提问人:Akshay Basutkar 提问时间:9/27/2023 最后编辑:desertnautAkshay Basutkar 更新时间:9/27/2023 访问量:33
KNN 算法抛出 ValueError: Unknown label type: 'continuous'
KNN algorithm throws ValueError: Unknown label type: 'continuous'
问:
import pandas as pd
from sklearn.preprocessing import LabelEncoder
import sklearn
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score
from sklearn.preprocessing import MinMaxScaler
path = "/content/cirrhosis.csv"
data = pd.read_csv(path)
data = data.loc[0:311]
data.head()
for col in data.columns:
if data[col].dtype == 'int64' or data[col].dtype == 'float64':
data[col].fillna(data[col].mean(), inplace=True)
elif data[col].dtype == 'object':
data[col].fillna(data[col].mode(), inplace=True)
label_encoder = LabelEncoder()
for column in data.columns:
if data[column].dtype == 'object':
data[column] = label_encoder.fit_transform(data[column])
print(data)
scaler = MinMaxScaler()
scaled_data = scaler.fit_transform(data)
data = pd.DataFrame(scaled_data, columns=data.columns)
inputs = data.drop(['ID', 'Stage'],axis=1)
output = data.drop(['ID', 'N_Days', 'Status', 'Drug', 'Age', 'Sex', 'Ascites', 'Hepatomegaly', 'Spiders', 'Edema', 'Bilirubin', 'Cholesterol', 'Albumin', 'Copper', 'Alk_Phos', 'SGOT', 'Tryglicerides', 'Platelets', 'Prothrombin'], axis=1)
print(inputs)
print(output)
x_train, x_test, y_train, y_test = train_test_split(inputs, output, train_size=0.8)
model = KNeighborsClassifier(n_neighbors=31)
model.fit(x_train,y_train)
y_pred = model.predict(x_test)
我试图提高 KNN 模型的准确性,所以我尝试执行特征缩放 但是当我执行特征缩放并尝试使用 model.fit() 训练我的模型时,它会抛出一个 ValueError 如果我不执行特征缩放,该算法有效,但在执行特征缩放时会抛出 ValueError
/usr/local/lib/python3.10/dist-packages/sklearn/neighbors/_classification.py:215: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples,), for example using ravel().
return self._fit(X, y)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-73-f656e2af91bb> in <cell line: 2>()
1 model = KNeighborsClassifier(n_neighbors=31)
----> 2 model.fit(x_train,y_train)
3 y_pred = model.predict(x_test)
4 print(y_pred)
5 print(y_test)
2 frames
/usr/local/lib/python3.10/dist-packages/sklearn/utils/multiclass.py in check_classification_targets(y)
216 "multilabel-sequences",
217 ]:
--> 218 raise ValueError("Unknown label type: %r" % y_type)
219
220
ValueError: Unknown label type: 'continuous'
答:
1赞
Ugur Yigit
9/27/2023
#1
你能检查你的响应变量是否连续吗? 您正在执行分类任务,因此y_train或y_test中的连续变量可能会导致错误。也许缩放整个数据导致了此错误,并且您的目标变量变为连续变量。
您的响应变量应该是分类的,例如 0/1 或 Yes/No 等。
评论
0赞
Akshay Basutkar
9/29/2023
是的,它奏效了。我检查了输出,发现它也是连续形式,我纠正了它
评论