是否可以在不将数据转换为数值的情况下进行分类?

Is it possible to do classification without converting the data into numeric values?

提问人:SH_IQ 提问时间:10/30/2023 最后编辑:marc_sSH_IQ 更新时间:10/31/2023 访问量:31

问:

我正在尝试了解 KNN(k 最近邻)在鸢尾花数据集分类方面的工作。据我了解,当我需要进行分类时,我必须将数据准备为数值。根据我遵循的以下代码,它没有将它们更改为数值。这是正确的吗?何时需要将它们更改为数值,何时不需要?这是因为,基于下面的代码,他实现了 0.97 的精度。请问我能得到澄清吗?

# import libraries 
import pandas as pd # Import Pandas for data manipulation using dataframes
import numpy as np # Import Numpy for data statistical analysis 
import matplotlib.pyplot as plt # Import matplotlib for data visualisation
import seaborn as sns

# dataframes creation for both training and testing datasets 
iris_df = pd.read_csv('iris.csv')

# Let's drop the ID and Species (target label) columns
X = iris_df.drop(['Species'],axis=1)
X

y = iris_df['Species']
y

# Import train_test_split from scikit library
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.35)

# Fitting K-NN to the Training set
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import classification_report, confusion_matrix

classifier = KNeighborsClassifier(n_neighbors = 5, metric = 'minkowski', p = 2)
classifier.fit(X_train, y_train)

y_predict = classifier.predict(X_test)
cm = confusion_matrix(y_test, y_predict)
sns.heatmap(cm, annot=True, fmt="d")

print(classification_report(y_test, y_predict))
Python 分类 knn

评论


答: 暂无答案