一种缩短具有 612 个以上特征的特征选择时间的方法

A way to reduce time for feature selection that have more than 612 features

提问人:Gray 提问时间:10/31/2023 最后编辑:desertnautGray 更新时间:11/1/2023 访问量:17

问:

我必须制作一个具有 612 个特征的 svm 模型,所以我想通过特征选择正向方法减少它们,但时间花了很多时间。有没有办法减少计算时间?

这是我的代码:

from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.preprocessing import StandardScaler, MinMaxScaler
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

X = df.drop(['Class'], axis=1)
y = df['Class'] # AD=0 MCI=1

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 42)

X_train.shape, X_test.shape

cols = X_train.columns
scaler = StandardScaler()
#scaler = MinMaxScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
X_train = pd.DataFrame(X_train, columns=[cols])
X_test = pd.DataFrame(X_test, columns=[cols])
X_train.describe()

print(X_train)
print(X_test)

import joblib
import sys
sys.modules['sklearn.externals.joblib'] = joblib
from mlxtend.feature_selection import SequentialFeatureSelector as sfs

svc=SVC()
svc2=sfs(estimator = svc, k_features = 'best', forward = True,verbose = 1, scoring = 'r2')
svc2.fit(X_train, y_train)
params = {'C':[0.001,0.01,0.1, 0, 1, 10, 100, 1000], 'kernel':['linear','rbf','poly','sigmoid'],'gamma': [0.01,0.02,0.03,0.04,0.05,0.06,0.07,0.08,0.09, 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0],'degree':[0,1,2,3,4,5,6,7,8,9]}
grid_search = GridSearchCV(svc2, params, cv=4)
grid_search.fit(X_train, y_train)
print (grid_search.best_params_)

clf = SVC(**grid_search.best_params_)
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)
机器学习 时间 scikit-learn SVM 功能选择

评论

0赞 Nick ODell 10/31/2023
我建议放弃顺序特征选择,并根据与因变量的相关性选择变量。
0赞 DataJanitor 11/2/2023
您在参数网格中有 6400 种组合,执行 4-fold-cv,并用 SFS 包装它 - 我并不感到惊讶,这需要很长时间

答: 暂无答案