提问人:Gabriel Bueno Guimaraes 提问时间:9/21/2023 最后编辑:desertnautGabriel Bueno Guimaraes 更新时间:9/22/2023 访问量:28
后向差分编码器
Backward Difference Encoder
问:
我正在尝试在某些列中使用 Backward Difference Encoder,然后训练逻辑回归模型。
def train_model_v0(X, y, model, cat_enc_method):
# Applying Ordinal Encoding to the dependent variable 'churn'
target_encoder = OrdinalEncoder()
y = target_encoder.fit_transform(y.values.reshape(-1, 1)).flatten()
# Defining the steps of the data processing pipeline
steps = [
('Change_Columns', ChangeColumns()), # Step to change columns (not specified in the code)
('Categorical_Encoder', cat_enc_method(cols=['state', 'area_code', 'international_plan', 'voice_mail_plan'])),
('scaler', StandardScaler()), # Standard scaling step (normalization)
('model', model) # The machine learning model to be trained
]
# Defining the hyperparameters to be tested in a grid search
grid_features = [
{
'model__penalty': ['l2', None], # L2 regularization or none
'model__C': np.logspace(0, 1, 10, base=0.001), # Regularization parameter C
'model__solver': ['lbfgs', 'newton-cg', 'sag'] # Optimization algorithms
},
{
'model__penalty': ['l1', 'l2'], # L1 or L2 regularization
'model__C': np.logspace(0, 1, 10, base=0.001), # Regularization parameter C
'model__solver': ['liblinear'] # Optimization algorithm for L1 regularization
},
{
'model__penalty': ['l1', 'l2', None, 'elasticnet'], # L1, L2, none, or elasticnet regularization
'model__C': np.logspace(0, 1, 10, base=0.001), # Regularization parameter C
'model__solver': ['saga'] # Optimization algorithm for elasticnet regularization
}
]
# Creating a pipeline that includes all data processing steps and the model
pipe_model = Pipeline(steps=steps)
# Performing a grid search (GridSearchCV) to find the best hyperparameters
pipe_v1 = GridSearchCV(pipe_model,
param_grid=grid_features,
scoring='roc_auc', # Evaluation metric (area under the ROC curve)
cv=5) # 5-fold cross-validation
# Fitting the model to the training data
pipe_v1.fit(X, y)
# calling the function
train_model_v0(X, y, LogisticRegression(max_iter= 10000),
ce.backward_difference.BackwardDifferenceEncoder)
我遇到了一个错误,我无法弄清楚为什么会发生这种情况。
以下是回溯:
868 results = self._format_results(
869 all_candidate_params, n_splits, all_out, all_more_results
870 )
872 return results
--> 874 self._run_search(evaluate_candidates)
876 # multimetric is determined here because in the case of a callable
877 # self.scoring the return type is only known after calling
878 first_test_score = all_out[0]["test_scores"]
...
376 f"Below are more details about the failures:\n{fit_errors_summary}"
377 )
--> 378 warnings.warn(some_fits_failed_message, FitFailedWarning)
TypeError: issubclass() arg 2 must be a class, a tuple of classes, or a union
以下是我正在使用的数据片段: 在此处输入图像描述
答: 暂无答案
评论
error_score="raise"