FitFailedWarning：估算器拟合失败。此训练测试分区上这些参数的分数将设置为 nan

FitFailedWarning: Estimator fit failed. The score on this train-test partition for these parameters will be set to nan

提问人：unstuck 提问时间：8/20/2021 最后编辑：unstuck 更新时间：10/16/2021 访问量：26598

问：

我正在尝试优化 XGB 回归模型的参数学习率和max_depth：

from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import cross_val_score
from xgboost import XGBRegressor

param_grid = [
    # trying learning rates from 0.01 to 0.2
    {'eta ':[0.01, 0.05, 0.1, 0.2]},
    # and max depth from 4 to 10
    {'max_depth': [4, 6, 8, 10]}
  ]

xgb_model = XGBRegressor(random_state = 0)
grid_search = GridSearchCV(xgb_model, param_grid, cv=5,
                           scoring='neg_root_mean_squared_error',
                           return_train_score=True)

grid_search.fit(final_OH_X_train_scaled, y_train)

final_OH_X_train_scaled是仅包含数值特征的训练数据集。

y_train是训练标签 - 也是数字。

这将返回错误：

FitFailedWarning: Estimator fit failed. The score on this train-test partition for these parameters will be set to nan.

我见过其他类似的问题，但还找不到答案。

还尝试过：

param_grid = [
    # trying learning rates from 0.01 to 0.2
    # and max depth from 4 to 10
    {'eta ': [0.01, 0.05, 0.1, 0.2], 'max_depth': [4, 6, 8, 10]}   
  ]

但它会产生相同的错误。

编辑：下面是数据示例：

final_OH_X_train_scaled.head()

y_train.head()

编辑2：

可以通过以下方式检索数据样本：

final_OH_X_train_scaled = pd.DataFrame([[0.540617 ,1.204666 ,1.670791 ,-0.445424 ,-0.890944 ,-0.491098 ,0.094999 ,1.522411 ,-0.247443 ,-0.559572 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,1.0 ,0.0 ,0.0], 
                   [0.117467 ,-2.351903 ,0.718969 ,-0.119721 ,-0.874705 ,-0.530832 ,-1.385230 ,2.126612 ,-0.947731 ,-0.156967 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,1.0 ,0.0 ,0.0 ,0.0 ,0.0], 
                   [0.901138 ,-0.208256 ,-0.019134 ,0.265250 ,-0.889128 ,-0.467753 ,0.169306 ,-0.973256 ,0.056164 ,-0.671978 , 0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,1.0 ,0.0 ,0.0],
                   [2.074639 ,0.100602 ,-1.645121 ,0.929598 ,0.811911 ,1.364560 ,0.337242 ,0.435187 ,-0.388075 ,1.279959 , 0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,1.0], 
                   [2.198099 ,-0.496254 ,-0.917933 ,-1.418407 ,-0.975889 ,1.044495 ,0.254181 ,1.335285 ,2.079415 ,2.071974 , 0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,1.0 ,0.0 ,0.0 ,0.0 ,0.0]],
                  columns=['cont0' ,'cont1' ,'cont2' ,'cont3' ,'cont4' ,'cont5' ,'cont6' ,'cont7' ,'cont8' ,'cont9' ,'31' ,'32' ,'33' ,'34' ,'35' ,'36' ,'37' ,'38' ,'39' ,'40'])

python scikit-学习 xgboost

1赞 TC Arlen 8/20/2021

在我看来没有什么明显的问题。你能发布几行你的和数据，以便我们可以重现和调试吗？您的数据可能有问题。final_OH_X_train_scaledy_train

0赞 unstuck 8/20/2021

@TCArlen非常感谢您的反馈。请看我上面的编辑

0赞 TC Arlen 8/20/2021

太好了，谢谢。但是，为了在我的机器上检查和重现/调试，我需要训练数据行/标签作为代码/数据，以便我可以自己运行它。你能把它作为数据而不是屏幕截图发布吗？

0赞 TC Arlen 8/20/2021

链接中的数据不是以上面的屏幕截图中所示方式转换的数据，从 .请将这些值放入代码中，如以下示例问题所示：stackoverflow.com/questions/68732791/...您是否看到数据帧是如何从代码构造的，因此它是在另一台机器上可重现的示例？谢谢final_OH_X_train_scaled.head()

0赞 unstuck 8/20/2021

好的，请看上面

答：

4赞 TC Arlen 8/20/2021 #1

我能够重现该问题，但代码无法适应，因为您的参数中有一个额外的空格！取而代之的是：eta

{'eta ':[0.01, 0.05, 0.1, 0.2]},...

将其更改为：

{'eta':[0.01, 0.05, 0.1, 0.2]},...

不幸的是，错误消息不是很有帮助。

0赞 heschmat 10/16/2021 #2

同样，例如，如果 for a 您将网格设置为 sth likeLogisticRegression

grid_lr = {
'cls__class_weight': [None, 'balanced'],
'cls__C': [0, .001, .01, .1, 1]
}

您会收到类似的错误;原因是只能采用正浮点值。因此，只需仔细检查超参数的命名或值就足以解决此问题。C

上一个：ImportError：无法从“pandas”导入名称“DtypeArg”

下一个：NameError：未定义名称“mean”

FitFailedWarning：估算器拟合失败。此训练测试分区上这些参数的分数将设置为 nan

FitFailedWarning: Estimator fit failed. The score on this train-test partition for these parameters will be set to nan

评论