XGBoost 分类器后的 Shap 解释器错误

Shap explainer error after XGBoost Classifier

提问人:Robin Bjoern Platte 提问时间:11/17/2023 更新时间:11/17/2023 访问量:24

问:

我正在尝试使用 shap 理解我的 XGBoost 模型。

对于模型,我以数据帧的形式提供数据,其中大多数列的值为 -1,0,1。1 列具有数值。

代码如下所示:

    n_estimators = 200000
    early_stopping_rounds = 10000
    
    max_depth=50
    #----- train: BUY ------
    X,y = getData(trainingStartDate, trainingEndDate, "buy")
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, stratify=y)   
    buy_xgb = xgb.XGBClassifier(objective='binary:logistic',
        missing=1, 
        eval_metric='aucpr',
        early_stopping_rounds=early_stopping_rounds,
        n_estimators=n_estimators,
        max_depth=max_depth)
    print(f"Training BUY model with {round(sum(y_train)/len(y_train)*100)}% targets")
    buy_xgb.fit(X_train, y_train, verbose=False, eval_set=[(X_test, y_test)]) 

该模型在尝试预测新值时可以完美运行。

尝试时explainer_buy = shap.TreeExplainer(buy_xgb)

我收到以下错误:

File /opt/conda/lib/python3.10/site-packages/shap/explainers/_tree.py:1778, in XGBTreeModelLoader.get_trees(self, data, data_missing)
   1774 self.features[i,j] = self.node_sindex[i][j] & ((np.uint32(1) << np.uint32(31)) - np.uint32(1))
   1775 if self.node_cleft[i][j] >= 0:
   1776     # Xgboost uses < for thresholds where shap uses <=
   1777     # Move the threshold down by the smallest possible increment
-> 1778     self.thresholds[i, j] = np.nextafter(self.node_info[i][j], - np.float32(np.inf))
   1779 else:
   1780     self.values[i,j] = self.node_info[i][j]

FloatingPointError: underflow encountered in nextafter

我已经用较少的估算器(20k)尝试过了。还是行不通。

关于如何解决的任何想法?

python numpy xgboost shap

评论

0赞 Sergey Bushmanov 11/25/2023
尽量减少可重复的例子

答: 暂无答案