提问人:Robin Bjoern Platte 提问时间:11/17/2023 更新时间:11/17/2023 访问量:24
XGBoost 分类器后的 Shap 解释器错误
Shap explainer error after XGBoost Classifier
问:
我正在尝试使用 shap 理解我的 XGBoost 模型。
对于模型,我以数据帧的形式提供数据,其中大多数列的值为 -1,0,1。1 列具有数值。
代码如下所示:
n_estimators = 200000
early_stopping_rounds = 10000
max_depth=50
#----- train: BUY ------
X,y = getData(trainingStartDate, trainingEndDate, "buy")
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, stratify=y)
buy_xgb = xgb.XGBClassifier(objective='binary:logistic',
missing=1,
eval_metric='aucpr',
early_stopping_rounds=early_stopping_rounds,
n_estimators=n_estimators,
max_depth=max_depth)
print(f"Training BUY model with {round(sum(y_train)/len(y_train)*100)}% targets")
buy_xgb.fit(X_train, y_train, verbose=False, eval_set=[(X_test, y_test)])
该模型在尝试预测新值时可以完美运行。
尝试时explainer_buy = shap.TreeExplainer(buy_xgb)
我收到以下错误:
File /opt/conda/lib/python3.10/site-packages/shap/explainers/_tree.py:1778, in XGBTreeModelLoader.get_trees(self, data, data_missing)
1774 self.features[i,j] = self.node_sindex[i][j] & ((np.uint32(1) << np.uint32(31)) - np.uint32(1))
1775 if self.node_cleft[i][j] >= 0:
1776 # Xgboost uses < for thresholds where shap uses <=
1777 # Move the threshold down by the smallest possible increment
-> 1778 self.thresholds[i, j] = np.nextafter(self.node_info[i][j], - np.float32(np.inf))
1779 else:
1780 self.values[i,j] = self.node_info[i][j]
FloatingPointError: underflow encountered in nextafter
我已经用较少的估算器(20k)尝试过了。还是行不通。
关于如何解决的任何想法?
答: 暂无答案
评论