提问人:lenhhoxung 提问时间:11/8/2023 更新时间:11/16/2023 访问量:34
使用 SHAP 解释 xgboost: Check failed: end <= model。BoostedRounds() (309 vs. 133) : 超出树层的范围
Using SHAP to explain xgboost: Check failed: end <= model.BoostedRounds() (309 vs. 133) : Out of range for tree layers
问:
我正在使用 SHAP 使用以下代码来解释我的 xgboost 模型:
explainer = shap.TreeExplainer(model)
explainer.shap_values(pd_df)
# explainer(xgboost.DMatrix(pd_df, label=label))
但它会抛出以下错误:
XGBoostError Traceback (most recent call last)
/app/dataiku/DSS_DATA_DIR/code-envs/python/env/lib/python3.9/site-packages/shap/explainers/_tree.py in shap_values(self, X, y, tree_limit, approximate, check_additivity, from_call)
357 try:
--> 358 phi = self.model.original_model.predict(
359 X, iteration_range=(0, tree_limit), pred_contribs=True,
/app/dataiku/DSS_DATA_DIR/code-envs/python/env/lib/python3.9/site-packages/xgboost/core.py in predict(self, data, output_margin, pred_leaf, pred_contribs, approx_contribs, pred_interactions, validate_features, training, iteration_range, strict_shape)
2295 dims = c_bst_ulong()
-> 2296 _check_call(
2297 _LIB.XGBoosterPredictFromDMatrix(
/app/dataiku/DSS_DATA_DIR/code-envs/python/env/lib/python3.9/site-packages/xgboost/core.py in _check_call(ret)
280 if ret != 0:
--> 281 raise XGBoostError(py_str(_LIB.XGBGetLastError()))
282
XGBoostError: [06:41:59] /workspace/src/gbm/gbtree.h:125: Check failed: end <= model.BoostedRounds() (309 vs. 133) : Out of range for tree layers.
Stack trace:
[bt] (0) /app/dataiku/DSS_DATA_DIR/code-envs/python/env/lib/python3.9/site-packages/xgboost/lib/libxgboost.so(+0x45a59a) [0x7f315de3b59a]
[bt] (1) /app/dataiku/DSS_DATA_DIR/code-envs/python/env/lib/python3.9/site-packages/xgboost/lib/libxgboost.so(+0x47123f) [0x7f315de5223f]
[bt] (2) /app/dataiku/DSS_DATA_DIR/code-envs/python/env/lib/python3.9/site-packages/xgboost/lib/libxgboost.so(+0x47166b) [0x7f315de5266b]
[bt] (3) /app/dataiku/DSS_DATA_DIR/code-envs/python/env/lib/python3.9/site-packages/xgboost/lib/libxgboost.so(+0x4c463b) [0x7f315dea563b]
[bt] (4) /app/dataiku/DSS_DATA_DIR/code-envs/python/env/lib/python3.9/site-packages/xgboost/lib/libxgboost.so(XGBoosterPredictFromDMatrix+0x2be) [0x7f315db4de2e]
[bt] (5) /lib64/libffi.so.6(ffi_call_unix64+0x4c) [0x7f371af33dec]
[bt] (6) /lib64/libffi.so.6(ffi_call+0x1f5) [0x7f371af33715]
[bt] (7) /usr/local/lib/python3.9/lib-dynload/_ctypes.cpython-39-x86_64-linux-gnu.so(+0x1286f) [0x7f371b14886f]
[bt] (8) /usr/local/lib/python3.9/lib-dynload/_ctypes.cpython-39-x86_64-linux-gnu.so(+0xc2fb) [0x7f371b1422fb]
The above exception was the direct cause of the following exception:
ValueError Traceback (most recent call last)
<ipython-input-20-40f7d265c006> in <cell line: 1>()
----> 1 explainer.shap_values(pd_df)
/app/dataiku/DSS_DATA_DIR/code-envs/python/env/lib/python3.9/site-packages/shap/explainers/_tree.py in shap_values(self, X, y, tree_limit, approximate, check_additivity, from_call)
365 "See https://github.com/slundberg/shap/issues/580."
366 )
--> 367 raise ValueError(emsg) from e
368
369 if check_additivity and self.model.model_output == "raw":
ValueError: This reshape error is often caused by passing a bad data matrix to SHAP. See https://github.com/slundberg/shap/issues/580.
模型的预测工作正常:
(model.predict(pd_df) == label).mean()
>> 0.8375209380234506
XGBoost 版本:2.0.0 SHAP 版本 0.43.0
可能是什么原因?
答:
0赞
forgetful_coder
11/16/2023
#1
抱歉,由于声誉不足而无法发表评论。 难道您正在eval_set和提前停止进行多类分类吗?
将安装说明替换为:
model = xgboost.XGBClassifier(objective="binary:logistic", max_depth=4, n_estimators=10, early_stopping_rounds=5)
model.fit(X_train, Y_train, eval_set=[(X_train, Y_train), (X_test, Y_test)], verbose=False)
它会抛出你的错误XGBoostError: [16:49:37] /workspace/src/gbm/gbtree.h:125: Check failed: end <= model.BoostedRounds() (30 vs. 10) : Out of range for tree layers.
如果您恢复到原始状态(或干脆放弃early_stopping),应该没问题:
model = xgboost.XGBClassifier(objective="binary:logistic", max_depth=4, n_estimators=10)
model.fit(X_train, Y_train)
shap_values = shap.TreeExplainer(model).shap_values(X_test)
shap.summary_plot(shap_values, X_test)
当然,这不是一个真正的原因,但可能会让你摆脱困境。
评论