提问人:Analysa Marie 提问时间:11/16/2023 最后编辑:Analysa Marie 更新时间:11/16/2023 访问量:16
如何将所有权重和树从 xgboost.spark.SparkXGBClassifier 复制到 xgboost.sklearn.XGBClassifier?
How can I copy over all weights and trees from xgboost.spark.SparkXGBClassifier to xgboost.sklearn.XGBClassifier?
问:
长话短说,我使用 xgboost.spark.SparkXGBClassifier 在 Spark 中训练了一个 xgboost 分类器模型:
from xgboost.spark import SparkXGBClassifier
spark_xgboost_classifier = SparkXGBClassifier (
features_col=numeric_features,
label_col=target,
num_workers=1,
missing=0.0,
max_depth=5,
use_gpu=True
)
pipeline = Pipeline(stages=[spark_xgboost_classifier])
model = pipeline.fit(train.limit(2500000))
model.stages[0].save('model.json')
我希望能够使用 python 包 explainerdashboard,但它需要 sklearn 包装器:xgboost.sklearn.XGBClassifier(根据我在尝试直接使用助推器时遇到的错误)。
所以现在我想知道,我必须训练一个新模型吗?我可以使用 json 文件将模型直接加载到 xgboost.sklearn.XGBClassifier 上吗?我是否需要复制所有权重、参数和树?只是对从这里去哪里感到非常困惑。
- 我尝试使用助推器:
model.stages[0].get_booster()
booster: xgb.Booster = model.stages[0].get_booster()
explainer = explainerdashboard.ClassifierExplainer(booster, test.select(numeric_features).limit(10000).toPandas(), test.limit(50000).select(target).toPandas())
但它引发了错误:ValueError: For xgboost models, currently only the scikit-learn compatible wrappers xgboost.sklearn.XGBClassifier and xgboost.sklearn.XGBRegressor are supported, so please use those instead of xgboost.Booster!
- 然后我尝试从保存的json文件加载,但我实际上无法加载模型:
sklearn_model = xgb.sklearn.XGBClassifier( features_col=numeric_features,
label_col=target,
missing=0.0,
max_depth=5
)
sklearn_model.load_model('model.json')
但它给了我一个奇怪的难以理解的错误,似乎从 S3 读取时有问题?我很迷茫!!
XGBoostError: [16:18:03] ../dmlc-core/src/io.cc:69: unknown filesystem protocol s3a://
Stack trace:
[bt] (0) /databricks/python/lib/python3.10/site-packages/xgboost/lib/libxgboost.so(+0x811ad5) [0x7fc537611ad5]
[bt] (1) /databricks/python/lib/python3.10/site-packages/xgboost/lib/libxgboost.so(+0x811cdb) [0x7fc537611cdb]
[bt] (2) /databricks/python/lib/python3.10/site-packages/xgboost/lib/libxgboost.so(+0x811ee5) [0x7fc537611ee5]
[bt] (3) /databricks/python/lib/python3.10/site-packages/xgboost/lib/libxgboost.so(+0x1947d9) [0x7fc536f947d9]
[bt] (4) /databricks/python/lib/python3.10/site-packages/xgboost/lib/libxgboost.so(+0x144770) [0x7fc536f44770]
[bt] (5) /databricks/python/lib/python3.10/site-packages/xgboost/lib/libxgboost.so(XGBoosterLoadModel+0x185) [0x7fc536f44b55]
[bt] (6) /lib/x86_64-linux-gnu/libffi.so.8(+0x7e2e) [0x7fc5d3e10e2e]
[bt] (7) /lib/x86_64-linux-gnu/libffi.so.8(+0x4493) [0x7fc5d3e0d493]
[bt] (8) /usr/lib/python3.10/lib-dynload/_ctypes.cpython-310-x86_64-linux-gnu.so(+0xa3e9) [0x7fc5d050b3e9]
答: 暂无答案
评论
booster_: xgb.Booster = model.get_booster()
\booster_.save_model('{}/model/booster/booster.json'.format(local_path))
\sklearn_model = xgb.sklearn.XGBClassifier(features_col=numeric_features, label_col=target, missing=0.0, max_depth=5)
\sklearn_model.load_model('{}/model/booster/booster.json'.format(local_path))