提问人:mindstorm84 提问时间:7/29/2023 更新时间:7/29/2023 访问量:26
对基本模型使用 h2o automl 时出现 h2o GBM 检查点错误
h2o GBM checkpointing error when using h2o automl for the base model
问:
我想使用检查点在一组新的观测值上重新训练我的 h2o 模型,但面临错误。使用检查点时,我的代码在训练步骤上失败。我的原始模型是使用 h2o automl 创建的,我验证了 aml.leader 是 GBM 模型。
该错误与无法修改max_depth字段有关。但是,我没有在gbm_continued定义中修改max_depth参数。
#ds_file is my local dataset with 4k rows
ds= h2o.import_file(ds_file)
splits = ds.split_frame(ratios= [0.8], seed=1)
train = splits[0]
test = splits[1]
aml = H2OAutoML(max_runtime_secs = 60, seed = 1 , project_name = 'test')
aml.train(y=y, training_frame = train, leaderboard_frame = test)
#verify that aml.leader is the GBM model
print(aml.leader)
#H2OGradientBoostingEstimator : Gradient Boosting Machine
#Model Key: GBM_1_AutoML_1_20230727_145804
#ds2_file is my local dataset with 30k rows
ds2 = h2o.import_file(ds2_file)
Splits2 = ds2.split_frame(ratios= [0.8], seed=1)
train2 = splits2[0]
test2 = splits2[1]
gbm_continued = H2OGradientBoostingEstimator(model_id = 'gbm_continued', checkpoint = aml.leader)
gbm_continued.train(x=predictors, y = y, training_frame = train2)
这是错误消息:
>>> gbm_continued.train(x=predictors, y = y, training_frame = train2)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "h2o-dev/lib/lib/python3.8/site-packages/h2o/estimators/estimator_base.py", line 108, in train
self._train(parms, verbose=verbose)
File "dev_items/h2o-dev/lib/lib/python3.8/site-packages/h2o/estimators/estimator_base.py", line 187, in _train
model_builder_json = h2o.api("POST /%d/ModelBuilders/%s" % (rest_ver, self.algo), data=parms)
File "h2o-dev/lib/lib/python3.8/site-packages/h2o/h2o.py", line 124, in api
return h2oconn.request(endpoint, data=data, json=json, filename=filename, save_to=save_to)
File "h2o-dev/lib/lib/python3.8/site-packages/h2o/backend/connection.py", line 498, in request
return self._process_response(resp, save_to)
File "h2o-dev/lib/lib/python3.8/site-packages/h2o/backend/connection.py", line 852, in _process_response
raise H2OResponseError(data)
h2o.exceptions.H2OResponseError: ModelBuilderErrorV3 (water.exceptions.H2OModelBuilderIllegalArgumentException):
timestamp = 1690566243266
error_url = '/3/ModelBuilders/gbm'
msg = 'Illegal argument(s) for GBM model: gbm_continued. Details: ERRR on field: _max_depth: Field _max_depth cannot be modified if checkpoint is specified!\nERRR on field: _ntrees: If checkpoint is specified then requested ntrees must be higher than 409'
dev_msg = 'Illegal argument(s) for GBM model: gbm_continued. Details: ERRR on field: _max_depth: Field _max_depth cannot be modified if checkpoint is specified!\nERRR on field: _ntrees: If checkpoint is specified then requested ntrees must be higher than 409'
http_status = 412
我发现了一个关于这个主题的相关问题,但没有解决这个问题。
答:
1赞
Wendy
7/29/2023
#1
要解决您遇到的错误,请尝试以下操作:
gbm_autoML = h2o.get_model(aml.leader) gbm_continued = H2OGradientBoostingEstimator(model_id = 'gbm_continued', max_depth = gbm_autoML.actual_params['max_depth'], ntrees = gbm_autoML.actual_params['ntrees']+2, checkpoint = aml.leader)
继续训练 GBM 模型,这意味着您正在向模型中添加更多树。这就是为什么我在 ntrees 参数中添加了 2。随意将 2 更改为您想要的任何其他内容,只要它>= 1。
希望这对您有所帮助,祝您好运。
评论