将 XGBoost 与 dask 分布式一起使用时出现值类型错误

Value type error when using XGBoost with dask distributed

提问人:lara_toff 提问时间:1/19/2021 最后编辑:M_xlara_toff 更新时间:1/20/2021 访问量:365

问:

这是在我的机器上重现错误的代码:

import numpy as np
import xgboost as xgb
import dask.array as da
import dask.distributed
from dask_cuda import LocalCUDACluster
from dask.distributed import Client

X = da.from_array(np.random.randint(0,10,size=(10,10)))
Y = da.from_array(np.random.randint(0,10,size=(10,1)))

cluster = LocalCUDACluster(n_workers=4, threads_per_worker=1)
client = Client(cluster)

dtrain = xgb.dask.DaskDeviceQuantileDMatrix(client=client, data=X, label=Y)

params = {'tree_method':'gpu_hist','objective':'rank:pairwise','min_child_weight':1,'max_depth':3,'eta':0.1} 
watchlist = [(trainLong, 'train')] 
reg= xgb.dask.train(client, params, dtrain, num_boost_round=10,evals=watchlist,verbose_eval=1)

以下是错误的摘要:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-9-ff1b0329f2f9> in <module>
      1 params = {'tree_method':'gpu_hist','objective':'rank:pairwise','min_child_weight':1,'max_depth':3,'eta':0.1}
      2 watchlist = [(trainLong, 'train')]
----> 3 regLong = xgb.dask.train(client, params, trainLong, num_boost_round=10,evals=watchlist,verbose_eval=1)

/usr/local/share/anaconda3/lib/python3.7/site-packages/xgboost/data.py in _device_quantile_transform()
    804         return _transform_dlpack(data), feature_names, feature_types
    805     raise TypeError('Value type is not supported for data iterator:' +
--> 806                     str(type(data)))
    807 
    808 

TypeError: Value type is not supported for data iterator:<class 'numpy.ndarray'>

设备分位数矩阵如何仍然作为 numpy 数组传递???

我试过使用 pandas 数据帧并将其转换为 dask 数据帧,然后将其转换为设备分位数矩阵......

Python GPU dask 分布式 xgboost

评论


答: 暂无答案