如何在不删除列的情况下忽略 LGBMClassifier 中的 ID 列?

How to ignore ID column in LGBMClassifier without dropping column?

提问人:zeman 提问时间:10/9/2023 更新时间:10/19/2023 访问量:43

问:

我需要保留我的列来对测试数据进行预测,但模型用作预测变量,这是不需要的。感谢您的帮助!idid

x_train = train.drop(columns=['tgt'], axis=1)
y_train = train['tgt']
x_test = test_new.drop(columns=['tgt'], axis=1)
y_test = test_new['tgt']

rand = LGBMClassifier(seed=seed, learning_rate=0.1, num_leaves=12, n_estimators=700)
train_data = lgb.Dataset(x_train, label=y_train)
rand.fit(train_data)
Python 数据科学 LightGBM

评论

0赞 mrw 10/11/2023
尝试将 ID 列作为 DataFrame 的索引 - 这样它就不会用于训练,但在查看 DataFrame 时,您仍然拥有信息

答:

0赞 comphilip 10/19/2023 #1

尝试ignore_column参数。

rand = LGBMClassifier(seed=seed, learning_rate=0.1, num_leaves=12, n_estimators=700, ignore_column="name:tgt")
train_data = lgb.Dataset(train, label=y_train, ignore_column="name:tgt")
rand.fit(train_data)
0赞 PV8 10/19/2023 #2

您的代码中已经有了解决方案(几乎):

x_train = train.drop(columns=['tgt', 'id'], axis=1)
y_train = train['tgt']
x_test = test_new.drop(columns=['tgt', 'id'], axis=1)
y_test = test_new['tgt']

rand = LGBMClassifier(seed=seed, learning_rate=0.1, num_leaves=12, n_estimators=700)
train_data = lgb.Dataset(x_train, label=y_train)
rand.fit(train_data)