提问人:Bhumit 提问时间:10/16/2023 最后编辑:desertnautBhumit 更新时间:10/16/2023 访问量:37
我正在尝试使用 ColumnTrandformer 填充数值和分类值并使用 OneHotEncoder 转换分类值,但它不起作用
I am trying to fill numerical and categorical values and convert categorical values with OneHotEncoder using ColumnTrandformer but its not working
问:
我尝试用 imputer 填充 DataFrame,然后对分类值执行 OneHoTNCODING 但是当我将任何 Alogos 应用于转换后的值时,它会抛出错误,在下面代码中提到,如果我在不使用 columntransformer 的情况下单独执行相同的任务,它工作正常,我做错了什么?
from sklearn.compose import ColumnTransformer
from sklearn.impute import SimpleImputer
from sklearn.preprocessing import OneHotEncoder
# Define the columns for different imputation strategies
mean_col = ['MinTemp', 'MaxTemp', 'Rainfall', 'WindGustSpeed', 'WindSpeed9am', 'WindSpeed3pm', 'Humidity9am', 'Humidity3pm', 'Pressure9am', 'Pressure3pm', 'Temp9am', 'Temp3pm']
median_col = ['Cloud9am', 'Cloud3pm']
mf_col = ['WindGustDir', 'WindDir9am', 'WindDir3pm', 'RainToday']
ohm_cols = ['Location','WindGustDir','WindDir9am','WindDir3pm','RainToday']
# Create a ColumnTransformer
preprocessor = ColumnTransformer(
transformers=[
('trnf1', SimpleImputer(strategy='mean'), mean_col),
('trnf2', SimpleImputer(strategy='median'), median_col),
('trnf3', SimpleImputer(strategy='most_frequent'), mf_col),
('trnf4', OneHotEncoder(drop='first', sparse=False), ohm_cols)
],
remainder='drop' # Drop columns not specified in transformers
)
# Fit and transform the training data
x_train_transformed = preprocessor.fit_transform(x_train)
# Transform the testing data
x_test_transformed = preprocessor.transform(x_test)
错误:
ValueError: could not convert string to float: 'SSW'
下面的代码正在分段执行相同的任务,并且工作正常:
im = SimpleImputer(strategy= 'mean')
x_train[mean_col] = im.fit_transform(x_train[mean_col])
x_test[mean_col] = im.transform(x_test[mean_col])
im_median = SimpleImputer(strategy='median')
x_train[median_col] = im_median.fit_transform(x_train[median_col])
x_test[median_col] = im_median.transform(x_test[median_col])
im_mf = SimpleImputer(strategy='most_frequent')
x_train[mf_col] = im_mf.fit_transform(x_train[mf_col])
x_test[mf_col] = im_mf.transform(x_test[mf_col])
ohm =OneHotEncoder(drop = 'first', sparse = False)
x_train_transformed = ohm.fit_transform(x_train1[ohm_cols])
x_test_transformed = ohm.transform(x_test1[ohm_cols])
答: 暂无答案
评论
x_train
x_test