对于线性回归算法，将类别列转换为一组列的最佳方法是什么？-解网

问：

我正在尝试将类别列转换为一组列，每个列对应于原始列中的唯一值，布尔值显示该情况下的类别是什么。

我最近的尝试涉及这个用户定义的函数：

def cath_column (df, col):
    u_values = np.sort(df[col].unique())
    
    for v in u_values: 
        df[col+'.'+str(v)] = df[col] == v
    
    return df.copy()

它似乎可以及时工作，但编译器给出了 SettingWithCopyWarning。警告全文如下：

/opt/conda/lib/python3.7/site-packages/ipykernel_launcher.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

我已经阅读了文档，但我不确定它是否适用于这种情况，因为我不是在修改数据帧中的值，而是添加列。我尝试使用其他方法添加轴 = 1 的 copandas.concat，但它的性能非常差。

是否有更好的方法可以执行此操作，或者是否可以安全地忽略 SettingWithCopyWarning？

熊猫设置与复制警告

对于线性回归算法，将类别列转换为一组列的最佳方法是什么？

What is the optimal way to turn a category column into a set of columns for a linear regression algorithm?

评论