如何循环访问数据框中的多个列，并根据其他几个列中的信息分配新的列值 [重复]

How to iterate over several columns in a data frame and assign a new column value based on info in several others [duplicate]

提问人：pyphan 提问时间：10/28/2023 最后编辑：Panda Kimpyphan 更新时间：10/28/2023 访问量：29

问：

27 天前关闭。

新手在这里..但非常热情。我有很多调查数据，我必须根据其他多个列中 0 或 1 的位置，在数据帧中新创建的列中派生一个值。

dftest = pd.DataFrame({'connection_0': ['0', '1','0','0','1'], 'connection_1': ['1', '0','1','1','0']},index=['819', '820','821','822','823'])

所以这是它的样子：

    connection_0 connection_1
819           0            1
820           1            0
821           0            1
822           0            1
823           1            0

我想在之前添加一个列“连接”，并将该列的值设置为“无”（如果connection_0为“1”），如果connection_1为“1”，则将该列的值设置为“患者”。我实际上有两列以上的这些列（connection_2等），但我以后可以处理。我还有数千行数据，所以这只是一个测试数据帧。

这就是我想要的——

    connect connection_0 connection_1
819    patient         0            1
820    none            1            0
821    patient         0            1
822    patient         0            1
823    none            1            0

这是我的尝试，这种尝试有效，但它将所有“连接”设置为“无”

对于索引，dftest.iterrows（）中的行：

if row[0]=="1": row.connect=row.connect.replace('1', 'none')

else: 
    if row[0]=="0" & row[1]=="1":row.connect=row.connect.replace('1', 'patient')

但我明白了：

        connect connection_0 connection_1
819    none            0            1
820    none            1            0
821    none            0            1
822    none            0            1
823    none            1            0

我知道我正在做一些非常愚蠢的事情，所以任何帮助将不胜感激！

python-3.x pandas 数据帧

答：

0赞 Panda Kim 10/28/2023 #1

法典

如果需要将类似的操作添加到其他列而不是两列，似乎使用是最好的方法。您可以通过添加来增加要应用的列数。np.selectcond#

import numpy as np
cond1 = dftest['connection_0'].eq('1')
cond2 = dftest['connection_1'].eq('1')
arr = np.select([cond1, cond2], ['none', 'patient'])
out = pd.DataFrame(arr, columns=['connect'], index=dftest.index).join(dftest)

外：

    connect connection_0    connection_1
819 patient 0               1
820 none    1               0
821 patient 0               1
822 patient 0               1
823 none    1               0

使其更简洁：

import numpy as np
arr = np.select(dftest.eq('1').T.values, ['none', 'patient'])
out = pd.DataFrame(arr, columns=['connect'], index=dftest.index).join(dftest)

如果所有列都包含“1”，也可以按以下方式执行。

让我们制作一个字典，为每列分配“1”时要分配的值，并将其分配给。m

m = {'connection_0':'none', 'connection_1':'patient'}
dftest.eq('1').idxmax(axis=1).map(m).to_frame('connect').join(dftest)

0赞 pyphan 10/28/2023

完善！非常感谢Panda Kim！在另一种条件下像魅力一样工作！

0赞 Panda Kim 10/28/2023

@pyphan我很高兴它得到了解决，但随后你需要为下一个人选择一个答案。

上一个：将前导零添加到 pandas 数据帧中小于 5 位数字的邮政编码

下一个：在 pandas 数据帧中对集群子集群中的行进行排序

如何循环访问数据框中的多个列，并根据其他几个列中的信息分配新的列值 [重复]

How to iterate over several columns in a data frame and assign a new column value based on info in several others [duplicate]

评论

评论