提问人:gregV 提问时间:8/22/2023 更新时间:8/23/2023 访问量:54
pandas:使用方法链接按切片分配列值
pandas: assign a column values by slice with method chaining
问:
在下面的玩具示例中,我尝试根据外部合并结果添加一个状态列。挑战在于保留 tom 博客中最好描述的链式方法。注释掉的行是我的尝试,但它不起作用
import pandas as pd
# Create sample data frames A and B
A = pd.DataFrame({
'key': ['A', 'B', 'C', 'D'],
'value': [1, 2, 3, 4]
})
B = pd.DataFrame({
'key': ['C', 'D', 'E', 'F'],
'value': [3, 4, 5, 6]
})
# Merge data frames A and B on the 'key' column and add an indicator column
merged = pd.merge(A, B, on='key', how='outer', indicator=True)
# add a status column
#{'both':'no change',
#'left_only': 'added',
#'right_only': 'removed'}
merged = (merged
.assign (status = 'no change')
#.assign(status = lambda x: x.loc[x._merge == 'left_only'], 'added')
.drop('_merge', axis=1)
)
答:
2赞
sammywemmy
8/22/2023
#1
像这样的东西应该就足够了 - 通常对于切片,因为你要分配,你需要使用条件(、、等)map
np.where
np.select
pd.where
(A
.merge(B, on='key', how='outer', indicator=True)
.assign(status = lambda f: f._merge.map({"left_only":"added",
"both":"no change",
"right_only":"removed"}))
)
1赞
taller
8/22/2023
#2
添加以获取状态。DataFrame.apply
merged = (merged
.assign(status = merged.apply(lambda x:
'added' if x._merge == "left_only" else "", axis=1))
.drop('_merge', axis=1)
)
key value_x value_y status
0 A 1.0 NaN added
1 B 2.0 NaN added
2 C 3.0 3.0
3 D 4.0 4.0
4 E NaN 5.0
5 F NaN 6.0
1赞
Scott Boston
8/22/2023
#3
这里有一种方法,可以在一行中使用“walrus”运算符,使用预定义的字典,并将列名更改为字符串::=
map
indicator
merge
import pandas as pd
# Create sample data frames A and B
A = pd.DataFrame({
'key': ['A', 'B', 'C', 'D'],
'value': [1, 2, 3, 4]
})
B = pd.DataFrame({
'key': ['C', 'D', 'E', 'F'],
'value': [3, 4, 5, 6]
})
d = {'both':'no_change',
'left_only':'added',
'right_only':'removed'}
merged = (merged_out:=pd.merge(A, B, on='key', how='outer', indicator='status'))\
.assign(status=merged_out['status'].map(d))
merged
输出:
key value_x value_y status
0 A 1.0 NaN added
1 B 2.0 NaN added
2 C 3.0 3.0 no_change
3 D 4.0 4.0 no_change
4 E NaN 5.0 removed
5 F NaN 6.0 removed
评论