提问人:Romeo Gherasim 提问时间:4/27/2023 更新时间:4/27/2023 访问量:31
行在 pandas DataFrame 中连接 -新版本
rows concatenate in pandas dataframe -new version
问:
我有下表。
import pandas as pd
# Define the input data
data = {
'ID': [1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,3,3,3,3,3],
'count': [1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,1,1,1,1,2,2,1,1,1,1,2],
'priority': [1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4,4,3,1,2,3,4,4],
'item': ['A','B','C','D','A','B','C','D','A','B','C','D','A','B','C','D','A','B','C','D','D','C','A','B','C','D','D'],
'c': ['XX','XX','XX','XX','YY-SS','YY','YY','YY','YY-SS','YY','YY','YY','XX','XX','XX','XX','ZZ','ZZ','ZZ','ZZ','ZZ','ZZ','TT-SS','ZZ','ZZ','ZZ','ZZ']
}
# Convert the input data to a Pandas DataFrame
df = pd.DataFrame(data)
如果您有任何想法,请分享。谢谢!
答:
2赞
mozway
4/27/2023
#1
您可以使用自定义 groupby.agg
:
out = (df
.sort_values(by='priority') # optional
.groupby(['ID', 'count'], as_index=False)
.agg({'item': '-'.join, 'c': 'first'})
.assign(FINAL=lambda d: d.pop('item')+'-'+d.pop('c'))
.drop(columns='count')
)
输出:
ID FINAL
0 1 A-B-C-D-XX
1 1 A-B-C-D-YY-SS
2 1 A-B-C-D-YY-SS
3 1 A-B-C-D-XX
4 2 A-B-C-D-ZZ
5 2 D-C-ZZ
6 3 A-B-C-D-TT-SS
7 3 D-ZZ
评论
c