提问人:test_zuck 提问时间:10/19/2023 最后编辑:test_zuck 更新时间:10/19/2023 访问量:35
Pandas 根据条件递增数据帧列
Pandas increment Dataframe column based on condition
问:
我有以下数据帧,我想在特定条件下为所有行递增“日期值”,如下所示: 如果各行中的日期相同,则按下一个日期递增日期,否则保持相同的日期。 输入 Datafrae
日期 |
---|
7/24/2023 |
7/24/2023 |
7/23/2023 |
7/23/2023 |
7/23/2023 |
7/17/2023 |
7/17/2023 |
7/17/2023 |
预期输出如下:
日期 | new_date |
---|---|
7/24/2023 | 7/24/2023 |
7/24/2023 | 7/25/2023 |
7/23/2023 | 7/23/2023 |
7/23/2023 | 7/24/2023 |
7/23/2023 | 7/25/2023 |
7/17/2023 | 7/17/2023 |
7/17/2023 | 7/18/2023 |
7/17/2023 | 7/19/2023 |
答:
1赞
mozway
10/19/2023
#1
您可以将 groupby.cumcount
和 to_timedelta
与每天 () 的频率一起使用,您将添加原始日期:D
df['dates'] = pd.to_datetime(df['dates'])
df['new_date'] = df['dates'].add(pd.to_timedelta(df.groupby('dates').cumcount(), unit='D'))
输出:
dates new_date
0 2023-07-24 2023-07-24
1 2023-07-24 2023-07-25
2 2023-07-23 2023-07-23
3 2023-07-23 2023-07-24
4 2023-07-23 2023-07-25
5 2023-07-17 2023-07-17
6 2023-07-17 2023-07-18
7 2023-07-17 2023-07-19
1赞
jezrael
10/19/2023
#2
将列转换为 datetetimes,通过 GroupBy.cumcount
创建计数器,通过 to_timedelta
转换为 timedeltas 并添加到 s:date
df['dates'] = pd.to_datetime(df['dates'])
df['new_date'] = df['dates'] + pd.to_timedelta(df.groupby('dates').cumcount(), unit='D')
print (df)
dates new_date
0 2023-07-24 2023-07-24
1 2023-07-24 2023-07-25
2 2023-07-23 2023-07-23
3 2023-07-23 2023-07-24
4 2023-07-23 2023-07-25
5 2023-07-17 2023-07-17
6 2023-07-17 2023-07-18
7 2023-07-17 2023-07-19
如果需要原始格式,请添加 Series.dt.strftime
:
dates=pd.to_datetime(df['dates'])+pd.to_timedelta(df.groupby('dates').cumcount(),unit='D')
df['new_date'] = dates.dt.strftime('%#m/%d/%Y')
print (df)
dates new_date
0 7/24/2023 7/24/2023
1 7/24/2023 7/25/2023
2 7/23/2023 7/23/2023
3 7/23/2023 7/24/2023
4 7/23/2023 7/25/2023
5 7/17/2023 7/17/2023
6 7/17/2023 7/18/2023
7 7/17/2023 7/19/2023
评论