Pandas 根据条件递增数据帧列

Pandas increment Dataframe column based on condition

提问人:test_zuck 提问时间:10/19/2023 最后编辑:test_zuck 更新时间:10/19/2023 访问量:35

问:

我有以下数据帧,我想在特定条件下为所有行递增“日期值”,如下所示: 如果各行中的日期相同,则按下一个日期递增日期,否则保持相同的日期。 输入 Datafrae

日期
7/24/2023
7/24/2023
7/23/2023
7/23/2023
7/23/2023
7/17/2023
7/17/2023
7/17/2023

预期输出如下:

日期 new_date
7/24/2023 7/24/2023
7/24/2023 7/25/2023
7/23/2023 7/23/2023
7/23/2023 7/24/2023
7/23/2023 7/25/2023
7/17/2023 7/17/2023
7/17/2023 7/18/2023
7/17/2023 7/19/2023
python-3.x pandas 数据帧 日期时间

评论


答:

1赞 mozway 10/19/2023 #1

您可以将 groupby.cumcountto_timedelta 与每天 () 的频率一起使用,您将添加原始日期:D

df['dates'] = pd.to_datetime(df['dates'])

df['new_date'] = df['dates'].add(pd.to_timedelta(df.groupby('dates').cumcount(), unit='D'))

输出:

       dates   new_date
0 2023-07-24 2023-07-24
1 2023-07-24 2023-07-25
2 2023-07-23 2023-07-23
3 2023-07-23 2023-07-24
4 2023-07-23 2023-07-25
5 2023-07-17 2023-07-17
6 2023-07-17 2023-07-18
7 2023-07-17 2023-07-19
1赞 jezrael 10/19/2023 #2

将列转换为 datetetimes,通过 GroupBy.cumcount 创建计数器,通过 to_timedelta 转换为 timedeltas 并添加到 s:date

df['dates'] = pd.to_datetime(df['dates'])
df['new_date'] = df['dates'] + pd.to_timedelta(df.groupby('dates').cumcount(), unit='D')

print (df)
       dates   new_date
0 2023-07-24 2023-07-24
1 2023-07-24 2023-07-25
2 2023-07-23 2023-07-23
3 2023-07-23 2023-07-24
4 2023-07-23 2023-07-25
5 2023-07-17 2023-07-17
6 2023-07-17 2023-07-18
7 2023-07-17 2023-07-19

如果需要原始格式,请添加 Series.dt.strftime

dates=pd.to_datetime(df['dates'])+pd.to_timedelta(df.groupby('dates').cumcount(),unit='D')
df['new_date'] = dates.dt.strftime('%#m/%d/%Y')
print (df)
       dates   new_date
0  7/24/2023  7/24/2023
1  7/24/2023  7/25/2023
2  7/23/2023  7/23/2023
3  7/23/2023  7/24/2023
4  7/23/2023  7/25/2023
5  7/17/2023  7/17/2023
6  7/17/2023  7/18/2023
7  7/17/2023  7/19/2023