基于时间戳的子集数据帧-解网

问：

我是 Python 新手，所以如果我问一个愚蠢的问题，我深表歉意，但我无法弄清楚如何根据最新的起始 CASE ID 子集我的数据帧，以便保留所有行。

编号	案例 ID	CHANGE_TMS	地位
00000001	案例 1	15:12:10	开始
00000001	案例 1	15:14:45	结束
00000001	案例 2	17:29:23	开始
00000001	案例 2	18:30:23	结束

我正在寻找以下结果：

编号	案例 ID	CHANGE_TMS	地位
00000001	案例 2	17:29:23	开始
00000001	案例 2	18:30:23	结束

到目前为止，这是我设法做到的，但我不确定我是否正确使用了索引：

import pandas as pd
file_path = r'C:\path'

# Read the csv file and extract the specified columns
columns_to_keep=['CUSTOMER_ID', 'CASE_ID', 'CHANGE_TMS','STATUS']
df = pd.read_csv(file_path, usecols=columns_to_keep)

**#change column to datetime type**
df['CHANGE_TMS'] = pd.to_datetime(df['CHANGE_TMS'])

**#sent the index**
df.set_index('CASE_ID', inplace=True)

**#sort the timestamp column**
df.sort_values(by=['CUSTOMER_ID', 'CHANGE_TMS'], inplace=True)
last_case_indices = df.groupby('CASE_ID')['CHANGE_TMS'].first()

print(last_case_indices)

Python Pandas DataFrame 日期时间索引

基于时间戳的子集数据帧

Subset Dataframe based on the Timestamp

评论