提问人:Nabi Shaikh 提问时间:4/14/2022 最后编辑:Nabi Shaikh 更新时间:4/14/2022 访问量:24
在 pandas 数据帧中使用重复的数字序列标记类似类别
Tagging a Similar category with repeated sequence of numbers in pandas dataframe
问:
下面是可重现的代码
colo = ['red', 'red', 'red','cross','cross','red', 'red', 'red','cross','cross','cross',
'cross','cross', 'red', 'red','cross', 'red','cross','cross']
dt = pd.DataFrame()
dt['seq']=[i for i in range(len(colo))]
dt['col'] = colo
预期输出:
需要创建列 和 。seq
col
Expected_col
答:
1赞
user7864386
4/14/2022
#1
这是使用 + + + 到 greate 组的一种方法;然后使用布尔索引来填充值:eq
diff
ne
cumsum
cond = dt['col'].eq('red')
s = dt.loc[cond, 'seq'].diff().ne(1).cumsum()
dt['Expected_col'] = dt['col']
dt.loc[cond, 'Expected_col'] = 'RED' + (s.max() + 1 - s).astype(str)
输出:
seq col Expected_col
0 0 red RED4
1 1 red RED4
2 2 red RED4
3 3 cross cross
4 4 cross cross
5 5 red RED3
6 6 red RED3
7 7 red RED3
8 8 cross cross
9 9 cross cross
10 10 cross cross
11 11 cross cross
12 12 cross cross
13 13 red RED2
14 14 red RED2
15 15 cross cross
16 16 red RED1
17 17 cross cross
18 18 cross cross
评论