pandas 系列标记两个值之间的所有行

pandas series mark all the rows between two values

提问人:Cranjis 提问时间:6/6/2023 更新时间:6/6/2023 访问量:56

问:

我有一个具有 3 个可能值的序列(df 中的单个列):

Stable, Increase, Decresae

,我想标记从“增加”到“减少”之间的所有区域。因此,对于这些值:

Stable
Stable
Stable
Increase
Increase
Stable
Stable
Decrease
Stable
Increase
Stable
Decrease

我会得到:最好的方法是什么?-,-,-,+,+,+,+,-,-,+,+,-

Pandas DataFrame 科学数据 整理

评论


答:

1赞 PaulS 6/6/2023 #1

一个可能的解决方案:

np.where
(s.where(s.eq('Increase') | s.eq('Decrease')).ffill().eq('Increase'),
 '+', '-')

输出:

array(['-', '-', '-', '+', '+', '+', '+', '-', '-', '+', '+', '-'],
      dtype='<U1')
1赞 jezrael 6/6/2023 #2

使用 Series.replace 仅对 s 值进行替换,在此处正向填充现有值,并在 numpy.where 中比较设置值:NaNStableIncreaseDecreaseIncrease0/-

df['new'] = np.where(df['col'].replace({'Stable': np.nan}).ffill().eq('Increase'), '+','-')
print (df)
         col new
0     Stable   -
1     Stable   -
2     Stable   -
3   Increase   +
4   Increase   +
5     Stable   +
6     Stable   +
7   Decrease   -
8     Stable   -
9   Increase   +
10    Stable   +
11  Decrease   -

中间:

print (df.assign(repl=df['col'].replace({'Stable': np.nan}),
                 ffill=df['col'].replace({'Stable': np.nan}).ffill(),
                 comp=df['col'].replace({'Stable': np.nan}).ffill().eq('Increase'),
                 out=np.where(df['col'].replace({'Stable': np.nan}).ffill().eq('Increase'), '+','-')))
         col      repl     ffill   comp out
0     Stable       NaN       NaN  False   -
1     Stable       NaN       NaN  False   -
2     Stable       NaN       NaN  False   -
3   Increase  Increase  Increase   True   +
4   Increase  Increase  Increase   True   +
5     Stable       NaN  Increase   True   +
6     Stable       NaN  Increase   True   +
7   Decrease  Decrease  Decrease  False   -
8     Stable       NaN  Decrease  False   -
9   Increase  Increase  Increase   True   +
10    Stable       NaN  Increase   True   +
11  Decrease  Decrease  Decrease  False   -
0赞 mozway 6/6/2023 #3

映射在“增加”和“减少”上,然后映射 ffill。最后将 +/- 与 numpy.where 映射:TrueFalse

s = df['col'].map({'Increase': True, 'Decrease': False}).ffill().fillna(False)
df['indicator'] = np.where(s, '+', '-')

作为单行:

df['indicator'] = np.where(df['col'].map({'Increase': True, 'Decrease': False})
                                    .ffill().fillna(False),
                           '+', '-')

输出:

         col indicator
0     Stable         -
1     Stable         -
2     Stable         -
3   Increase         +
4   Increase         +
5     Stable         +
6     Stable         +
7   Decrease         -
8     Stable         -
9   Increase         +
10    Stable         +
11  Decrease         -

中间体:

         col    map  ffill+fillna indicator
0     Stable    NaN         False         -
1     Stable    NaN         False         -
2     Stable    NaN         False         -
3   Increase   True          True         +
4   Increase   True          True         +
5     Stable    NaN          True         +
6     Stable    NaN          True         +
7   Decrease  False         False         -
8     Stable    NaN         False         -
9   Increase   True          True         +
10    Stable    NaN          True         +
11  Decrease  False         False         -