提问人:David Boshton 提问时间:11/16/2023 最后编辑:David Boshton 更新时间:11/23/2023 访问量:64
如何计算 pandas/numpy 中仅定义值的滚动函数?[已结束]
How to calculate a rolling function of only defined values in pandas/numpy? [closed]
问:
但这无济于事。
- 我有数组,其中每个行项目都是特定列的特定时间索引的特定值。
- 每行可能有多个非 NaN 条目,但肯定不会很多。
- 我想计算一个函数,该函数根据定义的值相对于索引计算梯度。
无法弄清楚如何用 numpy 或 pandas 做到这一点。
建议很有帮助。pandas.drop_na,skip_na也无济于事。
输出模板
fa = np.random.randn(10,4)
mask = np.zeros(40, dtype=bool)
mask[:15] = True
np.random.shuffle(mask)
mask = mask.reshape(10,4)
fa[mask] = np.nan
fa
Out[40]:
array([[ nan, -0.57681061, nan, 0.23047461],
[ 0.26260072, -0.62024175, 0.35678478, nan],
[-0.5781359 , -0.17364336, nan, nan],
[-0.58982883, nan, 0.07114217, 1.03781762],
[-0.03906354, -0.49546887, nan, nan],
[-0.3988263 , 0.21794358, nan, -0.04167338],
[ 0.35731643, -0.80956629, -0.29624602, 2.59351753],
[-0.02804324, nan, nan, nan],
[ nan, 0.75344618, -0.52145898, nan],
[-0.45565981, 0.26946552, nan, 1.64095417]])
dx = pd.date_range("2023-01-01", periods=10, freq="S")
df = pd.DataFrame(fa, index=idx)
## Apply function
df.rolling(3).apply(lambda s: s.sum())
Out[52]:
0 1 2 3
2018-01-01 00:00:00 NaN NaN NaN NaN
2018-01-01 00:00:01 NaN NaN NaN NaN
2018-01-01 00:00:02 NaN -1.370696 NaN NaN
2018-01-01 00:00:03 -0.905364 NaN NaN NaN
2018-01-01 00:00:04 -1.207028 NaN NaN NaN
2018-01-01 00:00:05 -1.027719 NaN NaN NaN
2018-01-01 00:00:06 -0.080573 -1.087092 NaN NaN
2018-01-01 00:00:07 -0.069553 NaN NaN NaN
2018-01-01 00:00:08 -0.126387 NaN NaN NaN
2018-01-01 00:00:09 -0.126387 NaN NaN NaN
## What would be good is:
2018-01-01 00:00:00 NaN NaN NaN NaN
2018-01-01 00:00:01 NaN NaN NaN NaN
2018-01-01 00:00:02 NaN -1.370696 NaN NaN
2018-01-01 00:00:03 -0.589829 -1.289354 NaN NaN
2018-01-01 00:00:04 -0.039064 -0.451169 NaN NaN
2018-01-01 00:00:05 -0.398826 -1.087092 NaN2 1.226619
2018-01-01 00:00:06 0.357316 NaN 0.131681 3.589662
2018-01-01 00:00:07 -0.028043 NaN NaN NaN
2018-01-01 00:00:08 NaN 0.161823 -0.746563 NaN
2018-01-01 00:00:09 -0.455660 0.2133456 NaN 4.192798
最后一行是通过做
df[n].dropna().rolling(3).apply(lambda s: s.sum())
在每一列上,然后手工填写。
现在我要运行的实际函数也使用时间索引作为输入,所以它比这更复杂一些(否则很容易 - 只需将所有 's 换成 's,我们就完成了)。nan
0
答: 暂无答案
评论