提问人:jaried 提问时间:8/25/2022 最后编辑:jaried 更新时间:8/25/2022 访问量:66
如何加快特定函数的计算速度?
How can I speed up the computation of a specific function?
问:
我有一个并且需要根据第一列的符号计算有多少相邻列与其他列具有相同的符号,然后乘以第一列的符号。df
我需要加速的是该功能,它在我的计算机上运行如下:calc_df
%timeit calc_df(df)
6.38 s ± 170 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
我的代码的输出是:
a_0 a_1 a_2 a_3 a_4 a_5 a_6 a_7 a_8 a_9
0 0.097627 0.430379 0.205527 0.089766 -0.152690 0.291788 -0.124826 0.783546 0.927326 -0.233117
1 0.583450 0.057790 0.136089 0.851193 -0.857928 -0.825741 -0.959563 0.665240 0.556314 0.740024
2 0.957237 0.598317 -0.077041 0.561058 -0.763451 0.279842 -0.713293 0.889338 0.043697 -0.170676
3 -0.470889 0.548467 -0.087699 0.136868 -0.962420 0.235271 0.224191 0.233868 0.887496 0.363641
4 -0.280984 -0.125936 0.395262 -0.879549 0.333533 0.341276 -0.579235 -0.742147 -0.369143 -0.272578
0 4.0
1 4.0
2 2.0
3 -1.0
4 -2.0
我的代码如下,其中函数生成演示数据,这与我的实际数据量一致。generate_data
import numpy as np
import pandas as pd
from numba import njit
np.random.seed(0)
pd.set_option('display.max_columns', None)
pd.set_option('expand_frame_repr', False)
# This function generates demo data.
def generate_data():
col = [f'a_{x}' for x in range(10)]
df = pd.DataFrame(data=np.random.uniform(-1, 1, [280000, 10]), columns=col)
return df
@njit
def calc_numba(s):
a = s[0]
b = 1
for sign in s[1:]:
if sign == a:
b += 1
else:
break
b *= a
return b
def calc_series(s):
return calc_numba(s.to_numpy())
def calc_df(df):
df1 = np.sign(df)
df['count'] = df1.apply(calc_series, axis=1)
return df
def main():
df = generate_data()
print(df.head(5))
df = calc_df(df)
print(df['count'].head(5))
return
if __name__ == '__main__':
main()
答:
1赞
mozway
8/25/2022
#1
您可以在此处使用矢量代码。
例如,使用掩码:
df1 = np.sign(df)
m = df1.eq(df1.iloc[:,0], axis=0).cummin(1)
out = df1.where(m).sum(1)
输出(前 5 行):
0 4.0
1 4.0
2 2.0
3 -1.0
4 -2.0
dtype: float64
对整个数据运行的时间:
269 ms ± 37.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
更快的替代方案:
df1 = np.sign(df)
m = df1.eq(df1.iloc[:,0], axis=0).cummin(1)
out = m.sum(1)*df1.iloc[:,0]
148 ms ± 27.4 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
而且你可能可以用纯 numpy 做得更好(你必须写一个等价物)。cummin
0赞
jaried
8/25/2022
#2
有人这样回答我:
使用您的解决方案并带有 numpy:
import numpy as np
import pandas as pd
from numba import njit
np.random.seed(0)
pd.set_option('display.max_columns', None)
pd.set_option('expand_frame_repr', False)
# This function generates demo data.
def generate_data() -> pd.DataFrame:
col = [f'a_{x}' for x in range(10)]
np.random.seed(0)
df = pd.DataFrame(data=np.random.uniform(-1, 1, [280000, 10]), columns=col)
return df
# This function generates demo data.
def generate_data_array() -> np.array:
np.random.seed(0)
return np.random.uniform(-1, 1, [280000, 10])
%%timeit df = generate_data()
df1 = np.sign(df)
m = df1.eq(df1.iloc[:,0], axis=0).cummin(1)
out_df = m.sum(1)*df1.iloc[:,0]
76.6 ms ± 2.6 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
%%timeit array = generate_data_array()
array2 = np.sign(array)
array3 = np.minimum.accumulate(np.equal(array2, np.expand_dims(array2[:,0], axis=1)), 1)
out_array = array3.sum(axis=1) * array2[:,0]
44.3 ms ± 1.35 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
评论