提问人:Jonny Studholme 提问时间:7/7/2023 最后编辑:Sebastian WoznyJonny Studholme 更新时间:7/17/2023 访问量:87
Python for 循环:数组解包与范围解构
Python for loop: array unpacking vs range deconstruction
问:
我目前正在学习用 Python 教授的数据结构和算法课程,讲师总是选择使用范围函数(下面的方法 2)解压缩数组(嗯,s),而不是我习惯在 python 中看到的那种打包。我很惊讶,因为在我看来方法 1 更具可读性。list
我是否缺少某些东西,使方法 2 更可取?我以为也许是速度,但这里的另一篇帖子提到,由于每个循环都必须“生成”一个新号码,因此呼叫范围实际上较慢。我在想的另一件事是,讲师只是想让代码更与语言无关,可以更容易地用其他语言复制。
代码示例:
给定哈希表索引位置:
index_location = [[ 'a' , 1 ], [ 'b' , 2 ], [ 'c' , 3 ], [ 'd' , 4 ]]
方法1:
for key, value in index_location:
print(key)
方法2:
for i in range(len(index_location)):
print(index_location[i][0])
答:
我不同意这只是一个意见问题的评论,因为它可以被测试。我确实同意归结为“这取决于你想做什么”的评论。如果你的任务只需要做一些事情,那么方法 2 的速度几乎快了 2 倍(你可以通过在每个测试循环中替换以下内容来测试这一点。但是,如果您需要访问列表中的元素(在示例中执行此操作),那么方法 1 会更快一些。这也有一个好处,就像在其他评论中一样,“更像 Pythonic”。len(your list)
tmp_lst.append
pass
import random
import time
rand_list = [random.sample(range(10), 2)]*100000
method1_times = []
method2_times = []
for i in range(50):
## Method 1
start = time.perf_counter()
tmp_lst1 = []
for key, value in rand_list:
tmp_lst1.append([key, value])
end = time.perf_counter()
method1_times.append(end-start)
## Method 2
start = time.perf_counter()
tmp_lst2 = []
for j in range(len(rand_list)):
tmp_lst2.append([rand_list[j][0], rand_list[j][1]])
end = time.perf_counter()
method2_times.append(end-start)
print(f"Method 1 Average: {round((sum(method1_times)/len(method1_times)), 5)}")
print(f"Method 2 Average: {round((sum(method2_times)/len(method2_times)), 5)}")
Method 1 Average: 0.03439
Method 2 Average: 0.03759
评论
for
我们可以使用 dis
来检查发出的字节码的差异:
import dis
def deconstruction_approach(index_location):
l = []
for key, value in index_location:
l.append(key)
dis.dis(deconstruction_approach)
4 0 RESUME 0
5 2 BUILD_LIST 0
4 STORE_FAST 1 (l)
6 6 LOAD_FAST 0 (index_location)
8 GET_ITER
>> 10 FOR_ITER 26 (to 64)
12 UNPACK_SEQUENCE 2
16 STORE_FAST 2 (key)
18 STORE_FAST 3 (value)
7 20 LOAD_FAST 1 (l)
22 LOAD_METHOD 0 (append)
44 LOAD_FAST 2 (key)
46 PRECALL 1
50 CALL 1
60 POP_TOP
62 JUMP_BACKWARD 27 (to 10)
6 >> 64 LOAD_CONST 0 (None)
66 RETURN_VALUE
def range_approach(index_location):
l = []
for i in range(len(index_location)):
l.append(index_location[i][0])
dis.dis(range_approach)
1 0 RESUME 0
2 2 BUILD_LIST 0
4 STORE_FAST 1 (l)
3 6 LOAD_GLOBAL 1 (NULL + range)
18 LOAD_GLOBAL 3 (NULL + len)
30 LOAD_FAST 0 (index_location)
32 PRECALL 1
36 CALL 1
46 PRECALL 1
50 CALL 1
60 GET_ITER
>> 62 FOR_ITER 35 (to 134)
64 STORE_FAST 2 (i)
4 66 LOAD_FAST 1 (l)
68 LOAD_METHOD 2 (append)
90 LOAD_FAST 0 (index_location)
92 LOAD_FAST 2 (i)
94 BINARY_SUBSCR
104 LOAD_CONST 1 (0)
106 BINARY_SUBSCR
116 PRECALL 1
120 CALL 1
130 POP_TOP
132 JUMP_BACKWARD 36 (to 62)
3 >> 134 LOAD_CONST 0 (None)
136 RETURN_VALUE
这里没什么疯狂的,少了几秒,这个名字不必在每次迭代中都加载,我们跳过运算符,用它们换取运算符。CALL
deconstruction_approach
index_location
BINARY_SUBSCR
UNPACK_SEQUENCE
似乎可以合理地假设它应该具有更高的性能。deconstruction_approach
分析:
import dis
import random
from performance_measurement import run_performance_comparison
def deconstruction_approach(index_location):
l = []
for key, value in index_location:
l.append(key)
def range_approach(index_location):
l = []
for i in range(len(index_location)):
l.append(index_location[i][0])
def setup(N):
data = [[""] * 2 for _ in range(N)]
for i in range(N):
data[i][0] = chr(ord("a") + i)
data[i][1] = random.randint(1, 100)
return [data]
run_performance_comparison(
approaches=[range_approach, deconstruction_approach],
data_size=[10000, 20000, 30000, 100000, 200_000, 300_000, 500_000],
setup=setup,
)
结果符合预期:
性能分析代码:
import timeit
from functools import partial
import matplotlib.pyplot as plt
from typing import List, Dict, Callable
from contextlib import contextmanager
import matplotlib.pyplot as plt
import matplotlib.transforms as mtransforms
import matplotlib.ticker as ticker
import numpy as np
@contextmanager
def data_provider(data_size, setup=lambda N: N, teardown=lambda: None):
data = setup(data_size)
yield data
teardown(*data)
def run_performance_comparison(approaches: List[Callable],
data_size: List[int],
*,
setup=lambda N: [N],
teardown=lambda *N: None,
number_of_repetitions=5,
title='Performance Comparison',
data_name='N',
yscale='log',
xscale='log'):
approach_times: Dict[Callable, List[float]] = {approach: [] for approach in approaches}
for N in data_size:
with data_provider(N, setup, teardown) as data:
print(f'Running performance comparison for {data_name}={N}')
for approach in approaches:
function = partial(approach, *data)
approach_time = min(timeit.Timer(function).repeat(repeat=number_of_repetitions, number=1))
approach_times[approach].append(approach_time)
for approach in approaches:
plt.plot(data_size, approach_times[approach], label=approach.__name__)
plt.yscale(yscale)
plt.xscale(xscale)
plt.xlabel(data_name)
plt.ylabel('Execution Time (seconds)')
plt.title(title)
plt.legend()
plt.show()
话虽如此,意图的清晰度和明确的意图对我来说是真正的倾斜,而不是性能。deconstruction_approach
上一个:如何提高该程序的可读性?
下一个:如何处理C语言中不断的错误检查?
评论
for idx, val in enumerate(X):
range()
range(len(list))