超大型词典在通过泡菜存储/加载时的性能-解网

问：

我有一个非常大的词典（数百万个），它也将被访问数百万次。
我想知道这种字典的低级性能，因为如果它们在构建过程中需要多次放大。我发现了以下有用的问题：在 Python 中提高超大型字典的性能，解释预大字典更有效。

同样，我的字典是逐行构建一次：

out_dict = {}
# parse file
with open(fname, "r", encoding="utf-8") as f: 
    for line in f:           # Million lines
        id, value = process(line)
        out_dict[id] = value # enlarge dictionary line by line
# Pickle it for reuse
with open(pickle_file_name, "wb") as f:
    dump(out_dict, f)

所以这本词典不会是最有效的。
但是，当我在以后的运行中通过泡菜重新加载它时，未泡制的词典是否会像已知的预大小一样更有效地构建，还是会是效率较低的原始词典的完美克隆？

python 字典 hashtable pickle python-internals

超大型词典在通过泡菜存储/加载时的性能

Performance of very large dictionaries when storing/loading via pickle

评论