通过拆分特定字符的字符串来创建可能性列表

问：

我正在尝试创建用于 Karnaugh 地图求解器的“真值表”。Karnaugh Map 求解器将变量 ABCD 的设置视为一串 4 个数字，表示它们所设置的内容。例如，ABCD 可以是从“0000”到“1111”的任何内容。求解器采用最小项（ABCD 的设置，在真值表中等于 1），但它输出“*000”或其他带有 * 的示例，以表示 Kmap 已删除结果方程中的第一个变量。我正在将其用于迭代过程，因此我需要能够将这些星号填充的字符串列表转换为完全扩展的字符串列表：

[*000， 0*00] -> [0000， 1000， 0100]

我目前这样做的方式（使用 while 循环）有效，但对于我使用它的应用程序来说非常慢。它至少需要 10 分钟才能完成，因为我需要执行大约 1000 次此过程（我正在使用的数据集的每个部分一个）它快速进行到 430 左右，由于扩展这些复杂序列需要很长时间，它显着减慢。我还可以确认它不是我正在使用的 Karnaugh 地图实现，因为它可以立即解决相同的长序列。

我不确定有一种更 pythonic 的方法来做到这一点，它也解释了像“*11*”这样的东西，它们有多个需要扩展表达式的地方：

[*11*] -> [011*， 111*] -> [0110， 0111， 1110， 1111]

有什么想法是高效和pythonic的吗？

我的代码：

        ast = True
        while ast:
            new_string_result = copy.deepcopy(string_result)
            for i in range(len(string_result)):
                for c, char in enumerate(string_result[i]):
                    if char == "*":
                        # replace with 0 and 1
                        new_string_result.append(string_result[i][:c] + "0" + string_result[i][c+1:])
                        new_string_result.append(string_result[i][:c] + "1" + string_result[i][c+1:])

                if "*" in string_result[i]:    
                    # remove original (only once, even if multiple *)
                    # print("Removing ", string_result[i])
                    new_string_result.remove(string_result[i])
                    
            # print("Kmap result during fix iter: ", new_string_result)
            
            ast_found = False
            for i in range(len(new_string_result)):
                if "*" in new_string_result[i]:
                    ast_found = True
            
            # print(ast_found)
            ast = ast_found
            string_result = new_string_result

python 字符串扩展 karnaugh-map

参考代码

这是原始问题中的代码，并添加了一个语句以从输出中删除重复项。（如果不这样做，则展开会给出 384 个结果。break****

def kmap_orig(string_result):
    ast = True
    while ast:
        new_string_result = copy.deepcopy(string_result)
        for i in range(len(string_result)):
            for c, char in enumerate(string_result[i]):
                if char == "*":
                    # replace with 0 and 1
                    new_string_result.append(string_result[i][:c] + "0" + string_result[i][c+1:])
                    new_string_result.append(string_result[i][:c] + "1" + string_result[i][c+1:])
                    break

            if "*" in string_result[i]:    
                # remove original (only once, even if multiple *)
                # print("Removing ", string_result[i])
                new_string_result.remove(string_result[i])

        # print("Kmap result during fix iter: ", new_string_result)

        ast_found = False
        for i in range(len(new_string_result)):
            if "*" in new_string_result[i]:
                ast_found = True

        # print(ast_found)
        ast = ast_found
        string_result = new_string_result
    return string_result

关于此代码的一些评论：

list.remove()很慢。为了从列表中删除元素，它必须搜索列表的每个元素，并检查它们是否相等。找到匹配项后，它必须将该点之后的列表中的每个元素向下移动一个空格。如果可能，请避免这种情况。
您正在使用迭代列表，但该变量不用于除之外的任何用途。这通常比慢，也更复杂。for i in range(len(string_result)):istring_result[i]for elem in string_result:

测试数据

这是我用来测试其中每个的数据。它是一万个随机字符串，每个字符串长 10 个字符。*01

data = [''.join(random.choices(['0', '1', '*'], k=10)) for i in range(10000)]
correct_output = [set(kmap_orig([string])) for string in data]

（注意：在检查以下所有解决方案的正确性时，我忽略了重复项和解决方案的顺序。

迭代解决方案

这个速度大约快了 3 倍，主要是由于移除了内环。for

def kmap_iterative(string_result):
    changed = True
    while changed:
        changed = False
        new_string_result = []
        for string in string_result:
            if "*" in string:
                c = string.find("*")
                assert c != -1
                # replace with 0 and 1
                prefix = string[:c]
                suffix = string[c+1:]
                new_string_result.append(prefix + "0" + suffix)
                new_string_result.append(prefix + "1" + suffix)
                changed = True
            else:
                # no * is present, so add unchanged
                new_string_result.append(string)
        string_result = new_string_result
    return string_result

完成的其他优化：

与其查找两次，不如执行一次并保存到变量中。string[:c]
代码花费了大量时间来查找的值。跟踪是否对列表进行了任何修改，如果未进行任何修改，则退出列表会更快。缺点是它比必要的循环次数多一次。ast_found
由于内部循环本质上只是在字符串中查找 of 的第一个实例，因此请将其替换为标准库中的函数，该函数执行相同的操作。*
避免从空列表开始，并添加没有任何星号的元素。copy.deepcopy()list.remove()string_result

递归解决方案

这比迭代解决方案慢约 20%，但它是最短且（在我看来）最简单的解决方案。

def expand_string_recursive(string):
    if "*" not in string:
        yield string
    else:
        c = string.find("*")
        assert c != -1
        # replace with 0 and 1
        prefix = string[:c]
        suffix = string[c+1:]
        yield from expand_string_recursive(prefix + "0" + suffix)
        yield from expand_string_recursive(prefix + "1" + suffix)

def kmap_recursive(string_result):
    ret = []
    for string in string_result:
        ret.extend(expand_string_recursive(string))
    return ret

Itertools解决方案

这个解决方案背后的想法是，你要求的是 [0， 1] 的笛卡尔乘积，完成 n 次，其中 n 是星号的数量。标准库中有一些东西可以找到它。这也比迭代解决方案慢 20%。

def expand_string_itertools(string):
    blank_positions = [i for i, letter in enumerate(string) if letter == "*"]
    blanks = len(blank_positions)
    # Convert to list to allow mutability
    string = list(string)
    for product in itertools.product(["0", "1"], repeat=blanks):
        for position, replacement in zip(blank_positions, product):
            string[position] = replacement
        # Convert back to string
        yield ''.join(string)

def kmap_itertools(string_result):
    ret = []
    for string in string_result:
        ret.extend(expand_string_itertools(string))
    return ret

基准测试结果

kmap_orig
706 ms ± 5.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
kmap_iterative
201 ms ± 1.06 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
kmap_recursive
251 ms ± 2.33 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
kmap_itertools
243 ms ± 4.74 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

通过拆分特定字符的字符串来创建可能性列表 - Karnaugh Map

Creating a list of possibilities from splitting strings at a certain character - Karnaugh Map

评论