最大化相同元素之间的最小距离

Maximize the minimum distance between the same elements

提问人:João Pedro 提问时间:3/11/2023 最后编辑:João Pedro 更新时间:9/20/2023 访问量:283

问:

我有一个数组,例如:

arr = ['A', 'A', 'A', 'B', 'B']

我想对这个数组重新排序,使相同类型的元素之间的最小距离是最大值。例如,这是上述数组的最佳序列:

arr1 = ['A', 'B', 'A', 'B', 'A']

如何在 python 中做到这一点?我很难弄清楚......

编辑:我试图用遗传算法来解决它,使得每个个体都是一个整数序列,例如,[1,2,3,6,4,5],这意味着位置1和2将被交换,3和6,4和5。然而,它似乎停留在局部最小值上。

Python 算法 序列 组合学

评论

0赞 Daniel Hao 3/11/2023
如果列表是这样的怎么办 - 列表是否只有 2 个项目或更多?['A', 'B', 'B', 'C', 'C', 'A']
0赞 João Pedro 3/11/2023
该列表必须保留每个类别的元素数量,因此在我的示例中,它将是 3 个 A 和 2 个 B。
0赞 Samwise 3/11/2023
从一种幼稚的方法开始:弄清楚如何确定给定列表是否满足您的标准,然后测试列表的所有排列,看看哪一个满足。一旦你开始工作,你就可以开始考虑如何优化它。你在第一部分是否取得了任何进展,或者你是否完全被困在如何测试相同元素之间的“距离”上?
1赞 user3386109 3/11/2023
如果数组是 ,则数组的最佳序列是什么?['A', 'A', 'A', 'A', 'B', 'B']

答:

-2赞 itprorh66 3/11/2023 #1

如果我明白你想做什么。 给定一个元素列表,例如 ['A', 'A', 'A', 'B', 'B']

提供一个新列表,以便:

  1. 使用输入列表的所有元素
  2. 每个重复元素之间有一个最大距离

例如,从上面输入的 ['A', 'B', 'A', 'A'] 生成输出

以下代码实现此功能

from collections import Counter
def processList(arr: list) -> list:
    cd = []
    for k,v in Counter(arr).items():
        cd.append([k,v])
    cd = sorted(cd, key= lambda x: x[1], reverse=True)
    rslt = []
    while sum([x[1] for x in cd]) > 0:
        seq = []
        for i in range(len(cd)):
            if cd[i][1] > 0:
                seq.append(cd[i][0])
                cd[i][1] -= 1
        rslt.extend(seq)
    return rslt

因此,使用您的示例作为测试:

processList(['A', 'B', 'A', 'B', 'A']) 

收益 率:

['A', 'B', 'A', 'B', 'A']

评论

2赞 Kelly Bundy 3/11/2023
processList('AAABC')失败。
0赞 itprorh66 3/11/2023
请注意,“AAABC”不是一个列表,这与所提供解决方案的基本假设相反
0赞 itprorh66 3/12/2023
processList('AAABC')生成输出 = ['A', 'B', 'C', 'A', 'A']。对于什么是可接受的,什么是不可接受的,没有明确的定义,所以请解释为什么你得出这个结论是失败的。
0赞 Kelly Bundy 3/12/2023
字符串,列表,对于显示问题并不重要。标题说“最大化相同元素之间的最小距离”。正文显示“相同类型元素之间的最小距离是最大值”。正如我的例子所示,你没有这样做。您可以将两个值并排放置,但可以对输入进行排序,以便没有两个相同的值彼此相邻,这具有更大的最小距离。A
0赞 itprorh66 3/12/2023
很公平,很好
-1赞 itprorh66 4/1/2023 #2

我不知道是否有人仍然有兴趣回答这个问题,但由于我的第一次天真尝试是不正确的,并且被适当地降级了,它让我的设计汁液开始了。所以这里有一个更完整的答案。

据我了解,问题: 给定一系列符号,例如“AAABB” 其中 A 和 B 是符号 找到序列的重新排序,使相似符号之间的最小间隔最大化。
因此,在上面的示例中,解决方案将是“ABABA”

import heapq as hpq
from copy import copy
from dataclasses import dataclass
from itertools import permutations
from functools import total_ordering


@dataclass
@total_ordering
class SolutionItem:
    arr: str
    sol: list[str]

    def __lt__(self, other):
        return len(self.arr) < len(other.arr)


def min_separation(in_arr, all_symbs):
    # compute the mininimum separation between symbols
    for s in all_symbs:
        locs = [i for i, x in enumerate(in_arr) if x == s]
        if len(locs) > 1:
            return min([locs[x] - locs[x-1] - 1 for x in range(1, len(locs))])
        return 0


def maximize_min_separation(arr):
    """ find the organization of symbols in arr to produce the maximum
        possible minimum separation between equal symbols
    """
    all_symbs = set(arr)
    h = []
    best = min_separation(arr, all_symbs)
    best_sol = arr
    hpq.heappush(h, (best, SolutionItem(arr, [])))
    while len(h) > 0:
        mindist, itm = hpq.heappop(h)
        cur_arr = itm.arr
        cur_sol = itm.sol
        if len(cur_arr) > 0:
            options = set(cur_arr)
            nxt_arr = copy(cur_arr)
            for s in options:
                k = nxt_arr.find(s)
                nxt_arr = nxt_arr[:k] + nxt_arr[k+1:]

            for opt in permutations(options):
                if cur_sol:
                    nxt_sol_frt = list(opt)
                    nxt_sol_frt.extend(cur_sol)
                    nxt_sol_bck = copy(cur_sol)
                    nxt_sol_bck.extend(list(opt))
                    frt_itm = SolutionItem(nxt_arr, nxt_sol_frt)
                    bck_itm = SolutionItem(nxt_arr, nxt_sol_bck)
                    hpq.heappush(h, (-min_separation(frt_itm.sol, all_symbs),
                                     frt_itm))
                    hpq.heappush(h, (-min_separation(bck_itm.sol, all_symbs),
                                     bck_itm))
                else:
                    frt_itm = SolutionItem(nxt_arr, list(opt))
                    hpq.heappush(h, (-min_separation(list(opt),
                                                     all_symbs), frt_itm))
        else:
            # come here for a solution
            if mindist < best:
                best_sol = cur_sol
    return ''.join(best_sol)

执行多个测试用例会产生以下结果:maximize_min_separation

Case: AAA --> AAA
Case: AAB --> ABA
Case: AABBB --> BABAB
Case: AABBC --> CBABA
Case: AABBBCC --> BCABACB
Case: AABCBAD --> BACABDA

这似乎满足了要求。

评论

0赞 Kelly Bundy 4/1/2023
BACABDA 不是最佳选择,ABCABDA 更好。
0赞 Daniele Rugginenti 7/29/2023
ABCABDA 并不完美,ABCADBA 和 ABDACBA 更好
1赞 Daniele Rugginenti 7/29/2023 #3

这里有一个“对输出的暴力攻击”解决方案。它不能提供完美的结果,它在更大的列表中效果更好,通常比在线找到的其他解决方案更快。如果您需要速度而不是精度,这很有用。

from collections import defaultdict

def find_nearest_none(arr, position):
    if arr[position] is None:
        return position

    step = 1
    while True:
        left_pos = position - step
        right_pos = position + step

        if right_pos < len(arr) and arr[right_pos] is None:
            return right_pos
        elif left_pos >= 0 and arr[left_pos] is None:
            return left_pos
        elif left_pos < 0 and right_pos >= len(arr):
            # Both left and right positions are out of bounds
            return False

        step += 1

def max_distance_list(nums):
    num_occurrences = {}
    t = len(nums)

    # Works on a lit that doesn't contains None
    out = [None] * t
    
    # Group data by occurence count
    for num in nums:
        num_occurrences[num] = num_occurrences.get(num, 0) + 1
    num_occurrences = sorted(num_occurrences.items(), key=lambda item: item[1], reverse=True)

    # Create a dict from touple list
    # grouped_data_dict new format => {10: [3], 4: [1, 5], 2: [44, 4, 6], 1: [7, 8, 9, 11, 123, 22]})
    grouped_data = defaultdict(list)
    for key, value in num_occurrences:
        grouped_data[value].append(key)  
    
    start_pos = 0   # start position
    it = 0          # iterations

    # iterate over the grouped data, beginning from the items that occurr most
    for x, y in dict(grouped_data).items():

        # iterate over list of items with same occurrences
        for z in y:
        
            # calculate the optimal separation between them
            sep = (t+1) / x # keep it as a float and round it up later
            pos = start_pos; # 0,1,2 increase for each uniq element in array
            it += 1;  
            # Iterate over every occurrence of every element
            for i in range(x): 
                
                # Write the last occurrence nearest the end as possible
                if i == x-1:
                    pos = t-it;

                # get the nearest free position to the oprimal one
                free_pos = find_nearest_none(out, int(round(pos,0)))

                # write the element in the out array
                out[free_pos] = z

                # Increment position by optimal (float) separation
                pos+=sep;

            start_pos+=1;
    return out

您可以在图像中看到瑕疵率,还可以看到速度

Test on 2k random numbers

一些测试

    [my_func] elements: 2000  -- Tot time: 0.0010275840759277344
    [1, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 44, 4, 5, 6, 1, 4, 5, 3, 6, 7, 8, 9, 11, 123, 22, 44, 1, 5, 5] 
    --> 
    [3, 1, 5, 3, 44, 4, 3, 6, 22, 3, 1, 5, 3, 123, 11, 9, 3, 1, 5, 3, 8, 7, 3, 6, 4, 3, 44, 5, 1, 3]
    AABCBAD --> ['A', 'B', 'D', 'A', 'C', 'B', 'A']
    AAA --> ['A', 'A', 'A']
    AAB --> ['A', 'B', 'A']
    AABBC --> ['A', 'B', 'C', 'B', 'A']
    AABBBCC --> ['B', 'A', 'C', 'B', 'C', 'A', 'B']
1赞 briplomo1 9/20/2023 #4

这似乎有效并且是相似的。我将把这个留在这里供参考。我有一个面试问题需要这个。这是我的解决方案。这需要第二个参数 k,这是重复值之间的最小期望距离。该算法尽最大努力用 k 个索引分隔重复项。我相信您可以通过使 k = lengthOfArray/maxOcurrenceOfValue 来获得最大间距。

def maxTime(deliveries, x):
    ans = []
    count = {}
    for n in deliveries:
        count[n] = count.setdefault(n, 0)+1
    keys = sorted([k for k in count.keys()], key=lambda x: count[x], reverse=True)
    while len(count) >0:
        ind = 0
        l = 0
        while l < x and ind < len(keys):
            ans.append(keys[ind])
            l+=1
            count[keys[ind]] -=1
            
            if count[keys[ind]] == 0:
                count.pop(keys[ind])
                keys.pop(ind)
            else:
                ind+=1
    print(ans)

maxTime(["A","A","A","A","B", "B","C", "C", "D", "E"], 3) # math.ceil(len/maxOccur) = 3
# -> ['A', 'B', 'C', 'A', 'B', 'C', 'A', 'D', 'E', 'A']
maxTime(["A","A","A","A","B", "B","C", "C", "D", "E"], 5) # alternative k
# -> ['A', 'B', 'C', 'D', 'E', 'A', 'B', 'C', 'A', 'A']
maxTime(["A","A","B","A","B","B","C","C","E","D"],4) # math.ceil(len/maxOccur) = 4 
# -> ['A', 'B', 'C', 'E', 'A', 'B', 'C', 'D', 'A', 'B']