提问人:mrgn 提问时间:1/2/2023 最后编辑:mrgn 更新时间:1/3/2023 访问量:143
在列表中查找连续和非连续的有序项序列
Find consecutive and nonconsecutive ordered sequences of items in a list
问:
我有两个清单:
lookup_list = [1,2,3]
my_list = [1,2,3,4,5,2,1,2,2,1,2,3,4,5,1,3,2,3,1]
我想用以下逻辑计算出现多少次:lookup_list
my_list
- 顺序应为 1 -> 2 -> 3
- 在 中,项目不必彼此相邻:1,4,2,1,5,3 -> 应该生成匹配项,因为 a 后面有 a 和 a 后面有 a 。
my_list
lookup_list
2
1
3
2
基于逻辑的数学:
第1场比赛: [1,2,3,4,5,2,1,2,2,1,2,3,4,5,1,3,2,3,1]
第二场比赛: [1,2,3,4,5,2,1,2,2,1,2,3,4,5,1,3,2,3,1]
第三场比赛:[1,2,3,4,5,2,1,2,2,1,2,3,4,5,1,3,2,3,1]
第4场比赛:[1,2,3,4,5,2,1,2,2,1,2,3,4,5,1,3,2,3,1]
是动态的,可以定义为 或 等。我该如何解决?我找到的所有答案都是关于查找以有序方式彼此相邻出现的匹配项,如下所示:在列表中查找匹配的项目序列lookup_list
[1,2]
[1,2,3,4]
1,2,3
我可以使用以下代码找到连续序列的计数,但它不计算非连续序列:
from nltk import ngrams
lookup_list = [1,2,3]
my_list = [1,2,3,4,5,2,1,2,2,1,2,3,4,5,1,3,2,3,1]
all_counts = Counter(ngrams(l2, len(l1)))
counts = {k: all_counts[k] for k in [tuple(lookup_list)]}
counts
>>> {(1, 2, 3): 2}
我尝试使用熊猫滚动窗口功能,但它们没有自定义重置选项。
答:
1赞
Andrej Kesely
1/2/2023
#1
该函数返回匹配项的索引,其中匹配项来自:find_matches()
lookup_list
def find_matches(lookup_list, lst):
buckets = []
def _find_bucket(i, v):
for b in buckets:
if lst[b[-1]] == lookup_list[len(b) - 1] and v == lookup_list[len(b)]:
b.append(i)
if len(b) == len(lookup_list):
buckets.remove(b)
return b
break
else:
if v == lookup_list[0]:
buckets.append([i])
rv = []
for i, v in enumerate(my_list):
b = _find_bucket(i, v)
if b:
rv.append(b)
return rv
lookup_list = [1, 2, 3]
my_list = [1, 2, 3, 4, 5, 2, 1, 2, 2, 1, 2, 3, 4, 5, 1, 3, 2, 3, 1]
print(find_matches(lookup_list, my_list))
指纹:
[[0, 1, 2], [6, 7, 11], [9, 10, 15], [14, 16, 17]]
2赞
Olvin Roght
1/2/2023
#2
def find_all_sequences(source, sequence):
def find_sequence(source, sequence, index, used):
for i in sequence:
while True:
index = source.index(i, index + 1)
if index not in used:
break
yield index
first, *rest = sequence
index = -1
used = set()
while True:
try:
index = source.index(first, index + 1)
indexes = index, *find_sequence(source, rest, index, used)
except ValueError:
break
else:
used.update(indexes)
yield indexes
用法:
lookup_list = [1,2,3]
my_list = [1,2,3,4,5,2,1,2,2,1,2,3,4,5,1,3,2,3,1]
print(*find_all_sequences(my_list, lookup_list), sep="\n")
输出:
(0, 1, 2)
(6, 7, 11)
(9, 10, 15)
(14, 16, 17)
生成器函数生成带有序列匹配索引的元组。在这个函数中,我们初始化循环,当调用抛出 .内部生成器功能生成每个序列项的索引。find_all_sequences()
list.index()
ValueError
find_sequence()
根据这个基准,我的方法比 Andrej Kesely 的答案快 60% 左右。
评论
0赞
mrgn
1/2/2023
对不起,误解了输出。在您的函数中,第 3 个匹配项应该是因为索引已在前一个匹配项中计数。((9, 1), (10, 2), (15, 3))
11
1赞
Olvin Roght
1/3/2023
@mrgn,对不起,我错过了这个要求。用修复编辑了我的答案
0赞
Yusuf Ipek
1/2/2023
#3
下面是一个递归解决方案:
lookup_list = [1,2,3]
my_list = [1,2,3,4,5,2,1,2,2,1,2,3,4,5,1,3,2,3,1]
def find(my_list, continue_from_index):
if continue_from_index > (len(my_list) - 1):
return 0
last_found_index = 0
found_indizes = []
first_occuring_index = 0
found = False
for l in lookup_list:
for m_index in range(continue_from_index, len(my_list)):
if my_list[m_index] is l and m_index >= last_found_index:
if not found:
found = True
first_occuring_index = m_index
last_found_index = m_index
found += 1
found_indizes.append(str(m_index))
break
if len(found_indizes) is len(lookup_list):
return find(my_list, first_occuring_index+1) + 1
return 0
print(find(my_list, 0))
评论
-1赞
Shariff Mohammad
1/2/2023
#4
my_list = [5, 6, 3, 8, 2, 1, 7, 1]
lookup_list = [8, 2, 7]
counter =0
result =False
for i in my_list:
if i in lookup_list:
counter+=1
if(counter==len(lookup_list)):
result=True
print (result)
评论
1,2,3