计算另一个列表中一个单词列表的频率

Count the frequency of a list of words within another list

提问人:Onik Rahman 提问时间:4/22/2022 更新时间:4/22/2022 访问量:553

问:

我有一个列表列表,我想在句子中看到频率:

words = [plates, will]
sentence = [the, plates, will, still, shift, and, the, clouds, will, still, spew,]

我想计算一组单词在列表中被提及的次数。 所以从列表中 [plates,will] 在句子中只提到了 1 次words

我有一个完整的列,我想迭代。

理想的输出是:

频率
〔〕〔�� [盘子,将] 1
〔〕〔�� [仍然,喷出] 1

我试过这个:

for word in word:
    if word in sentence:
        counts[word] += 1
    else:
        counts[word] = 1

[[word.count() for word in b if word in row] for row in b]

对正确的输出有什么帮助吗?

Python for 循环 多维数组 NLP 嵌套列表

评论

1赞 Patrick Artner 4/22/2022
您的列表中没有字符串...完全。如何询问&&最小可重复的例子
0赞 Patrick Artner 4/22/2022
for word in word:毫无意义。
0赞 asha 4/22/2022
你想看到什么频率?词频(例如,提到“盘子”的次数)或词频的组合(例如,提到“盘子,将”的次数)。请澄清这一点,因为在第一个例子中,“遗嘱”被不止一次提及。
0赞 Patrick Artner 4/22/2022
@Jake 不 - 他们的列表中有变量名称 - 没有定义。完全没有字符串。字符串列出或looks_like = ["T","h","i","s"]like = ['T','h','i','s']
0赞 Onik Rahman 4/22/2022
我的意思是 df['word''] @PatrickArtner

答:

0赞 terrafox 4/22/2022 #1

这不是内联的,但它完成了我理解您要求的工作。

words = ["plates", "will"]
sentence = ["the", "plates", "will", "still", "shift", "and"]

count = 0
# Go through each word in the sentence
for si in range(len(sentence) - len(words)):
    match = True
    # Compare if the following words match
    for wi, word in enumerate(words):
        # Break if one word is wrong
        if sentence[si + wi] != word:
            match = False
            break
    if match:
        count +=1
print(count)
0赞 Phantoms 4/22/2022 #2

我认为 Counter 的灵魂更简单。

from collections import Counter

words = ['plates', 'will']
sentence = ['the', 'plates', 'will', 'still', 'shift', 'and', 'the', 'clouds', 'will', 'still', 'spew',]

word_counts = Counter(sentence)

for word in words:
    print(word, word_counts[word])