提问人:Shayan 提问时间:11/10/2023 最后编辑:Shayan 更新时间:11/10/2023 访问量:76
在 Python 中给定上下文的句子中查找名词
Finding the nouns in a sentence given the context in Python
问:
如何在关于上下文的句子中找到名词?我正在按如下方式使用该库:nltk
text = 'I bought a vintage car.'
text = nltk.word_tokenize(text)
result = nltk.pos_tag(text)
result = [i for i in result if i[1] == 'NN']
#result = [('vintage', 'NN'), ('car', 'NN')]
这个脚本的问题在于它被认为是一个名词,这可能是真的,但考虑到上下文,它是一个形容词。vintage
我们怎样才能完成这项任务?
附录:使用 ,我们得到“老爷车”作为名词:textblob
!python -m textblob.download_corpora
from textblob import TextBlob
txt = "I bought a vintage car."
blob = TextBlob(txt)
print(blob.noun_phrases) #['vintage car']
答:
2赞
petezurich
11/10/2023
#1
使用 spacy 可能会解决您的任务。试试这个:
import spacy
nlp = spacy.load("en_core_web_lg")
def analyze(text):
doc = nlp(text)
for token in doc:
print(token.text, token.pos_)
analyze("I bought a vintage car.")
print()
analyze("This old wine is a vintage.")
输出
I PRON
bought VERB
a DET
vintage ADJ <- correctly identified as adjective
car NOUN
. PUNCT
This DET
old ADJ
wine NOUN
is AUX
a DET
vintage NOUN <- correctly identified as noun
. PUNCT
1赞
Ro.oT
11/10/2023
#2
您可以使用 和 noun_chunks 来分隔名词:spacy
import spacy # tested with version 3.6.1
nlp = spacy.load('en_core_web_sm')
doc = nlp('I bought a vintage car.')
noun = []
for chunk in doc.noun_chunks:
for tok in chunk:
if tok.pos_ == "NOUN":
noun.append(tok.text)
print(noun)
打印出来:
['汽车']
但是,如果您希望同时提取名词和形容词,您可以按照此处的建议尝试以下操作:
noun_adj_pairs = {}
for chunk in doc.noun_chunks:
adj = []
noun = ""
for tok in chunk:
if tok.pos_ == "NOUN":
noun = tok.text
if tok.pos_ == "ADJ" or tok.pos_ == "CCONJ": # accounts for both adjective and conjunctions
adj.append(tok.text)
if noun:
noun_adj_pairs.update({noun:" ".join(adj)})
打印出来:
{'car': '复古'}
评论