R 中的 stemCompletion 问题-解网

问：

尊敬的 stack overflow 社区：

尝试在 R 中使用 tm 包（https://cran.r-project.org/web/packages/tm/tm.pdf）中的函数 stemCompletion 完成词干语料库时，我遇到了一个问题。

过去，我曾成功使用过此功能。然而，它现在也不再使用我过去的数据集。

作为函数的输入，我使用预处理的 VCorpus。一切正常（tolower、removePunctuation、stripWhitespace、removeNumber、removeWords、stemDocumt），直到 stemCompletion 步骤。

这是我使用的部分代码：

# Load Data
Data <-read.csv("AMAZON_FASHION_5.csv", header=TRUE, sep = ";", dec = ",", colClasses = "character", fill = TRUE)

# View Data
View(Data)

Data <-data.frame(Data)

# Define Corpus
Data$reviewText <- iconv(Daten$reviewText, "WINDOWS-1252", sub="byte")
Text <- VCorpus(VectorSource(Daten$reviewText))
writeLines(strwrap(as.character(Text[[69]])))

#Output:
#was terribly disappointed the pants were way too large in the legs my husband looked
#like he was wearing blown up clown pants

###then some code to preprocess the data is performed

# Create a PlainTextDocument
Text <- tm_map(Text, PlainTextDocument)

# Create a copy of object "Text" to use later as a dictionary for stemming completion
Text.copy <- Text

# Stem document 
Text_stemmed <- tm_map(Text, stemDocument, language = "english")

# Show comment Nr.69
writeLines(strwrap(as.character(Text_stemmed[[69]])))

#Output:
#terribl disappoint pant way larg leg husband look like wear blown clown pant

Text_comp <- stemCompletion(Text_stemmed, dictionary=Text.copy, type = "prevalent")
# Show comment Nr.69
writeLines(strwrap(as.character(Text_comp[[69]])))

#Output:
#character(0)

谁能帮忙？这里可能有什么问题？

我之前尝试在不执行操作 PlainTextDocument 的情况下运行 stemCompleteum。然而，这导致了以下输出：

writeLines(strwrap(as.character(Text_comp[[69]])))
69

不知何故，stemCompletion 函数似乎产生了一个字符类，因为我无法调用这些函数

meta(Text_comp)
inspect(Text_comp)

在此对象上：

Error in UseMethod("meta", x) : 
  no applicable method for 'meta' applied to an object of class "character"

我还尝试使用Zhao（https://drive.google.com/file/d/1JSlWQLPrAUrtdLrGFuS8kckxhqHp885f/view;（在此堆栈溢出帖子中也提到：在 R（tm 包）中用于文本挖掘的语料库的 stemCompletion 问题）。然而，这也没有带来预期的结果。

R NLP TM 词干提取

R 中的 stemCompletion 问题

Issue with stemCompletion in R

评论