R 删除仅包含数字的数据框条目中的数字-解网

问：

我正在从在线 csv 文件中读取数据框，但创建文件的人不小心在列中输入了一些数字，这些数字应该只是城市名称。表格样本。cities.data

City        Population   Foo   Bar
Seattle     10           foo1  bar1
98125       20           foo2  bar2
Kent 98042  30           foo3  bar3
98042 Kent  30           foo4  bar4

删除城市列中只有数字的行后所需的输出：

City        Population   Foo   Bar
Seattle     10           foo1  bar1
Kent 98042  30           foo3  bar2
98042 Kent  30           foo4  bar4

我想删除城市列中只有数字的行。Kent 98042 和 98042 Kent 都可以，因为它包含城市名称，但由于 98125 不是城市，我删除了该行。

我不能使用，因为该数字在 csv 文件中被读取为字符串。我尝试使用正则表达式，is.numeric

cities.data <- cities.data[which(grepl("[0-9]+", cities.data) == FALSE)]

但这会删除包含任何数字的行，而不仅仅是仅包含数字的行，例如

City        Population   Foo   Bar
Seattle     10           foo1  bar1

"Kent 98042"即使我想保留那一行，也被删除了。建议？请，谢谢！

R 正则表达式 DataFrame 筛选器 DPLYR

df = read.table(text = "
City        Population   Foo   Bar
Seattle     10           foo1  bar1
98125       20           foo2  bar2
Kent98042  30           foo3  bar2
", header=T, stringsAsFactors=F)

library(dplyr)

df %>% filter(is.na(as.numeric(City)))

#        City Population  Foo  Bar
# 1   Seattle         10 foo1 bar1
# 2 Kent98042         30 foo3 bar2

这个想法是，当我们应用于字符变量时，它不会只在它是数字时返回一个值。as.numericNA

如果要使用基础 R，可以使用以下命令：df[is.na(as.numeric(df$City)),]

1赞 Jan 12/2/2017 #2

带普通：R

df <- data.frame(City = c('Seattle', '98125', 'Kent 98042'),
                 Population = c(10, 20, 30),
                 Foo = c('foo1', 'foo2', 'foo3'))
df2 <- df[-grep('^\\d+$', df$City),]
df2

这产生了

        City Population  Foo
1    Seattle         10 foo1
3 Kent 98042         30 foo3

这个想法是寻找（仅数字）并从集合中删除这些。注意两边的锚点。^\d+$

R 删除仅包含数字的数据框条目中的数字

R remove numbers in data frame entries containing only numbers

评论

评论

评论