根据列中的部分字符串匹配选择数据框行-解网

问：

我想根据列中字符串的部分匹配从数据框中选择行，例如列“x”包含字符串“has”。使用 - 如果它有语法 - 我会做这样的事情：sqldflike

select * from <> where x like 'hsa'.

很遗憾，不支持该语法。sqldf

或类似：

selectedRows <- df[ , df$x %like% "hsa-"]

这当然是行不通的。

有人可以帮我解决这个问题吗？

r 正则表达式字符串匹配子集

mtcars[grep("Merc", rownames(mtcars)), ]
             mpg cyl  disp  hp drat   wt qsec vs am gear carb
# Merc 240D   24.4   4 146.7  62 3.69 3.19 20.0  1  0    4    2
# Merc 230    22.8   4 140.8  95 3.92 3.15 22.9  1  0    4    2
# Merc 280    19.2   6 167.6 123 3.92 3.44 18.3  1  0    4    4
# Merc 280C   17.8   6 167.6 123 3.92 3.44 18.9  1  0    4    4
# Merc 450SE  16.4   8 275.8 180 3.07 4.07 17.4  0  0    3    3
# Merc 450SL  17.3   8 275.8 180 3.07 3.73 17.6  0  0    3    3
# Merc 450SLC 15.2   8 275.8 180 3.07 3.78 18.0  0  0    3    3

再举一个例子，使用数据集搜索字符串：irisosa

irisSubset <- iris[grep("osa", iris$Species), ]
head(irisSubset)
#   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
# 1          5.1         3.5          1.4         0.2  setosa
# 2          4.9         3.0          1.4         0.2  setosa
# 3          4.7         3.2          1.3         0.2  setosa
# 4          4.6         3.1          1.5         0.2  setosa
# 5          5.0         3.6          1.4         0.2  setosa
# 6          5.4         3.9          1.7         0.4  setosa

对于您的问题，请尝试：

selectedRows <- conservedData[grep("hsa-", conservedData$miRNA), ]

library(stringr)
library(dplyr)

CO2 %>%
  filter(str_detect(Treatment, "non"))

   Plant        Type  Treatment conc uptake
1    Qn1      Quebec nonchilled   95   16.0
2    Qn1      Quebec nonchilled  175   30.4
3    Qn1      Quebec nonchilled  250   34.8
4    Qn1      Quebec nonchilled  350   37.2
5    Qn1      Quebec nonchilled  500   35.3
...

这将筛选示例 CO2 数据集（随 R 一起提供）的行，其中 Treatment 变量包含子字符串“non”。您可以调整是查找固定匹配项还是使用正则表达式 - 请参阅 stringr 包的文档。str_detect

根据列中的部分字符串匹配选择数据框行

Selecting data frame rows based on partial string match in a column

评论

评论

评论

评论