R:如何检查一列中的字符在另一列中是否具有相同的值

R: How I can check that my characters in one column have same value in another column

提问人:Payam Rahmani 提问时间:11/27/2022 最后编辑:TarJaePayam Rahmani 更新时间:11/29/2022 访问量:154

问:

我是 R 的初学者,我需要学习如何执行代码。正如您在我的数据框中看到的,我想检查商品列中的鸡蛋是否在所有行中都具有相同的单位。

数据框:

df <- structure(list(commodity = c("eggs", "lentils (green)", "oil (vegetable)", 
"rice", "sugar (white)", "eggs", "lentils (green)", "oil (vegetable)", 
"rice", "sugar (white)", "eggs"), unit = c("1.8 kg", "900 g", 
"810 g", "kg", "kg", "1.8 kg", "900 g", "810 g", "kg", "kg", 
"1.8 kg")), class = "data.frame", row.names = c(NA, -11L))

         commodity   unit
1             eggs 1.8 kg
2  lentils (green)  900 g
3  oil (vegetable)  810 g
4             rice     kg
5    sugar (white)     kg
6             eggs 1.8 kg
7  lentils (green)  900 g
8  oil (vegetable)  810 g
9             rice     kg
10   sugar (white)     kg
11            eggs 1.8 kg

我不知道我该怎么办

r 重复 相等

评论

0赞 Payam Rahmani 11/28/2022
谢谢你的回答,但我想知道,例如,鸡蛋是否在所有行中都有相同的单位。以及其他项目。
0赞 Mike 11/29/2022
你能分享预期的输出吗?

答:

1赞 TarJae 11/27/2022 #1

一种方法可能是: 首先创建一个列,其中您的单位仅提取字母,然后使用:distinct()

library(dplyr)

df %>% 
  mutate(unit1 = gsub("[^a-zA-Z]", "", unit)) %>% 
  distinct(unit1)
  unit1
1    kg
2     g
df <- structure(list(commodity = c("eggs", "lentils (green)", "oil (vegetable)", 
"rice", "sugar (white)", "eggs", "lentils (green)", "oil (vegetable)", 
"rice", "sugar (white)", "eggs"), unit = c("1.8 kg", "900 g", 
"810 g", "kg", "kg", "1.8 kg", "900 g", "810 g", "kg", "kg", 
"1.8 kg")), class = "data.frame", row.names = c(NA, -11L))

评论

0赞 Payam Rahmani 11/28/2022
谢谢你的回答,但我想知道,例如,鸡蛋是否在所有行中都有相同的单位。以及其他项目。
1赞 akrun 11/27/2022 #2

在 中,我们可以使用base R

length(unique(trimws(df$unit, whitespace = "[0-9.]+\\s+"))) == 1
[1] FALSE

如果要检查元素的子集

with(df, length(unique(trimws(unit[grepl("eggs", commodity)], 
    whitespace = "[0-9.]+\\s+"))) == 1)
[1] TRUE

如果我们想检查所有元素

library(dplyr)
library(stringr)
df %>% 
  group_by(item = str_extract(commodity, "^\\w+(?=\\s*)")) %>% 
  summarise(isUnitSame = n_distinct(str_extract(unit, "[a-z]+$"))==1)

-输出

# A tibble: 5 × 2
  item    isUnitSame
  <chr>   <lgl>     
1 eggs    TRUE      
2 lentils TRUE      
3 oil     TRUE      
4 rice    TRUE      
5 sugar   TRUE     

评论

0赞 Payam Rahmani 11/28/2022
谢谢你的回答,但我想知道,例如,鸡蛋是否在所有行中都有相同的单位。以及其他项目。