按行查找数据框中特定值的所有列

Finding all columns in data frame of a certain value by row

提问人:confusedindividual 提问时间:4/3/2022 最后编辑:Maëlconfusedindividual 更新时间:4/4/2022 访问量:793

问:

我正在尝试在数据帧中的每一行找到具有特定编号的第一列和具有相同值的最后一列。如果数字为 4,请参阅示例数据和所需输出。

示例数据

ID WZ_1 WZ_2 WZ_3 WZ_4
1  5    4    4    3 
2  4    4    3    3
3  4    4    4    4 

示例输出

ID First Last 
1  WZ_2  WZ_3
2  WZ_1  WZ_2
3  WZ_1  WZ_4 
R DataFrame 数据操作

评论


答:

2赞 Sweepy Dodo 4/3/2022 #1
library(data.table)

# dummy data
# use setDT(df) if yours isn't a datatable already
df <- data.table(id = 1:3
                 , a = c(4,4,0)
                 , b = c(0,4,0)
                 , c = c(4,0,4)
                 ); df
   id a b c
1:  1 4 0 4
2:  2 4 4 0
3:  3 0 0 4

# find 1st & last column with target value
df[, .(id
       , first = apply(.SD, 1, \(i) names(df)[min(which(i==4))])
       , last = apply(.SD, 1, \(i) names(df)[max(which(i==4))])
       )
   ]

评论

1赞 confusedindividual 4/3/2022
非常感谢!我非常感谢。
1赞 zephryl 4/3/2022
@Alaina如果这回答了您的问题,请将其标记为答案。(如果您觉得它有帮助,也可以为它投赞成票。
1赞 Sweepy Dodo 4/3/2022
还。谢谢,zephryl、AndrewGB 和 Alaina
0赞 AndrewGB 4/3/2022 #2

这是一个选项,我将其放入长格式,然后仅保留值,并且仅保留第一次和最后一次出现。然后,我创建一个新列来表示它是第一个值还是最后一个值,然后透视回宽格式。tidyversefilter4

library(tidyverse)

df %>% 
  pivot_longer(-ID) %>% 
  group_by(ID) %>% 
  filter(value == 4) %>% 
  filter(row_number()==1 | row_number()==n()) %>% 
  mutate(col = c("First", "Last")) %>% 
  pivot_wider(names_from = "col", values_from = "name") %>% 
  select(-value)

输出

  <int> <chr> <chr>
1     1 WZ_2  WZ_3 
2     2 WZ_1  WZ_2 
3     3 WZ_1  WZ_4 

数据

df <- structure(list(ID = 1:3, WZ_1 = c(5L, 4L, 4L), WZ_2 = c(4L, 4L, 
4L), WZ_3 = c(4L, 3L, 4L), WZ_4 = c(3L, 3L, 4L)), class = "data.frame", row.names = c(NA, 
-3L))
2赞 Maël 4/3/2022 #3

跟:max.col

data.frame(ID = df$ID,
           First = names(df)[max.col(df == 4, ties.method = "first")],
           Last = names(df)[max.col(df == 4, ties.method = "last")])

  ID First Last
1  1  WZ_2 WZ_3
2  2  WZ_1 WZ_2
3  3  WZ_1 WZ_4

数据

df <- read.table(header= T, text= "ID WZ_1 WZ_2 WZ_3 WZ_4
1  5    4    4    3 
2  4    4    3    3
3  4    4    4    4 ")

评论

0赞 AndrewGB 4/4/2022
它还适用于:dplyrdf %>% mutate(First = names(.)[max.col(. == 4, ties.method = "first")], Last = names(.)[max.col(. == 4, ties.method = "last")])