将 Data.Frame 拆分为包含一对的较小 Data.Frame 的列表

spliting a data.frame to a list of smaller data.frames containing a pair

提问人:Simon Harmel 提问时间:10/20/2021 最后编辑:Simon Harmel 更新时间:10/20/2021 访问量:69

问:

我想知道如何拆分我的下面,以便我得到一个较小的 dataf.rames 列表,每个列表都包含一对唯一的 of?datatype

我的如下所示。desired_output

请注意,这只是一个玩具数据,因此可以是任何其他变量。另外,请注意,如果一个特定的只有一行(如),我想通过警告来排除它:typetypetype == 4

type 4 has just one row thus is excluded.

m=
"
  obs   type
    1   1
    2   1
    3   a
    4   a
    5   3
    6   3
    7   4
"
data <- read.table(text = m, h=T)


desired_output <-list(
  
  data.frame(obs=1:4,   type=c(1,1,"a","a")),
  
  data.frame(obs=c(1,2,5,6),   type=c(1,1,3,3)),
  
  data.frame(obs=3:6,   type=c("a","a",3,3))
)

# warning: type 4 has just one row thus is excluded.
R DataFrame 函数 DPLYR Tidyverse

评论


答:

5赞 Ronak Shah 10/20/2021 #1

这是基本的 R 函数 -

return_list_data <- function(data, type) {
  unique_counts <- table(data[[type]])
  single_count <- names(unique_counts[unique_counts == 1])
  if(length(single_count)) {
    warning(sprintf('%s %s has just one row thus is excluded.', type, toString(single_count)))
  }
  multiple_count <- names(unique_counts[unique_counts > 1])
  
  combn(multiple_count, 2, function(x) {
    data[data[[type]] %in% x, ]
  }, simplify = FALSE)  
}

这将返回 -

return_list_data(data, 'type')

#[[1]]
#  obs type
#1   1    1
#2   2    1
#5   5    3
#6   6    3

#[[2]]
#  obs type
#1   1    1
#2   2    1
#3   3    a
#4   4    a

#[[3]]
#  obs type
#3   3    a
#4   4    a
#5   5    3
#6   6    3

#Warning message:
#In return_list_data(data, "type") :
#  type 4 has just one row thus is excluded.

如果没有单行,则不会生成警告,即 .typereturn_list_data(data[-7, ], 'type')

评论

0赞 Ronak Shah 10/20/2021
当然,可以成为可变的,但您的问题不包括该信息。type
0赞 Ronak Shah 10/20/2021
您可以将最后一行更改为setNames(combn(multiple_count, 2, function(x) { data[data[[type]] %in% x, ] }, simplify = FALSE), combn(multiple_count, 2, paste, collapse = '-'))
0赞 Simon Harmel 11/27/2021
嗨,罗纳克,你知道这个函数问题的答案吗?
1赞 Park 10/20/2021 #2

您可以尝试使用 ,dplyr

df1 <- read.table(text = m, h=T)
fun <- function(df1){
  df2 <- df1 %>%
    group_by(type) %>%
    filter(n() > 1) 
  
  df3 <- combn(unique(df2$type), 2) %>% as.data.frame
  
  df4 <- lapply(df3, function(x){
    df2 %>%
      filter(type %in% x)
  })
  war <- df1 %>%
    group_by(type) %>%
    filter(n()<= 1) %>%
    pull(type)%>%
    unique
  if (length(war)>0){
  warning(paste("type", war, "has just one row thus is excluded"))}
  return(df4)
}
fun(df1)

结果:

$V1
# A tibble: 4 x 2
# Groups:   type [2]
    obs type 
  <int> <chr>
1     1 1    
2     2 1    
3     3 a    
4     4 a    

$V2
# A tibble: 4 x 2
# Groups:   type [2]
    obs type 
  <int> <chr>
1     1 1    
2     2 1    
3     5 3    
4     6 3    

$V3
# A tibble: 4 x 2
# Groups:   type [2]
    obs type 
  <int> <chr>
1     3 a    
2     4 a    
3     5 3    
4     6 3 
Warnings: In fun(df1) : type 4 has just one row thus is excluded