透视更长的多列,而透视更宽的其他列

Pivot longer multiple columns while pivot wider others

提问人:principe 提问时间:2/21/2021 更新时间:10/30/2023 访问量:1028

问:

您好,我有一个每组 3-5 行的数据集,如下所示,我想以更长的格式放置一些列,并以更宽的格式放置列。

下面的第一个数据集代表原始格式,我想将其转换为第二个格式。我使用了透视更宽的列 = c(“Jan”, “Feb”),但我无法同时将 Type 列透视得更长。

data <- as.data.frame(matrix(ncol=5, nrow=6))
colnames(data) <- c("names", "group", "Type", "Jan", "Feb")
data$names <- c("P1", "P1", "P1", "P2", "P2", "P2")
data$group <- "S"
data$Type <- c("Beg", "Middle", "End", "Beg", "Middle", "End")
data$Jan <- c(1, 2, 3, 10, 5, 15)
data$Feb <- c(5, 5, 10, 5, 2, 7)

    
   names group Type     Jan  Feb
1   P1    S    Beg       1   5
2   P1    S    Middle    2   5
3   P1    S    End       3   10
4   P2    S    Beg       10  5
5   P2    S    Middle    5   2
6   P2    S    End       15  7


data_transformed <- as.data.frame(matrix(ncol=6, nrow=4))
colnames(data_transformed) <- c("names", "group", "Month", "Beg", "Middle", "End")
data_transformed$names <- c("P1", "P1", "P2", "P2")
data_transformed$group <- "S"
data_transformed$Month <- c("Jan", "Feb")
data_transformed$Beg <- c(1, 10, 5, 5)
data_transformed$Middle <- c(2, 5, 5, 2)
data_transformed$End <- c(2, 15, 10, 7)

  names group Month   Beg Middle End
1   P1  S     Jan      1    2    2
2   P1  S     Feb      10   5    15
3   P2  S     Jan      5    5    10
4   P2  S     Feb      5    2    7
r dplyr 透视 数据操作

评论


答:

3赞 akrun 2/21/2021 #1

在这里,我们需要一个 +,即首先将 s 重整为“long”,将 s 重整为“Feb”,然后将 long 重整为更宽的格式,列名从“Type”开始pivot_longerpivot_widercolJan

library(dplyr)
library(tidyr)
data %>%
     pivot_longer(cols = Jan:Feb, names_to = 'Month') %>% 
     pivot_wider(names_from = Type, values_from = value)

-输出

# A tibble: 4 x 6
#  names group Month   Beg Middle   End
#  <chr> <chr> <chr> <dbl>  <dbl> <dbl>
#1 P1    S     Jan       1      2     3
#2 P1    S     Feb       5      5    10
#3 P2    S     Jan      10      5    15
#4 P2    S     Feb       5      2     7

或从recastreshape2

library(reshape2)
recast(data, measure = c("Jan", "Feb"),
     names + group + variable ~ Type, values.var = 'value')
1赞 ThomasIsCoding 2/21/2021 #2

使用选项data.tabledcast + melt

dcast(
  melt(
    setDT(data),
    id.vars = c("names", "group", "Type"),
    variable.name = "Month"
  ),
  names + group + Month ~ Type
)

   names group Month Beg End Middle
1:    P1     S   Jan   1   3      2
2:    P1     S   Feb   5  10      5
3:    P2     S   Jan  10  15      5
4:    P2     S   Feb   5   7      2
1赞 G. Grothendieck 10/29/2023 #3

这已经晚了几年,但在 procs 包中(当时 CRAN 上可能不存在)可以按组转置。proc_transpose

代码下方指定要按 () 分组的列。id 列 () 是输入列,它将成为输出数据框中的列名。此处无需指定,因为删除两个分组列后只剩下一个字符列,并且默认为假定单个字符列是该列。输出中保存输入中的列名的新列 () 的列名被指定为,但如果默认值 足够,则可以省略。by=id=idname="Month"NAME

library(procs)

proc_transpose(data, by = c("names", "group"), name = "Month")
##   names group Month Beg Middle End
## 1    P1     S   Jan   1      2   3
## 2    P1     S   Feb   5      5  10
## 3    P2     S   Jan  10      5  15
## 4    P2     S   Feb   5      2   7