提问人:HappyPy 提问时间:11/10/2023 最后编辑:ThomasIsCodingHappyPy 更新时间:11/10/2023 访问量:114
拆分字符向量,每个不同元素具有相等数量的条柱
split character vector with each distinct element having an equal amount of bins
问:
x <- rep(c("A","B","C"),times=c(6,8,3))
"A" "A" "A" "A" "A" "A" "B" "B" "B" "B" "B" "B" "B" "B" "C" "C" "C"
我正在努力创建一个向量,该向量对应于每个字母被划分为 3 个箱:
(A A A A A A B B B B B B B B C C C)
x_bin = 1 1 2 2 3 3 1 1 1 2 2 2 3 3 1 2 3
在此示例中,我可以通过合并每 2 个值来划分为 3 个条柱。我可以通过组合 3、3 和 2 个值来划分为 3 个箱。而且我只能通过组合 3 个值来划分 1 个箱。A
B
C
有没有允许我这样做的功能?我尝试过,但仅适用于数字数据,它没有按照我想要的方式切割。cut
dplyr
cut
答:
6赞
r2evans
11/10/2023
#1
我们可以用字母分组,然后让它达到正确的长度。这保证了编号组(每个字母)要么平衡相等,要么相差不超过 1。ave
rep(1:3, length.out=)
ave(rep(1L, length(x)), x, FUN = function(z) rep(1:3, length.out = length(z)))
# [1] 1 2 3 1 2 3 1 2 3 1 2 3 1 2 1 2 3
如果您想要所有 1 秒,2 秒后,依此类推,那么我们可以:sort
ave(rep(1L, length(x)), x, FUN = function(z) sort(rep(1:3, length.out = length(z))))
# [1] 1 1 2 2 3 3 1 1 1 2 2 2 3 3 1 2 3
验证:
ave(rep(1L, length(x)), x, FUN = function(z) sort(rep(1:3, length.out = length(z)))) |>
all.equal(x_bin)
# [1] TRUE
数据
x <- rep(c("A","B","C"),times=c(6,8,3))
x_bin <- c(1, 1, 2, 2, 3, 3, 1, 1, 1, 2, 2, 2, 3, 3, 1, 2, 3)
评论
3赞
ThomasIsCoding
11/10/2023
sort
是个好主意,干杯!
5赞
ThomasIsCoding
11/10/2023
#2
- 尝试范围
rep
ave
ave(
seq_along(x),
x,
FUN = \(v) {
rep(1:3,
each = ceiling(length(v) / 3),
length.out = length(v)
)
}
)
- 或者,另一个技巧
matrix
ave
ave(
seq_along(x),
x,
FUN = \(v)
col(matrix(nrow = ceiling(length(v) / 3), ncol = 3))[seq_along(v)]
)
哪个应该给
1 1 2 2 3 3 1 1 1 2 2 2 3 3 1 2 3
评论
3赞
r2evans
11/10/2023
each=
是另一个好主意:-)
3赞
PGSA
11/10/2023
#3
x <- rep(c("A","B","C"),times=c(6,8,3))
xdf <- data.frame(x = x)
library(tidyverse)
xdf |> group_by(x) |> mutate(bin = rep(1:3, length.out = n())) |> arrange(x, bin)
给
x bin
<chr> <int>
1 A 1
2 A 1
3 A 2
4 A 2
5 A 3
6 A 3
7 B 1
8 B 1
9 B 1
10 B 2
11 B 2
12 B 2
13 B 3
14 B 3
15 C 1
16 C 2
17 C 3
3赞
Maël
11/10/2023
#4
另一种方式是+:rle
rep
with(rle(x),
sapply(seq(length(values)),
\(z) rep(1:3,
each = ceiling(lengths[z] / 3),
length.out = lengths[z]))
) |>
unlist()
#[1] 1 1 2 2 3 3 1 1 1 2 2 2 3 3 1 2 3
3赞
Yuriy Saraykin
11/10/2023
#5
times <- c(6,8,3)
x <- rep(c("A","B","C"),times=times)
CUT <- ceiling(times / 3)
x_bin <- unlist(sapply(CUT, function(x) rep(seq(3), each = x)))
x_bin
#> [1] 1 1 2 2 3 3 1 1 1 2 2 2 3 3 3 1 2 3
创建于 2023-11-10 with reprex v2.0.2
4赞
G. Grothendieck
11/10/2023
#6
1) 我们可以像这样使用 ave/cut:
ave(x == x, x, FUN = \(x) cut(seq_along(x), 3))
## [1] 1 1 2 2 3 3 1 1 1 2 2 3 3 3 1 2 3
2) 另一种可能性是 unlist/tapply/cut:
unlist(tapply(x, x, \(x) cut(seq_along(x), 3, FALSE)))
## A1 A2 A3 A4 A5 A6 B1 B2 B3 B4 B5 B6 B7 B8 C1 C2 C3
## 1 1 2 2 3 3 1 1 1 2 2 3 3 3 1 2 3
更新
对 (1) 和添加 (2) 进行了小幅改进。
3赞
jay.sf
11/10/2023
#7
试试这个
> table(x) |> Map(\(...) sort(rep_len(...)), list(1:3), length.out=_) |> unlist()
[1] 1 1 2 2 3 3 1 1 1 2 2 2 3 3 1 2 3
条柱的长度 n=3 在 中定义。list(1:3)
评论