提问人:Max 提问时间:9/23/2023 更新时间:9/24/2023 访问量:47
如何在行和列的 for 循环中构造一个“If 循环”?
How do I construct an "If loop" inside of a for loop for rows and columns?
问:
我在 R 中有一个数据框,其中包含折叠变化 (fc) 的测量值和fc_0.1、fc_0.2 等列,直到 fc_2。我尝试用 1 填充每列中的每一行,如果它等于或大于基于特定列名的第一列的值。我最初通过指定每一列来非常繁琐地做到这一点,现在我正在尝试缩短代码并对其进行概括。我已经让这个 for 循环做了很多工作,但我无法将这些 if 语句集成到循环中。这真的很棒,我不必为每 0.1 个增量编写一个 if 语句。
fc <- c(0.82,0.03,1.52,0.14,0.61,1.88,0.29,1.91,0.32,0.46,1.76,0.54,1.45,2.01,0.71,1.45,0.90,1.01,1.17,1.68,1.21,1.37)
data<- data.frame(abs(fc)) # create a data frame fold change
z<- data.frame(rep.int(0, 22))
z[2:20] <- z[[1]] # create a data frame of zeros
data <- data.frame(data,z) # add the two data frames together
names(data) <- c("fc",paste0("fc_", seq(0.1, 2, by = 0.1))) # give names to the columns
data
for(i in 1:nrow(data)){
for(j in 2:ncol(data)){
if(data[i,1]>=0.1){
data[i,2]=1
}
if(data[i,1]>=0.2){
data[i,3]=1
}
if(data[i,1]>=0.3){
data[i,4]=1
}
if(data[i,1]>=0.4){
data[i,5]=1
}
if(data[i,1]>=0.5){
data[i,6]=1
}
if(data[i,1]>=0.6){
data[i,7]=1
}
if(data[i,1]>=0.7){
data[i,8]=1
}
if(data[i,1]>=0.8){
data[i,9]=1
}
if(data[i,1]>=0.9){
data[i,10]=1
}
if(data[i,1]>=1.0){
data[i,11]=1
}
}
}
data
如果我想填充所有列,我必须为所有 0.1 增量编写一个 if 语句。这使得代码很长,如果我增加所需的值量,则无法扩展。
我尝试了第三个 for 循环来尝试对每列的 if 语句计算的数字进行积分。
for(i in 1:nrow(data)){
for(j in 2:ncol(data)){
for(k in seq(0.1, 2, by = 0.1)){
if(data[i,1]>=k){
data[i,j]=1
}
}
}
}
这将在每行和每列中填充 1,而不管第一列的值如何(除了 0.03,它神秘地仍然填充了 0)。我想不出一种方法来简化我的代码,但我知道可以以某种方式完成。请帮忙。
答:
看似晦涩难懂,但这是更惯用的 R:
data[,-1] <- (+outer(data[[1]], as.numeric(sub("fc_", "", names(data)[-1])), `<`))
data
# fc fc_0.1 fc_0.2 fc_0.3 fc_0.4 fc_0.5 fc_0.6 fc_0.7 fc_0.8 fc_0.9 fc_1 fc_1.1 fc_1.2 fc_1.3 fc_1.4 fc_1.5 fc_1.6 fc_1.7 fc_1.8 fc_1.9 fc_2
# 1 0.82 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1
# 2 0.03 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
# 3 1.52 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1
# 4 0.14 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
# 5 0.61 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1
# 6 1.88 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1
# 7 0.29 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
# 8 1.91 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
# 9 0.32 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
# 10 0.46 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
# 11 1.76 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1
# 12 0.54 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
# 13 1.45 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1
# 14 2.01 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
# 15 0.71 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1
# 16 1.45 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1
# 17 0.90 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1
# 18 1.01 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1
# 19 1.17 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1
# 20 1.68 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1
# 21 1.21 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1
# 22 1.37 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1
演练:
as.numeric(sub(..))
只是提取列名并对其进行编号,结果是 。c(0.1, 0.2, 0.3, 0.4, ..., 2)
data[[1]]
可以是或其他什么,它只是返回您需要的实际数字data$fc
fc
outer(val1, val2, `<`)
对 中的所有值进行外部计算,其中所有值都在 中。它返回维度 x 的矩阵:val1
val2
length(val1)
length(val2)
outer(data[[1]], as.numeric(sub("fc_", "", names(data)[-1])), `<`
由于我们真的想要 s 和 s,因此我们将其包装为 R 从逻辑到整数的隐式转换:
1
0
+(..)
+(outer(data[[1]], as.numeric(sub("fc_", "", names(data)[-1])), `<`)) # [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [,19] [,20] # [1,] 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 # [2,] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 # [3,] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 # [4,] 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 # [5,] 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 # [6,] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 # [7,] 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 # [8,] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 # [9,] 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 # [10,] 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 # [11,] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 # [12,] 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 # [13,] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 # [14,] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 # [15,] 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 # [16,] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 # [17,] 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 # [18,] 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 # [19,] 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 # [20,] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 # [21,] 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 # [22,] 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1
然后,我们用 重新分配回框架(除第 1 列外的所有帧)。
data[,-1] <- ...
一些想法:在 R 中,将事物作为整体向量而不是元素处理通常要快得多。例如
1:10 < 5
# [1] TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE
做了十个比较,但表达方式(至少在我的脑海中)要简单得多,容易理解。这个逻辑可以扩展到矩阵等(尽管我在这里不使用它)。我们确实在调用 中隐式使用了该操作。在内部,产生两个更长的向量。举一个更小的例子,`<`
outer
outer
vec1 <- 1:3
vec2 <- 11:14
outer(vec1, vec2, function(a, b) { browser(); a + b; })
# Browse[2]>
a
# [1] 1 2 3 1 2 3 1 2 3 1 2 3
# Browse[2]>
b
# [1] 11 11 11 12 12 12 13 13 13 14 14 14
请注意,它扩展为 的长度,并作为整体重复;它对每个值重复了几次,更改值的速度更慢。vec1
length(vec1)*length(vec2)
vec1
length(vec2)
vec2
length(vec1)
有了 that, and(我们非函数的第一个参数)并且两个长度相同,它就简单地 ,这是两个向量的分段相加。a
b
a + b
# Browse[2]>
a + b
# [1] 12 13 14 13 14 15 14 15 16 15 16 17
然后,此输出被重新调整尺寸,使其处于正确的矩阵中,如预期的那样。(有时这个矩阵不是必需的,但是取消基质化它是微不足道的,然后重新维度到原始参数的长度。outer
outer
c(outer(..))
# Browse[2]>
c # <--- continues out of the debugger
# [,1] [,2] [,3] [,4]
# [1,] 12 13 14 15
# [2,] 13 14 15 16
# [3,] 14 15 16 17
数据
data <- structure(list(fc = c(0.82, 0.03, 1.52, 0.14, 0.61, 1.88, 0.29, 1.91, 0.32, 0.46, 1.76, 0.54, 1.45, 2.01, 0.71, 1.45, 0.9, 1.01, 1.17, 1.68, 1.21, 1.37), fc_0.1 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), fc_0.2 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), fc_0.3 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), fc_0.4 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), fc_0.5 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), fc_0.6 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), fc_0.7 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), fc_0.8 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), fc_0.9 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), fc_1 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), fc_1.1 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), fc_1.2 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), fc_1.3 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), fc_1.4 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), fc_1.5 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), fc_1.6 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), fc_1.7 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), fc_1.8 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), fc_1.9 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), fc_2 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)), class = "data.frame", row.names = c(NA, -22L))
这是一个简短的、(我认为)可读的 dplyr 方法,使用 .这表示我们想要 1) 取 ,然后 2) 将除 3) 之外的所有列更改为测试的数字版本(例如 1 表示 TRUE,0 表示 FALSE),将当前列名的解析值与 中的值进行比较。across
data
fc
fc
library(dplyr)
data |>
mutate(across(-fc, ~1*(parse_number(cur_column()) > fc)))
结果
fc fc_0.1 fc_0.2 fc_0.3 fc_0.4 fc_0.5 fc_0.6 fc_0.7 fc_0.8 fc_0.9 fc_1 fc_1.1 fc_1.2 fc_1.3 fc_1.4 fc_1.5 fc_1.6 fc_1.7 fc_1.8 fc_1.9 fc_2
1 0.82 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1
2 0.03 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
3 1.52 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1
4 0.14 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
5 0.61 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1
6 1.88 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1
7 0.29 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
8 1.91 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
9 0.32 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
10 0.46 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
11 1.76 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1
12 0.54 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
13 1.45 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1
14 2.01 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
15 0.71 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1
16 1.45 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1
17 0.90 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1
18 1.01 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1
19 1.17 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1
20 1.68 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1
21 1.21 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1
22 1.37 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1
这是一个使用重塑的整洁方法。对行进行编号,调整更长的形状,将值与列名中的数字进行比较,然后再次调整宽度:fc
library(tidyverse)
data |>
mutate(row = row_number()) |> # not necessary if `fc` values guaranteed unique
pivot_longer(-c(row, fc)) |>
mutate(value = 1 * (parse_number(name) > fc)) |>
pivot_wider(names_from = name, values_from = value)
# A tibble: 22 × 22
fc row fc_0.1 fc_0.2 fc_0.3 fc_0.4 fc_0.5 fc_0.6 fc_0.7 fc_0.8 fc_0.9 fc_1 fc_1.1 fc_1.2 fc_1.3 fc_1.4 fc_1.5 fc_1.6 fc_1.7 fc_1.8 fc_1.9 fc_2
<dbl> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 0.82 1 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1
2 0.03 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
3 1.52 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1
4 0.14 4 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
5 0.61 5 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1
6 1.88 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1
7 0.29 7 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
8 1.91 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
9 0.32 9 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
10 0.46 10 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
# ℹ 12 more rows
评论