根据 R 中现有列中的条件创建一个新列?

Make a new column based on a condition in existing column in R?

提问人:Ahsk 提问时间:2/25/2023 最后编辑:Ahsk 更新时间:2/26/2023 访问量:60

问:

我每天都有天气记录。我需要根据现有列和 创建新列。对于每年,我都需要计算温度/湿度/风速值值的每个唯一组合的暴露持续时间。例如,我想知道 2006 年记录了多少天mean_ws?mean_rhmean_tempmean_ws0.06

原始数据示例

enter image description here

我可以说只录制了一天。0.06

以下是上述数据的可重现示例

library(dplyr)

# create example data frame
df <- data.frame(
  year = c(2006, 2006, 2006, 2006, 2006, 2006, 2006, 2006, 2006, 2006, 2006, 2006, 2006, 2006, 2006, 2006, 2006, 2006, 2006, 2006),
  date = c("11/9/06", "12/9/06", "13/9/06", "14/9/06", "15/9/06", "16/9/06", "17/9/06", "18/9/06", "19/9/06", "20/9/06", "21/9/06", "22/9/06", "23/9/06", "24/9/06", "25/9/06", "26/9/06", "27/9/06", "28/9/06", "29/9/06", "30/9/06"),
  mean_rh = c(69.49, 69.05, 68.47, 65.99, 65.53, 67.07, 72.42, 91.79, 90.13, 76.88, 67.68, 61.11, 70.19, 66.88, 76.63, 89.09, 84.97, 80.08, 79.82, 82.81),
  mean_temp = c(18.05, 20.23, 20.93, 20.58, 18.64, 18.74, 19.13, 15.98, 14.62, 14.06, 17.03, 19.76, 18.64, 18.27, 16.96, 16.31, 14.97, 15.69, 16.34, 16.02),
  mean_ws = c(0.84, 0.33, 0.46, 0.79, 2.25, 1.95, 0.13, 0.06, 1.10, 0.90, 1.10, 1.20, 0.33, 0.36, 0.27, 0.66, 0.22, 0.62, 0.33, 0.18)
)

# convert the date column to a date format
df$date <- as.Date(df$date, format = "%d/%m/%y")

以温度为例,我想要这样的东西(每年每个温度的天数总和

enter image description here

我正在使用以下代码,但是当我将它们与原始数据匹配时,我的计算不正确。我的代码需要修复。

# calculate the exposure duration for each unique `mean_temp` value
df2 <- df %>%
  group_by(year, mean_temp) %>%
  summarise(exposure_duration = difftime(max(date), min(date), units = "days")) %>%
  ungroup()

# spread the data to have separate columns for each unique `mean_temp` value
df3 <- df2 %>%
  spread(key = mean_temp, value = exposure_duration, fill = 0)

上面的代码只是为了 ,但我也想。谢谢!mean_tempgroup_bymean_wsmean_rh

R DataFrame tidyr 操作 数据 挖掘

评论


答:

0赞 MarBlo 2/26/2023 #1

据我了解的任务:我想知道 0.06 年记录了多少天mean_ws 2006?

这可以通过 来完成。由于您的数据只有 的数据,因此我已将第一年更改为 。xtabs20062005

df <- data.frame(
  year = c(
    2005, 2005, 2005, 2005, 2005, 2005, 2005, 2005, 2006, 2006, 2006, 2006, 2006, 2006, 2006, 2006, 2006, 2006, 2006, 2006),
  date = c("11/9/06", "12/9/06", "13/9/06", "14/9/06", "15/9/06", "16/9/06", "17/9/06", "18/9/06", "19/9/06", "20/9/06", "21/9/06", "22/9/06", "23/9/06", "24/9/06", "25/9/06", "26/9/06", "27/9/06", "28/9/06", "29/9/06", "30/9/06"),
  mean_rh = c(69.49, 69.05, 68.47, 65.99, 65.53, 67.07, 72.42, 91.79, 90.13, 76.88, 67.68, 61.11, 70.19, 66.88, 76.63, 89.09, 84.97, 80.08, 79.82, 82.81),
  mean_temp = c(18.05, 20.23, 20.93, 20.58, 18.64, 18.74, 19.13, 15.98, 14.62, 14.06, 17.03, 19.76, 18.64, 18.27, 16.96, 16.31, 14.97, 15.69, 16.34, 16.02),
  mean_ws = c(0.84, 0.33, 0.46, 0.79, 2.25, 1.95, 0.13, 0.06, 1.10, 0.90, 1.10, 1.20, 0.33, 0.36, 0.27, 0.66, 0.22, 0.62, 0.33, 0.18)
)

# convert the date column to a date format
df$date <- as.Date(df$date, format = "%d/%m/%y")


xtabs(~year + mean_temp, data = df)
#>       mean_temp
#> year   14.06 14.62 14.97 15.69 15.98 16.02 16.31 16.34 16.96 17.03 18.05 18.27
#>   2005     0     0     0     0     1     0     0     0     0     0     1     0
#>   2006     1     1     1     1     0     1     1     1     1     1     0     1
#>       mean_temp
#> year   18.64 18.74 19.13 19.76 20.23 20.58 20.93
#>   2005     1     1     1     0     1     1     1
#>   2006     1     0     0     1     0     0     0