提问人:John Steed 提问时间:11/9/2023 最后编辑:John Steed 更新时间:11/9/2023 访问量:29
data.table 长数据格式的逐行计算
Row-wise calculations in data.table long data format
问:
Hi Stackover flow 社区
我是学习新手,所以感谢您的耐心等待。我正在学习data.table,并想使用data.table计算进行逐行计算。请参阅附有 6 列的图像(图像仅显示一小部分数据):在此处输入图像描述
我想计算失业率,即“失业总人数/劳动力总人数”。由于数据采用长格式,因此失业总数和劳动力总数都在“变量”列内。我想要按“性别”、“地区”和“data_type”计算失业率。
我的问题是:如何对失业率进行这样的计算(使用 data.table 而不是 dplyr),因为我理想情况下想要成行的失业率结果。
例如
谢谢,我还附上了可以帮助您生成原始数据集的代码。
rm(list=ls())
#Bring in all installed packages
library(readabs)
library(tidyverse)
library(dplyr)
library(stringr)
library(lubridate)
library(data.table)
library(tidyr)
library(fy)
library(psych)
library(plyr)
#**************************************************************************************************************
#Labour force indicators
abs_labour_force_base <- read_abs(cat_no = '6202.0',
tables = 12,
series_id = NULL,
metadata = TRUE,
show_progress_bars = FALSE,
retain_files = FALSE,
check_local = FALSE
)
#split out series column
abs_labour_force_base <- separate_series(abs_labour_force_base)
setDT(abs_labour_force_base)
# create raw data set
d <-
lf_monthly_qld <- abs_labour_force_base[
series_1 %in% c("Unemployed total","Labour force total") &
series_2 %in% c("Males","Female") &
series_3 %in% c("Australia","Victoria") &
series_type %in% c("Seasonally Adjusted", "Trend") &
date > as_date("2023-01-08"),]
keep_cols = c("date", "series_1", "series_2","series_3","value","series_type")
d <- d[, ..keep_cols]
colnames(d)<-c("date","variable","sex","region","value","data_type")
我没有尝试太多,因为我不确定是否是进行此操作的最佳方法。
答:
1赞
Hugh
11/9/2023
#1
用:dcast
d <-
abs_labour_force_base[
series_1 %in% c("Unemployed total","Labour force total") &
series_2 %in% c("Males","Females") &
series_3 %in% c("Australia","Victoria") &
series_type %in% c("Seasonally Adjusted", "Trend") &
date > "2023-01-08"]
Ans <- dcast(d, date + series_type + series_2 + series_3 ~ series_1, value.var = "value")
Ans[, "UnemploymentRate" := `Unemployed total` / `Labour force total`]
请注意您原始问题中的小错误(它是)。您可以使用软件包 hutils 来避免此类错误。series_2 %in% c("Males", "Females")
"Female"
%ein%
library(hutils)
abs_labour_force_base[series_2 %ein% c("Males", "Female")]
#> Error: `rhs` contained Female, but this value was not found in `lhs = series_2`. All values of `rhs` must be in `lhs`. Ensure you have specified `rhs` correctly.
评论
0赞
John Steed
11/9/2023
谢谢,你非常善良,你的及时回复。
评论