提问人:Sri Sreshtan 提问时间:5/15/2020 更新时间:5/15/2020 访问量:350
在将字符变量转换为整数时,有一条消息说:强制引入的 NA。如何避免此错误?
While converting character variable into integer, there is a message saying : NAs introduced by coercion. How to avoid this error?
问:
我尝试使用函数将字符变量转换为整数变量。但是,在执行代码时,输出将返回值为 。代码如下:as.integer
NA
library(tidyverse)
coal_data <- read.csv("http://594442.youcanlearnit.net/coal.csv", skip = 2)
coal_data %>% glimpse()
colnames(coal_data)[1] <- "region"
coal_long <- gather(coal_data, 'year', 'coal_consumption', -region)
coal_long %>% glimpse()
coal_long %>% separate(year, into = c("x", "year"), sep = "X")%>%
select(-x)%>% glimpse()
class(coal_long$year)
coal_long$year <- as.integer(coal_long$year)
输出如下
coal_long %>% glimpse()
Rows: 6,960
Columns: 3
$ region <fct> "North America", "Bermuda", "Canada", "Greenland", "Mexico",...
$ year <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
$ coal_consumption <chr> "16.45179", "0", "0.96156", "0.00005", "0.10239", "0", "15.3...
预期的实际产出以整数形式获得这一年。 非常感谢您提前调查此事。
答:
1赞
Jeff Bezos
5/15/2020
#1
在转换为整数之前,您需要删除这些字母。试试这样的事情。coal_long$year
coal_long$year # X1980 X1981 X1982 X1983, etc.
as.integer(str_remove(coal_long$year, "X"))
这是一种更通用的方法,在转换之前从字符串中提取所有数字。
as.integer(str_extract(coal_long$year, "\\d+"))
评论
0赞
Sri Sreshtan
5/15/2020
感谢您提供代码。您提供的代码有助于获得这一年。但是,coal_long$year 变量的类仍然是字符。它没有被更改为整数。
0赞
Jeff Bezos
5/16/2020
您是否重新分配了 coal_long$year?当我这样做时,我得到整数class(as.integer(str_remove(coal_long$year, "X")))
0赞
Sri Sreshtan
5/16/2020
是的,先生,它在分配 coal_long$ 年后起作用了。谢谢。
2赞
thorepet
5/15/2020
#2
删除列中的后,您需要重新分配。coal_long
X
year
coal_long <- coal_long %>%
separate(year, into = c("x", "year"), sep = "X") %>%
select(-x) %>%
glimpse()
coal_long$year <- as.integer(coal_long$year)
coal_long %>% glimpse()
Rows: 6,960
Columns: 3
$ region <fct> "North America", "Bermuda", "Canada", "Greenland", "Mexico", "Saint Pierre and Miquelon", "United States", "Cent…
$ year <int> 1980, 1980, 1980, 1980, 1980, 1980, 1980, 1980, 1980, 1980, 1980, 1980, 1980, 1980, 1980, 1980, 1980, 1980, 1980…
$ coal_consumption <chr> "16.45179", "0", "0.96156", "0.00005", "0.10239", "0", "15.38779", "0.42011", "0", "0", "0.03476", "--", "0", "0…
3赞
Chuck P
5/15/2020
#3
不妨在你做的时候coal_consumption加倍......
library(tidyverse)
coal_data <- read.csv("http://594442.youcanlearnit.net/coal.csv", skip = 2, na.strings = "--")
colnames(coal_data)[1] <- "region"
coal_long <- gather(coal_data, 'year', 'coal_consumption', -region)
coal_long %>% glimpse()
#> Rows: 6,960
#> Columns: 3
#> $ region <chr> "North America", "Bermuda", "Canada", "Greenland", "…
#> $ year <chr> "X1980", "X1980", "X1980", "X1980", "X1980", "X1980"…
#> $ coal_consumption <dbl> 16.45179, 0.00000, 0.96156, 0.00005, 0.10239, 0.0000…
coal_long <- coal_long %>% separate(year, into = c("x", "year"), sep = "X") %>%
select(-x) %>% glimpse()
#> Rows: 6,960
#> Columns: 3
#> $ region <chr> "North America", "Bermuda", "Canada", "Greenland", "…
#> $ year <chr> "1980", "1980", "1980", "1980", "1980", "1980", "198…
#> $ coal_consumption <dbl> 16.45179, 0.00000, 0.96156, 0.00005, 0.10239, 0.0000…
class(coal_long$year)
#> [1] "character"
coal_long$year <- as.integer(str_remove(coal_long$year, "X"))
glimpse(coal_long)
#> Rows: 6,960
#> Columns: 3
#> $ region <chr> "North America", "Bermuda", "Canada", "Greenland", "…
#> $ year <int> 1980, 1980, 1980, 1980, 1980, 1980, 1980, 1980, 1980…
#> $ coal_consumption <dbl> 16.45179, 0.00000, 0.96156, 0.00005, 0.10239, 0.0000…
评论
1赞
Sri Sreshtan
5/15/2020
非常感谢您提供代码。成功了。
0赞
Chuck P
5/15/2020
没问题,乐意帮忙
评论