ggplot2 警告 drop = FALSE 时“删除了包含缺失值的 x 行”-解网

问：

我正在使用 ggplot2 创建一个并排条形图。我的代码在时生成正确的绘图。但是，我有一个值为 0 的级别，我想将其包含在 x 轴上。当我设置时，我收到警告：并且另一个具有非零值的类别在图上显示为零。scale_x_discrete(drop = T)scale_x_discrete(drop = F)Removed x rows containing missing values (geom_bar).

这是我的数据的重现：

library("tidyverse")

df <- data.frame(
  location = c(rep("in", 231), rep("out", 83)),
  status = c(rep("normal", 73), rep("mild", 42), rep("moderate", 20), rep("fever", 4),
             rep("normal", 70), rep("mild", 41), rep("moderate", 62), rep("fever", 2)))

df$status <- factor(df$status, levels = c("normal", "mild", "moderate", "severe", "fever"))


df %>%
  ggplot(aes(x = status,
             y = ..count../tapply(..count.., ..x.., sum)[..x..],
             fill = location)) +
  geom_bar(position = "dodge") +
  scale_y_continuous(labels = scales::percent) +
  scale_x_discrete(drop=F) +
  NULL

我一直在研究这个问题，但真的无法解决问题。

r ggplot2

#calculate totals and then calculate the %
df %>% group_by(status, location) %>% summarise(value=n()) %>%   
  group_by(status) %>% mutate(result=value/sum(value)) %>%.      
  ggplot(aes(x = status,
             y = result,
             fill = location)) +
  geom_col(position = "dodge") +
  scale_y_continuous(labels = scales::percent) +
  scale_x_discrete(drop=F)

请注意，现在geom_col而不是geom_bar。

library("tidyverse")

df <- data.frame(
  location = c(rep("in", 231), rep("out", 83)),
  status = c(rep("normal", 73), rep("mild", 42), rep("moderate", 20), rep("fever", 4),
             rep("normal", 70), rep("mild", 41), rep("moderate", 62), rep("fever", 2)))

df$status <- factor(df$status, levels = c("normal", "mild", "moderate", "severe", "fever"))

情节`..count..`

df %>%
  ggplot(aes(x = status,
             y = ..count..,
             fill = location)) +
  geom_bar(position = "dodge") +
  scale_x_discrete(drop=F)

不存在缺失的类别，我们可以从仅显示一个值的事实中推断出，即是向量..count..normal..count..

..count.. <- c(143, 64, 19, 20, 62, 4, 2)

情节`..x..`

df %>%
  ggplot(aes(x = status,
             y = ..x..,
             fill = location)) +
  geom_bar(position = "dodge") +
  scale_x_discrete(drop=F)

与缺失的类别一样，不存在是向量..count....x....x..

..x.. <- c(1, 2, 2, 3, 3, 5, 5)

为什么代码不起作用

作为第一步，我计算它为我们提供了一个长度为 4 的向量（非缺失状态类别的总计数）：tapply(..count.., ..x.., sum)

tapply(..count.., ..x.., sum)
#>   1   2   3   5 
#> 143  83  82   6

现在，通过以下方式提取元素[..x..]

tapply(..count.., ..x.., sum)[..x..]
#>    1    2    2    3    3 <NA> <NA> 
#>  143   83   83   82   82   NA   NA

或

..count.. / tapply(..count.., ..x.., sum)[..x..]
#>      1      2      2      3      3   <NA>   <NA> 
#> 1.0000 0.7711 0.2289 0.2439 0.7561     NA     NA

因此，您的代码导致最后两个类别缺少两个，这解释了警告。原因是我们试图从长度 4 向量中提取两倍的第 5 个元素，从而取回 NA。Removed 2 rows containing missing values (geom_bar)..x.. <- c(1, 2, 2, 3, 3, 5, 5)tapply(..count.., ..x.., sum)

万一一切正常，因为在这种情况下，而是一样的。drop=TRUE..x.. <- c(1, 2, 2, 3, 3, 4, 4)..count..

溶液

这个问题可以通过转换为字符向量来解决。在这种情况下，我们按名称提取元素：..x..

library("tidyverse")

df <- data.frame(
  location = c(rep("in", 231), rep("out", 83)),
  status = c(rep("normal", 73), rep("mild", 42), rep("moderate", 20), rep("fever", 4),
             rep("normal", 70), rep("mild", 41), rep("moderate", 62), rep("fever", 2)))

df$status <- factor(df$status, levels = c("normal", "mild", "moderate", "severe", "fever"))

# Convert ..x.. to character
df %>%
  ggplot(aes(x = status,
             y = ..count.. / tapply(..count.., ..x.., sum)[as.character(..x..)],
             fill = location)) +
  geom_bar(position = "dodge") +
  scale_x_discrete(drop=F)

^{由 reprex 软件包（v0.3.0）于 2020-03-23 创建}

上一个：使用 dplyr mutate（）和 if_else（）的另一个长度错误

下一个：将 dplyr summarise（）用于 purrr map（）中的特定列，并带有分组数据

ggplot2 警告 drop = FALSE 时“删除了包含缺失值的 x 行”

ggplot2 warning "removed x rows containing missing values" when drop = FALSE

评论

评论

情节`..count..`

情节`..x..`

为什么代码不起作用

溶液

ggplot2 警告 drop = FALSE 时“删除了包含缺失值的 x 行”

ggplot2 warning "removed x rows containing missing values" when drop = FALSE

评论

评论

情节..count..

情节..x..

为什么代码不起作用

溶液

情节`..count..`

情节`..x..`