雨云图上的缩放问题

Scaling issue on raincloud plot

提问人:user21215346 提问时间:7/27/2023 最后编辑:user21215346 更新时间:7/27/2023 访问量:32

问:

我正在尝试创建一个雨云图来显示性别分数,但是它根据分数对每个点进行分组,我希望它看起来像这张图片,其中花瓣长度按物种分组,而不是长度本身所描绘like thisscore&sex我有一直在与其他集合一起使用的代码,但是我不确定问题是什么。

我还检查了它,分数量表是连续的还是离散的,它是连续的*

这是我在 R 中使用的代码:

  dplyr::group_by(sex) %>%
  dplyr::mutate(
    mean = mean(score),
    se = sd(score) / sqrt(length(score)),
    sex_y = paste0(sex, "\n(", n(), ")")
  ) %>%
  ungroup() %>%
  ggplot(aes(x = NIH_score, y = sex_y)) +
  stat_slab(aes(fill = sex)) +
  geom_point(aes(color = sex),shape = 16,
             position = ggpp::position_jitternudge(height = 0.125, width = 0, 
                                             y = -0.125,
                                             nudge.from = "jittered")) +
  scale_fill_brewer(palette = "Set1", aesthetics = c("fill", "color")) +
  geom_errorbar(aes(
    xmin = mean - 1.96 * se,
    xmax = mean + 1.96 * se
  ), width = 0.2) +
  stat_summary(fun = mean, geom = "point", shape = 16, size = 3.0) +
  theme_bw(base_size = 10) +
  theme(legend.position = "top") +
  labs(title = "Raincloud plot with ggdist", x = "score")```
ggplot2 标度 r 因子

评论


答:

1赞 Allan Cameron 7/27/2023 #1

这并不是说您的数据是按 x 轴值分组的。只是核密度估计器的带宽太小了。

让我们用基本相同的代码重新创建你的问题,但有一些数据是虚构的:

library(tidyverse)
library(ggdist)

set.seed(1)
df <- tibble(NIH_score = sample(2:8, 200, TRUE),
             sex = sample(c("Male", "Female"), 200, TRUE),
             score = NIH_score)

df  %>%
  dplyr::group_by(sex) %>%
  dplyr::mutate(
    mean = mean(score),
    se = sd(score) / sqrt(length(score)),
    sex_y = paste0(sex, "\n(", n(), ")")
  ) %>%
  ungroup() %>%
  ggplot(aes(x = NIH_score, y = sex_y)) +
  stat_slab(aes(fill = sex), adjust = 0.1) +
  geom_point(aes(color = sex),shape = 16,
             position = ggpp::position_jitternudge(height = 0.125, width = 0, 
                                                   y = -0.125,
                                                   nudge.from = "jittered")) +
  scale_fill_brewer(palette = "Set1", aesthetics = c("fill", "color")) +
  geom_errorbar(aes(
    xmin = mean - 1.96 * se,
    xmax = mean + 1.96 * se
  ), width = 0.2) +
  stat_summary(fun = mean, geom = "point", shape = 16, size = 3.0) +
  theme_bw(base_size = 10) +
  theme(legend.position = "top") +
  labs(title = "Raincloud plot with ggdist", x = "score")

enter image description here

但是,如果我们使用参数将带宽增加到 2,我们得到:stat_slabadjust

enter image description here

目前尚不清楚您的设置或数据是什么导致了如此窄的带宽(因为两者都不在您的问题中),但您应该能够通过增加adjust

评论

0赞 user21215346 7/27/2023
是的,这奏效了!谢谢艾伦!