使用 for 循环和图形结果提取 nls 模型的模型参数-解网

问：

我有一组 10 个主题（使用下面 3 个主题的子集作为示例数据集）。我设法创建了一个遍历 3 个主题的 for 循环，并使用 nlsLM（包 minpack.lm）将每个主题拟合到 S 形曲线。nlsLM 与 nls 相同，它只是使用 Levenberg-Marquardt 算法，而不是与基本 nls 函数一起使用的高斯-牛顿算法。

我还设法提取了模型参数（平台、S50、斜率）并将它们放入数据帧中。我的代码如下：

dfRC <- read_csv("data/sampleRCdf.csv") #read in the raw data files

# list of subjects of interest to loop through
sub <- unique(dfRC$subID)

# define start to obtain the number and name of fit parameters
start <- list(plateau=1, S50=1, slope=1)

# create empty data.frame to store IDs and parameters
params.pre <- data.frame(matrix(nrow = length(sub), ncol = 1+length(start)))
names(params.pre) <- c("sub", names(start))

# nested for loop that goes by subject (i)
for(i in seq_along(sub)) {
  # create data frame for sub "i"
    individual_DFs <- dfRC %>% filter (subID %in% sub[i])
  
  # fit model for each sub "i"
    fitpre.loop <- nlsLM(mepAMP_pre ~ plateau / (1 + exp(slope*(S50 - state))), 
                         data = individual_DFs,
                         start = start, trace = TRUE)
    
  # store IDs
    params.pre[i,1] <- sub[i]
  # store fit parameters
    params.pre[i,2:ncol(params)] <- fitpre.loop$m$getPars()
}

params.pre

params.pre给：

sub plateau     S50         slope
101 3.579751    6.505194    0.6930363   
202 2.506159    3.538753    0.8300668   
303 1.971020    5.888228    0.4806047

这是针对以下数据集（其中 101、202 和 303 是我的示例主题），其中 101.y.pre 是，x 在上面的 for 循环中：mepAMP_prestate

x = c(1,2,3,4,5,6,7,8,9,10,11)

101.y.pre <- c(0.38117575, 0.11300747, 0.37239499, 0.51321839, 0.56851893, 
              1.73259296, 2.08146847, 2.80090151, 3.04446933, 2.67647473, 3.87695509)

202.y.pre <- c(0.263931535, 0.554056564, 0.903243066, 1.758670072, 1.512232414,
              2.382228869, 2.744255537, 1.943985522, 2.642561877, 2.880719751, 2.139018852)

303.y.pre <- c(0.197647216, 0.095434883, 0.523944806, 0.625025631, 0.92489588, 
              0.898288637, 0.918388724, 1.433502882, 2.127665395, 1.649622992, 1.642610593)

我有几个问题。

我做了一个健全性检查，我通过 nlsLM 单独运行每个主题，并确保输出参数与我的 for 循环的输出匹配。他们做到了！所以，它起作用了。但是，我对数据帧的输出有点困惑。我以为它会显示所有 3 个主题，但当我打开它时，它只显示主题 303。显然代码有效，但我很困惑为什么数据帧中只显示 1 个主题？individual_DFs
您会注意到，我将参数输出命名为并使用模型的值 - 这是因为我同时拥有每个受试者的治疗前和治疗后数据。.csv文件具有列标题“subID”、“state”、“mepAMP_pre”和“mepAMP_post”。有没有办法使用 for 循环遍历前值和后值，然后吐出并将它们全部拉出并拉出到单个数据帧中，就像我目前使用不同的主题作为行一样？params.premepAMP_preplateau_pre, plateau_post, S50_pre, S50_post, slope_pre, slope_postparams.pre
关于如何绘制它的任何建议？我设法使用以下代码绘制了一个主题的预拟合图：

df101 <- data.frame(x, fit101.y)

p.sample <- ggplot(data = df101, aes(x = x, y = fit101.y)) +
  geom_point() +
  geom_smooth(method = "nls", 
              data = df101,
              formula = fit101.y ~ plateau / (1 + exp(slope*(S50 - x))), start = list( plateau=1, S50=1, slope=1), 
              se = FALSE)
p.sample

我想以网格状的数字模式重叠每个主题前后的曲线。或者，也许将所有前曲线绘制到一个图形上，将所有后曲线绘制到另一个图形上。如果能帮到这些帮助，将不胜感激！

---------------编辑---------------

谢谢艾伦的帮助！这是我的完整数据帧（输出来自dput(dfRC))

structure(list(subID = c(101, 101, 101, 101, 101, 101, 101, 101, 
101, 101, 101, 202, 202, 202, 202, 202, 202, 202, 202, 202, 202, 
202, 303, 303, 303, 303, 303, 303, 303, 303, 303, 303, 303), 
    state = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 1, 2, 3, 4, 
    5, 6, 7, 8, 9, 10, 11, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11), 
    mepAMP_pre = c(0.38117575, 0.11300747, 0.37239499, 0.51321839, 
    0.56851893, 1.73259296, 2.08146847, 2.80090151, 3.04446933, 
    2.67647473, 3.87695509, 0.263931535, 0.554056564, 0.903243066, 
    1.758670072, 1.512232414, 2.382228869, 2.744255537, 1.943985522, 
    2.642561877, 2.880719751, 2.139018852, 0.197647216, 0.095434883, 
    0.523944806, 0.625025631, 0.92489588, 0.898288637, 0.918388724, 
    1.433502882, 2.127665395, 1.649622992, 1.642610593), mepAMP_post = c(0.126321776, 
    0.566816552, 0.374254417, 0.199486984, 0.510302018, 1.03651474, 
    1.697137046, 2.090100867, 3.448320717, 2.095180146, 2.897606435, 
    0.018846444, 0.041664734, 0.51243325, 0.961881685, 0.998366952, 
    2.082848001, 2.713030559, 3.373811346, 2.839989549, 3.283945894, 
    3.052075374, 0.232427913, 0.895619231, 1.194016429, 1.721528554, 
    2.249776715, 2.756416541, 4.716890788, 4.16244235, 4.757734573, 
    4.965043759, 4.732616496)), row.names = c(NA, -33L), spec = structure(list(
    cols = list(subID = structure(list(), class = c("collector_double", 
    "collector")), state = structure(list(), class = c("collector_double", 
    "collector")), mepAMP_pre = structure(list(), class = c("collector_double", 
    "collector")), mepAMP_post = structure(list(), class = c("collector_double", 
    "collector"))), default = structure(list(), class = c("collector_guess", 
    "collector")), delim = ","), class = "col_spec"), problems = <pointer: 0x10d8e15d0>, class = c("spec_tbl_df", 
"tbl_df", "tbl", "data.frame"))

我尝试运行您包含的代码，更新后的 params.pre 效果很好，但是情节很奇怪，我不确定我哪里出了问题。

代码：

params.pre <- do.call("rbind", lapply(split(dfRC, dfRC$subID), function(d) {
  mod <- nlsLM(mepAMP_pre ~ plateau / (1 + exp(slope*(S50 - state))), 
               data = d, start = list(plateau = 1, S50 = 1, slope = 1))
  as.data.frame(c(list(sub = d$subID[1]), as.list(coef(mod))))
}))

ggplot(dfRC, aes(state, mepAMP_pre, color = subID)) +
  geom_point() +
  geom_smooth(method = nlsLM, se = FALSE,
              formula = mepAMP_pre ~ plateau / (1 + exp(slope*(S50 - state))),
              method.args = list(start = list(plateau = 1, S50 = 1, slope = 1)))

它输出的数字：

r for 循环 ggplot2 nls

ggplot(dfRC_long, aes(state, mepAMP, colour = subID)) +
  geom_point() +
  geom_smooth(method = nlsLM, se = FALSE, 
              formula = y ~ plateau / (1 + exp(slope*(S50 - x))),
              method.args = list(start = list(plateau = 1, S50 = 1, slope = 1))
             ) +
  facet_grid(.~prepost)

同样，您可以跳过显式循环，使用一些 tidyverse 函数生成表：

dfRC_long %>%
  group_by(subID, prepost) %>%
  group_map(.f = ~ nlsLM(mepAMP ~ plateau / (1 + exp(slope*(S50 - state))), 
                         data = .x, 
                         start = list(plateau = 1, S50 = 1, slope = 1)) %>%
              coef() %>%
              t() %>%
              as.data.frame() %>%
              mutate(pre_or_post = .y$prepost, .before = 1) %>%
              mutate(subID = .y$subID, .before = 2)) %>%
  bind_rows() %>%
  arrange(pre_or_post, subID)
#>   pre_or_post subID  plateau      S50     slope
#> 1         pre   101 3.579751 6.505194 0.6930363
#> 2         pre   202 2.506159 3.538753 0.8300668
#> 3         pre   303 1.971020 5.888228 0.4806047
#> 4        post   101 2.874621 6.538601 0.9221484
#> 5        post   202 3.225695 5.356826 0.9406343
#> 6        post   303 5.084059 5.094672 0.6321269

^{创建于 2023-08-16 使用 reprex v2.0.2}

对不起，恢复这个。我刚刚尝试在更大的受试者样本（到目前为止有 7 个受试者）上运行它，现在在计算模型参数时出现以下错误（）：警告：lmdif：info = -1。迭代次数已达到 'maxiter' == 50。nlsModel（formula， mf， start， wts）中的错误：初始参数估计时的奇异梯度矩阵。知道可能出了什么问题吗？params

1赞 Allan Cameron 8/24/2023

@A.R.，它表明对于您的一个数据集，该函数无法找到参数的最佳值（至少在给定的起始值下）。这在 nls 合身中很常见。您可以选择首先检查数据是否正确（并且大致适合模型），尝试不同的启动参数，或尝试不同的模型

上一个：对 R 中具有单个 x 列和多个 y 列的表中的数据执行非线性回归

下一个：R_：对参数总和有约束的 NL

使用 for 循环和图形结果提取 nls 模型的模型参数

Extract model parameters for nls model using for loop and graph results

评论

评论