根据数据帧重命名列表中的值

Rename values in a list based on a dataframe

提问人:Ashley 提问时间:6/16/2023 最后编辑:Ashley 更新时间:6/16/2023 访问量:83

问:

我正在使用函数“ps_venn”将不同样本中存在的分类群与phyloseq对象进行比较。此函数输出每个样本和相交分类群的嵌套列表:

enter image description here

因为这是元条形码数据,所以分类群有这些长而复杂的名称。我有一个分类群谱系的数据帧,格式如下:enter image description here

我想根据“类”列中的名称重命名列表中的分类群。

我没有使用 R 中的列表的经验,因此我将不胜感激任何指导。谢谢!

编辑: 这是我在phyloseq对象上使用的函数:

cvenn.met <- ps_venn(combined_c, group = "method", weight = FALSE, plot = FALSE)

下面是输出列表的开头:

list(OBB = c("329966c334544d14f9985b98b813f40f", 
"8e1c87829579917f8f77f7fe7a30156a"
), OES = "2f86f2cb2e0879ebd39e60982959c8bd", QBB = c("9be3f72560b678f6bbd584632672818a", 
"3a7f78f620f4733bf2344867beae26aa", "ca57149144a6a6dfdb6e14465d3e2123", 
"8612ebe6094b2f7fd25985e5c0c36226", "5a3ed6459016f5c9398eab8f051940a0", 
"c3f2f11de98c6f64740ea772e202bcbc"), QPP = "8d7b15445ca448bec893311b47510e00", 
    QPS = c("407465934116d64a8d61c12cee90b0b0", "768ed20a18290c921ed30f24e458e25a", 
    "b7a099fb2ea20e4a13fa7c52820eeb6c"), MIC__OBB__OES__OMT__QBB__QBT__QPP__QPS = c("74bda332d0a3174634f9b496b1da8d0c", 
    "cd9cac265a41b06843d41aaa1893efd5", "72e5af8afb3fdd3323fede4d49e97bda", 
    "d785682c0a83be9275e095d76cabbe36", "fd736d603728e963c8c47487a8e48755", 
    "875f4582d8e1dc661efbda0d8bb11c22", "b8615ae8b54a17ee118afe8718d7ee11"
    ))

这是我的分类数据帧的开头:

structure(list(X = c("0021706b1ca315556a24b6d5df927e5b", "0038f2eedf8cc7893a7a9a4330aa477c", 
"003ba56d29607b45d8599085b8b69afa", "004610d70fb6092436394ca4b09bf6fb", 
"004af7f8f83f24fb7b51d8335583e14a", "0053fa60aebebf5f5e6008c70425230c"
), domain = c("Eukaryota", "Eukaryota", "Eukaryota", "Eukaryota", 
"Eukaryota", "Eukaryota"), supergroup = c("Haptista", "TSAR", 
"TSAR", "TSAR", "Obazoa", "Obazoa"), division = c("Haptophyta", 
"Alveolata", "Rhizaria", "Alveolata", "Opisthokonta", "Opisthokonta"
), subdivision = c("Haptophyta_X", NA, "Radiolaria", "Dinoflagellata", 
NA, "Metazoa"), class = c("Prymnesiophyceae", NA, "RAD-B", "Syndiniales", 
NA, "Arthropoda"), order = c("Prymnesiales", NA, "RAD-B_X", "Dino-Group-II", 
NA, "Crustacea"), family = c("Chrysochromulinaceae", NA, "RAD-B_X_Group-IVd", 
"Dino-Group-II-Clade-2", NA, "Maxillopoda"), genus = c("Chrysochromulina", 
NA, "RAD-B_X_Group-IVd_X", "Dino-Group-II-Clade-2_X", NA, NA), 
    species = c(NA_character_, NA_character_, NA_character_, 
    NA_character_, NA_character_, NA_character_), Consensus = c(0.625, 
    0.75, 0.714, 0.714, 1, 0.6)), row.names = c(NA, 6L), class = "data.frame")

我想在“类”列中使用相应的分类重命名列表中的文本字符串。

R 列表 替换 嵌套列表 phyloseq

评论

0赞 stefan_aus_hannover 6/16/2023
您能举一个可重复的数据示例吗?
0赞 divibisan 6/16/2023
您能否提供一个最小的示例,以便我们可以看到数据结构?从这些图像中可以清楚地看出您要做什么以及数据的结构。请对数据帧进行最小化,并将它们输出为文本,您可以将其粘贴到问题中。如果您能展示所需输出的示例,也会有所帮助dput
0赞 Ashley 6/16/2023
谢谢,我试图在帖子中添加更多信息。希望这会有所帮助!
0赞 Andre Wildberg 6/16/2023
@Ashley 最好给出你的数据框名称,否则答案必须提出自己的答案,每个答案都会有所不同,并且难以一致地重现。另外,一些关联的类有,那应该会发生什么?tax$XNA
1赞 Ashley 6/16/2023
@AndreWildberg谢谢,我以后会这样做的。对于这个例子,我应该选择一个更好的分类子集。Phyloseq 有一种过滤分类的底层方法,所以在这种情况下,我的实际维恩图输出不包括任何没有类名称的分类群。您在下面的回答非常有效。谢谢!

答:

0赞 Andre Wildberg 6/16/2023 #1

如果调用了输出,并且分类数据帧将征税,则使用(在用于演示目的的修改数据集上)对不匹配的字符串征税。lapplysetdiff

out[[3]][2] <- "003ba56d29607b45d8599085b8b69afa"
lapply(out, \(x) c(tax$class[tax$X %in% x], setdiff(x, tax$X)))
$OBB
[1] "329966c334544d14f9985b98b813f40f" "8e1c87829579917f8f77f7fe7a30156a"

$OES
[1] "2f86f2cb2e0879ebd39e60982959c8bd"

$QBB
[1] "RAD-B"                            "9be3f72560b678f6bbd584632672818a"
[3] "ca57149144a6a6dfdb6e14465d3e2123" "8612ebe6094b2f7fd25985e5c0c36226"
[5] "5a3ed6459016f5c9398eab8f051940a0" "c3f2f11de98c6f64740ea772e202bcbc"

$QPP
[1] "8d7b15445ca448bec893311b47510e00"

$QPS
[1] "407465934116d64a8d61c12cee90b0b0" "768ed20a18290c921ed30f24e458e25a"
[3] "b7a099fb2ea20e4a13fa7c52820eeb6c"

$MIC__OBB__OES__OMT__QBB__QBT__QPP__QPS
[1] "74bda332d0a3174634f9b496b1da8d0c" "cd9cac265a41b06843d41aaa1893efd5"
[3] "72e5af8afb3fdd3323fede4d49e97bda" "d785682c0a83be9275e095d76cabbe36"
[5] "fd736d603728e963c8c47487a8e48755" "875f4582d8e1dc661efbda0d8bb11c22"
[7] "b8615ae8b54a17ee118afe8718d7ee11"