如何在 R 中将多元丰度矩阵转换为出现表?

How to convert a multivariate abundance matrix into a occurrence table in R?

提问人:kjtheron 提问时间:2/26/2021 最后编辑:kjtheron 更新时间:3/1/2021 访问量:311

问:

我有一个多变量丰度矩阵(按地点划分的物种丰度):

Site<-c("1","2","3","4","5","6")
Long<-c("30.01565","29.99297","29.98867","29.95418","29.96438","29.93963")
Lat<-c("-29.71932","-29.69708","-29.70216","-29.65436","-29.66999","-29.66700")
Sp_A<-c("0","0","3","3","0","8")
Sp_B<-c("3","7","1","0","0","0")
Sp_C<-c("1","0","5","0","3","6")
Sp_D<-c("5","4","3","3","4","4")
data<-cbind(Site,Long,Lat,Sp_A,Sp_B,Sp_C,Sp_D)
     Site Long       Lat         Sp_A Sp_B Sp_C Sp_D
[1,] "1"  "30.01565" "-29.71932" "0"  "3"  "1"  "5" 
[2,] "2"  "29.99297" "-29.69708" "0"  "7"  "0"  "4" 
[3,] "3"  "29.98867" "-29.70216" "3"  "1"  "5"  "3" 
[4,] "4"  "29.95418" "-29.65436" "3"  "0"  "0"  "3" 
[5,] "5"  "29.96438" "-29.66999" "0"  "0"  "3"  "4" 
[6,] "6"  "29.93963" "-29.66700" "8"  "0"  "6"  "4" 

我需要使用这个矩阵并创建一个新的表/矩阵。新表应有 4 列,具体为 Species、Long、Lat 和 Presence。对于这张表,我不关心丰度值,而是关心某个位置是否存在特定物种。因此,新表将具有单个物种的多个副本,位于不同位置,在这些位置存在和不存在。该表应如下所示。仅供Sp_A示例:

##    Species   Long        Lat         Presence
## 1  Sp_A      30.01565    -29.71932   0 #Binary meaning absent
## 2  Sp_A      29.99297    -29.69708   0
## 3  Sp_A      29.98867    -29.70216   1 #Binary meaning present
## 4  Sp_A      29.95418    -29.65436   1
## 5  Sp_A      29.96438    -29.66999   0
## 6  Sp_A      29.93963    -29.66700   1

换句话说,对于 die 多元丰度矩阵中的每个物种,我想为每个观测值创建单独的记录。如何使用函数在 R 中自动执行此过程?我的数据格式化技能非常生疏。任何建议将不胜感激。

R 矩阵 Tidyverse 数据操作

评论

0赞 AnilGoyal 2/26/2021
为什么您的值以字符串/字符的形式存储在矩阵中?

答:

1赞 Ronak Shah 2/26/2021 #1

也许像这样?

library(dplyr)
library(tidyr)

data %>%
  select(-Site) %>%
  pivot_longer(cols = starts_with('Sp'), values_to = 'Presence') %>%
  mutate(Presence = pmin(Presence, 1)) %>%
  arrange(name)

#    Long   Lat name  Presence
#   <dbl> <dbl> <chr>    <dbl>
# 1  30.0 -29.7 Sp_A         0
# 2  30.0 -29.7 Sp_A         0
# 3  30.0 -29.7 Sp_A         1
# 4  30.0 -29.7 Sp_A         1
# 5  30.0 -29.7 Sp_A         0
# 6  29.9 -29.7 Sp_A         1
# 7  30.0 -29.7 Sp_B         1
# 8  30.0 -29.7 Sp_B         1
# 9  30.0 -29.7 Sp_B         1
#10  30.0 -29.7 Sp_B         0
# … with 14 more rows

数据

data<-data.frame(Site,Long,Lat,Sp_A,Sp_B,Sp_C,Sp_D)
data <- type.convert(data)

评论

0赞 kjtheron 2/26/2021
感谢您的快速回复 Ronak。在上面的示例数据集中,我的物种名称过于简单化。每个物种名称都以不同的字母开头,因此不起作用。cols = starts_with('Sp')
0赞 Ronak Shah 2/26/2021
那么你怎么知道哪些列有物种信息呢?也可以将列号用作 。cols = 3:6
0赞 kjtheron 2/26/2021
矩阵的前 3 列有位置和 ID 数据,矩阵的其余部分是物种信息。应该有效。让我试试。cols=3:6
1赞 Ronak Shah 2/26/2021
您也可以用于忽略前 2 列。(我已经删除了带有 ).cols = -(1:2)Speciesselect
0赞 kjtheron 2/26/2021
我的其他一些加载的库与 冲突。但是,您的解决方案运行良好。非常感谢罗纳克:)select