当行相等时,从 DataFrame 创建邻接矩阵

Make a adjacency matrix from dataframe when rows are equal

提问人:Discovery2020 提问时间:9/10/2023 最后编辑:MarkDiscovery2020 更新时间:9/10/2023 访问量:70

问:

我有一个类似于

mydf <- data.frame(Country=c('USA','Brazil','China','Italy','Ghana','Brazil','USA','China','USA'),
                               Pattern=c('XXZ','XXX','XYX','XXZ','XXX','XXX','XYZ','XXX','XYZ'),
                               Value=c(1,2,5,4,1,2,3,1,6))

如果 Pattern 的行相等,我需要一个无向邻接矩阵,这会添加值。

例如

    From     To   Pattern  Value
    Brazil  Brazil  XXX    4
    Brazil  China   XXX    3
    Brazil  Ghana   XXX    3
R iGraph 邻接矩阵 弦图

评论

1赞 Discovery2020 9/10/2023
你是对的,纠正了
0赞 Mark 9/10/2023
为什么中国加纳不在矩阵中?
0赞 Discovery2020 9/10/2023
@Mark,我的清单只是一个例子,而不是一个完整的例子。中国和加纳是可能的组合之一,应该在行列。

答:

0赞 Mark 9/10/2023 #1

听起来你想要这个:

library(tidyverse)

mydf %>%
  split(.$Pattern) %>%
  map_dfr(~ .x %>%
       mutate(n = row_number()) %>%
       cross_join(., .) %>%
       filter(n.x != n.y, Country.x <= Country.y) %>%
       reframe(Value = `Value.x` + Value.y, .by = c(Country.x, Country.y)) %>%
       rename(From = Country.x, To = Country.y) %>%
       distinct(), .id = 'Pattern')

输出:

  Pattern   From     To Value
1     XXX Brazil  Ghana     3
2     XXX Brazil Brazil     4
3     XXX Brazil  China     3
4     XXX  China  Ghana     2
5     XXZ  Italy    USA     5
6     XYZ    USA    USA     9
0赞 jay.sf 9/10/2023 #2

与案件处理一起使用。combnby

by(d, d$Pattern, \(x) {
  u <- x$Country
  out <- if (length(u) > 1) {
    combn(u, 2, FUN=\(z) {
      s <- x[x$Country %in% z, ]
      if (length(table(z)) > 1) {
        s <- unique(s)
      }
      with(s, data.frame(Country[1], Country[2], Pattern[1], sum(Value)))
    }, simplify=FALSE) |> do.call(what='rbind') |> unique()
  } else {
    with(x, data.frame(Country, NA, Pattern, Value))
    # with(x, data.frame(Country, NA, Pattern, NA))  ## alternatively
  }
  setNames(out, c('From', 'To', 'Pattern', 'Value'))
}) |> c(make.row.names=FALSE) |> do.call(what='rbind')
#     From     To Pattern Value
# 1 Brazil  Ghana     XXX     3
# 2 Brazil Brazil     XXX     4
# 3 Brazil  China     XXX     3
# 4  Ghana  China     XXX     2
# 5    USA  Italy     XXZ     5
# 6  China   <NA>     XYX     5
# 7    USA    USA     XYZ     9

数据:

d <- structure(list(Country = c("USA", "Brazil", "China", "Italy", 
"Ghana", "Brazil", "USA", "China", "USA"), Pattern = c("XXZ", 
"XXX", "XYX", "XXZ", "XXX", "XXX", "XYZ", "XXX", "XYZ"), Value = c(1, 
2, 5, 4, 1, 2, 3, 1, 6)), class = "data.frame", row.names = c(NA, 
-9L))