提问人:Malta 提问时间:7/13/2017 最后编辑:MaëlMalta 更新时间:8/3/2023 访问量:985
为直接和间接连接的值创建组索引
Create a group index for values connected directly and indirectly
问:
我想生成索引,以根据两列对观测值进行分组。但我希望由共享的观察组成小组,至少在公共空间中有一个观察。
在下面的数据中,我想检查“G1”和“G2”中的值是直接连接(出现在同一行上),还是通过其他中间值间接连接。所需的分组变量以“g”表示。
例如,A 直接链接到 Z(第 1 行)和 X(第 2 行)。A 通过 X (A -> X -> B) 间接链接到“B”,并通过 X 和 B (A -> X -> B -> Y) 进一步链接到 Y。
dt <- data.frame(id = 1:10,
G1 = c("A","A","B","B","C","C","C","D","E","F"),
G2 = c("Z","X","X","Y","W","V","U","s","T","T"),
g = c(1,1,1,1,2,2,2,3,4,4))
dt
# id G1 G2 g
# 1 1 A Z 1
# 2 2 A X 1
# 3 3 B X 1
# 4 4 B Y 1
# 5 5 C W 2
# 6 6 C V 2
# 7 7 C U 2
# 8 8 D s 3
# 9 9 E T 4
# 10 10 F T 4
我尝试过 from ,但没有成功。group_indices
dplyr
答:
20赞
zx8754
7/13/2017
#1
使用 igraph get membership,然后映射名称:
library(igraph)
# convert to graph, and get clusters membership ids
g <- graph_from_data_frame(df1[, c(2, 3, 1)])
myGroups <- components(g)$membership
myGroups
# A B C D E F Z X Y W V U s T
# 1 1 2 3 4 4 1 1 1 2 2 2 3 4
# then map on names
df1$group <- myGroups[df1$G1]
df1
# id G1 G2 group
# 1 1 A Z 1
# 2 2 A X 1
# 3 3 B X 1
# 4 4 B Y 1
# 5 5 C W 2
# 6 6 C V 2
# 7 7 C U 2
# 8 8 D s 3
# 9 9 E T 4
# 10 10 F T 4
评论