将节点路径转换为列表的树结构列表

Transform a path of nodes into a tree structure list of lists

提问人:Kamaloka 提问时间:11/17/2023 更新时间:11/17/2023 访问量:37

问:

我有一个data.frame,其中一列表示节点的路径,我想将其转换为树。我有什么简单的功能可以做到这一点吗?

下面是一个简单的示例:

data <- data.frame(
  Name = c("A", "A1", "A2", "A1a", "A1b", "A2a", "A2b", "A2c"),
  Path = c("1", "1,1", "1,2", "1,1,1", "1,1,2", "1,2,1", "1,2,2", "1,2,3")
)

我想将其转换为:


nodes <- list(
  list(
    text = "A",
    li_attr = list(id = "1")
    state = list(opened = TRUE),
    
    children = list(
      list(
        text = "A1",
        li_attr = list(id = "1,1")
        state = list(opened = TRUE),
        
        children = list(
          list(
            text = "A1a",
            li_attr = list(id = "1,1,1")),
          list(
            text = "A1b",
            li_attr = list(id = "1,1,2"))
          )),

        list(
            text = "A2",
            li_attr = list(id = "1,2")
            state = list(opened = TRUE),
            
            children = list(
              list(
                text = "A2a",
                li_attr = list(id = "1,2,1")),
              list(
                text = "A2b",
                li_attr = list(id = "1,2,2")),
              list(
                text = "A2c",
                li_attr = list(id = "1,2,3"))
            
        )
      )
    )
  )
)
R

评论


答:

1赞 Allan Cameron 11/17/2023 #1

没有现有的函数可以完全按照您的要求执行操作。你必须编写一个递归函数来为你做这件事:

f <- function(path, name, parent_node = NULL) {
  nodes <- strsplit(path, ',')
  parents <- match(unique(sapply(nodes, `[`, 1)), path)
  lapply(parents, function(i) {
    li <- list(Text = name[i],
         li_attr = list(id = if(is.null(parent_node)) parents[i]
                          else paste(parent_node, parents[i], sep = ',')))
    kids <- sapply(nodes, \(x) x[1] == parents[i])
    if(length(kids) > 0) {
      if(sum(kids) > 1) {
        li$opened = TRUE
        n <- which(kids & lengths(nodes) > 1)
        li$children <-
        f(sapply(nodes[n], \(x) paste0(x[-1], collapse = ',')), name[n],
          parent_node = li$li_attr$id)
      }
    }
    return(li)
  })
}

我们这样称呼它:

result <- f(data$Path, data$Name)

结果如下所示:

list(
  list(Text = "A", 
       li_attr = list(id = 1L), 
       opened = TRUE, 
       children = list(
         list(Text = "A1", 
              li_attr = list(id = "1,1"), 
              opened = TRUE, 
              children = list(
                list(Text = "A1a", 
                     li_attr = list(id = "1,1,1")), 
                list(Text = "A1b", 
                     li_attr = list(id = "1,1,2"))
                )
              ), 
         list(Text = "A2", 
              li_attr = list(id = "1,2"), 
              opened = TRUE, 
              children = list(
                list(Text = "A2a", 
                     li_attr = list(id = "1,2,1")), 
                list(Text = "A2b", 
                     li_attr = list(id = "1,2,2")), 
                list(Text = "A2c", 
                     li_attr = list(id = "1,2,3"))
                )
              )
         )
       )
  )

这看起来像是 JSON 的典型结构,在这种情况下,您可以使用 将其转换为 JSON,这将产生:jsonlite::toJSON(result)

[
  {
    "Text": ["A"],
    "li_attr": {"id": [1]},
    "opened": [true],
    "children": [
      {
        "Text": ["A1"],
        "li_attr": {
          "id": ["1,1"]
        },
        "opened": [true],
        "children": [
          {
            "Text": ["A1a"],
            "li_attr": {"id": ["1,1,1"]}
          },
          {
            "Text": ["A1b"],
            "li_attr": {"id": ["1,1,2"]}
          }
        ]
      },
      {
        "Text": ["A2"],
        "li_attr": {"id": ["1,2"]},
        "opened": [true],
        "children": [
          {
            "Text": ["A2a"],
            "li_attr": {"id": ["1,2,1"]}
          },
          {
            "Text": ["A2b"],
            "li_attr": {"id": ["1,2,2"]
            }
          },
          {
            "Text": ["A2c"],
            "li_attr": {"id": ["1,2,3"]}
          }
        ]
      }
    ]
  }
]
2赞 I_O 11/17/2023 #2

{data.tree} 有助于处理分层数据结构。就您而言:

  • 将 A 添加到数据框(一个以斜杠分隔的字符串,如目录路径,其中变量名称的最后一个字母对应于端点,每个前一个字母对应于上游文件夹;最后将数据帧转换为树,使用:pathStringas.Node
library(data.tree)
library(dplyr)

the_treedata <- 
    data |>
    rowwise() |>
    mutate(pathString = strsplit(Name, '') |> unlist() |> paste(collapse = '/'))
## > the_treedata
## # A tibble: 8 x 3
## # Rowwise: 
##   Name  Path  pathString
##   <chr> <chr> <chr>     
## 1 A     1     A         
## 2 A1    1,1   A/1       
## 3 A2    1,2   A/2       
## 4 A1a   1,1,1 A/1/a     
## 5 A1b   1,1,2 A/1/b  
  • 转换为数据树:
my_tree <- my_treedata |> as.Node()
  • 遍历树和将自定义函数应用于每个节点的结果作为列表:Get
the_list <- 
    the_tree$Get(\(node) list(text = node$name,
                              li_attr = list(node$Path),
                              state = list(opened = TRUE),
                              children = Map(node$children,
                                             f = \(child) list(text = child$Name,
                                                               state = list(opened = TRUE),
                                                               li_attr = list(id = node$Path)
                                                               )
                                             )
                              ),
                 filterFun = \(node) !is.leaf(node), ## leave nodes already captured via the `children` attribute of their parent nodes
                 simplify = FALSE
                 )

评论

0赞 Kamaloka 12/18/2023
我在拥有此类数据时遇到问题 在这里,“1,a”和“1,b”与父“2”相关联,而它应该是父“1”。你知道有什么方法可以解决它吗?data <- data.frame(Name = c("A", "A1", "A2", "A1a", "A1b", "A2a", "A2b", "A2c", "B"), Path = c("1", "1,a", "1,b", "1,a,1", "1,a,2", "1,b,1", "1,b,2", "1,b,3", "2")).