如何将列表列表转换为具有 list[struct[n]] 类型列的极坐标数据帧？-解网

问：

我有一个列表列表。每个单独的列表可能具有不同的长度。列表的每个元素都是元组。

list1 = [("a", 1), ("b", 2)]
list2 = [("c", 3), ("d", 4), ("e", 5)]

我想将它们全部组合到一个类型为 list[struct[2]] 的极性数据帧列中。

在打印数据帧时，我应该看到：

    column_name

    list[struct[2]]

    [{"a",1}, {"b",2}]
    [{"c",3}, {"d",4}, {"e",5}]

我所要做的就是使用下面的代码获取一列 struct[2]

    list1 = ["a", "b", "c"]
    list2 = [1, 2, 3]

    df = pl.DataFrame({
        "col1": list1,
        "col2": list2
    })

    print (df)
    dfs = df.select(pl.struct(pl.all()).alias("my_struct"))
    print(dfs)

但这远不是我想要实现的目标

解决：我使用以下代码解决了这个问题。看起来在极地结构与普通 python 中的字典具有相同的含义。

list1 = [("a", 1), ("b", 2)]
list2 = [("c", 3), ("d", 4), ("e", 5)]
list_of_lists = [list1, list2]
lofl_as_structs = [[dict(f1=pair[0], f2=pair[1]) for pair in lst] for lst in list_of_lists]
df = pl.DataFrame({"column_name": lofl_as_structs})
print(df)

结果：

shape: (2, 1)
┌─────────────────────────────┐
│ column_name                 │
│ ---                         │
│ list[struct[2]]             │
╞═════════════════════════════╡
│ [{"a",1}, {"b",2}]          │
│ [{"c",3}, {"d",4}, {"e",5}] │
└─────────────────────────────┘

附加问题：

我希望能够通过指定如下模式来稍微不同地执行上述操作：

df = pl.DataFrame(lofl_as_structs,schema={'column_name': pl.List(pl.Struct([pl.Field('f1', pl.Utf8), pl.Field('f2', pl.Int64)]))})

这给出了错误：

    raise ShapeError("the row data does not match the number of columns")
polars.exceptions.ShapeError: the row data does not match the number of columns

有关在架构中更改哪些内容以消除此错误的任何线索。

数据帧列表元组 python-polars

shape: (1, 1)
┌──────────────────────┐
│ a                    │
│ ---                  │
│ list[struct[2]]      │
╞══════════════════════╡
│ [{1,null}, {null,2}] │ # [{"a": 1, b: None}, {"a": None: b: 2}]
└──────────────────────┘

Polars 将架构确定为：[ {"a": int, "b": int } ]

>>> df.schema
OrderedDict([('a', List(Struct([Field('a', Int32), Field('b', Int32)])))])

这基本上意味着：列中的每个结构都必须具有相同的字段名称。（键）

如果我们取你的起始列表：、、、、都是关键。abcde

list1 = [("a", 1), ("b", 2)]
list2 = [("c", 3), ("d", 4), ("e", 5)]

>>> dict(list1)
{'a': 1, 'b': 2}
>>> dict(list2)
{'c': 3, 'd': 4, 'e': 5}

如果你想要你所展示的结构，你实际上是在说你想要这个：

list1 = [{"key": "a", "value": 1}, {"key": "b", "value": 2}]
list2 = [{"key": "c", "value": 3}, {"key": "d", "value": 4}, {"key": "e", "value": 5}]

即，您的起始键必须成为实际值。

如何将列表列表转换为具有 list[struct[n]] 类型列的极坐标数据帧？

How to convert a list of lists into a polars dataframe with a column of type list[struct[n]]?

评论

评论