在解析 XML 文件时，需要类似字节的对象，而不是“str”-解网

问：

我正在尝试解析如下所示的 xml。我想提取有关 katagorie 的信息，即 ID、父母 ID 等：

<?xml version="1.0" encoding="UTF-8"?>
<test timestamp="20210113">
<kategorien>
    <kategorie id="1" parent_id="0">
        Sprache
    </kategorie>
</kategorien>
</test>

我正在尝试这个

fields = ['id', 'parent_id']

with open('output.csv', 'wb') as fp:
    writer = csv.writer(fp)
    writer.writerow(fields)
    tree = ET.parse('./file.xml')
    # from your example Locations is the root and Location is the first level
    for elem in tree.getroot():
        writer.writerow([(elem.get(name) or '').encode('utf-8') 
            for name in fields])

但是我收到此错误：

in <module>
    writer.writerow(fields)
TypeError: a bytes-like object is required, not 'str'

即使我已经在我的代码中使用了。如何摆脱此错误？encode('utf-8')

python-3.x xml utf-8 编码

我看到两个问题。首先，您不需要自己进行编码。打开不带“b”二进制标志的文件并跳过 .encode。文件对象将为您进行编码。您看到的错误来自包含未编码字符串的列表。但是，如果您一开始就不以二进制形式打开，那就不是问题了。['id', 'parent_id']

其次，你迭代了错误的元素。在你的循环中添加一个，你会看到。相反，您可以与伪 xpath 一起使用来获取所需的元素。print(elem)findall

import csv
import xml.etree.ElementTree as ET

fields = ['id', 'parent_id']

with open('output.csv', 'w') as fp:
    writer = csv.writer(fp)
    writer.writerow(fields)
    tree = ET.parse('./file.xml')
    # from your example Locations is the root and Location is the first level
    for elem in tree.getroot().findall('kategorien/kategorie'):
        writer.writerow([(elem.get(name) or '') 
            for name in fields])

上一个：com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException：1 字节 UTF-8 序列的字节 1 无效

下一个：'<？xml version=“1.0” encoding=“utf-8”？>' 在将 xml 编码为字符串 python 后消失

在解析 XML 文件时，需要类似字节的对象，而不是“str”

a bytes-like object is required, not 'str' while parsing XML files

评论

评论