在 Python 中解析 XML,但输出文件中的命名空间不同

Parsing XML in Python but namespaces are not the same in the output file

提问人:Dani 提问时间:1/24/2023 最后编辑:Gino MempinDani 更新时间:1/24/2023 访问量:75

问:

我正在尝试修改下面文件中的两个参数(这只是一个示例,我有一个更大的文件要修改)。

<?xml version="1.0" encoding="UTF-8"?>
<data xmlns="urn:cmp:ran:cmp_data_container:5_0_326_21">
  <id>1</id>
  <ManagedElement xmlns="urn:std:sa5:managed-element:5_0_326_21">
    <id>1</id>
    <attributes>
      <priorityLabel>1</priorityLabel>
    </attributes>
    <GNBDUFunction xmlns="urn:std:sa5:gnbdufunction:5_0_326_21">
      <id>1</id>
      <attributes>
        <userLabel/>
        <resourceType/>
        <rRMPolicyMemberList>
          <idx>1</idx>
          <mcc>262</mcc>
          <mnc>01</mnc>
          <sNSSAI>50519801</sNSSAI>
        </rRMPolicyMemberList>
        <gNBId>83030094</gNBId>
        <gNBIdLength>28</gNBIdLength>
        <gNBDUId>1</gNBDUId>
        <gNBDUName>mBTS_121</gNBDUName>
      </attributes>
    </GNBDUFunction>
  </ManagedElement>
</data>

但是输出 .xml 文件仅在文件顶部具有命名空间,从我所看到的,这些命名空间与 ns0、ns1 等映射在文件中。

输出文件:


<ns0:data xmlns="urn:std:sa5:gnbdufunction:5_0_326_21" xmlns:ns0="urn:cmp:ran:cmp_data_container:5_0_326_21" xmlns:ns1="urn:std:sa5:managed-element:5_0_326_21">
  <ns0:id>1</ns0:id>
  <ns1:ManagedElement>
    <ns1:id>1</ns1:id>
    <ns1:attributes>
      <ns1:priorityLabel>1</ns1:priorityLabel>
    </ns1:attributes>
    <GNBDUFunction>
      <id>1</id>
      <attributes>
        <userLabel />
        <resourceType />
        <rRMPolicyMemberList>
          <idx>1</idx>
          <mcc>262</mcc>
          <mnc>01</mnc>
          <sNSSAI>50519801</sNSSAI>
        </rRMPolicyMemberList>
        <gNBId>9999999</gNBId>
        <gNBIdLength>9</gNBIdLength>
        <gNBDUId>1</gNBDUId>
        <gNBDUName>mBTS_121</gNBDUName>
      </attributes>
    </GNBDUFunction>
  </ns1:ManagedElement>
</ns0:data>


代码:

import xml.etree.ElementTree as ET

tree = ET.parse('example.xml')
root = tree.getroot()

for lvl1 in root:
for lvl2 in lvl1:
extracted = lvl2.tag.split('}')\[0\].strip('{')
ET.register_namespace('', extracted)
aux = '{'+extracted+'}'+'GNBDUFunction'

        if lvl2.tag == aux:
            for lvl3 in lvl2:
                extracted = lvl3.tag.split('}')[0].strip('{')
                ET.register_namespace('', extracted)
                aux = '{'+extracted+'}'+'attributes'
    
                if lvl3.tag == aux:
                    for lvl4 in lvl3:
                        extracted = lvl4.tag.split('}')[0].strip('{')
                        ET.register_namespace('', extracted)
                        gnbid = '{'+extracted+'}'+'gNBId'
                        gnbidlength = '{'+extracted+'}'+'gNBIdLength'
    
                        if lvl4.tag == gnbid:
                            lvl4.text = str(9999999)
                        if lvl4.tag == gnbidlength:
                            lvl4.text = str(9)

tree.write('example_output.xml')

问题是,我需要输出的 .xml 文件与初始文件的格式相同。

我尝试手动注册所有命名空间,只是为了看看我是否在注册命名空间的正确轨道上,但输出相同。

ET.register_namespace('', "urn:3gpp:sa5:3gpp-nr-nrm-nrnetwork-rrmpolicy:5_0_326_27")
ET.register_namespace('ns0', "urn:mavenir:ran:mavenir_data_container:5_0_326_27")
ET.register_namespace('ns1', "urn:3gpp:sa5:_3gpp-common-managed-element:5_0_326_27")
ET.register_namespace('ns2', "urn:3gpp:sa5:_3gpp-nr-nrm-gnbdufunction:5_0_326_27")
ET.register_namespace('ns3', "urn:3gpp:sa5:_3gpp-nr-nrm-nrcelldu:5_0_326_27")
ET.register_namespace('ns4', "urn:3gpp:sa5:_3gpp-nr-nrm-bwp:5_0_326_27")
ET.register_namespace('ns5', "urn:3gpp:sa5:_3gpp-nr-nrm-ep:5_0_326_27")

我想要获得的输出是这样的:

<?xml version="1.0" encoding="UTF-8"?>
<data xmlns="urn:cmp:ran:cmp_data_container:5_0_326_21">
  <id>1</id>
  <ManagedElement xmlns="urn:std:sa5:managed-element:5_0_326_21">
    <id>1</id>
    <attributes>
      <priorityLabel>1</priorityLabel>
    </attributes>
    <GNBDUFunction xmlns="urn:std:sa5:gnbdufunction:5_0_326_21">
      <id>1</id>
      <attributes>
        <userLabel/>
        <resourceType/>
        <rRMPolicyMemberList>
          <idx>1</idx>
          <mcc>262</mcc>
          <mnc>01</mnc>
          <sNSSAI>50519801</sNSSAI>
        </rRMPolicyMemberList>
        <gNBId>9999999</gNBId>
        <gNBIdLength>9</gNBIdLength>
        <gNBDUId>1</gNBDUId>
        <gNBDUName>mBTS_121</gNBDUName>
      </attributes>
    </GNBDUFunction>
  </ManagedElement>
</data>

如果有办法获得这个,请告诉我。

P.S. 我知道这可能不是解决问题的最佳方法。这是第一次做这种类型的项目。

python xml 解析 xml-namespaces xml.etree

评论

2赞 Jack Fleeting 1/25/2023
您可以通过使用 lxml 而不是 elementtree 来避免该问题。
1赞 mzjn 1/25/2023
是的,请使用 lxml。类似问题:stackoverflow.com/q/45990761/407651

答: 暂无答案