无法在 python xml.etree.ElementTree 中找到 XPath 中的元素

Can't get find to access element in XPath in python xml.etree.ElementTree

提问人:amackley 提问时间:9/21/2023 最后编辑:amackley 更新时间:9/21/2023 访问量:22

问:

我一直在兜圈子。我的任务是阅读 xml 文档。解析记录以查找 ID。使用该 ID 执行某些 SQL。然后将 xml 的值与 SQL 的值进行比较。如果 SQL 不同(并且不为 null),我们将更新要发送回另一台服务器的 XML。

一切都在工作,只是当我尝试获取 XML 记录中字段的值时,它没有找到。

我创建了两个xml映射词典:

xml_sql_mapping_sa = {
    'ncaaId': 'RZECAST_KEYNCAAID',
    'schoolSid': 'SPRIDEN_ID',
    'birthDate': 'BIRTHDATE',
    'email': 'EMAIL',
    'ethnicCode': 'ETHNICODE',
    'firstName': 'FIRSTNAME',
    'lastName': 'LASTNAME',
    'MI': 'MI',
    'gender': 'GENDER',
    'primaryPhone': 'PRIMARY_PHONE'
}

# Define a mapping between XML fields and SQL columns for <parent> elements
xml_sql_mapping_parent = {
    'email': 'PARENT_EMAIL',
    'primaryPhone': 'PARENT_PHONE'
}

下面是 XML 的结构:

<students>
<sa birthDate="2005-##-####" email="[email protected]" ethnicCode="5" firstName="L____" gender="F" lastName="A____d" ncaaId="211123456908" primaryPhone="208-555-5555" schoolSid="020111126">
<saDetail fulltimeEnrollmentTermAny="S1" fulltimeEnrollmentTermHere="S1" fulltimeEnrollmentYearAny="2024" fulltimeEnrollmentYearHere="2024" internationalFlag="N"/>
<address address1="678 Address" city="theCity" country="US" postalCode="83204" state="ID"/>
<parent email="[email protected]" name="first Name" primaryPhone="25555541522"/>
<saPreFte hoursCode="ADVANCED_PLACEMENT"/>
<saPreFte degreeApplicableHours="27.0" earnedHours="27.0" hoursCode="CREDIT_BEFORE_FULL_TIME"/>
<saPreFte hoursCode="CREDIT_BY_EXAM"/>
<saPreFte hoursCode="SUMMER_BRIDGE"/>
<saYear academicYear="2024">
<saYearSport sportCode="WGO"/>
<saYearEligible financialAidCertDate="2023-08-09" medicalDate="2023-08-08"/>
<saYearTerm termCode="S1"/>
<saYearTerm termCode="S2"/>
<saYearTerm termCode="SU"/>
<saYearPtd classYear="1"/>
</saYear>
</sa>
</students>

以下是相关代码:

            row = cursor.fetchone()

            # Check if there are rows to update XML
            if row:
                # Get the column names from cursor description
                column_names = [desc[0] for desc in cursor.description]

                # Create a dictionary to map column names to values (uppercase column names)
                row_data = dict(zip(map(str.upper, column_names), row))
                print(row_data)

                # Determine whether to update <sa> or <parent> elements based on tag
                if record.tag == 'sa':
                    xml_sql_mapping = xml_sql_mapping_sa
                elif record.tag == 'parent':
                    xml_sql_mapping = xml_sql_mapping_parent
                else:
                    print(f"Unsupported XML element tag: {record.tag}")
                    continue

                # Loop through the XML fields and update if necessary
                for xml_field, sql_column in xml_sql_mapping.items():
                    if sql_column in row_data:
                        sql_value = row_data[sql_column]
                        xml_element = record.find(xml_field)
                        print(f"The xml_field is {xml_field}.")
                        print(f"XML element is {xml_element}")

                        # Check if the XML element exists
                        if xml_element is not None:
                            # Check if the SQL value is not None and different from XML value
                            if sql_value is not None and xml_element.text != sql_value:
                                xml_element.text = sql_value
                        else:
                            print(f"XML element {xml_field} not found in the <{record.tag}> element.")

            else:
                print(f"No record found for student ID: {v_student_ID}")

        finally:
            # Close the cursor for each student
            cursor.close()

一切正常,只是找不到 xml 元素。这是我的部分印刷声明。(数据已更改)

SchoolSid: 020194486
{'RZECAST_KEYNCAAID': '151104', 'SPRIDEN_ID': '123456', 'BIRTHDATE': '2004-05-23', 'EMAIL': '[email protected]', 'ETHNICODE': '5', 'FIRSTNAME': 'A____', 'LASTNAME': 'A___', 'MI': None, 'GENDER': 'F', 'PREFERREDNAME': None, 'PRIMARY_PHONE': '406-555-5555'}
The xml_field is ncaaId.
XML element is None
XML element ncaaId not found in the <sa> element.
The xml_field is schoolSid.
XML element is None 

相关的代码行是这样的: xml_element = record.find(xml_field)

xml_element应该填充 xml 中的值,但它找不到任何内容。

python xml 解析 elementtree

评论


答:

1赞 Daniel Haley 9/21/2023 #1
The xml_field is ncaaId.
XML element is None
XML element ncaaId not found in the <sa> element.

ncaaId不是一个元素。它是元素的一个属性。sa

所以你可能想做一些类似的事情(也许也重命名变量,因为你没有选择元素):

xml_element = record.get(xml_field)

评论

0赞 amackley 9/21/2023
谢谢丹尼尔。应该看到的。
0赞 fracjackmac 9/24/2023
来自“ElementTree.py”对于那些想知道为什么“get()”是答案的人: def get(self, key, default=None): “”“Get 元素属性。< < < < 等同于 attrib.get,但某些实现可能会更有效地处理此问题。key 是要查找的属性,default 是未找到该属性时要返回的属性。返回包含属性值的字符串,如果未找到属性,则返回默认值。“”“ return self.attrib.get(key, default)
0赞 Daniel Haley 9/25/2023
@fracjackmac - 为了添加到您的“为什么”get()“是答案”中......这是因为元素 () 的属性是一个字典,并且是返回 的值的字典方法(如您所说)。我使用它而不是的主要原因是,使用默认值 ,我不必在 try/except 中处理异常,除非属性不存在。.attrib.get()keyrecord.attrib[xml_field]NoneKeyError