使用 Java 解析多级 XML 文件 - Dom Parser

Parsing the multilevel XML File with Java - Dom Parser

提问人:Andrei Rus 提问时间:12/9/2022 最后编辑:xerx593Andrei Rus 更新时间:12/17/2022 访问量:166

问:

我有这个包含 3 个类别的 xml 文件:employee_list、position_details 和 employee_info。

<?xml version="1.0" encoding="UTF-8"?>
<employee>
    <employee_list>
        <employee ID="1">
            <firstname>Andrei</firstname>
            <lastname>Rus</lastname>
            <age>23</age>
            <position-skill ref="Java"/>
            <detail-ref ref="AndreiR"/>
        </employee>

        <employee ID="2">
            <firstname>Ion</firstname>
            <lastname>Popescu</lastname>
            <age>25</age>
            <position-skill ref="Python"/>
            <detail-ref ref="IonP"/>
        </employee>

        <employee ID="3">
            <firstname>Georgiana</firstname>
            <lastname>Domide</lastname>
            <age>33</age>
            <position-skill ref="C"/>
            <detail-ref ref="GeorgianaD"/>
        </employee>

    </employee_list>

    <position_details>
        <position ID="Java">
            <role>Junior Developer</role>
            <skill_name>Java</skill_name>
            <experience>1</experience><!-- years of experience -->
        </position>

        <position ID="Python">
            <role>Developer</role>
            <skill_name>Python</skill_name>
            <experience>3</experience> 
        </position>

        <position ID="C">
            <role>Senior Developer</role>
            <skill_name>C</skill_name>
            <experience>5</experience>
        </position>
    </position_details>

    <employee_info>
        <detail ID="AndreiR">
            <username>AndreiR</username>
            <residence>Timisoara</residence>
            <yearOfBirth>1999</yearOfBirth>
            <phone>0</phone>
        </detail>

        <detail ID="IonP">
            <username>IonP</username>
            <residence>Timisoara</residence>
            <yearOfBirth>1997</yearOfBirth>
            <phone>0</phone>
        </detail>

        <detail ID="GeorgianaD">
            <username>GeorgianaD</username>
            <residence>Arad</residence>
            <yearOfBirth>1989</yearOfBirth>
            <phone>0</phone>
        </detail>
    </employee_info>
</employee>

我想为所有 3 个类别编写 java 代码,但到目前为止,我只设法通过了第一个类别 (employee_list)。当我尝试从position_list或employee_info类别中检索信息时,程序无法根据每个类别查找信息。

我为 3 个类别编写了 Java 代码,结果如下所示:

package Dom;

import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;

import java.io.File;
import java.io.IOException;
import java.util.Scanner;

public class main {

    public static void main(String[] args) {
        try {
            File xmlDoc = new File("employees.xml");
            DocumentBuilderFactory dbFact = DocumentBuilderFactory.newInstance();
            DocumentBuilder dBuild = dbFact.newDocumentBuilder(); 
            Document doc = dBuild.parse(xmlDoc);
            
            //Citim radacina
            //                                    doc localizeaza radacina   da numele ei
            System.out.println("Root element: " + doc.getDocumentElement().getNodeName());
            System.out.println("-----------------------------------------------------------------------------");
            
            //citim un array de studenti pe care il denumim NodeList
            NodeList nList = doc.getElementsByTagName("employee");
            System.out.println("Total Category inside = " + nList.getLength());
            System.out.println("-----------------------------------------------------");
            
            
            for(int i = 0 ; i<nList.getLength();i++) {
                Node nNode = nList.item(i);
                //System.out.println("Node name: " + nNode.getNodeName()+" " + (i+1));
                if(nNode.getNodeType() == Node.ELEMENT_NODE) {
                    Element eElement = (Element) nNode;
                    System.out.println("Person id#: " + eElement.getAttribute("id"));
                    System.out.println("Person Last Name: " + eElement.getElementsByTagName("lastname").item(0).getTextContent());
                    System.out.println("Person First name: " + eElement.getElementsByTagName("firstname").item(0).getTextContent());
                    System.out.println("Person Age: " + eElement.getElementsByTagName("age").item(0).getTextContent());
                    System.out.println("--------------------------------------------------------------------------");
                }
            }
            
            System.out.println("=============================================================================================");
            
            nList = doc.getElementsByTagName("position");
            System.out.println("Total Category inside = " + nList.getLength());
            System.out.println("-----------------------------------------------------");
            for(int i = 0 ; i<nList.getLength();i++) {
                Node nNode = nList.item(i);
                //System.out.println("Node name: " + nNode.getNodeName()+" " + (i+1));
                if(nNode.getNodeType() == Node.ELEMENT_NODE) {
                    Element eElement = (Element) nNode;
                    System.out.println("Role: " + eElement.getElementsByTagName("role").item(0).getTextContent());
                    System.out.println("Skill: "+ eElement.getElementsByTagName("skill_name").item(0).getTextContent());
                    System.out.println("Experience: "+ eElement.getElementsByTagName("experience").item(0).getTextContent());
                    System.out.println("--------------------------------------------------------------------------");
                }
            }
            
            System.out.println("=============================================================================================");
            
            nList = doc.getElementsByTagName("detail");
            System.out.println("Total Category inside = " + nList.getLength());
            System.out.println("-----------------------------------------------------");
            for(int i = 0 ; i<nList.getLength();i++) {
                Node nNode = nList.item(i);
                //System.out.println("Node name: " + nNode.getNodeName()+" " + (i+1));
                if(nNode.getNodeType() == Node.ELEMENT_NODE) {
                    Element eElement = (Element) nNode;
                    System.out.println("Person with username: " +  eElement.getElementsByTagName("username").item(0).getTextContent());
                    System.out.println("Username: " + eElement.getElementsByTagName("username").item(0).getTextContent());
                    System.out.println("Residence: "+ eElement.getElementsByTagName("residence").item(0).getTextContent());
                    System.out.println("Year of birth: "+ eElement.getElementsByTagName("yearOfBirth").item(0).getTextContent());
                    System.out.println("Phone: "+ eElement.getElementsByTagName("phone").item(0).getTextContent());
                    System.out.println("--------------------------------------------------------------------------");
                }
            }
            
            
        }catch(Exception e) {
            
        }
        
    }

}

输出:

Root element: employee
-----------------------------------------------------------------------------
Total Category inside = 4
-----------------------------------------------------
Person id#: 
Person Last Name: Rus
Person First name: Andrei
Person Age: 23
--------------------------------------------------------------------------
Person id#: 
Person Last Name: Rus
Person First name: Andrei
Person Age: 23
--------------------------------------------------------------------------
Person id#: 
Person Last Name: Popescu
Person First name: Ion
Person Age: 25
--------------------------------------------------------------------------
Person id#: 
Person Last Name: Domide
Person First name: Georgiana
Person Age: 33
--------------------------------------------------------------------------
=============================================================================================
Total Category inside = 3
-----------------------------------------------------
Role: Junior Developer
Skill: Java
Experience: 1
--------------------------------------------------------------------------
Role: Developer
Skill: Python
Experience: 3
--------------------------------------------------------------------------
Role: Senior Developer
Skill: C
Experience: 5
--------------------------------------------------------------------------
=============================================================================================
Total Category inside = 3
-----------------------------------------------------
Person with username: AndreiR
Username: AndreiR
Residence: Timisoara
Year of birth: 1999
Phone: 0
--------------------------------------------------------------------------
Person with username: IonP
Username: IonP
Residence: Timisoara
Year of birth: 1997
Phone: 0
--------------------------------------------------------------------------
Person with username: GeorgianaD
Username: GeorgianaD
Residence: Arad
Year of birth: 1989
Phone: 0
--------------------------------------------------------------------------

是否有可能以每个人的以下形式对输出进行稍微多的分组:

PersonId
firstname
lastname
age
role
skill_name
experience
username
residence
yearOfBirth
phone
Java 解析 XML 解析 DomParser

评论


答:

0赞 skreutzer 12/17/2022 #1

但你基本上已经完成了,你为这三个类别编写了代码。“未能根据每个类别找到信息”可能意味着输出不是所需的输出。原因可能是全局搜索将名称作为参数传递的所有元素。由于你的根元素也被命名,它被作为附加元素包含在你的 on 中,顺便说一句。没有自己的 ID 属性(注意:这些属性区分大小写)。因此,“内部的总类别 = 4”。如果你先做这个/那是根,果然,它下面有一个元素,只是不是作为直接子元素,而是向下两级,第一个“实际”/期望的元素。Document.getElementsByTagName()employeeNodeNodeListdoc.getElementsByTagName("employee")getElementsByTagName("lastname")NodeElement<lastname/><employee/>

因此,您可能要做的不是全局搜索元素名称,而是在类别的本地上下文中搜索元素名称,就像您已经在循环中的其他地方成功搜索一样。也许只是改变

NodeList nList = doc.getElementsByTagName("employee");

NodeList nList = doc.getElementsByTagName("employee_list");
nList = ((Element)nList.item(0)).getElementsByTagName("employee");

对于employee_list类别,其他类别也是如此。

为了更好地对记录进行分组,您无需立即输出/打印它们。你可以将从 DOM 获得的值复制/存储在你可以定义的类的对象的成员中,或者创建一个更通用的类,将字段值存储在包含 或类似的东西中。这样,您可以遍历/循环您创建的对象或列表,并按您喜欢的顺序输出/打印字段值。ListMap