DOM XML Parsing, dBuilder.parse(), normalize()

by - January 31, 2018


DOM을 이용하여 XML instance의 일부 데이터 목록을 parsing 해 보았다.

1
2
3
4
5
6
7
8
9
10
public class Ex_xml_parsing {
 
    public static void main(String[] args) {
 
        try {
 
            File xmlFile = new File("D:/Workspace/_JAVA/Ex_xml_parsing/src/test.xml");            
            DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance(); // instance 생성            
            DocumentBuilder dBuilder = dbFactory.newDocumentBuilder(); // builder 생성            
            Document doc = dBuilder.parse(xmlFile); // xml 파일을 dom 객체로 파싱
cs

dBuilder.parse(xmlFile)

DocumentBuilder.parse(File f): Parse the content of the given InputStream as an XML document and return a new DOM Document object. An IllegalArgumentException is thrown if the InputStream is null. throws SAXException, IOException

(XML 문서 컨텐트를 parse해서 DOM 문서 객체로 리턴한다.)

1
doc.getDocumentElement().normalize(); // 파싱한 파일 normalize
cs

doc.getDocumentElement().normalize(): Puts all Text nodes in the full depth of the sub-tree underneath this Node where only structure (e.g., elements, comments, processing instructions, CDATA sections, and entity references) separates Text nodes, i.e., there are neither adjacent Text nodes nor empty Text nodes.

This basically means that the following XML element

<foo>hello
wor
ld</foo>
could be represented like this in a denormalized node:

Element foo
    Text node: ""
    Text node: "Hello "
    Text node: "wor"
    Text node: "ld"
When normalized, the node will look like this

Element foo
    Text node: "Hello world"

(<foo>hello
wor
ld</foo>
이런 형태의 코드를 denormalize된 노드로 표현하면:

Element foo
    Text node: ""
    Text node: "Hello "
    Text node: "wor"
    Text node: "ld"
normalize하면

Element foo
    Text node: "Hello world"
이렇게 된다.)


- 전체 코드

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
 
import org.w3c.dom.Document;import org.w3c.dom.Element;import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
 
import javax.xml.parsers.DocumentBuilder;import javax.xml.parsers.DocumentBuilderFactory;
import java.io.*;
 
/*
DOM XML Parser Example
Ref - http://www.mkyong.com/java/how-to-read-xml-file-in-java-dom-parser/
*/
public class Ex_xml_parsing {
 
    public static void main(String[] args) {
 
        try {
 
            File xmlFile = new File("D:/Workspace/_JAVA/Ex_xml_parsing/src/test.xml");            
            DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance(); // instance 생성            
            DocumentBuilder dBuilder = dbFactory.newDocumentBuilder(); // builder 생성            
            Document doc = dBuilder.parse(xmlFile); // xml 파일을 dom 객체로 파싱
 
            doc.getDocumentElement().normalize(); // 파싱한 파일 normalize
 
            // 루트 element(최상위 노드)의 이름 출력
            System.out.println("Root element :" + doc.getDocumentElement().getNodeName());             
            System.out.println("\n");           
            // staff라는 이름을 가진 element를 모두 찾아 NodeList로 리턴 
            NodeList nList = doc.getElementsByTagName("staff"); 
 
            // 리스트의 길이 만큼 출력            for (int temp = 0; temp < nList.getLength(); temp++) {
 
                Node nNode = nList.item(temp); // NodeList.item(int index): return the indexth item in the collection
                // getNodeType(): A code representing the type of the underlying object, as defined above
                if (nNode.getNodeType() == Node.ELEMENT_NODE) {                                                             
                    // ELEMENT_NODE: The node is an Element.                    
                    Element eElement = (Element) nNode;
                    // getElementsByTagName(String name): Returns a NodeList of all descendant 
 
                    // Elements with a given tag name, in document order
 
                    // getTextContent(): This attribute returns the text content of this node and its descendants.                    
 
                    System.out.println("Staff id : " + eElement.getAttribute("id"));                     
                    System.out.println("First Name : " + eElement.getElementsByTagName("firstname").item(0).getTextContent()); 
                    System.out.println("Last Name : " + eElement.getElementsByTagName("lastname").item(0).getTextContent());                    
                    System.out.println("Nick Name : " + eElement.getElementsByTagName("nickname").item(0).getTextContent());                    
                    System.out.println("Salary : " + eElement.getElementsByTagName("salary").item(0).getTextContent());                }
            }
 
        } catch (Exception e) {
            e.printStackTrace();        
          }
    }
}
 
 
 
cs

- 출력값

Root element :company


Staff id : 1001
First Name : yong
Last Name : mook kim
Nick Name : mkyong
Salary : 100000
Staff id : 2001
First Name : low
Last Name : yin fong
Nick Name : fong fong
Salary : 200000


- test.xml

<?xml version="1.0" encoding="utf-8"?>
<company>
    <staff id="1001">
        <firstname>yong</firstname>
        <firstname>kong</firstname>
        <lastname>mook kim</lastname>
        <nickname>mkyong</nickname>
        <salary>100000</salary>
    </staff>
    <staff id="2001">
        <firstname>low</firstname>
        <lastname>yin fong</lastname>
        <nickname>fong fong</nickname>
        <salary>200000</salary>
    </staff>
</company>

You May Also Like

0 comments