DOM XML Parsing, dBuilder.parse(), normalize()

by Unknown - January 31, 2018

DOM을 이용하여 XML instance의 일부 데이터 목록을 parsing 해 보았다.

1

2

3

4

5

6

7

8

9

10

public class Ex_xml_parsing {

    public static void main(String[] args) {

        try {

            File xmlFile = new File("D:/Workspace/_JAVA/Ex_xml_parsing/src/test.xml");            

            DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance(); // instance 생성            

            DocumentBuilder dBuilder = dbFactory.newDocumentBuilder(); // builder 생성            

            Document doc = dBuilder.parse(xmlFile); // xml 파일을 dom 객체로 파싱

Colored by Color Scripter
cs

dBuilder.parse(xmlFile)

DocumentBuilder.parse(File f): Parse the content of the given InputStream as an XML document and return a new DOM Document object. An IllegalArgumentException is thrown if the InputStream is null. throws SAXException, IOException

(XML 문서 컨텐트를 parse해서 DOM 문서 객체로 리턴한다.)

1

doc.getDocumentElement().normalize(); // 파싱한 파일 normalize
cs

doc.getDocumentElement().normalize(): Puts all Text nodes in the full depth of the sub-tree underneath this Node where only structure (e.g., elements, comments, processing instructions, CDATA sections, and entity references) separates Text nodes, i.e., there are neither adjacent Text nodes nor empty Text nodes.

This basically means that the following XML element

<foo>hello
wor
ld</foo>
could be represented like this in a denormalized node:

Element foo
Text node: ""
Text node: "Hello "
Text node: "wor"
Text node: "ld"
When normalized, the node will look like this

Element foo
Text node: "Hello world"

(<foo>hello
wor
ld</foo>
이런 형태의 코드를 denormalize된 노드로 표현하면:

Element foo
Text node: ""
Text node: "Hello "
Text node: "wor"
Text node: "ld"
normalize하면

Element foo
Text node: "Hello world"
이렇게 된다.)

- 전체 코드

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

import org.w3c.dom.Document;import org.w3c.dom.Element;import org.w3c.dom.Node;

import org.w3c.dom.NodeList;

import javax.xml.parsers.DocumentBuilder;import javax.xml.parsers.DocumentBuilderFactory;

import java.io.*;

/*

DOM XML Parser Example

Ref - http://www.mkyong.com/java/how-to-read-xml-file-in-java-dom-parser/

*/

public class Ex_xml_parsing {

    public static void main(String[] args) {

        try {

            File xmlFile = new File("D:/Workspace/_JAVA/Ex_xml_parsing/src/test.xml");            

            DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance(); // instance 생성            

            DocumentBuilder dBuilder = dbFactory.newDocumentBuilder(); // builder 생성            

            Document doc = dBuilder.parse(xmlFile); // xml 파일을 dom 객체로 파싱

            doc.getDocumentElement().normalize(); // 파싱한 파일 normalize

            // 루트 element(최상위 노드)의 이름 출력

            System.out.println("Root element :" + doc.getDocumentElement().getNodeName());             

            System.out.println("\n");           

            // staff라는 이름을 가진 element를 모두 찾아 NodeList로 리턴 

            NodeList nList = doc.getElementsByTagName("staff"); 

            // 리스트의 길이 만큼 출력            for (int temp = 0; temp < nList.getLength(); temp++) {

                Node nNode = nList.item(temp); // NodeList.item(int index): return the indexth item in the collection

                // getNodeType(): A code representing the type of the underlying object, as defined above

                if (nNode.getNodeType() == Node.ELEMENT_NODE) {                                                             

                    // ELEMENT_NODE: The node is an Element.                    

                    Element eElement = (Element) nNode;

                    // getElementsByTagName(String name): Returns a NodeList of all descendant 

                    // Elements with a given tag name, in document order

                    // getTextContent(): This attribute returns the text content of this node and its descendants.                    

                    System.out.println("Staff id : " + eElement.getAttribute("id"));                     

                    System.out.println("First Name : " + eElement.getElementsByTagName("firstname").item(0).getTextContent()); 

                    System.out.println("Last Name : " + eElement.getElementsByTagName("lastname").item(0).getTextContent());                    

                    System.out.println("Nick Name : " + eElement.getElementsByTagName("nickname").item(0).getTextContent());                    

                    System.out.println("Salary : " + eElement.getElementsByTagName("salary").item(0).getTextContent());                }

            }

        } catch (Exception e) {

            e.printStackTrace();        

          }

    }

}

Colored by Color Scripter
cs

- 출력값

Root element :company

Staff id : 1001
First Name : yong
Last Name : mook kim
Nick Name : mkyong
Salary : 100000
Staff id : 2001
First Name : low
Last Name : yin fong
Nick Name : fong fong
Salary : 200000

- test.xml

<?xml version="1.0" encoding="utf-8"?>
<company>
<staff id="1001">
<firstname>yong</firstname>
<firstname>kong</firstname>
<lastname>mook kim</lastname>
<nickname>mkyong</nickname>
<salary>100000</salary>
</staff>
<staff id="2001">
<firstname>low</firstname>
<lastname>yin fong</lastname>
<nickname>fong fong</nickname>
<salary>200000</salary>
</staff>
</company>

Tags : XML

chuckolet's Blog

DOM XML Parsing, dBuilder.parse(), normalize()

0 comments

About Me

Categories

Popular Posts

recent posts

Total Pageviews

Follow Me

chuckolet's Blog

DOM XML Parsing, dBuilder.parse(), normalize()

You May Also Like

0 comments

About Me

Categories

Popular Posts

recent posts

Total Pageviews

Follow Me