Friday, July 25, 2014

DOM (Document Object Model)


What is DOM ?


DOM (Document Object Model) is another way that can be used to get information from XML file. You have seen what are the differences between SAX and DOM, in previous post. So no need to mention again. There are other ways called JDOM, DOM4J...etc. They are much easier than DOM. But you have to add their libraries into your project. We will discuss them later.


What is the tree structure of DOM ?


One of special things in DOM is, it uses tree structure of the XML document known as DOM tree to read data.

<?xml version="1.0" encoding="UTF-8"?>
<company>   
    <staff id="1">
        <firstname>John</firstname>
        <lastname>Lodne</lastname>
        <nickname>JL</nickname>
        <salary>100000</salary>
    </staff>
    <staff id="2">
        <firstname>Ann</firstname>
        <lastname>Linda</lastname>
        <nickname>Lin</nickname>
        <salary>200000</salary>
    </staff>
</company>


Here you can see the corresponding DOM tree according to above XML file.




This is the DOM tree, it consist of nodes and you can traverse using this thinking about normal tree structures (Binary tree, B tree, Multiway tree...etc) in Data structures. DOM is a huge part which has so many theories. But this post is about, How to use DOM with Java.


DOM API Packages

  • org.w3c.dom
  • javax.xml.parsers

org.w3c.dom Package

  • This package consist of a set of interfaces defined by W3C.

javax.xml.parsers Package

  • This package contains DocumentBuilderFactory and DocumentBuilder classes which are specific to DOM.
  • Other SAX specific classes are also there.


    Node types in DOM


    Everything in XML document is a node. There are so many node types in DOM. Go to W3 resource. But in this post I use mainly two types.
    • Element node : Represent an element
    • Text node : Represent text content in a element


      How DOM works ?

      1. DOM parser parses entire XML document.
      2. Then it loads into the memory.
      3. Build the tree.


        Disadvantages of DOM

        1. Memory consuming method (SAX use less memory).
        2. Slower than SAX.

        How to read XML file using DOM

        Students.xml


        <?xml version="1.0" encoding="UTF-8"?>
        <school>   
            <student id="1">
                <firstname>John</firstname>
                <lastname>Lodne</lastname>
                <age>20</age>
                <city>A</city>
            </student>
            <student id="2">
                <firstname>Ann</firstname>
                <lastname>Linda</lastname>
                <age>23</age>
                <city>B</city>
            </student>
        </school>
        
        

        DomTest.java


        package DomPkg;
        
        import javax.xml.parsers.DocumentBuilderFactory;
        import javax.xml.parsers.DocumentBuilder;
        import org.w3c.dom.*;
        import java.util.Scanner;
        
        public class DomTest {
            
          public static void main(String args[]){
                
            try{
              Scanner scn = new Scanner(System.in);
              System.out.print("Enter your file path: ");
              String filePath = scn.nextLine();
                    
              DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
              DocumentBuilder db = dbFactory.newDocumentBuilder();
              Document document = db.parse(filePath);
              document.getDocumentElement().normalize();
                   
              System.out.println("-----------------------");
              System.out.println("Root element: " + document.getDocumentElement().getNodeName());
              System.out.println("-----------------------");
              NodeList list = document.getElementsByTagName("student");
                    
              for(int i=0; i<list.getLength(); i++){            
                Node node = list.item(i);
                System.out.println("Current Element: " + node.getNodeName());
                        
                if(node.getNodeType() == Node.ELEMENT_NODE){
                   Element element = (Element)node;
                           
                   System.out.println("Firstname: " + element.getElementsByTagName("firstname").item(0).getTextContent());
                   System.out.println("Lastname: " + element.getElementsByTagName("lastname").item(0).getTextContent());
                   System.out.println("Age: " + element.getElementsByTagName("age").item(0).getTextContent());
                   System.out.println("City: " + element.getElementsByTagName("city").item(0).getTextContent());
                   System.out.println("-----------------------");
                }
              }       
            }catch(Exception e){
                e.printStackTrace();
            }
          }   
        }      
        
        


        Here is the link to download the NetBeans project file. You can download it and run.


        You will be able to see this output.



        There are so many methods that can use DOM to read XML file. Here is another method that can be used to get data from an entire XML file with the Node types. This is recommended by Oracle site.




        Hope you enjoy, lets meet in next post.