Friday, July 18, 2014

SAX (Simple API for XML)

What is SAX ?


  • SAX is very simple API which is used to read XML data in simple way. The result of SAX is event based.
  • This takes one pass over the document from start to finish. 
  • Once SAX starts, it goes through all the way to the end. 

SAX Processing steps

  1. Create an event handler.
  2. Create the SAX parser.
  3. Assign the event handler to the parser.
  4. Parse the document while sending each event to the handler.



SAX vs DOM

SAX API Packages


Mainly SAX API consist of following packages.

  • org.xml.sax
  • org.xml.sax.ext
  • org.xml.sax.helpers
  • javax.xml.parsers

org.xml.sax Package


This package contains the basic interfaces of SAX API. There are four interfaces.

  • ContentHandler
  • ErrorHandler
  • DTDHandler
  • EntityHandler

The ContentHandler Interface

This interface provides various call back methods that are invoked when a SAX parser parses an XML file. You can use this interface to receive parsing event notifications. It consist of following methods.

  • startDocument() - Get invoked when the parser starts parsing.
  • endDocument() - Get invoked when the parser ends parsing.
  • startElement() - Get invoked when the parser encounters a start element.
  • endElement() - Get invoked when the parser encounters an end element.
  • characters() - Get invoked when the parser encounters character data.
  • ingorableWhitespaces() - Get invoked when the parser encounters white spaces.

The ErrorHandler Interface

This interface defines the methods to handle any error that might occur during parsing. Mainly there are three methods.

  • warning() - Gives notifications of warnings might occur during processing
  • error() - Gets invoked when a recoverable error occurs while parsing XML
  • fatalError() - Gets invoked when fatal error occurs while parsing XML

The DTDHandler Interface

This interface defines methods to receive notification when a parser process the DTD (Document Type Definition) of XML file. Notations represent the binary format cannot be parsed.

  • notationDecl() - Gives notifications about an entity being declared as a notation
  • unparsedEntityDecl() - Gives notifications about the entity of an XML that cannot be parsed. 

The EntityHandler Interface

This method is invoked when the parser must identify data identified by a URI (Uniform Resource Identifier). URI is either URL (Uniform Resource Locator) or URN (Uniform Resource Name). URL specify the location of a file while URN identify the name of a document.


org.xml.sax.ext Package

This package defines SAX extensions that can be used to advanced SAX processing such as accessing DTD or lexical information of an XML document. It defines two interfaces.

  • DeclHandler - Declares methods that provide notification about DTD.
  • LexicalHandler - Declares methods that provide notification about lexical events.

org.xml.sax.helpers Package


This package contains helper classes of SAX API. The default handler class of this package implements the ContentHandler, DTDHandler, EntityHadler and ErrorHandler interfaces. So you only need to override the required method once you need it.

javax.xml.parsers Package

This is another important package that has collection of classes that enable Java applications to process with XML documents. It can use either SAX or DOM to do it.

  • SAXParser - Allows parsing of XML file
  • SAXParserFactory - Enables you to retrieve an object of SAXParsers.

How to read XML file using SAX ?


This can be done in many ways. You can read and print either whole XML file or the custom things that you need to print. Following example shows you to read and print the information as a list.

Employee.xml


<?xml version="1.0" encoding="UTF-8"?>
<company>   
    <staff id="1">
        <firstname>John</firstname>
        <lastname>Lodne</lastname>
        <nickname>JL</nickname>
        <salary>100000</salary>
    </staff>
    <staff id="2">
        <firstname>Ann</firstname>
        <lastname>Linda</lastname>
        <nickname>Lin</nickname>
        <salary>200000</salary>
    </staff>
    <staff id="3">
        <firstname>Donna</firstname>
        <lastname>Flor</lastname>
        <nickname>Don</nickname>
        <salary>150000</salary>
    </staff>
</company>

SaxXmlRead.java


package SaxTest;

import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;

public class SaxXmlRead extends DefaultHandler{
 
  public void readXml(){
    try {
      SAXParserFactory saxParserFactory = SAXParserFactory.newInstance();   
      SAXParser saxParser = saxParserFactory.newSAXParser();

      DefaultHandler defaultHandler = new DefaultHandler(){

         /*cTag = closeTag 
           oTag = openTag
           fnTag = firstNameTag
           lnTag = lastNameTag*/
        
        String fnTag="cTag";
        String lnTag="cTag";
        String nnTag="cTag";
        String sTag="cTag";

        public void startElement(String uri, String localName, String qName,Attributes attributes) throws SAXException{
          if (qName.equalsIgnoreCase("firstname")) {
            fnTag = "oTag";
          }
          if (qName.equalsIgnoreCase("lastname")) {
            lnTag = "oTag";
          }
          if (qName.equalsIgnoreCase("nickname")) {
            nnTag = "oTag";
          }
          if (qName.equalsIgnoreCase("salary")) {
            sTag = "oTag";
          }
        }

        public void characters(char ch[], int start, int length)throws SAXException {
          if (fnTag.equals("oTag")) {
            System.out.println("First Name : " + new String(ch, start, length));
          }
          if (lnTag.equals("oTag")) {
            System.out.println("Last Name : " + new String(ch, start, length));
          }
          if (nnTag.equals("oTag")) {
            System.out.println("Nick Name : " + new String(ch, start, length));
          }
          if (sTag.equals("oTag")) {
            System.out.println("Salary : " + new String(ch, start, length));
            System.out.println("----------------------");
          }
        }

        public void endElement(String uri, String localName, String qName)throws SAXException {
          if (qName.equalsIgnoreCase("firstName")) {
            fnTag = "cTag";
          }
          if (qName.equalsIgnoreCase("lastName")) {
            lnTag = "cTag";
          }
          if (qName.equalsIgnoreCase("nickname")) {
            nnTag = "cTag";
          }
          if (qName.equalsIgnoreCase("salary")) {
            sTag = "cTag";
          }
        }
      };

      saxParser.parse("C:\\Users\\Ravi\\Documents\\NetBeansProjects\\Sax\\src\\SaxTest\\Employee.xml", defaultHandler);
        
    }catch (Exception e) {
       e.printStackTrace();
    }
  }
}



SaxXmlImp.java


package SaxTest;

public class SaxXmlIpm {
    public static void main(String args[]){
        SaxXmlRead xml = new SaxXmlRead();
        xml.readXml();
    }
}


I have created package called SaxTest and imported all these three code into it as follows. You need to change the file path of the XML file in SaxXmlRead.java file. (You can find the path by right clicking on XML file and go to properties and then you can find the path.)



Then you need to run SaxXmlIpm.java file and then you will be able to following result.






Here you can download the NetBeans project file also, you can import it to NetBeans and run easily. 




Now you have a idea about get information from XML file as a list or ordered script. Here is another project file that shows you, how  to print whole XML file using SAX. You can download it and run.