Open In App

Java XML Parsers

Last Updated : 27 Jun, 2024
Comments
Improve
Suggest changes
Like Article
Like
Report

XML is a versatile data format that is to be used for storing and transporting structured information. A significant amount of configuration files, data interchange, and others are done using XML in Java. For effective manipulation of XML documents in Java, there exists a set of parsers for XML. These parsers are capable of reading XML content and making them readable and editable. Any Java developer who is into XML has to know these parsers.

There are two main groups of Java XML parsers:

  • DOM (Document Object Model)
  • SAX (Simple API for XML).

Each parser type serves different needs, from simple data extraction to complex document manipulation.

This article tries to offer an introduction to these parsers and their subtypes; it will describe their key features and use cases.

XML File Used for Example XML File in Java

Below is the XML file to be used with Java Programs:

example.xml
<?xml version="1.0" encoding="UTF-8"?>
<Test>
  <case id="1">
    <domain>Java</domain>
    <count>39</count>
  </case>

  <case id="2">
    <domain>C/C++</domain>
    <count>45</count>
  </case>
</Test>


Types of XML Parsers

1. DOM (Document Object Model) Parser

Overview

The DOM parser reads the entire XML document and builds an in-memory tree representation, which allows the document to be traversed and manipulated by normal DOM APIs.

Features

  • Tree View: This represents the XML document as a tree of nodes.
  • Random Access: All nodes can be accessed and modified freely at any time.
  • Rich API: traversal, manipulations, and querying methods over the document.

Use Cases

  • Complex XML Documents: Useful to the documents where the nodes are supposed to be accessed and changed quite often.
  • In-Memory Operations: Ideal for applications that require taking the entire XML structure into memory and manipulating it.

Pros and Cons

  • Pros: Can be easily used and has robust navigation and modification abilities.
  • Cons: Memory intensive, inefficient for large documents.

Example

Java
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.w3c.dom.Node;

public class DomParserExample {
    public static void main(String[] args) {
        try {
            DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
            DocumentBuilder builder = factory.newDocumentBuilder();
            Document document = builder.parse("example.xml");

            NodeList nodeList = document.getElementsByTagName("exampleTag");
            for (int i = 0; i < nodeList.getLength(); i++) {
                Node node = nodeList.item(i);
                System.out.println(node.getTextContent());
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

Output:

Java
C/C++

2. Simple API for XML (SAX) parser

Overview

Simple API for XML Parser is event-driven, just like an event-driven parser, but it has the additional ability to perform serial access. In this regard, it does not load the entire document into memory, as does the DOM parser; instead, it reads the document sequentially and generates events, such as when elements start and finish, which can be acted upon by custom event handlers.

Features

  • Event-Driven: It parses the document and raises the events of elements and attributes.
  • Low Memory Usage: It processes the document so that the entire document is not necessarily stored in memory.
  • Fast Performance: Quick for large documents due to sequential access.

Use Cases

  • Large XML Documents: Suitable for large documents where processing is needed for only some pieces.
  • Streaming Requirements: Ideal for applications that work with XML data in a streaming fashion.

Pros and Cons

  • Pros: Low memory footprint, fast processing.
  • Cons: Hard to implement, no random access to elements.

Example

Java
import javax.xml.parsers.SAXParserFactory;
import javax.xml.parsers.SAXParser;
import org.xml.sax.helpers.DefaultHandler;
import org.xml.sax.Attributes;

public class SaxParserExample {
    public static void main(String[] args) {
        try {
            SAXParserFactory factory = SAXParserFactory.newInstance();
            SAXParser saxParser = factory.newSAXParser();
            saxParser.parse("example.xml", new MyHandler());
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

class MyHandler extends DefaultHandler {
    public void startElement(String uri, String localName, String qName, Attributes attributes) {
        System.out.println("Start Element: " + qName);
    }

    public void endElement(String uri, String localName, String qName) {
        System.out.println("End Element: " + qName);
    }

    public void characters(char[] ch, int start, int length) {
        System.out.println("Characters: " + new String(ch, start, length));
    }
}

Output:

Start Element: Test
Characters:

Start Element: case
Characters:

Start Element: domain
Characters: Java
End Element: domain
Characters:

Start Element: count
Characters: 39
End Element: count
Characters:

End Element: case
Characters:


Start Element: case
Characters:

Start Element: domain
Characters: C/C++
End Element: domain
Characters:

Start Element: count
Characters: 45
End Element: count
Characters:

End Element: case
Characters:


End Element: Test

3. StAX (Streaming API for XML) Parser

Overview

StAX is a pull-parsing model of XML. It provides an application developer with the ability to pull events from the parser, such as the start and end of elements, when needed, and thus dramatically controls the parsing process.

Features

  • Pull-Based: Control-based parsing is where developers control the parsing process by pulling events.
  • Moderate Memory Usage: More efficient in memory than DOM, but not that much as SAX.
  • Bidirectional Parsing: It allows for both forward and backward traversal of the document.

Use Cases

  • Moderate-Sized Documents: Used in applications that require a balance between memory consumption and ease of use.
  • Complex Processing Logic: Ideal for situations in which complex document processes are required.

Pros and Cons

  • Pros: Well-balanced in memory usage and control; flexible.
  • Cons: May be more complicated than SAX, and not as efficient for very large documents.

Example

Java
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamReader;
import javax.xml.stream.XMLStreamConstants;
import java.io.FileReader;

public class StaxParserExample {
    public static void main(String[] args) {
        try {
            XMLInputFactory factory = XMLInputFactory.newInstance();
            XMLStreamReader reader = factory.createXMLStreamReader(new FileReader("example.xml"));

            while (reader.hasNext()) {
                int event = reader.next();
                switch (event) {
                    case XMLStreamConstants.START_ELEMENT:
                        System.out.println("Start Element: " + reader.getLocalName());
                        break;
                    case XMLStreamConstants.END_ELEMENT:
                        System.out.println("End Element: " + reader.getLocalName());
                        break;
                    case XMLStreamConstants.CHARACTERS:
                        if (reader.hasText()) {
                            System.out.println("Characters: " + reader.getText().trim());
                        }
                        break;
                }
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

Output:

Start Element: Test
Characters:
Start Element: case
Characters:
Start Element: domain
Characters: Java
End Element: domain
Characters:
Start Element: count
Characters: 39
End Element: count
Characters:
End Element: case
Characters:
Start Element: case
Characters:
Start Element: domain
Characters: C/C++
End Element: domain
Characters:
Start Element: count
Characters: 45
End Element: count
Characters:
End Element: case
Characters:
End Element: Test

4. JAXB – Java Architecture for XML Binding

Overview

JAXB allows Java developers to map Java objects with XML representations and also assists in the reverse—verting XML representations to Java objects. The mapping of XML representations to Java objects and vice versa is vastly enhanced.

Features

  • Object-XML Mapping: A technology that converts Java objects to XML and those that convert them back.
  • Annotations: Annotations can be used to map Java classes with XML elements.
  • Binding: In other words, automatically handling the binding between the Java objects and XML.

Use Cases

  • Data binding: Ideal for applications that require frequent sweeping back and forth between Java-object and XML conversions.
  • Configuration Files: Used by any application that uses XML for its configuration.

Pros and Cons

  • Pros: This simplifies object-XML conversion, making the boilerplate code small.
  • Cons: Less control over XML parsing compared to other methods.

Example

Java
import javax.xml.bind.JAXBContext;
import javax.xml.bind.Marshaller;
import javax.xml.bind.Unmarshaller;
import java.io.StringReader;
import java.io.StringWriter;

public class JaxbExample {
    public static void main(String[] args) {
        try {
            JAXBContext context = JAXBContext.newInstance(Person.class);

            // Marshalling - Convert Java object to XML
            Person person = new Person("John", 30);
            StringWriter writer = new StringWriter();
            Marshaller marshaller = context.createMarshaller();
            marshaller.marshal(person, writer);
            System.out.println("XML Output:");
            System.out.println(writer.toString());

            // Unmarshalling - Convert XML to Java object
            StringReader reader = new StringReader(writer.toString());
            Unmarshaller unmarshaller = context.createUnmarshaller();
            Person unmarshalledPerson = (Person) unmarshaller.unmarshal(reader);
            System.out.println("Java Object:");
            System.out.println(unmarshalledPerson);
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

class Person {
    private String name;
    private int age;

    // Default constructor is required for JAXB
    public Person() {}

    public Person(String name, int age) {
        this.name = name;
        this.age = age;
    }

    // Getters and setters are required for JAXB
    public String getName() { return name; }
    public void setName(String name) { this.name = name; }
    public int getAge() { return age; }
    public void setAge(int age) { this.age = age; }

    @Override
    public String toString() {
        return "Person{name='" + name + "', age=" + age + '}';
    }
}

Output:

XML Output:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<person>
<age>30</age>
<name>John</name>
</person>
Java Object:
Person{name='John', age=30}

Conclusion

Java's power to handle XML is in the rich set of tools for parsing and handling the language. The DOM parser is good when working with an XML in-memory setup; the SAX parser works well within a low-memory, high-performance environment; and the StAX parser, appropriate for a good balance between the two, will keep you in control of the parsing process. JAXB is designed to be easier with object-XML mapping, thus pretty good with applications requiring frequent data binding. Selection of a parser will require identification of an application's needs relative to other factors, including document size, memory available, and complexity of the XML processing to be undertaken. Having more information about these parsers will make you better prepared to manage XML well in your Java applications.


Next Article
Article Tags :
Practice Tags :

Similar Reads