Parse XML File in Bash Script



XML stands for Extensible Markup Language. It's a widely used format that is used to exchange data between systems. Many applications are based on XML as their configuration files. Even the very well-known document application, Office, is based on XML.

  • What makes XML very popular is that it is written in plain text, which makes it easy to work with and also independent. It can be used in any programming language.
  • Unlike HTML and other markup languages, XML doesn't come with predefined tags that you need to remember and use. With XML, you can use tags of your choice that represent what you are working with through a tree-like format.

In Linux, there are a lot of tools and programs that allow us to parse and read the content of XML files. Throughout this article, we will see some examples of how we can deal with XML files.

Why Parsing an XML File?

Parsing or reading an XML file helps us understand its content and structure. This operation helps us in two ways: first, by searching for specific elements, and second, by extracting information.

Parse an XML File using XMLStarlet

In order to read / parse an XML file, we need a tool specific for this. In Linux, there are many tools that do the job for us. One of them is XMLStarlet. It is a command-line utility that helps read XML files with a bunch of functionality like selecting, validating, and editing XML files.

Installing XMLStarlet in Linux

XMLStarlet doesn't come installed on Linux. You need to install it first. Since it's available in the repositories of Ubuntu, Fedora, and Arch, you will just need to use the package manager to install it easily.

For Debian / Ubuntu / Mint ?

sudo apt install xmlstarlet

For RedHat / Fedora ?

sudo dnf check-update

To check if the installation is successfully performed, use the option --version to print the current version of the tool ?

xmlstarlet -v

This should output the version installed in your system ?

1.6.1

Basic Usage of XMLStarlet

As we said before, XMLStarlet is a powerful tool with a lot of functionality that we can use when we deal with XML format documents.

Let's say, for example, we have an XML file called example.xml that we need to parse.

Select a Specific Element

The example file that we create to demonstrate the usage of XMLStarlet is like this ?

<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
   <book category="fiction">
      <title lang="en">Learn BashScript</title>
      <author>TutorialsPoint</author>
      <year>1925</year>
      <price>10.99</price>
   </book>

   <book category="non-fiction">
      <title lang="en">Learn Docker</title>
      <author>TutorialsPoint</author>
      <year>2011</year>
      <price>14.99</price>
   </book>

   <book category="children">
      <title lang="en">Guide to Linux</title>
      <author>TutorialsPoint</author>
      <year>2023</year>
      <price>7.99</price>
   </book>
</bookstore>

In this example, the XML file is formatted very well. If you have a file that is not formatted, you can use the option fo with the tool to print the document in a pretty way ?

xmlstarlet fo example.xml


Using the tool, we can get a specific element, for example:

xmlstarlet sel -t -v "//author" example.xml

Let's understand Options what we did here ?

  • sel stands for "select", allowing us to select an element from the file.
  • -t stands for "template", which tells the tool how we need to output the element.
  • -v tells the tool that we need the value of the element. In this example, the element is the author.
  • // means we need all the elements in this file.
  • author is the name of the element we are searching for.

The option example.xml is the name of the XML file that we need to parse.

Because in our example the element author is repeated three times, the output should be ?

TutorialsPoint
TutorialsPoint
TutorialsPoint

Edit an XML File

Using the tool also helps us add elements to the existing XML file, for example ?

xmlstarlet ed -L -s "//bookstore" -t elem -n "course" -v "" \
   -s "//course[last()]" -t elem -n "title" -v "Operating Systems" \
   example.xml

Let's explain Options what we did here ?

  • ed ? Tell XMLStarlet that we need to edit a file.
  • -L ? This means we need to modify and save the file.
  • //bookstore ? Select the element where we are going to add a child (we need to be under the bookstore element).
  • -t elem ? This means we want to create an element.
  • -v ? Sets values.
  • -s "//course[last()]" ? This will select the last "course" element that we just added.
  • -n "title" ? Adds a new element under the "course" element that we just created.
  • -v "Operating Systems" ? Sets the value for the title element.

If we run this command, it should add a new element called course with a child element called title and the value "Operating Systems".

Now if you open the file again, you will see our new element course added to the file ?

<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
   <book category="fiction">
      <title lang="en">Learn BashScript</title>
      <author>TutorialsPoint</author>
      <year>1925</year>
      <price>10.99</price>
   </book>
   
   <book category="non-fiction">
      <title lang="en">Learn Docker</title>
      <author>TutorialsPoint</author>
      <year>2011</year>
      <price>14.99</price>
   </book>
   <book category="children">
      <title lang="en">Guide to Linux</title>
      <author>TutorialsPoint</author>
      <year>2023</year>
      <price>7.99</price>
   </book>
   <course>
      <title>Operating Systems</title>
   </course>
</bookstore>

You should note that we have a new element called course under the bookstore element. XML is tree-like, so course is a child of the bookstore element, and title is a child of the course element.

Schema Validation

Using XMLStarlet, we can use the following command to validate an XML file (example.xml) against a DTD or schema ?

xmlstarlet val -d schema.dtd example.xml

The DTD (Document Type Definition) defines the structure and rules for an XML file. This validation checks whether the file follows the specified rules or structure defined in the DTD.

In this example, we set the schema to schema.dtd, and we want to check if the document example.xml follows it.

Conclusion

In this tutorial, we demonstrated how we can parse an XML file in bash script using the XMLStarlet tool, which helps us if we need to parse and edit XML files. In this article, we did an example of how to select elements and edit to add a new field to the file.

Updated on: 2024-11-04T11:22:31+05:30

669 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements