Reading and Writing XML Files in Python
Last Updated :
10 Aug, 2024
Extensible Markup Language, commonly known as XML is a language designed specifically to be easy to interpret by both humans and computers altogether. The language defines a set of rules used to encode a document in a specific format. In this article, methods have been described to read and write XML files in python.
Note: In general, the process of reading the data from an XML file and analyzing its logical components is known as Parsing. Therefore, when we refer to reading a xml file we are referring to parsing the XML document.
In this article, we would take a look at two libraries that could be used for the purpose of xml parsing. They are:
- BeautifulSoup used alongside the lxml parser
- Elementtree library.
Using BeautifulSoup alongside with lxml parser
For the purpose of reading and writing the xml file we would be using a Python library named BeautifulSoup. In order to install the library, type the following command into the terminal.
pip install beautifulsoup4
Beautiful Soup supports the HTML parser included in Python’s standard library, but it also supports a number of third-party Python parsers. One is the lxml parser (used for parsing XML/HTML documents). lxml could be installed by running the following command in the command processor of your Operating system:
pip install lxml
Firstly we will learn how to read from an XML file. We would also parse data stored in it. Later we would learn how to create an XML file and write data to it.
Reading Data From an XML File
There are two steps required to parse a xml file:-
- Finding Tags
- Extracting from tags
Example:
XML File used:

Python3
from bs4 import BeautifulSoup
# Reading the data inside the xml
# file to a variable under the name
# data
with open('dict.xml', 'r') as f:
data = f.read()
# Passing the stored data inside
# the beautifulsoup parser, storing
# the returned object
Bs_data = BeautifulSoup(data, "xml")
# Finding all instances of tag
# `unique`
b_unique = Bs_data.find_all('unique')
print(b_unique)
# Using find() to extract attributes
# of the first instance of the tag
b_name = Bs_data.find('child', {'name':'Frank'})
print(b_name)
# Extracting the data stored in a
# specific attribute of the
# `child` tag
value = b_name.get('test')
print(value)
OUTPUT:

Writing an XML File
Writing a xml file is a primitive process, reason for that being the fact that xml files aren’t encoded in a special way. Modifying sections of a xml document requires one to parse through it at first. In the below code we would modify some sections of the aforementioned xml document.
Example:
Python3
from bs4 import BeautifulSoup
# Reading data from the xml file
with open('dict.xml', 'r') as f:
data = f.read()
# Passing the data of the xml
# file to the xml parser of
# beautifulsoup
bs_data = BeautifulSoup(data, 'xml')
# A loop for replacing the value
# of attribute `test` to WHAT !!
# The tag is found by the clause
# `bs_data.find_all('child', {'name':'Frank'})`
for tag in bs_data.find_all('child', {'name':'Frank'}):
tag['test'] = "WHAT !!"
# Output the contents of the
# modified xml file
print(bs_data.prettify())
Output:

Using Elementtree
Elementtree module provides us with a plethora of tools for manipulating XML files. The best part about it being its inclusion in the standard Python’s built-in library. Therefore, one does not have to install any external modules for the purpose. Due to the xmlformat being an inherently hierarchical data format, it is a lot easier to represent it by a tree. The module provides ElementTree provides methods to represent whole XML document as a single tree.
In the later examples, we would take a look at discrete methods to read and write data to and from XML files.
Reading XML Files
To read an XML file using ElementTree, firstly, we import the ElementTree class found inside xml library, under the name ET (common convension). Then passed the filename of the xml file to the ElementTree.parse() method, to enable parsing of our xml file. Then got the root (parent tag) of our xml file using getroot(). Then displayed (printed) the root tag of our xml file (non-explicit way). Then displayed the attributes of the sub-tag of our parent tag using root[0].attrib. root[0] for the first tag of parent root and attrib for getting it’s attributes. Then we displayed the text enclosed within the 1st sub-tag of the 5th sub-tag of the tag root.
Example:
Python3
# importing element tree
# under the alias of ET
import xml.etree.ElementTree as ET
# Passing the path of the
# xml document to enable the
# parsing process
tree = ET.parse('dict.xml')
# getting the parent tag of
# the xml document
root = tree.getroot()
# printing the root (parent) tag
# of the xml document, along with
# its memory location
print(root)
# printing the attributes of the
# first tag from the parent
print(root[0].attrib)
# printing the text contained within
# first subtag of the 5th tag from
# the parent
print(root[5][0].text)
Output:

Writing XML Files
Now, we would take a look at some methods which could be used to write data on an xml document. In this example we would create a xml file from scratch.
To do the same, firstly, we create a root (parent) tag under the name of chess using the command ET.Element(‘chess’). All the tags would fall underneath this tag, i.e. once a root tag has been defined, other sub-elements could be created underneath it. Then we created a subtag/subelement named Opening inside the chess tag using the command ET.SubElement(). Then we created two more subtags which are underneath the tag Opening named E4 and D4. Then we added attributes to the E4 and D4 tags using set() which is a method found inside SubElement(), which is used to define attributes to a tag. Then we added text between the E4 and D4 tags using the attribute text found inside the SubElement function. In the end we converted the datatype of the contents we were creating from ‘xml.etree.ElementTree.Element’ to bytes object, using the command ET.tostring() (even though the function name is tostring() in certain implementations it converts the datatype to `bytes` rather than `str`). Finally, we flushed the data to a file named gameofsquares.xml which is a opened in `wb` mode to allow writing binary data to it. In the end, we saved the data to our file.
Example:
Python3
import xml.etree.ElementTree as ET
# This is the parent (root) tag
# onto which other tags would be
# created
data = ET.Element('chess')
# Adding a subtag named `Opening`
# inside our root tag
element1 = ET.SubElement(data, 'Opening')
# Adding subtags under the `Opening`
# subtag
s_elem1 = ET.SubElement(element1, 'E4')
s_elem2 = ET.SubElement(element1, 'D4')
# Adding attributes to the tags under
# `items`
s_elem1.set('type', 'Accepted')
s_elem2.set('type', 'Declined')
# Adding text between the `E4` and `D5`
# subtag
s_elem1.text = "King's Gambit Accepted"
s_elem2.text = "Queen's Gambit Declined"
# Converting the xml data to byte object,
# for allowing flushing data to file
# stream
b_xml = ET.tostring(data)
# Opening a file under the name `items2.xml`,
# with operation mode `wb` (write + binary)
with open("GFG.xml", "wb") as f:
f.write(b_xml)
Output:

Similar Reads
Reading and Writing YAML File in Python
YAML (YAML Ain't Markup Language) is a human-readable data serialization standard that is commonly used for configuration files and data exchange between languages with different data structures. YAML is preferred in many cases due to its simplicity and readability compared to other formats like JSO
3 min read
Reading and Writing CSV Files in Python
CSV (Comma Separated Values) format is one of the most widely used formats for storing and exchanging structured data between different applications, including databases and spreadsheets. CSV files store tabular data, where each data field is separated by a delimiter, typically a comma. Python provi
4 min read
Reading and Writing to text files in Python
Python provides built-in functions for creating, writing, and reading files. Two types of files can be handled in Python, normal text files and binary files (written in binary language, 0s, and 1s). Text files: In this type of file, Each line of text is terminated with a special character called EOL
8 min read
Reading and Writing JSON to a File in Python
The full form of JSON is Javascript Object Notation. It means that a script (executable) file which is made of text in a programming language, is used to store and transfer the data. Python supports JSON through a built-in package called JSON. To use this feature, we import the JSON package in Pytho
3 min read
Reading and Writing lists to a file in Python
Reading and writing files is an important functionality in every programming language. Almost every application involves writing and reading operations to and from a file. To enable the reading and writing of files programming languages provide File I/O libraries with inbuilt methods that allow the
5 min read
Read File As String in Python
Python provides several ways to read the contents of a file as a string, allowing developers to handle text data with ease. In this article, we will explore four different approaches to achieve this task. Each approach has its advantages and uses cases, so let's delve into them one by one. Read File
3 min read
Read and Write TOML Files Using Python
TOML file stand for (Tom's Obvious, Minimum Language). The configuration files can be stored in TOML files, which have the .toml extension. Due to its straightforward semantics, which strives to be "minimal," it is supposed to be simple to read and write. It is also made to clearly map to a dictiona
6 min read
Writing CSV files in Python
CSV (Comma Separated Values) is a simple file format used to store tabular data, such as spreadsheets or databases. Each line of the file represents a data record, with fields separated by commas. This format is popular due to its simplicity and wide support. Ways to Write CSV Files in PythonBelow a
11 min read
Writing to file in Python
Writing to a file in Python means saving data generated by your program into a file on your system. This article will cover the how to write to files in Python in detail. Creating a FileCreating a file is the first step before writing data to it. In Python, we can create a file using the following t
4 min read
Reading binary files in Python
Reading binary files means reading data that is stored in a binary format, which is not human-readable. Unlike text files, which store data as readable characters, binary files store data as raw bytes. Binary files store data as a sequence of bytes. Each byte can represent a wide range of values, fr
6 min read