Understanding XML: Features and Uses
Understanding XML: Features and Uses
XML is a markup language similar to HTML, but without predefined tags to use.
Instead, you define your own tags designed specifically for your needs. This is a
powerful way to store data in a format that can be stored, searched, and shared
1
Main Benefit
We can use it to take data from a program like Microsoft SQL, convert it into
XML then share that XML with other programs and platforms.
- The main thing which makes XML truly powerful is its international
acceptance.
Editix, Liquid Studio, Notepad++, Oxygen XML Editor, Stylus Studio, Sublime Text,
Visual Studio Code, XMLBlueprint, XMLPad, XML Spy, XML ValidatorBuddy
3
Who created XML?
• In the late 1990s a group of people including Jon Bosak, Tim Bray, James Clark and others
came up with XML, eXtensible Markup Language.
4 HTML is static because it is used to display data. XML is dynamic because it is used to transport data.
6 HTML has its own predefined tags. We can define tags according to our need.
• XML documents must contain a root element and it is "the parent" of all other elements.
• Elements in an XML document form a document tree.
• The tree starts at the root and branches to the lowest level of the tree.
• All elements can have sub elements (child elements).
• The terms parent, child, and sibling are used to describe the relationships b/w elements.
• Children on the same level are called siblings (brothers or sisters).
• All elements can have text content and attributes (just like in HTML).
<root>
<child>
<subchild>.....</subchild>
</child>
</root>
(B) Example of XML: basic XML structure that represents a book's details.
[Link]
<?xml version="1.0" encoding="UTF-8"?>
<bookstore> root element: <bookstore>
<book>
<title>XML Basics</title> All elements in the document are contained
<author>Abhi</author> within <bookstore>
<year>2024</year>
<price>299</price> The <book> element has 4 children: <title>,<
</book> author>, <year> and <price>.
<book>
<title>Learn XML</title>
<author>Avi</author>
<year>2023</year>
<price>200</price>
</book>
</bookstore>
XML Attributes
• XML elements can have attributes that add the information about the element.
• Attribute for an element is placed after the tag name in the start tag.
• We can add more than one attribute for a single element.
• XML attributes enhance the properties of the elements.
• XML attributes must always be quoted(single or double quote).
XML comments are just like HTML comments. Although XML is known as self-describing
data but sometimes XML comments are necessary.
<?xml version="1.0"?>
<college>
<student>
<firstname>Anu</firstname>
<lastname>Bhatt</lastname>
<contact>07899044992</contact>
<email>[Link]@[Link]</email>
<address>
<city>Haldwani</city>
<state>Uttarakhand</state>
<pin>201206</pin>
</address>
</student> Line 1: XML declaration( defines the XML version 1.0.
</college> Line 2: Root element (college).
Line 3: Inside root element, there is one more element: student
<student> : contains 5 branches- <firstname>, <lastname>, <contact>, <Email>, <address>.
Line 8: <address> branch contains 3 sub-branches named <city>, <state> and <pin>.
Note: DOM parser represents the XML document in Tree structure.
XML with a Complex Structure
<?xml version="1.0" encoding="UTF-8"?>
<university>
<student id="001">
<name>Anna Lee</name>
<courses>
<course code="CS101" credits="3">Computer Science</course>
<course code="MA102" credits="4">Mathematics</course>
</courses>
<graduationYear>2024</graduationYear>
</student>
<student id="002">
<name>Mark Liu</name>
<courses>
<course code="PH103" credits="3">Physics</course>
<course code="CS101" credits="3">Computer Science</course>
</courses>
<graduationYear>2023</graduationYear>
</student>
</university>
Types of XML:
2 ways to describe an XML document: XML Schemas and DTDs.
15
Types of DTD- Internal and External DTD:
• XML documents can contain a DTD, which can either be embedded within the
document itself (known as an internal DTD) or stored in a separate file (an
external DTD).
• Internal DTDs can result in larger XML documents, while external DTDs keep
them smaller.
Difference between external and internal data:
• Internal data is the information generated from within a business, including areas
such as operations, maintenance, personnel, and finance.
• External data comes from the market, such as surveys, questionnaires, research, and
customer feedback.
How to make an xml file without using internal or external dtd
For this, simply avoid including any DOCTYPE declarations in your XML content.
In XML, DTDs are optional. Unless you explicitly add a DOCTYPE declaration, your XML file will not reference any DTD.
16
1) Internal Declaration
<?xml version="1.0"?>
<!DOCTYPE note [
<!ELEMENT pnote (to,from,heading,body)>
1) Internal
<!ELEMENT to (#PCDATA)> • !DOCTYPE note: defines that the root
element of this document is note.
Declaration <!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)> • It has one parent <pnote>
• !ELEMENT pnote: defines that the
<!ELEMENT body (#PCDATA)> pnote element contains 4 elements:
]> "to,from,heading,body".
<note>
<pnote>
<to>Raju</to> • <!ELEMENT to: The to element is
#PCDATA typed. (parse-able data type).
<from>Ravi</from> ….and so on.
<heading>Reminder</heading>
<body>Don't forget me this weekend</body>
</pnote>
</note>
2) External DTD [Link]
<!ELEMENT pnote (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<?xml version="1.0"?> <!ELEMENT body (#PCDATA)>
<!DOCTYPE note SYSTEM "[Link]"> ]>
<note>
<pnote>
[Link]
<to>Raju</to>
<from>Ravi</from>
<heading>Reminder</heading> In this example:
<body>Don't forget me this weekend</body> 1. DOCTYPE declaration: refers to an external
</pnote> DTD file.
2. <!DOCTYPE employee : root element of the
<pnote>
document is employee.
<to>Raju</to>
3. The external DTD file is to be referred
<from>Ravi</from>
here.(in [Link])
<heading>Reminder</heading>
<body>Don't forget me this weekend</body>
</pnote>
</note>
XML CSS
Purpose of CSS in XML:
• CSS (Cascading Style Sheets) can be used to add style and display information to an XML
document. It can format the whole XML document.
Link XML file with CSS : syntax: <?xml-stylesheet type="text/css" href="[Link]"?>
XML CSS Example xml file using CSS and DTD : [Link]
DTD file : [Link]
[Link] <?xml version="1.0"?>
Understanding Namespaces
• As defined by the W3C Namespaces in XML Recommendation , an XML namespace is a collection of XML
elements and attributes identified by an Internationalized Resource Identifier (IRI); this collection is often
referred to as an XML "vocabulary.“
• A namespace is a declarative region that provides a scope to the identifiers (the names of types, functions,
variables, etc) inside it. Namespaces are used to organize code into logical groups and to prevent name
collisions that can occur especially when your code base includes multiple libraries.
XML Schema – XSD (XML Schema Definition)
Ø The XML Schema language is also referred to as XML Schema Definition (XSD).
Ø XSD is used to define the possible structure and contents of an XML format.
Ø A validating parser can then check whether an XML instance document conforms to an XSD
schema or a set of schemas.
• Similar to DTD, XML Schema is also used to check whether the given XML document is
“well formed” and “valid”.
• XML schema is an alternative to DTD.
• An XML document is considered “well formed” and “valid” if it is successfully validated
against XML Schema.
2. Creating the XML file: This XML File follows the rules in the XSD
file.
Optionally, Validate the XML against the XSD to ensure it adheres to
the schema.
1. Creating an XSD File: [Link] that defines the structure of a book catalog:
This XSD defines that:
<?xml version="1.0"?>
<xs:schema xmlns:xs="[Link] This code is an XML Schema Definition (XSD) used to
<xs:element name="catalog"> define the structure of XML documents.
<xs:complexType>
<xs:sequence> Simple explanation :
<xs:element name="book" maxOccurs="unbounded"> 2. `<xs:schema
<xs:complexType> xmlns:xs="[Link]
<xs:sequence>
This line starts the XML Schema definition and defines
<xs:element name="author" type="xs:string"/>
<xs:element name="title" type="xs:string"/> the namespace for the schema, which is a way of
<xs:element name="genre" type="xs:string"/> uniquely identifying the XML Schema elements and
<xs:element name="price" type="xs:decimal"/> attributes.
<xs:element name="publish_date" type="xs:date"/>
3. `<xs:element name="catalog">`: an element named
</xs:sequence>
"catalog". This is the root element of the XML document
<xs:attribute name="id" type="xs:string"
use="required"/> that will be validated against this schema.
</xs:complexType> 4. `<xs:complexType>`: This specifies that the "catalog"
</xs:element> element contains other elements inside it (it is not a
</xs:sequence>
simple element with just text).
</xs:complexType>
</xs:element> 5. `<xs:sequence>`: This means that the child elements of
"catalog" should appear in a specific sequence.
</xs:schema>
Cont…
6. `<xs:element name="book" maxOccurs="unbounded">`: Inside "catalog", there can be multiple "book" elements.
The `maxOccurs="unbounded"` means there can be any number of "book" elements.
7. `<xs:complexType>`: each "book" element also contains other elements inside it.
8. `<xs:sequence>`: the child elements of each "book" element must appear in a specific order.
9. `<xs:element name="author" type="xs:string"/>`: "author" element, and its content should be a string.
10. `<xs:element name="title" type="xs:string"/>`: "title" element, and its content should be a string.
11. `<xs:element name="genre" type="xs:string"/>`: "genre" element, and its content should be a string.
12. `<xs:element name="price" type="xs:decimal"/>`: "price" element, and its content should be a decimal number.
13. `<xs:element name="publish_date" type="xs:date"/>`: "publish_date" element, and its content should be a
date.
14. `<xs:attribute name="id" type="xs:string" use="required"/>`: Each "book" element must have an attribute
named "id", and it should be a string. This attribute is required for every "book" element.
15. `</xs:schema>`: This line closes the schema definition.
In summary, this schema defines a structure for an XML document where the root element is "catalog", which can
contain multiple "book" elements. Each "book" must have specific child elements ("author", "title", "genre", "price",
and "publish_date") and a required "id" attribute.
2. Creating an XML File That Conforms to the XSD (`[Link]` schema)
Explanation:
<?xml version="1.0" encoding="UTF-8"?>
<catalog xmlns:xsi="[Link]
2. <catalog
instance"
xmlns:xsi="[Link]
xsi:noNamespaceSchemaLocation="[Link]">
<book id="bk101"> instance"
<author>Gambardella, Matthew</author> xsi:noNamespaceSchemaLocation="[Link]">`: This is
<title>XML Developer's Guide</title> the root element of the XML document.
<genre>Computer</genre> - `xmlns:xsi="[Link]
<price>44.95</price> instance"`: Declares the XML Schema Instance namespace,
<publish_date>2000-10-01</publish_date> which is used for schema-related attributes.
</book>
<book id="bk102"> - `xsi:noNamespaceSchemaLocation="[Link]"`:
<author>Ralls, Kim</author> Indicates the location of the XML Schema (XSD) file that
<title>Midnight Rain</title> defines the structure and rules for this XML document.
<genre>Fantasy</genre> Here, it's named "[Link]".
<price>5.95</price>
3. `<book id="bk101">`: This is a "book" element with an
<publish_date>2000-12-16</publish_date>
attribute `id="bk101"`. It represents a single book entry in
</book>
the catalog.
</catalog>
4. Inside the `<book>` element, there are several child elements:
- `<author>Gambardella, Matthew</author>`: The author of the book.
- `<title>XML Developer's Guide</title>`: The title of the book.
- `<genre>Computer</genre>`: The genre of the book.
- `<price>44.95</price>`: The price of the book.
- `<publish_date>2000-10-01</publish_date>`: The publication date of the book.
5. `<book id="bk102">`: Another "book" element with a different ID (`id="bk102"`). This represents another book entry in
the catalog.
7. `</catalog>`: This closes the "catalog" element, ending the XML document.
In summary, this XML document is a catalog that contains information about two books, including details like the author,
title, genre, price, and publication date for each book. The schema file referenced (`[Link]`) would define the rules and
structure for these elements.
XML Entities
• a way of representing special characters. The ENTITY statement is used to define entities in the DTD, for use in both the
XML document associated with the DTD and the DTD itself.
• An ENTITY provides an abbreviated entry to place in your XML document.
• For example, the < and > symbols a used for tags. You cannot directly type from the keyboard for less than and greater
than signs. Instead, you need to use entities.
some of the popular XML entities. " Quotation mark (double quote) quot "
(1)simpleType – Such type of element can contain text, they do not contain
other elements and cannot be left empty.
the elements- to, from, heading and body are simpleType element.
5) Doesn't define order for child elements. Defines order for child elements.
8) The extension of document type is .dtd. The extension of Schema file is .xsd.
Advantages of using XML Schema over DTD
Example:
<?xml version="1.0"?>
<message>Hello <world>!</message>
XML Parsers
- It is a software library or package that provides interfaces for client applications
to work with an XML document.
- It is designed to read the XML and create a way for programs to use XML.
- It validates the document and check that the document is well formatted.
A SAX Parser implements SAX API. This API is an event based API and less intuitive.
Advantages
1) It is simple and memory efficient.
2) It is very fast and works for huge documents.
Disadvantages
1) It is event-based so its API is less intuitive.
2) Clients never know the full information because the data is broken into pieces.
[Link]. SAX PARSER DOM PARSER
01. It is called a Simple API for XML Parsing. It is called as Document Object Model.
03. SAX Parser is slower than DOM Parser. DOM Parser is faster than SAX Parser.
04. Best for the larger sizes of files. Best for the smaller size of files.
05. Internal structure can’t be created by SAX Parser. Internal structure can be created by DOM Parser.
07. Backward navigation is not possible. Backward and forward search is possible
08. Suitable for efficient memory. Suitable for large XML document.
09. Small part of XML file is only loaded in memory. It loads whole XML documents in memory.
XSL and XSLT
• XSL stands for EXtensible Stylesheet Language. It is a styling language for XML just like CSS
is a styling language for HTML.
• XSLT stands for XSL Transformation. It is used to transform XML documents into other
formats (like transforming XML into HTML).
• World Wide Web Consortium (W3C) developed XSL to understand and style an XML
document, which can act as XML based Stylesheet Language.
• An XSL document specifies how a browser should render an XML document.
• Dynamic HTML is not a markup or programming language but it is a term that combines
the features of various web development technologies for creating the web pages dynamic
and interactive.
• The DHTML application was introduced by Microsoft with the release of the 4th version of
IE (Internet Explorer) in 1997.
4. Does not contain any server-side scripting 4. May contain code of server-side scripting.
code.
5. Their files are stored with the .html 5. The files are stored with the .dhtm
or .htm extension in a system. extension in a system.
6. A simple page which is created by a user 6. A page which is created by a user using
without using the scripts or styles called as the HTML, CSS, DOM, and JavaScript
an HTML page. technologies called a DHTML page.
7. This markup language does not need 7. This concept needs database connectivity
database connectivity. because it interacts with users.