Java Course
Module 15: XML
Module Objective
At the end of this module, participants will be able to:
Explain the concept and use of XML
Describe the XML tree structure
Identify and adhere to XML syntax rules
Use XML elements and attributes
Describe how XML documents are validated
Use XML schemas for XML document validation
XML Overview
XML is a markup language for documents containing structured
information
XML is used to describe, store, and transport data
XML is pure information wrapped in user-defined tags
XML was created so that richly structured documents could be
used over the internet
XML Overview
Design Goals for XML
XML shall be straightforwardly usable over the Internet.
XML shall support a wide variety of applications.
XML shall be compatible with SGML.
It shall be easy to write programs that process XML documents.
The number of optional features in XML is to be kept to the absolute
minimum, ideally zero.
XML documents should be human-legible and reasonably clear.
The XML design should be prepared quickly.
The design of XML shall be formal and concise.
XML documents shall be easy to create.
Terseness in XML markup is of minimal importance.
XML Tree Structure
Elements in XML documents form a logical tree structure
XML tree structure illustrates the hierarchy and locality of the
elements in a XML document
XML tree structure can help in showing which elements are the
descendants and ancestors of each element
** Refer to the Classroom.xml sample code
XML Tree Structure
Sample code:
<?xml version="1.0" ?>
<classroom>
<teacher>
<first_name>Victoria</first_name>
<last_name>Brooke</last_name>
<gender>Female</gender>
<age>30</age>
</teacher>
<student>
<first_name>Michael</first_name>
<last_name>Rogers</last_name>
<gender>Male</gender>
<age>18</age>
</student>
</classroom>
6
XML Tree Structure
XML Tree of previous code:
CLASSROOM
TEACHER
First
Name
Last
Name
Gender
STUDENT
Age
First
Name
Last
Name
Gender
Age
XML Syntax Rules
All XML documents should begin with an XML declaration. The XML
declaration is a processing instruction that identifies the document as
being XML.
Example:
<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
There are no predefined tags in XML, users have to make their own
tags
XML documents must have exactly one root element, also known as
the document element.
Basic syntax for XML elements:
Syntax :
<element_name>element value</element_name>
Examples:
<first_name>Jason</first_name>
<book_title>My Favorite Book</book_title>
XML Syntax Rules
All XML elements must have a corresponding closing tag
Invalid:
Valid :
<some_tag>some value.
<some_tag>some value placed here</some_tag>
XML tags are case sensitive
Invalid:
Valid :
<Item_Name>7 Tonner Rice</item_name>
<item_name>7 Tonner Rice</item_name>
XML Syntax Rules
XML elements must be properly nested
Invalid:
<parent_element>
<child_element>Some value here</parent_element>
<child_element2>Some value here
</child_element>
</child_element2>
Valid :
<parent_element>
<child_element>Some value here</child_element>
<child_element2>Another value here</child_element2>
</parent_element>
10
XML Syntax Rules
XML attribute values must be placed within quotes
Invalid:
Valid :
<some_tag attribute1=attributeValue>.</some_tag>
<some_tag attribute1=attributeValue>.</some_tag>
Make use of entity references for special characters
<
<
Less than symbol
>
>
Greater than symbol
&
&
Ampersand
'
Apostrophe
"
Quotation mark
11
XML Syntax Rules
Example using entity references:
<?xml version="1.0" ?>
<quote>
<author>Sam Ewing </author>
<paragraph> " Success has a simple formula: do your best, and people may
like it. " </paragraph>
</quote>
The value of <paragraph> when accessed will be:
Success has a simple formula: do your best, and people may like it.
** Refer to the Quote.xml sample code
12
XML Elements
Elements are used to classify data in an XML document to make
the data understandable.
Elements can have any name desired and are usually
descriptive of the data they hold.
Elements can contain other elements usually to include more
details.
13
XML Elements
Elements can contain attributes which also allow additional
information.
Elements are defined by its opening and closing tags.
<myElement> This sentence is found inside the element -myElement- </myElement>
Opening tag
Information stored in the element
Closing tag
14
XML Attributes
XML Attributes provide additional information to the element to
which it belongs.
Attributes are information that is often not part of data but is
used in manipulating the data the element holds.
Attributes are commonly used for identification purposes, in
cases as such there is more than one element of the same type.
15
XML Attributes
XML Attribute Sample:
<?xml version="1.0" ?>
<class_list year=3 section=B>
<student id=B-0001>
<first_name>Anna</first_name>
<last_name>Sanders</last_name>
<gender>Female</gender>
</student>
<student id=B-0002>
<first_name>John</first_name>
<last_name>dela Cruz</last_name>
<gender>Male</gender>
</student>
</class_list>
** Refer to the classlist.xml sample code
16
Validating XML Documents
Validating of XML documents is done through a DTD (Document
Type Definition) or an XSD (XML Schema Definition)
Different types of XML documents in terms of validity
Broken XML documents
Well Formed XML Documents
Valid XML Documents
17
Validating XML Documents
Broken XML documents refer to XML Documents where syntax
rules are violated
Well-formed XML documents refer to XML documents that fully
comply to the syntax rules
Valid XML documents refer to XML documents that are wellformed and comply to a DTD/XSD
18
Validating XML Documents
A DTD defines the structure of an XML document with the list of
legal elements and attributes.
XML Schema Definition (XSD) is the XML-based alternative of
DTDs, having the same purpose of DTDs but more powerful and
extensible.
DTD is the older standard. It is most probable that XSD will
replace DTD for validating XML Documents.
** Refer to the person_DTD.xml sample code
19
Validating XML Documents
DTD vs XML Schema (XSD)
XSD is extensible to accept future additions
XSD is more powerful than its predecessor
XSD makes use of XML syntax
XSD supports data types
XSD supports namespaces
** Refer to the person_XSD.xml and person.xsd sample code
20
Validating XML Documents
XSD is written similarly to XML as XSD makes use of XML
syntax, hence most of the rules of XML apply to XSD
XML Schema has data types and namespaces, unlike DTDs
XML Schema data types include
String
Date
Numeric
Many others
21
Validating XML Documents
The <schema> element is the root element for XSD
Syntax : <xs:schema>
</xs:schema>
The <schema> element can have attributes, including the
default namespace to be used
22
Validating XML Documents
Attributes of elements are defined within the elements where the
attributes belong
Syntax :
<xs:attribute name=name_of_attribute" type=data_type"/>
Example :
<xs:element name=dog>
<xs:complexType>
<xs:simpleContent>
<xs:extension base=xs:string>
<xs:attribute name=breed" type=xs:string"/>
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
** Refer to the Samples.xml and XSD_Samples.xsd sample code
23
Validating XML Documents
Elements in schemas define the structure and properties of
elements in XML documents
Elements in schemas are divided into two types
Simple Elements
Complex Elements
24
Validating XML Documents
Simple elements refer to elements containing only text and
cannot contain other elements or attributes
In XML :
<first_name>Michael Angelo</first_name>
Syntax :
Example :
<xs:element name=my_name" type=the_datatype"/>
<xs:element name=first_name type=xs:string/>
Simple elements can have default or fixed values
Syntax :
<xs:element name=my_name" type=the_datatype default=default_value/>
<xs:element name=my_name" type=the_datatype fixed=fixed_value/>
Examples :
<xs:element name=current_year type=xs:integer default=2008/>
<xs:element name=legal_age type=xs:integer fixed=18/>
25
Validating XML Documents
Complex elements refer to elements that can have attributes
and can also be one of the following:
Elements that contain only text (with attributes)
Elements that are empty
Elements containing other elements
Elements that contain both text and other elements
26
Validating XML Documents
Elements that contain only text but have attributes are
considered complex elements
XML :
<dog dog_tag_number=100012>Spike</dog>
<xs:element name=dog">
<xs:complexType>
<xs:simpleContent>
<xs:extension base="xs:string">
<xs:attribute name=dog_tag_number" type="xs:integer" />
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
XSD :
27
Validating XML Documents
Elements that are empty refer to elements that hold no text but
may contain attributes
XML :
<computer serial_ID=103A-0212-00A7-101B />
XSD:
<xs:element name=computer">
<xs:complexType>
<xs:attribute name=serial_ID" type="xs:string"/>
</xs:complexType>
</xs:element>
28
Validating XML Documents
Elements that serve only to contain other elements are
considered complex elements
XML :
<car>
<color>red</color>
<wheels>4</wheels>
</car>
<xs:element name=car">
<xs:complexType>
<xs:sequence>
<xs:element name=color" type="xs:string"/>
<xs:element name=wheels" type="xs:integer"/>
</xs:sequence>
</xs:complexType>
</xs:element>
XSD :
29
Validating XML Documents
Elements that contain both text and other elements are
considered complex elements
XML :
<letter> Dear Mrs.
<name>Erika Daniels</name>. Your child,
<child_name>Michael</child_name>, has done something at school.
Please come to the prinicipals office anytime tomorrow,
<schedule>2008-08-21</schedule>.
</letter>
XSD :
<xs:element name="letter">
<xs:complexType mixed="true">
<xs:sequence>
<xs:element name="name" type="xs:string"/>
<xs:element name=child_name" type="xs:string"/>
<xs:element name=schedule" type="xs:date"/>
</xs:sequence>
</xs:complexType>
</xs:element>
30
Validating XML Documents
Indicators control how elements can be used
Order Indicators
All
Choice
Sequence
Occurrence Indicators
maxOccurs
minOccurs
Group Indicators
Group name
attributeGroup name
31
Validating XML Documents
The <any> element allows other elements not specified in the
schema, which makes it extensible
Similarly, the <anyAttribute> element allows other attributes not
specified in the schema within the designated element
32
Validating XML Documents
Element names can be substituted by defining a
substitutionGroup in the XML schema
Substitution can be useful when developers and users speak
different languages, where it may be feasible to change element
names for easier understanding
33
Validating XML Documents
Restrictions set the acceptable values for elements and
attributes in XML documents
Some restrictions on values that can be applied are
Set of values
Series of values
Whitespace character
Length Restrictions
34
References
W3Schools XML Tutorial
http://www.w3schools.com/xml/default.asp
Java SE 6 Documentation, Tutorials, Training, Demos, and
Samples
http://java.sun.com/javase/6/docs/index.html
Tizag - XML Tutorial
http://www.tizag.com/xmlTutorial/index.php
35
Questions and Comments
36