Introduction to XML
Extensible Markup Language
Extensible Markup Language
• Introduction
• SGML is a meta-markup language is a language for defining markup language it
can describe a wide variety of document types.
• Developed in the early 1980s; In 1986 SGML was approved by ISO std.
• HTML was developed using SGML in the early 1990s - specifically for Web
documents.
• Two problems with HTML:
• 1. HTML is defined to describe the general form and layout of information without
considering its meaning.
• 2. Fixed set of tags and attributes. Given tags must fit every kind of document. No
way to find particular information
• 3. There are no restrictions on arrangement or order of tag appearance in
document.
What is XML
• XML stands for eXtensible Markup Language.
• A markup language is used to provide information about a
document.
• Tags are added to the document to provide the extra
information.
• XML was designed to describe data, not to display data
• XML tags are not predefined. You must define your own tags
• HTML tags tell a browser how to display the document.
• XML tags give a reader some idea what some of the data
means.
What is XML Used For?
• XML documents are used to transfer data from one
place to another often over the Internet.
• XML subsets are designed for particular applications.
• A number of fields have their own subsets. These
include chemistry, mathematics, and books publishing.
• Most of these subsets are registered with the
W3Consortium and are available for anyone’s use.
How Can XML be Used?
• If you need to display dynamic data in your HTML document, it
will take a lot of work to edit the HTML each time the data
changes.
• With XML, data can be stored in separate XML files. This way
you can concentrate on using HTML/CSS for display and
layout, and be sure that changes in the underlying data will not
require any changes to the HTML.
• With a few lines of JavaScript code, you can read an external
XML file and update the data content of your web page.
Advantages of XML
• XML is text (Unicode) based.
– Takes up less space.
– Can be transmitted efficiently.
• XML documents can be modularized. Parts can
be reused.
Example of an HTML Document
<html>
<head><title>Example</title></head.
<body>
<h1>This is an example of a page.</h1>
<h2>Some information goes here.</h2>
</body>
</html>
Example of an XML Document
<?xml version=“1.0”/>
<address>
<name>Alice Lee</name>
<email>alee@aol.com</email>
<phone>212-346-1234</phone>
<birthday>1985-03-22</birthday>
</address>
Difference Between HTML and XML
• HTML tags have a fixed meaning and
browsers know what it is.
• XML tags are different for different
applications, and users know what they
mean.
• HTML tags are used for display.
• XML tags are used to describe documents
and data.
XML Rules
• Tags are enclosed in angle brackets.
• Tags come in pairs with start-tags and
end-tags.
• Tags must be properly nested.
– <name><email>…</name></email> is not allowed.
– <name><email>…</email><name> is.
• Tags that do not have end-tags must be
terminated by a ‘/’.
– <br /> is an html example.
More XML Rules
• Tags are case sensitive.
– <address> is not the same as <Address>
• XML in any combination of cases is not allowed
as part of a tag.
• Tags may not contain ‘<‘ or ‘&’.
• Tags follow Java naming conventions, except
that a single colon and other characters are
allowed. They must begin with a letter and may
not contain white space.
• Documents must have a single root tag that
begins the document.
Encoding
• XML (like Java) uses Unicode to encode characters.
• Unicode comes in many flavors. The most common one
used in the West is UTF-8.
• UTF-8 is a variable length code. Characters are
encoded in 1 byte, 2 bytes, or 4 bytes.
• The first 128 characters in Unicode are ASCII.
• In UTF-8, the numbers between 128 and 255 code for
some of the more common characters used in western
Europe, such as ã, á, å, or ç.
Well-Formed Documents
• An XML document is said to be well-formed if it
follows all the rules.
• An XML parser is used to check that all the rules
have been obeyed.
• Recent browsers such as Internet Explorer 5
and Netscape 7 come with XML parsers.
• Parsers are also available for free download
over the Internet.
• Java 1.4 also supports an open-source parser.
XML Example Revisited
<?xml version=“1.0”/>
<address>
<name>Alice Lee</name>
<email>alee@aol.com</email>
<phone>212-346-1234</phone>
<birthday>1985-03-22</birthday>
</address>
• Markup for the data helps understanding of its purpose.
• A flat text file is not nearly so clear.
Alice Lee
alee@aol.com
212-346-1234
1985-03-22
• The last line looks like a date, but what is it for?
Expanded Example
<?xml version = “1.0” ?>
<address>
<name>
<first>Alice</first>
<last>Lee</last>
</name>
<email>alee@aol.com</email>
<phone>123-45-6789</phone>
<birthday>
<year>1983</year>
<month>07</month>
<day>15</day>
</birthday>
</address>
XML Files are Trees
address
name email phone birthday
first last year month day
XML Trees
• An XML document has a single root node.
• The tree is a general ordered tree.
– A parent node may have any number of
children.
– Child nodes are ordered, and may have
siblings.
Validity
• A well-formed document has a tree structure and
obeys all the XML rules.
• A particular application may add more rules in
either a DTD (document type definition) or in a
schema.
• Many specialized DTDs and schemas have
been created to describe particular areas.
• These range from disseminating news bulletins
to chemical formulas.
• DTDs were developed first, so they are not as
comprehensive as schema.
Document Type Definitions
• A DTD describes the tree structure of a
document and something about its data.
• There are two data types, PCDATA and
CDATA.
– PCDATA is parsed character data.
– CDATA is character data, not usually parsed.
• A DTD determines how many times a
node may appear, and how child nodes
are ordered.
Parsing
• Breaking a data block into smaller chunks by following a set
of rules, so that it can be more easily interpreted, managed, or
transmitted by a computer. Spreadsheet programs, for
example, parse a data to fit it into a cell of certain size.
Document Type Definitions
• The form of an element declaration for
elements that contain elements
• <!ELEMENT element_name(list of names of child elements)>
• The form of an Attribute declaration
• <!ATTLIST element_name attribute_name
attribute_type[default_value]>
• Ex.<!ATTLIST airplane places CDATA “4”>
DTD for address Example
<!ELEMENT address (name, email, phone, birthday)>
<!ELEMENT name (first, last)>
<!ELEMENT first (#PCDATA)>
<!ELEMENT last (#PCDATA)>
<!ELEMENT email (#PCDATA)>
<!ELEMENT phone (#PCDATA)>
<!ELEMENT birthday (year, month, day)>
<!ELEMENT year (#PCDATA)>
<!ELEMENT month (#PCDATA)>
<!ELEMENT day (#PCDATA)>
INTERNAL AND EXTERNAL DTDs
• Internal DTD Example:
• External DTD Example: [assuming that the DTD
is stored in the file named planes.dtd]
<!DOCTYPE planes_for_sale SYSTEM “planes.dtd”>
NAMESPACES
• It is often convenient to construct XML documents that include
tag sets that are defined for and used by other documents.
• When a tag set is available and appropriate for particular XML
document, it is better to use it rather than to invent new
collection of element types.
• problem with using different markup vocabularies in the same
document is that collisions between names that are defined in
two or more of those tag sets could result.
• An example of this situation is having a <table> tag for a
category of furniture and a <table> tag from XHTML for
information tables.
NAMESPACES
• An XML namespace is a collection of element and attribute
names used in XML documents. The name of a namespace
usually has the form of a uniform resource identifier (URI).
• The form of a namespace declaration for an element is
• <element_name xmlns[:prefix] = URI>
• The square brackets indicate that what is within them is
optional. The prefix, if included, is the name that must be
attached to the names in the declared namespace.
• <html xmlns = “http://www.w3.org/1999/xhtml”>
NAMESPACES
• The next example declares two namespaces. The first is
declared to be the default namespace; the second defines the
prefix, cap:
XML SCHEMAS
• XML schemas is similar to DTD i.e. schemas are used to
define the structure of the document
• DTDs had several disadvantages:
• The syntax of the DTD was un-related to XML, therefore they
cannot be analyzed with an XML processor
• It was very difficult for the programmers to deal with 2
different types of syntaxes
• DTDs does not support the datatype of content of the tag. All
of them are specified as text
Schemas
• Schemas are themselves XML documents.
• They were standardized after DTDs and provide
more information about the document.
• They have a number of data types including
string, decimal, integer, boolean, date, and time.
• They divide elements into simple and complex
types.
• They also determine the tree structure and how
many children a node may have.
DEFINING A SCHEMA
• Schemas themselves are written with the use of a collection of
tags, from a namespace that is, in effect, a schema of
schemas.
• The name of this namespace is
http://www.w3.org/2001/XMLSchema.
• Every schema has schema as its root element. This
namespace specification appears as follows:
• xmlns:xsd = “http://www.w3.org/2001/XMLSchema”
• The name of the namespace defined by a schema must be
specified with the targetNamespace attribute of the schema
element.
– targetNamespace = “http://cs.uccs.edu/planeSchema”
DEFINING A SCHEMA
DEFINING A SCHEMA INSTANCE
• An instance document normally defines its default namespace to be the one
defined in its schema.
• for example, if the root element is planes, we could have
<planes xmlns = “http://cs.uccs.edu/planeSchema” ... >
• The second attribute specification in the root element of an instance
document is for the schemaLocation attribute. This attribute is used to name
the standard namespace for instances, which includes the name
XMLSchema-instance.
xmlns:xsi = http://www.w3.org/2001/XMLSchema-instance
• Third, the instance document must specify the filename of the schema in
which the default namespace is defined. This is accomplished with the
schemaLocation attribute, which takes two values: the namespace of the
schema and the filename of the schema.
Schema for First address Example
<?xml version="1.0" encoding="ISO-8859-1" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="address">
<xs:complexType>
<xs:sequence>
<xs:element name="name" type="xs:string"/>
<xs:element name="email" type="xs:string"/>
<xs:element name="phone" type="xs:string"/>
<xs:element name="birthday" type="xs:date"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
Explanation of Example Schema
<?xml version="1.0" encoding="ISO-8859-1" ?>
• ISO-8859-1, Latin-1, is the same as UTF-8 in the first 128 characters.
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
• www.w3.org/2001/XMLSchema contains the schema standards.
<xs:element name="address">
<xs:complexType>
• This states that address is a complex type element.
<xs:sequence>
• This states that the following elements form a sequence and must
come in the order shown.
<xs:element name="name" type="xs:string"/>
• This says that the element, name, must be a string.
<xs:element name="birthday" type="xs:date"/>
• This states that the element, birthday, is a date. Dates are always of
the form yyyy-mm-dd.
XSLT
• XSL = Style Sheets for XML
• XML does not use predefined tags (we can use any
tag-names we like), and therefore the meaning of
each tag is not well understood.
• A <table> tag could mean an HTML table, a piece of
furniture, or something else - and a browser does
not know how to display it.
• XSL describes how the XML document should be
displayed!
XSLT
• The eXtensible Stylesheet Language (XSL) is a family of
recommendations for defining the presentation and
transformations of XML documents.
• It consists of three related standards:
– XSL Transformations (XSLT),
– XML Path Language (XPath), and
– XSL Formatting Objects (XSL-FO).
• XSLT is used to transform one xml document into another,
often an html document.
• A program is used that takes as input one xml document and
produces as output another.
• If the resulting document is in html, it can be viewed by a web
browser.
• This is a good way to display xml data.
XSLT
• XPath is a language for expressions, which are often used to
identify parts of XML documents.
• such as specific elements that are in specific positions in the
document or elements that have particular attribute values.
OVERVIEW OF XSLT
• XSLT processors take both an XML document and an
XSLT document as input.
• the XSLT document is the program to be executed; the
XML document is the input data to the program.
• An XSLT document consists primarily
of one or more templates.
• One XSLT model of processing XML
data is called the template-driven model
• l
Introduction to XML.ppt
A Style Sheet to Transform address.xml
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="address">
<html><head><title>Address Book</title></head>
<body>
<xsl:value-of select="name"/>
<br/><xsl:value-of select="email"/>
<br/><xsl:value-of select="phone"/>
<br/><xsl:value-of select="birthday"/>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
The Result of the Transformation
Alice Lee
alee@aol.com
123-45-6789
1983-7-15
Parsers
• There are two principal models for
parsers.
• SAX – Simple API for XML
– Uses a call-back method
– Similar to javax listeners
• DOM – Document Object Model
– Creates a parse tree
– Requires a tree traversal

More Related Content

PPTX
Xml unit1
PPTX
Internet_Technology_UNIT V- Introduction to XML.pptx
PPT
Introduction to XML
PPTX
Web Technology Part 4
PPT
1 xml fundamentals
PPTX
PPTX
Unit 5 xml (1)
Xml unit1
Internet_Technology_UNIT V- Introduction to XML.pptx
Introduction to XML
Web Technology Part 4
1 xml fundamentals
Unit 5 xml (1)

Similar to Introduction to XML.ppt (20)

PPTX
PDF
M.FLORENCE DAYANA WEB DESIGN -Unit 5 XML
PPTX
xml.pptx
PPT
Ch2 neworder
PPT
Xml iet 2015
PDF
PPTX
distributed system concerned lab sessions
PPT
Introduction to XML
PPT
web program-Extended MARKUP Language XML.ppt
DOCX
Oracle soa xml faq
PPT
XML-Unit 1.ppt
PPTX
PPT
Intro to xml
PPTX
Extensible Markup Language(XML)_lecture.pptx
PPT
xml.ppt
PPTX
Unit3wt
PPTX
Unit3wt
M.FLORENCE DAYANA WEB DESIGN -Unit 5 XML
xml.pptx
Ch2 neworder
Xml iet 2015
distributed system concerned lab sessions
Introduction to XML
web program-Extended MARKUP Language XML.ppt
Oracle soa xml faq
XML-Unit 1.ppt
Intro to xml
Extensible Markup Language(XML)_lecture.pptx
xml.ppt
Unit3wt
Unit3wt
Ad

More from Varsha Uchagaonkar (6)

PPT
chap04.ppt
PPT
Introduction to XML.ppt
PPTX
wpsession15.pptx
PPTX
wpsession9.pptx
PPTX
wptoolbox.pptx
chap04.ppt
Introduction to XML.ppt
wpsession15.pptx
wpsession9.pptx
wptoolbox.pptx
Ad

Recently uploaded (20)

PPTX
Update on GST Tribunal.pptxbxhsbsbsbsgzzhhsha
PDF
commercial kitchen design for owners of restaurants and hospitality
PPTX
Best Web Design Agencies in Europe [2025]
PPTX
GROUP-1-PA-103-PPT-BULQUERIN-CORTEZ-MORENO.pptx
PDF
SEMINAR 21st classroom SCE by school science
PPTX
9Slide-Fashion-Templateaaaaaaaaaaaaaaaaa
PDF
case studies and literature study for a salon design
PDF
Ɔɒll ϱiɿlƨ bɘlʜi ¢คll ງirlŞ ໓ēlhi ¢คll ງirlŞ ໓ēlhi
PDF
Surat undangan 2025-2026.pdf hafaccsjakwjw
PPTX
introduction of linguistics bdhddjsjsjsjdjd
PDF
iNTEROF SDFFDEWRDFS FSDFSDFSASDAFFDFSSDA
PPTX
2. Introduction to oral maxillofacial surgery .pptx
PPTX
Fabrication Of Multi directional elevator
DOCX
allianz arena munich case study of long span structure
PPTX
8. PMI Toolkit - Overview and Approach.pptx
PPTX
Rocket-Launched-PowerPoint-Template.pptx
PDF
CSWIP1 welding standards and welding simpols
PPTX
LESSON-3-Introduction-to-Office-Suite.pptx
PPTX
Advanced Pharmaceutical Analysis-Lecture Two.pptx
Update on GST Tribunal.pptxbxhsbsbsbsgzzhhsha
commercial kitchen design for owners of restaurants and hospitality
Best Web Design Agencies in Europe [2025]
GROUP-1-PA-103-PPT-BULQUERIN-CORTEZ-MORENO.pptx
SEMINAR 21st classroom SCE by school science
9Slide-Fashion-Templateaaaaaaaaaaaaaaaaa
case studies and literature study for a salon design
Ɔɒll ϱiɿlƨ bɘlʜi ¢คll ງirlŞ ໓ēlhi ¢คll ງirlŞ ໓ēlhi
Surat undangan 2025-2026.pdf hafaccsjakwjw
introduction of linguistics bdhddjsjsjsjdjd
iNTEROF SDFFDEWRDFS FSDFSDFSASDAFFDFSSDA
2. Introduction to oral maxillofacial surgery .pptx
Fabrication Of Multi directional elevator
allianz arena munich case study of long span structure
8. PMI Toolkit - Overview and Approach.pptx
Rocket-Launched-PowerPoint-Template.pptx
CSWIP1 welding standards and welding simpols
LESSON-3-Introduction-to-Office-Suite.pptx
Advanced Pharmaceutical Analysis-Lecture Two.pptx

Introduction to XML.ppt

  • 2. Extensible Markup Language • Introduction • SGML is a meta-markup language is a language for defining markup language it can describe a wide variety of document types. • Developed in the early 1980s; In 1986 SGML was approved by ISO std. • HTML was developed using SGML in the early 1990s - specifically for Web documents. • Two problems with HTML: • 1. HTML is defined to describe the general form and layout of information without considering its meaning. • 2. Fixed set of tags and attributes. Given tags must fit every kind of document. No way to find particular information • 3. There are no restrictions on arrangement or order of tag appearance in document.
  • 3. What is XML • XML stands for eXtensible Markup Language. • A markup language is used to provide information about a document. • Tags are added to the document to provide the extra information. • XML was designed to describe data, not to display data • XML tags are not predefined. You must define your own tags • HTML tags tell a browser how to display the document. • XML tags give a reader some idea what some of the data means.
  • 4. What is XML Used For? • XML documents are used to transfer data from one place to another often over the Internet. • XML subsets are designed for particular applications. • A number of fields have their own subsets. These include chemistry, mathematics, and books publishing. • Most of these subsets are registered with the W3Consortium and are available for anyone’s use.
  • 5. How Can XML be Used? • If you need to display dynamic data in your HTML document, it will take a lot of work to edit the HTML each time the data changes. • With XML, data can be stored in separate XML files. This way you can concentrate on using HTML/CSS for display and layout, and be sure that changes in the underlying data will not require any changes to the HTML. • With a few lines of JavaScript code, you can read an external XML file and update the data content of your web page.
  • 6. Advantages of XML • XML is text (Unicode) based. – Takes up less space. – Can be transmitted efficiently. • XML documents can be modularized. Parts can be reused.
  • 7. Example of an HTML Document <html> <head><title>Example</title></head. <body> <h1>This is an example of a page.</h1> <h2>Some information goes here.</h2> </body> </html>
  • 8. Example of an XML Document <?xml version=“1.0”/> <address> <name>Alice Lee</name> <email>[email protected]</email> <phone>212-346-1234</phone> <birthday>1985-03-22</birthday> </address>
  • 9. Difference Between HTML and XML • HTML tags have a fixed meaning and browsers know what it is. • XML tags are different for different applications, and users know what they mean. • HTML tags are used for display. • XML tags are used to describe documents and data.
  • 10. XML Rules • Tags are enclosed in angle brackets. • Tags come in pairs with start-tags and end-tags. • Tags must be properly nested. – <name><email>…</name></email> is not allowed. – <name><email>…</email><name> is. • Tags that do not have end-tags must be terminated by a ‘/’. – <br /> is an html example.
  • 11. More XML Rules • Tags are case sensitive. – <address> is not the same as <Address> • XML in any combination of cases is not allowed as part of a tag. • Tags may not contain ‘<‘ or ‘&’. • Tags follow Java naming conventions, except that a single colon and other characters are allowed. They must begin with a letter and may not contain white space. • Documents must have a single root tag that begins the document.
  • 12. Encoding • XML (like Java) uses Unicode to encode characters. • Unicode comes in many flavors. The most common one used in the West is UTF-8. • UTF-8 is a variable length code. Characters are encoded in 1 byte, 2 bytes, or 4 bytes. • The first 128 characters in Unicode are ASCII. • In UTF-8, the numbers between 128 and 255 code for some of the more common characters used in western Europe, such as ã, á, å, or ç.
  • 13. Well-Formed Documents • An XML document is said to be well-formed if it follows all the rules. • An XML parser is used to check that all the rules have been obeyed. • Recent browsers such as Internet Explorer 5 and Netscape 7 come with XML parsers. • Parsers are also available for free download over the Internet. • Java 1.4 also supports an open-source parser.
  • 14. XML Example Revisited <?xml version=“1.0”/> <address> <name>Alice Lee</name> <email>[email protected]</email> <phone>212-346-1234</phone> <birthday>1985-03-22</birthday> </address> • Markup for the data helps understanding of its purpose. • A flat text file is not nearly so clear. Alice Lee [email protected] 212-346-1234 1985-03-22 • The last line looks like a date, but what is it for?
  • 15. Expanded Example <?xml version = “1.0” ?> <address> <name> <first>Alice</first> <last>Lee</last> </name> <email>[email protected]</email> <phone>123-45-6789</phone> <birthday> <year>1983</year> <month>07</month> <day>15</day> </birthday> </address>
  • 16. XML Files are Trees address name email phone birthday first last year month day
  • 17. XML Trees • An XML document has a single root node. • The tree is a general ordered tree. – A parent node may have any number of children. – Child nodes are ordered, and may have siblings.
  • 18. Validity • A well-formed document has a tree structure and obeys all the XML rules. • A particular application may add more rules in either a DTD (document type definition) or in a schema. • Many specialized DTDs and schemas have been created to describe particular areas. • These range from disseminating news bulletins to chemical formulas. • DTDs were developed first, so they are not as comprehensive as schema.
  • 19. Document Type Definitions • A DTD describes the tree structure of a document and something about its data. • There are two data types, PCDATA and CDATA. – PCDATA is parsed character data. – CDATA is character data, not usually parsed. • A DTD determines how many times a node may appear, and how child nodes are ordered.
  • 20. Parsing • Breaking a data block into smaller chunks by following a set of rules, so that it can be more easily interpreted, managed, or transmitted by a computer. Spreadsheet programs, for example, parse a data to fit it into a cell of certain size.
  • 21. Document Type Definitions • The form of an element declaration for elements that contain elements • <!ELEMENT element_name(list of names of child elements)> • The form of an Attribute declaration • <!ATTLIST element_name attribute_name attribute_type[default_value]> • Ex.<!ATTLIST airplane places CDATA “4”>
  • 22. DTD for address Example <!ELEMENT address (name, email, phone, birthday)> <!ELEMENT name (first, last)> <!ELEMENT first (#PCDATA)> <!ELEMENT last (#PCDATA)> <!ELEMENT email (#PCDATA)> <!ELEMENT phone (#PCDATA)> <!ELEMENT birthday (year, month, day)> <!ELEMENT year (#PCDATA)> <!ELEMENT month (#PCDATA)> <!ELEMENT day (#PCDATA)>
  • 23. INTERNAL AND EXTERNAL DTDs • Internal DTD Example: • External DTD Example: [assuming that the DTD is stored in the file named planes.dtd] <!DOCTYPE planes_for_sale SYSTEM “planes.dtd”>
  • 24. NAMESPACES • It is often convenient to construct XML documents that include tag sets that are defined for and used by other documents. • When a tag set is available and appropriate for particular XML document, it is better to use it rather than to invent new collection of element types. • problem with using different markup vocabularies in the same document is that collisions between names that are defined in two or more of those tag sets could result. • An example of this situation is having a <table> tag for a category of furniture and a <table> tag from XHTML for information tables.
  • 25. NAMESPACES • An XML namespace is a collection of element and attribute names used in XML documents. The name of a namespace usually has the form of a uniform resource identifier (URI). • The form of a namespace declaration for an element is • <element_name xmlns[:prefix] = URI> • The square brackets indicate that what is within them is optional. The prefix, if included, is the name that must be attached to the names in the declared namespace. • <html xmlns = “http://www.w3.org/1999/xhtml”>
  • 26. NAMESPACES • The next example declares two namespaces. The first is declared to be the default namespace; the second defines the prefix, cap:
  • 27. XML SCHEMAS • XML schemas is similar to DTD i.e. schemas are used to define the structure of the document • DTDs had several disadvantages: • The syntax of the DTD was un-related to XML, therefore they cannot be analyzed with an XML processor • It was very difficult for the programmers to deal with 2 different types of syntaxes • DTDs does not support the datatype of content of the tag. All of them are specified as text
  • 28. Schemas • Schemas are themselves XML documents. • They were standardized after DTDs and provide more information about the document. • They have a number of data types including string, decimal, integer, boolean, date, and time. • They divide elements into simple and complex types. • They also determine the tree structure and how many children a node may have.
  • 29. DEFINING A SCHEMA • Schemas themselves are written with the use of a collection of tags, from a namespace that is, in effect, a schema of schemas. • The name of this namespace is http://www.w3.org/2001/XMLSchema. • Every schema has schema as its root element. This namespace specification appears as follows: • xmlns:xsd = “http://www.w3.org/2001/XMLSchema” • The name of the namespace defined by a schema must be specified with the targetNamespace attribute of the schema element. – targetNamespace = “http://cs.uccs.edu/planeSchema”
  • 31. DEFINING A SCHEMA INSTANCE • An instance document normally defines its default namespace to be the one defined in its schema. • for example, if the root element is planes, we could have <planes xmlns = “http://cs.uccs.edu/planeSchema” ... > • The second attribute specification in the root element of an instance document is for the schemaLocation attribute. This attribute is used to name the standard namespace for instances, which includes the name XMLSchema-instance. xmlns:xsi = http://www.w3.org/2001/XMLSchema-instance • Third, the instance document must specify the filename of the schema in which the default namespace is defined. This is accomplished with the schemaLocation attribute, which takes two values: the namespace of the schema and the filename of the schema.
  • 32. Schema for First address Example <?xml version="1.0" encoding="ISO-8859-1" ?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="address"> <xs:complexType> <xs:sequence> <xs:element name="name" type="xs:string"/> <xs:element name="email" type="xs:string"/> <xs:element name="phone" type="xs:string"/> <xs:element name="birthday" type="xs:date"/> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>
  • 33. Explanation of Example Schema <?xml version="1.0" encoding="ISO-8859-1" ?> • ISO-8859-1, Latin-1, is the same as UTF-8 in the first 128 characters. <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> • www.w3.org/2001/XMLSchema contains the schema standards. <xs:element name="address"> <xs:complexType> • This states that address is a complex type element. <xs:sequence> • This states that the following elements form a sequence and must come in the order shown. <xs:element name="name" type="xs:string"/> • This says that the element, name, must be a string. <xs:element name="birthday" type="xs:date"/> • This states that the element, birthday, is a date. Dates are always of the form yyyy-mm-dd.
  • 34. XSLT • XSL = Style Sheets for XML • XML does not use predefined tags (we can use any tag-names we like), and therefore the meaning of each tag is not well understood. • A <table> tag could mean an HTML table, a piece of furniture, or something else - and a browser does not know how to display it. • XSL describes how the XML document should be displayed!
  • 35. XSLT • The eXtensible Stylesheet Language (XSL) is a family of recommendations for defining the presentation and transformations of XML documents. • It consists of three related standards: – XSL Transformations (XSLT), – XML Path Language (XPath), and – XSL Formatting Objects (XSL-FO). • XSLT is used to transform one xml document into another, often an html document. • A program is used that takes as input one xml document and produces as output another. • If the resulting document is in html, it can be viewed by a web browser. • This is a good way to display xml data.
  • 36. XSLT • XPath is a language for expressions, which are often used to identify parts of XML documents. • such as specific elements that are in specific positions in the document or elements that have particular attribute values.
  • 37. OVERVIEW OF XSLT • XSLT processors take both an XML document and an XSLT document as input. • the XSLT document is the program to be executed; the XML document is the input data to the program. • An XSLT document consists primarily of one or more templates. • One XSLT model of processing XML data is called the template-driven model • l
  • 39. A Style Sheet to Transform address.xml <?xml version="1.0" encoding="ISO-8859-1"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="address"> <html><head><title>Address Book</title></head> <body> <xsl:value-of select="name"/> <br/><xsl:value-of select="email"/> <br/><xsl:value-of select="phone"/> <br/><xsl:value-of select="birthday"/> </body> </html> </xsl:template> </xsl:stylesheet>
  • 40. The Result of the Transformation Alice Lee [email protected] 123-45-6789 1983-7-15
  • 41. Parsers • There are two principal models for parsers. • SAX – Simple API for XML – Uses a call-back method – Similar to javax listeners • DOM – Document Object Model – Creates a parse tree – Requires a tree traversal