0% found this document useful (0 votes)
4 views9 pages

XML-lesson Note

xml basics.

Uploaded by

liulgirma24
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views9 pages

XML-lesson Note

xml basics.

Uploaded by

liulgirma24
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

XML__IT_Grade 12

Introduction to XML

XML stands for Extensible Markup Language and is a text-based markup language derived
from Standard Generalized Markup Language (SGML).
What is Markup?
XML is a markup language that defines set of rules for encoding documents in a format
that is both human-readable and machine-readable. So, what exactly is a markup
language? Markup is information added to a document that enhances its meaning in
certain ways, in that it identifies the parts and how they relate to each other. More
specifically, a markup language is a set of symbols that can be placed in the text of a
document to demarcate and label the parts of that document.
Following example shows how XML markup looks, when embedded in a piece of text:

<message>
<text>Hello, world!</text>
</message>
This snippet includes the markup symbols, or the tags such as < message>...</message>

and <text>... </text>. The tags <message> and </message> mark the start and the end
of the XML code fragment. The tags <text> and </text> surround the text Hello, world!.
Is XML a Programming Language?
A programming language consists of grammar rules and its own vocabulary which is used
to create computer programs. These programs instruct the computer to perform specific
tasks. XML does not qualify to be a programming language as it does not perform any
computation or algorithms. It is usually stored in a simple text file and is processed by
special software that is capable of interpreting XML.
Following is a complete XML document:
<?xml version="1.0"?>
<contact-info>
<name>abel dawit</name>
<company>xmlTutorial</company>
<phone>(011) 123-4567</phone>
</contact-info>
XML Declaration
The XML document can optionally have an XML declaration. It is written as follows :
<?xml version="1.0" encoding="UTF-8"?>
Where version is the XML version and encoding specifies the character encoding used in
the document.
Syntax Rules for XML Declaration
The XML declaration is case sensitive and must begin with " <?xml>" where "xml"
is written in lower-case.
The XML declaration strictly needs be the first statement in the XML document.
An HTTP protocol can override the value of encoding that you put in the XML declaration.
Tags and Elements An XML file is structured by several XML-elements, also called XML-nodes or XML-

ENS Page 1
XML__IT_Grade 12

tags. The names of XML-elements are enclosed in triangular brackets < > as shown below:
<element>
Syntax Rules for Tags and Elements
Element Syntax: Each XML-element needs to be closed either with start or with end
elements as shown below:
<element>....</element>
or in simple-cases, just this way:
<element/>
Nesting of Elements: An XML-element can contain multiple XML-elements as its children,
but the children elements must not overlap. i.e., an end tag of an element must have the
same name as that of the most recent unmatched start ta
The following example shows incorrect nested tags:
<?xml version="1.0"?>
<contact-info>
<company>TutorialsPoint
<contact-info>
</company>
The following example shows correct nested tags:
<?xml version="1.0"?>
<contact-info>
<company>TutorialsPoint</company>
<contact-info>
Root Element: An XML document can have only one root element. For example, following
is not a correct XML document, because both the x and y elements occur at the top level
without a root element:
<x>...</x>
<y>...</y>
The following example shows a correctly formed XML document:
<root>
<x>...</x>
<y>...</y>
</root>
Case Sensitivity: The names of XML-elements are case-sensitive. That means the name
of the start and the end elements need to be exactly in the same case.
For example, <contact-info> is different from <Contact-Info>.
XML Attributes
An attribute specifies a single property for the element, using a name/value pair. An XML-
element can have one or more attributes. For example:
<a href="http://www.tutorialspoint.com/">Tutorialspoint!</a>
Here href is the attribute name and http://www.tutorialspoint.com/ is attribute value.
Syntax Rules for XML Attributes
Attribute names in XML (unlike HTML) are case sensitive.
That is, HREF and href are considered two different XML attributes.
Same attribute cannot have two values in a syntax. The following example shows incorrect syntax because
the attribute b is specified twice:

ENS Page 2
XML__IT_Grade 12

<a b="x" c="y" b="z">....</a>


Attribute names are defined without quotation marks, whereas attribute values must always appear in
quotation marks. Following example demonstrates incorrect
xml syntax:
<a b=x>....</a>
In the above syntax, the attribute value is not defined in quotation marks.
The names of XML-elements and XML-attributes are case-sensitive, which means the name
of start and end elements need to be written in the same case.
To avoid character encoding
problems, all XML files should be saved as Unicode UTF-8 or UTF-16 files.
Whitespace characters like blanks, tabs and line-breaks between XML-elements and
between the XML-attributes will be ignored.
Some characters are reserved by the XML syntax itself. Hence, they cannot be used
directly. To use them, some replacement-entities are used, which are listed below:
Not Allowed Character Replacement Entity Character Description
< &lt; less than
> &gt; greater than
& &amp; ampersand
' &apos; apostrophe
" &quot; quotation mark
An XML declaration should abide with the following rules:
 If the XML declaration is present in the XML, it must be placed as the first line in
the XML document.
 If the XML declaration is included, it must contain version number attribute.
 The parameter names and values are case-sensitive.
 The names are always in lower case.
 The order of placing the parameters is important. The correct order is: version,
encoding and standalone.
 Either single or double quotes may be used.
The XML declaration has no closing tag, i.e. </?xml>
Example XML declaration with all parameters defined :
<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
Let us learn about one of the most important part of XML, the XML tags. XML tags form
the foundation of XML. They define the scope of an element in XML. They can also be used
to insert comments, declare settings required for parsing the environment, and to insert
special instructions.
We can broadly categorize XML tags as follows:
Start Tag
The beginning of every non-empty XML element is marked by a start-tag. Following is an example of start-
tag :
<address>
End Tag
Every element that has a start tag should end with an end-tag. Following is an example of end-tag :
</address>

ENS Page 3
XML__IT_Grade 12

Note, that the end tags include a solidus ("/") before the name of an element.
Empty Tag
The text that appears between start-tag and end-tag is called content. An element which
has no content is termed as empty. An empty element can be represented in two ways as
XML tags must be closed in an appropriate order, i.e., an XML tag opened inside another
element must be closed before the outer element is closed. For example:
<outer_element>
<internal_element>
This tag is closed before the outer_element
</internal_element>
</outer_element>
XML elements can be defined as building blocks of an XML. Elements can behave as
containers to hold text, elements, attributes, media objects, or all of these.
Each XML document contains one or more elements, the scope of which are either
delimited by start and end tags, or for empty elements, by an empty-element tag.
Following is the syntax to write an XML element:
<element-name attribute1 attribute2>
...content
</element-name>
Where,
element-name is the name of the element. The name its case in the start and end tags must match.
attribute1, attribute2 are attributes of the element separated by white spaces.
An attribute defines a property of the element. It associates a name with a value,
which is a string of characters. An attribute is written as:
name = "value"
name is followed by an = sign and a string value inside double (" ") or single (' ')quotes.
Following is an example of an XML document using various XML element:
<?xml version="1.0"?>
<contact-info>
<address category="residence">
<name>Haile G/selassie</name>
<company>Haile resort and hotel</company>
<phone>(011) 123-4567</phone>
<address/>
</contact-info>
Following rules are required to be followed for XML elements:
 An element name can contain any alphanumeric characters. The only punctuation
mark allowed in names are the hyphen (-), under-score (_) and period (.).
 Names are case sensitive. For example, Address, address, and ADDRESS are
different names.
 Start and end tags of an element must be identical.
 An element, which is a container, can contain text or elements as seen in the above example.
XML – Comments
This chapter explains how comments work in XML documents. XML comments are similar to HTML
comments. The comments are added as notes or lines for understanding the purpose of an XML code.

ENS Page 4
XML__IT_Grade 12

Syntax
XML comment has the following syntax:
<!-------Your comment----->
A comment starts with <!-- and ends with -->. You can add textual notes as comments
between the characters.
XML – DTDs
The XML Document Type Declaration, commonly known as DTD, is a way to describe XML
language precisely. DTDs check vocabulary and validity of the structure of XML documents
against grammatical rules of appropriate XML language.
An XML DTD can be either specified inside the document, or it can be kept in a separate
document and then liked separately.
Syntax
Basic syntax of a DTD is as follows:
<!DOCTYPE element DTD identifier
[
declaration1
declaration2
........
]>
In the above syntax,
 The DTD starts with <!DOCTYPE delimiter.
 An element tells the parser to parse the document from the specified root element.
 DTD identifier is an identifier for the document type definition, which may be the
path to a file on the system or URL to a file on the internet. If the DTD is pointing to external path, it is
called External Subset.
 The square brackets [ ] enclose an optional list of entity declarations
called Internal Subset.
Internal DTD
A DTD is referred to as an internal DTD if elements are declared within the XML files. To refer it as internal
DTD, standalone attribute in XML declaration must be set to yes . This means, the declaration works
independent of an external source.
Syntax
Following is the syntax of internal DTD:
<!DOCTYPE root-element [element-declarations]>
where root-element is the name of root element and element-declarations is where you
declare the elements.
Example
Following is a simple example of internal DTD:
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE address [
<!ELEMENT address (name,company,phone)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT company (#PCDATA)>
<!ELEMENT phone (#PCDATA)>
]>

ENS Page 5
XML__IT_Grade 12

<address>
<name>Tanmay Patil</name>
<company>TutorialsPoint</company>
<phone>(011) 123-4567</phone>
</address>
Let us go through the above code:
 Start Declaration - Begin the XML declaration with the following statement.
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
 DTD - Immediately after the XML header, the document type declaration follows,
commonly referred to as the DOCTYPE:
 <!DOCTYPE address [
 The DOCTYPE declaration has an exclamation mark (!) at the start of the element name.
The DOCTYPE informs the parser that a DTD is associated with this XML document.
 DTD Body - The DOCTYPE declaration is followed by the body of the DTD, where you
declare elements, attributes, entities, and notations.
External DTD
In external DTD elements are declared outside the XML file. They are accessed by
specifying the system attributes which may be either the legal .dtd file or a valid URL.
To
refer it as external DTD, standalone attribute in the XML declaration must be set as no.
This means, declaration includes information from the external source.
Following is the syntax for external DTD:
<!DOCTYPE root-element SYSTEM "file-name">
where file-name is the file with .dtd extension.
Example
The following example shows external DTD usage:
<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<!DOCTYPE address SYSTEM "address.dtd">
<address>
<name>abel dawit</name>
<company>TutorialsPoint</company>
<phone>(011) 123-456667</phone>
</address>
The content of the DTD file address.dtd is as shown:
<!ELEMENT address (name,company,phone)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT company (#PCDATA)>
<!ELEMENT phone (#PCDATA)>
XML – Schemas
XML Schema is commonly known as XML Schema Definition (XSD). It is used to describe
and validate the structure and the content of XML data. XML schema defines the elements, attributes, and
data types.
Schema element supports Namespaces. It is similar to a database schema that describes the data in a
database. You need to declare a schema in your XML document as follows:
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

ENS Page 6
XML__IT_Grade 12

Example
The following example shows how to use schema:
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="contact">
<xs:complexType>
<xs:sequence>
<xs:element name="name" type="xs:string" />
<xs:element name="company" type="xs:string" />
<xs:element name="phone" type="xs:int" />
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
The basic idea behind XML Schemas is that they describe the legitimate format that an
XML document can take.
Elements
As we saw in the XML - Elements chapter, elements are the building blocks of XML document. An
element can be defined within an XSD as follows:
<xs:element name="x" type="y"/>
XML – Tree Structure
An XML document is always descriptive. The tree structure is often referred to as XML
Tree and plays an important role to describe any XML document easily.
The tree structure contains root (parent) elements, child elements, and so on. By using
tree structure, you can get to know all succeeding branches and sub-branches starting
from the root. The parsing starts at the root, then moves down the first branch to an
element, take the first branch from there, and so on to the leaf nodes.
Example Following example demonstrates simple XML tree structure:
<?xml version="1.0"?>
<Company>
<Employee>
<FirstName>Tanmay</FirstName>
<LastName>Patil</LastName>
<ContactNo>1234567890</ContactNo>
<Email>[email protected]</Email>
<Address>
<City>Bangalore</City>
<State>Karnataka</State>
<Zip>560212</Zip>
</Address>
</Employee>
</Company>

ENS Page 7
XML__IT_Grade 12

HTML XxXML vs HTML XML HTML vs. XML


 HTML is the markup language that helps you to create and design web content.
It has a variety of tags and attributes for defining the layout and structure of the
web document. It is designed to display data in a formatted manner. An HTML
document has the extension .htm or .html.
 XML is a markup language that is designed to store data. It is popularly used for
the transfer of data. It is case sensitive. XML offers you the ability to define markup
elements and generate customized markup language. The basic unit in XML is
known as an element, and the extension of an XML file is .xml

Table 4.2 Compares common features of HTML and XML.


Parameter XML HTML
XML is a framework for specifying HTML is a predefined
Type of language
markup languages. markup language.
Structural details They are provided. They are not provided.
Display / Presentation of
Purpose Transfer of data.
data
It does not have any effect
Nesting It Should be done appropriately.
on the code.
Driven by XML is content driven. HTML is format driven.
Documents are mostly lengthy in size,
The syntax is very brief
Size especially when an element-centric
and yields formatted text.
approach is used informatting.
It is very hard as you need to learn It is a simple technology
Learning curve technologies like XPath, XML stack that is familiar to
Schema, DOM, etc. developers.
Coding Errors No coding errors are allowed. Small errors are ignored.
Extension .xml .html or .htm
E.G. Page1.xml E.G. Page1.html
Whitespace White spaces can be used in your White spaces cannot be
Output code. used in your code.
<Name> Motherland <p>Motherland
Ethiopia</Name> Ethiopia</p>
Motherland Ethiopia Motherland Ethiopia
Language type It is case sensitive. It is case insensitive.
<Name>Lucy</Name
<STRong>Lucy<strONG>
>
Tags are defined as per the need It has its own predefined
Tag tags.
of the programmer.

ENS Page 8
XML__IT_Grade 12

E.G.
<book><Author><Titl E.G. <body><b><i>
e>
The closing tag is essential in a The closing tag is not
End of tags always required.
well-formed XML document.
<Person><student><N
ame>Your
<body><P> This is
name
paragraph </body>
</Name</student>></
Person>
Quotation marks are not
Quotation marks are required
Quotes required for the values of
around XML attribute values. attributes.
<Department>
<number <body
type=”int”> 101 bgcolor=#00ff00><p>
</number></ This is paragraph </body>
Department>

eAdvantages of using XML: It makes documents transportable, separates


data from HTML, flexible platform change process.
Disadvantages of using XML: It requires a processing application, syntax
sometimes confusing, no intrinsic data type, and redundant syntax.
Website publishing
Website publishing is the process of publishing the website’s original content
on the Internet, or specifically on a remote server.
Websites are published by uploading website content/files onto the remote
server which is provided by a hosting company or a web host.
SSL and HTTPS are protocols that provide security options that keep your
site safe and secure. HTTPS is secured and prevents interceptions and
interruptions from occurring while the content is in transit. The website
requires an SSL certificate to enable HTTPS.

.3. Publishing Website

ENS Page 9

You might also like