0% found this document useful (0 votes)
12 views77 pages

Soft Computing in XML Data Management Intelligent Systems From Decision Making To Data Mining Web Intelligence and Computer Vision 1st Edition Barbara Oliboni PDF Download

The document discusses the book 'Soft Computing in XML Data Management', which explores the application of soft computing techniques to manage XML data effectively. It highlights the challenges of handling imprecise and uncertain information in XML and presents various methodologies for improving XML data management through fuzzy logic and other soft computing approaches. The book is structured into three sections, addressing uncertainty in XML, flexibility in data management, and the development of soft computing applications in XML data management.

Uploaded by

wefoqfjlal468
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views77 pages

Soft Computing in XML Data Management Intelligent Systems From Decision Making To Data Mining Web Intelligence and Computer Vision 1st Edition Barbara Oliboni PDF Download

The document discusses the book 'Soft Computing in XML Data Management', which explores the application of soft computing techniques to manage XML data effectively. It highlights the challenges of handling imprecise and uncertain information in XML and presents various methodologies for improving XML data management through fuzzy logic and other soft computing approaches. The book is structured into three sections, addressing uncertainty in XML, flexibility in data management, and the development of soft computing applications in XML data management.

Uploaded by

wefoqfjlal468
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 77

Soft Computing In Xml Data Management

Intelligent Systems From Decision Making To Data


Mining Web Intelligence And Computer Vision 1st
Edition Barbara Oliboni download
https://ebookbell.com/product/soft-computing-in-xml-data-
management-intelligent-systems-from-decision-making-to-data-
mining-web-intelligence-and-computer-vision-1st-edition-barbara-
oliboni-1939372

Explore and download more ebooks at ebookbell.com


Here are some recommended products that we believe you will be
interested in. You can click the link to download.

Soft Computing In Materials Development And Its Sustainability In The


Manufacturing Sector Amar Patnaik

https://ebookbell.com/product/soft-computing-in-materials-development-
and-its-sustainability-in-the-manufacturing-sector-amar-
patnaik-47301994

Soft Computing In Data Science 7th International Conference Scds 2023


Virtual Event January 2425 2023 Proceedings Marina Yusoff

https://ebookbell.com/product/soft-computing-in-data-science-7th-
international-conference-scds-2023-virtual-event-
january-2425-2023-proceedings-marina-yusoff-49420188

Soft Computing In Smart Manufacturing Solutions Toward Industry 50


Tatjana Sibalija Editor J Paulo Davim Editor

https://ebookbell.com/product/soft-computing-in-smart-manufacturing-
solutions-toward-industry-50-tatjana-sibalija-editor-j-paulo-davim-
editor-50983288

Soft Computing In Management And Business Economics Volume 2 Anna M


Gillafuente

https://ebookbell.com/product/soft-computing-in-management-and-
business-economics-volume-2-anna-m-gillafuente-52957428
Soft Computing In Textile Engineering Woodhead Publishing Series In
Textiles Abhijit Majumdar

https://ebookbell.com/product/soft-computing-in-textile-engineering-
woodhead-publishing-series-in-textiles-abhijit-majumdar-2341558

Soft Computing In Green And Renewable Energy Systems 1st Edition


Arturo Pachecovega Auth

https://ebookbell.com/product/soft-computing-in-green-and-renewable-
energy-systems-1st-edition-arturo-pachecovega-auth-2451436

Soft Computing In Humanities And Social Sciences 1st Edition Rudolf


Seising

https://ebookbell.com/product/soft-computing-in-humanities-and-social-
sciences-1st-edition-rudolf-seising-2456110

Soft Computing In Inventory Management Inventory Optimization 1st Ed


2021 Nita H Shah Editor

https://ebookbell.com/product/soft-computing-in-inventory-management-
inventory-optimization-1st-ed-2021-nita-h-shah-editor-34116490

Soft Computing In Industrial Electronics 1st Edition Professor Seppo J


Ovaska

https://ebookbell.com/product/soft-computing-in-industrial-
electronics-1st-edition-professor-seppo-j-ovaska-4187988
Zongmin Ma and Li Yan (Eds.)
Soft Computing in XML Data Management
Studies in Fuzziness and Soft Computing, Volume 255

Editor-in-Chief
Prof. Janusz Kacprzyk
Systems Research Institute
Polish Academy of Sciences
ul. Newelska 6
01-447 Warsaw
Poland
E-mail: [email protected]

Further volumes of this series can be found on our homepage: springer.com

Vol. 238. Atanu Sengupta, Tapan Kumar Pal Vol. 247. Michael Glykas
Fuzzy Preference Ordering of Interval Fuzzy Cognitive Maps, 2010
Numbers in Decision Problems, 2009 ISBN 978-3-642-03219-6
ISBN 978-3-540-89914-3 Vol. 248. Bing-Yuan Cao
Optimal Models and Methods
Vol. 239. Baoding Liu with Fuzzy Quantities, 2010
Theory and Practice of Uncertain
Programming, 2009 ISBN 978-3-642-10710-8
ISBN 978-3-540-89483-4 Vol. 249. Bernadette Bouchon-Meunier,
Luis Magdalena, Manuel Ojeda-Aciego,
Vol. 240. Asli Celikyilmaz, I. Burhan Türksen
José-Luis Verdegay,
Modeling Uncertainty with Fuzzy Logic, 2009
Ronald R. Yager (Eds.)
ISBN 978-3-540-89923-5
Foundations of Reasoning under
Vol. 241. Jacek Kluska Uncertainty, 2010
Analytical Methods in Fuzzy ISBN 978-3-642-10726-9
Modeling and Control, 2009
ISBN 978-3-540-89926-6 Vol. 250. Xiaoxia Huang
Portfolio Analysis, 2010
Vol. 242. Yaochu Jin, Lipo Wang ISBN 978-3-642-11213-3
Fuzzy Systems in Bioinformatics
and Computational Biology, 2009 Vol. 251. George A. Anastassiou
ISBN 978-3-540-89967-9 Fuzzy Mathematics:
Approximation Theory, 2010
Vol. 243. Rudolf Seising (Ed.) ISBN 978-3-642-11219-5
Views on Fuzzy Sets and Systems from Vol. 252. Cengiz Kahraman,
Different Perspectives, 2009 Mesut Yavuz (Eds.)
ISBN 978-3-540-93801-9 Production Engineering and Management
Vol. 244. Xiaodong Liu and Witold Pedrycz under Fuzziness, 2010
Axiomatic Fuzzy Set Theory and Its ISBN 978-3-642-12051-0
Applications, 2009 Vol. 253. Badredine Arfi
ISBN 978-3-642-00401-8 Linguistic Fuzzy Logic Methods in Social
Sciences, 2010
Vol. 245. Xuzhu Wang, Da Ruan, ISBN 978-3-642-13342-8
Etienne E. Kerre
Mathematics of Fuzziness – Vol. 254. Weldon A. Lodwick,
Basic Issues, 2009 Janusz Kacprzyk (Eds.)
Fuzzy Optimization, 2010
ISBN 978-3-540-78310-7 ISBN 978-3-642-13934-5
Vol. 246. Piedad Brox, Iluminada Castillo,
Santiago Sánchez Solano Vol. 255. Zongmin Ma, Li Yan (Eds.)
Soft Computing in XML Data
Fuzzy Logic-Based Algorithms for
Video De-Interlacing, 2010 Management, 2010
ISBN 978-3-642-10694-1 ISBN 978-3-642-14009-9
Zongmin Ma and Li Yan (Eds.)

Soft Computing in XML


Data Management
Intelligent Systems from Decision Making
to Data Mining, Web Intelligence
and Computer Vision

ABC
Editors
Zongmin Ma
College of Information Science and Engineering
Northeastern University
3-11 Wenhua Road
Shenyang, Liaoning 110819
China
E-mail: [email protected]

Li Yan
School of Software
Northeastern University
3-11 Wenhua Road
Shenyang, Liaoning 110819
China

ISBN 978-3-642-14009-9 e-ISBN 978-3-642-14010-5

DOI 10.1007/978-3-642-14010-5

Studies in Fuzziness and Soft Computing ISSN 1434-9922

Library of Congress Control Number: 2010929475


c 2010 Springer-Verlag Berlin Heidelberg

This work is subject to copyright. All rights are reserved, whether the whole or part
of the material is concerned, specifically the rights of translation, reprinting, reuse
of illustrations, recitation, broadcasting, reproduction on microfilm or in any other
way, and storage in data banks. Duplication of this publication or parts thereof is
permitted only under the provisions of the German Copyright Law of September 9,
1965, in its current version, and permission for use must always be obtained from
Springer. Violations are liable to prosecution under the German Copyright Law.

The use of general descriptive names, registered names, trademarks, etc. in this pub-
lication does not imply, even in the absence of a specific statement, that such names
are exempt from the relevant protective laws and regulations and therefore free for
general use.
Typeset & Cover Design: Scientific Publishing Services Pvt. Ltd., Chennai, India.
Printed on acid-free paper
987654321
springer.com
Preface

Being the de-facto standard for data representation and exchange over the Web,
XML (Extensible Markup Language) allows the easy development of applications
that exchange data over the Web. This creates a set of data management
requirements involving XML. XML and related standards have been extensively
applied in many business, service, and multimedia applications. As a result, a
large volume of data is managed today directly in XML format.
With the wide and in-depth utilization of XML in diverse application domains,
some particularities of data management in concrete applications emerge, which
challenge current XML technology. This is very similar with the situation that
some database models and special database systems have been developed so that
databases can satisfy the need of managing diverse data well. In data- and
knowledge- intensive application systems, one of the challenges can be
generalized as the need to handle imprecise and uncertain information in XML
data management by applying fuzzy logic, probability, and more generally soft
computing. Currently, two kinds of situations are roughly identified in soft
computing for XML data management: applying soft computing for the intelligent
processing of classical XML data; applying soft computing for the representation
and processing of imprecise and uncertain XML data. For the former, soft
computing can be used for flexible query of XML document as well as XML data
mining, XML duplicate detection, and so on. Additionally, it is crucial for Web-
based intelligent information systems to explicitly represent and process imprecise
and uncertain XML data with soft computing. This is because XML has been
extensively applied in many application domains which may have a big deal of
imprecision and vagueness. Imprecise and uncertain data can be found, for
example, in the integration of data sources and data generation with nontraditional
means (e.g., automatic information extraction and data acquirement by sensor and
RFID). Also XML has been an important component of the Semantic Web
framework, and the Semantic Web provides Web data with well-defined meaning,
enabling computers and people to better work in cooperation.
Soft computing has been a crucial means of implementing machine
intelligence. Therefore, soft computing cannot be ignored in order to bridge the
gap between human-understandable soft logic and machine-readable hard logic. It
can be believed that soft computing can play an important and positive role in
XML data management. Currently the research and development of soft
computing in XML data management are attracting an increased attention.
VI Preface

This book covers in a great depth the fast growing topic of techniques, tools
and applications of soft computing in XML data management. It is shown how
XML data management (like model, query, integration) can be covered with a soft
computing focus. This book aims to provide a single account of current studies in
soft computing approaches to XML data management. The objective of the book
is to provide the state of the art information to researchers, practitioners, and
graduate students of the Web intelligence, and at the same time serving the
information technology professional faced with non-traditional applications that
make the application of conventional approaches difficult or impossible.
This book, which consists of twelve chapters, is organized into three major
sections. The first section containing the first four chapters discusses the issues of
uncertainty in XML. The next four chapters, covering the flexibility in XML data
management supported by soft computing, comprise the second section. The third
section focuses on the developments and applications of soft computing in XML
data management in the final four chapters.
Chapter 1 proposes a general XML Schema definition for representing and
managing fuzzy information in XML documents. Different aspects of fuzzy
information are represented by starting from proposals coming from the classical
database context. Their datatype classifications are extended and integrated in
order to propose a complete and general approach for representing fuzzy
information in XML documents by using XML Schema. In particular, a fuzzy
XML Schema Definition is described taking into account fuzzy datatypes and
elements needed to fully represent fuzzy information.
Chapter 2 aims to satisfy the need of modeling complex objects with
imprecision and uncertainty in the fuzzy XML model and the fuzzy nested
relational database model. After presenting the fuzzy DTD model and the fuzzy
nested relational database model based on possibility distributions, the formal
approach is developed in order to map a fuzzy DTD model to a fuzzy nested
relational database schema.
Chapter 3 describes a fuzzy XML schema to represent an implementation of a
fuzzy relational database that allows for similarity relations and fuzzy sets. A flat
translation algorithm is provided to translate from the fuzzy database
implementation to a fuzzy XML document that conforms to the suggested fuzzy
XML schema. The proposed algorithm is implemented within VIREX. A
demonstrating example is presented to illustrate the power of VIREX in
converting fuzzy relational data into fuzzy XML.
Chapter 4 aims at automatically integrating data sources, using very simple
knowledge rules to rule out most of the nonsense possibilities, combined with
storing the remaining possibilities as uncertainty in the database and resolving
these during querying by means of user feedback. For this purpose, the chapter
introduces this “good is good-enough” integration approach and explains the
uncertainty model that is used to capture the remaining integration possibilities. It
is shown that using this strategy, the time necessary to integrate documents
drastically decreases, while the accuracy of the integrated document increases
over time.
Preface VII

Chapter 5 focuses on the retrieval of XML data from heterogeneous multiple


sources and proposes a new approach enabling the retrieval of meaningful answers
from different sources, by exploiting vague querying and approximate join
techniques. It essentially consists in first applying transformations to the original
query obtaining relaxed versions of it, each matching the schema adopted at a
single source, then using relaxed queries to retrieve partial answers from each
source and finally combining them using information about retrieved objects. The
approach is experimentally validated and has proved effective in a P2P setting.
Chapter 6 presents a fuzzy-set-based extension to XQuery which allows user to
express preferences on XML documents and retrieves documents discriminated by
their satisfaction degree. This extension consists of the new xs:truth built-in data
type intended to represent gradual truth degrees as well as the xml:truth attribute
to handle satisfaction degrees in nodes of fuzzy XQuery expressions. XQuery
language is extended to declare fuzzy terms and use them in query expressions.
Additionally, several kinds of expressions as FLWOR are fuzzified. An evaluation
mechanism is presented in order to avoid superfluous calculation of truth degrees.
Chapter 7 describes the design and implementation of a fuzzy nested querying
system for XML databases. The research involved is outlined and examined to
decide on the most fitting solution that incorporates fuzziness into a user interface
intended to be attractive to naive users. The findings are applied via the
implementation of a prototype which covers the intended scope of a demonstration
of fuzzy nested querying. This prototype is integrated into VIREX (a user-friendly
system allowing users to view and use relational data as XML) and includes an
easy to use graphical interface that will allow the user to apply fuzziness in order
to easier search XML documents.
Chapter 8 focuses on fuzzy duplicate detection in XML data, a crucial task in
many applications such as data cleaning and data integration. By using two main
dimensions, which are the methods effectiveness and efficiency, four algorithms
that have been proposed for XML fuzzy duplicate detection are described and
analyzed for comparison purpose. Also a comparative experimental evaluation
performed on both artificial and real-world data is presented. The comparison
shows the performances of these four algorithms.
Chapter 9 proposes a machine-readable fuzzy-EPC representation in XML
based on the EPC Markup Language (EPML) to conceptually represent fuzzy
business process models. It reports on the design of the Fuzzy-EPC compliant
schema and shows major syntactical extensions. A realistic example (sales order
checks) is sketched, showing that Fuzzy-EPML is able to serve as an adequate
interchange format for fuzzy business process models.
Chapter 10 aims to design and develop an XML based framework to represent
and merge the statistical information of clinical trials in XML documents. This
framework considers any valid clinical trial including trials with partial
information, and merges statistical information automatically with the potential to
add a component to extract clinical trials information automatically. A method is
developed to analyze inconsistencies among a collection of clinical trials and if
necessary to exclude any trials that are deemed to be illegible. Moreover, two sets
VIII Preface

of clinical trials, trials on Type 2 diabetes and on neurocognitive outcomes after


off-pump versus on-pump coronary revascularisation, are used to illustrate the
framework.
Chapter 11 presents the main characteristics of a new Fuzzy Database Aliança
(Alliance). The system is the union of fuzzy logic techniques, a database relational
management system and a fuzzy meta-knowledge base defined in XML. Aliança
accepts a wide range of data types, including all information already treated by
traditional databases, as well as incorporating different forms of representing
fuzzy data. The system uses XML to represent meta-knowledge. The use of XML
makes it easy to maintain and understand the structure of imprecise information.
Also Aliança is designed to allow easy upgrading of traditional database systems.
The Fuzzy Database Architecture Aliança approximates the interaction with
databases to the usual way in which human can reason.
Chapter 12 presents SUNRISE (System for Unified Network Routing, Indexing
and Semantic Exploration) for XML data sharing. Aiming at semantic
interoperability in heterogeneous networks, SUNRISE is a PDMS (Peer Data
Management System) infrastructure, which leverages the semantic approximations
originating from schemas’ heterogeneity for an effective and efficient organization
and exploration of the network. SUNRISE implements soft computing techniques
which cluster peers in Semantic Overlay Networks according to their own
contents, and promote the routing of queries towards the semantically best
directions in the network.

Acknowledgements

We wish to thank all of the authors for their insights and excellent contributions to
this book and would like to acknowledge the help of all involved in the collation
and review process of the book. Thanks go to all those who provided constructive
and comprehensive reviews. Thanks go to Janusz Kacprzyk, the series editor of
Studies in Fuzziness and Soft Computing, and Thomas Ditzinger, the senior editor
of Applied Sciences and Engineering of Springer-Verlag, for their support in the
preparation of this volume. The idea of editing this volume stems from our initial
research work which is supported by the National Natural Science Foundation of
China (60873010), the Fundamental Research Funds for the Central Universities
(N090504005 & N090604012) and Program for New Century Excellent Talents in
University (NCET-05-0288).

Northeastern University, China Zongmin Ma


April 2010 Li Yan
Contents

Part I: Uncertainty in XML

An XML Schema for Managing Fuzzy Documents . . . . . . . . . . . 3


Barbara Oliboni, Gabriele Pozzani
Formal Translation from Fuzzy XML to Fuzzy Nested
Relational Database Schema . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Li Yan, Jian Liu, Z.M. Ma
Human Centric Data Representation: From Fuzzy
Relational Databases into Fuzzy XML . . . . . . . . . . . . . . . . . . . . . . . 55
Keivan Kianmehr, Tansel Özyer, Anthony Lo, Jamal Jida,
Alnaar Jiwani, Yasin Alimohamed, Krista Spence, Reda Alhajj
Data Integration Using Uncertain XML . . . . . . . . . . . . . . . . . . . . . 79
Ander de Keijzer

Part II: Flexibility in XML Data Management


Exploiting Vague Queries to Collect Data from
Heterogeneous XML Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
Bettina Fazzinga
Fuzzy XQuery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
Marlene Goncalves, Leonid Tineo
Attractive Interface for XML: Convincing Naive Users to
Go Online . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
Keivan Kianmehr, Jamal Jida, Allan Chan, Nancy Situ, Kim Wong,
Reda Alhajj, Jon Rokne, Ken Barker
An Overview of XML Duplicate Detection Algorithms . . . . . . . 193
Pável Calado, Melanie Herschel, Luıs Leitäo
X Contents

Part III: Developments and Applications


Fuzzy-EPC Markup Language: XML Based Interchange
Formats for Fuzzy Process Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
Oliver Thomas, Thorsten Dollmann
An XML Based Framework for Merging Incomplete and
Inconsistent Statistical Information from Clinical Trials . . . . . . 259
Jianbing Ma, Weiru Liu, Anthony Hunter, Weiya Zhang
Aliança: A Proposal for a Fuzzy Database Architecture
Incorporating XML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
Raquel D. Rodrigues, Adriano J. de O. Cruz, Rafael T. Cavalcanti
Leveraging Semantic Approximations in Heterogeneous
XML Data Sharing Networks: The SUNRISE Approach . . . . . 315
Federica Mandreoli, Riccardo Martoglia, Wilma Penzo,
Simona Sassatelli, Giorgio Villani

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351


Part I: Uncertainty in
XML
An XML Schema for Managing Fuzzy
Documents

Barbara Oliboni and Gabriele Pozzani

Abstract. Topics related to fuzzy data have been investigated in the classical
database research field, and in the last years they are becoming interesting also in
the XML data context. In this work, we consider issues related to the representation
and management of fuzzy data by using XML documents. We propose to represent
different aspects of fuzzy information by starting from proposals coming from the
classical database context. We extend and integrate their datatype classifications in
order to propose a complete and general approach for representing fuzzy informa-
tion in XML documents by using XML Schema. In particular, we describe a fuzzy
XML Schema Definition taking into account fuzzy datatypes and elements needed
to fully represent fuzzy information.

1 Introduction
Issues related to the representation, processing, and management of information in
a flexible way appear in several research areas (e.g., artificial intelligence, databases
and information systems, data mining, and knowledge representation). Require-
ments related to fuzziness come from the observation that human reasoning is not
exact and precise as happen usually in personal computers. Humans do not fol-
low precise and always equal rules. Moreover, in some applications data come with
errors or are inherently imprecise since their values are subjective (e.g., values for
representing customer satisfaction degrees). Thus, it has been natural for researchers
try to incorporate flexible features in software. Hence, several proposals deal with
Barbara Oliboni
Department of Computer Science, University of Verona, Italy
e-mail: [email protected]
Gabriele Pozzani
Department of Computer Science, University of Verona, Italy
e-mail: [email protected]

Z. Ma & L. Yan (Eds.): Soft Computing in XML Data Management, STUDFUZZ 255, pp. 3–34.
springerlink.com c Springer-Verlag Berlin Heidelberg 2010
4 B. Oliboni and G. Pozzani

problems related to the representation and processing of imprecise data. Many of


them starts from theories formulated by Zadeh [36].
Zadeh formalized notions related to fuzziness and uncertain data representation
by presenting a theory about fuzzy sets, possibility theory, and similarity relations.
These notions are the basic ones used in many proposals related to the representation
of imprecise data in classical databases for making them more flexible [13, 21, 25,
24, 27, 28, 40, 41].
As an example, fuzzy databases allow one to represent the uncertainity of physi-
cal measures or subjective human preferencies. On the other hand, fuzzy processing
of data allows one to reply to queries not only returning exact matching data but also
data similar to the requested ones. In this way, the system is able to get around er-
rors in queries formulation coming from user misunderstanding or from incomplete
information representation.
Among all proposals about fuzzy databases, we consider the GEFRED one [21],
which is based on generalized fuzzy domains and relations and allows one to repre-
sent possibility distributions, similarity relations, linguistic labels and all other fuzzy
concepts and datatypes. The GEFRED model was extended by Galindo et al. [13]
to define a complete database system capable to manage fuzzy information. For
extending the GEFRED model they define a fuzzy ER conceptual model, a fuzzy
relational database and an extended SQL language (FSQL) able to manage fuzzy
data.
In this work, we consider the model proposed by Galindo et al., and in particular
their fuzzy data types classification, as a starting point for classifying data types
needed to represent fuzzy information in XML documents.
Since XML is imposing itself as a standard for representing and exchanging in-
formation on the net, topics related to the modeling of fuzzy data can be considered
very interesting also in the XML data context. Few proposals in the literature deal
with the representation of fuzzy information in XML documents [14, 19, 20, 26] by
considering different aspects.
In our proposal, we adopt the data types classification defined in [13] for the
relational database context, and adapt it to the XML data context. In order to man-
age data types, differently from other related approaches, we choose to use XML
Schema [32] instead of DTD [23]. DTD is included in the XML 1.0 standard [23],
and thus it is widely used and supported in applications. However DTD has some
limitations: it does not support new XML features (e.g. namespaces), it has some
lack of expressivity and it uses a non-XML syntax to describe the grammar. All
these limitations are overcame by XML Schema [32]. XML Schema can be used
to express a set of rules to which an XML document must conform in order to be
considered “valid” (with respect to that schema), and provides an object oriented
approach to the definition of XML elements and datatypes. Moreover, it is compat-
ible with other XML technologies like Web services, XQuery (for XML document
querying) and XSLT (for XML document presentation). Thus, we propose a general
approach for representing fuzzy information in XML documents by using XML
Schema. We describe a fuzzy XML Schema definition taking into account fuzzy
data types and elements needed to fully represent fuzzy information.
An XML Schema for Managing Fuzzy Documents 5

Our proposal for an XML Schema able to represent fuzzy data can be used by
any organization or system managing uncertain data. These users may have the ne-
cessity to exchange fuzzy information through different subsystems, locally or over
the net, and the use of fuzzy XML documents may represent a good solution. More-
over, fuzzy XML documents can be used by these systems as a storage method
for collected fuzzy data. Since, actually, there are no DBMSs implementing fuzzy
capabilities and the development of a fuzzy extention for an existing DBMS may
require too effort, fuzzy XML documents can represent a simple way to store and
manage fuzzy information, as already happen for classical data. Our proposal can
help in organizing these data providing a common and complete reference Schema
for representing fuzzy data.
This work is structured as follows: in Section 2 we present some background no-
tions useful to better understand the context of this proposal. In Section 3 we present
our proposal of an XML Schema definition introducing new fuzzy datatypes and el-
ements needed to represent fuzzy information in an XML document. In Section 4 we
give an example of an XML document satisfying the proposed Schema, by consid-
ering information managed by a weather station. In Section 5 we further extend the
proposed Schema allowing the representation of some information useful during the
fuzzy processing of an XML document. Some examples about these fuzzy process-
ing information are illustrated in Section 6. In Section 7 we discuss how a classical
XML document can be changed in order to comply with our fuzzy XML Schema
proposal and be able to represent fuzzy data. In Section 8 we give a brief description
of other approaches presented in the literature about representation and querying of
fuzzy XML documents. Finally, in Section 9 we sketch some conclusions and future
research directions.

2 Background
In this section we briefly report some background notions on fuzziness, on relational
databases dealing with fuzzy data, and on XML.
Several proposals deal with the representation of uncertain data in databases. The
relational approach [6, 7, 8] has introduced the NULL value in order to represent
unknown attribute values (i.e., none value is applicable or all values in the domain
are possible). NULL value introduces a tri-valued logic. Later on, for example in
Umano-Fukami model [27, 28], NULL value was further differentiated introducing
the fuzzy values UNKNOWN, UNDEFINED and NULL. UNKNOWN means that any
value in the domain is possible, UNDEFINED means that none of the values in the
domain is possible and NULL (it is different by the null pointer) means that we do
not know anything, in other words it may be both undefined or unknown.
However, more systematic approaches to fuzzy databases started from the notion
of fuzzy set and other related notions.
The definition of fuzzy set, introduced by Zadeh in [36], is based on the classical
notion of set and extends it to introduce flexibility. In the classical definition, a set S
6 B. Oliboni and G. Pozzani

on a domain D is defined as a boolean function μ : D → {0, 1} that says us whether


an object in D belongs (1) or not (0) to S; μ is called the membership function of S.
The membership function associated to a fuzzy set F is a function μF : D → [0, 1]
valued in the real unit interval [0, 1]. Thus, in a fuzzy set, each object in D belongs to
the set with a certain degree; this means that each object is related to a membership
degree.
In 1971 Zadeh introduced the notion of similarity relation [37]: given a set of
objects, a similarity relation defines the similarity degree between any pair of ob-
jects, i.e., how much two objects are similar one to each other. By using similarity
relations, users can retrieve not only a requested object but also the similar ones, in-
troducing fuzziness in queries. The use of similarity relations inside relational model
was introduced in the Buckles-Petry Model [4] to get fuzzy capability to relational
databases.
Moreover, in [38], Zadeh has extended the fuzzy set theory introducing the possi-
bility theory, an alternative to probabilistic theory. This notion was further extended
by Dubois and Prade in [11] and subsequent work. A possibility distribution is based
on the relationship between linguistic variable and fuzzy set notions. A possibility
distribution is determinated by the question “Is x A?” where A is a fuzzy set on do-
main X and x is a variable on X. The use of possibility theory in relational model
was introduced in three main different models: Prade-Testemale model [25, 24],
Umano-Fukami model [27, 28] and Zemankova-Kandel model [40, 41].
All above fuzzy approaches and models have been joined in the GEFRED model
of Medina, Pons and Vila [21]. The GEFRED model is based on generalized fuzzy
domains and relations which extend classical domains and relations and allows one
to represent possibility distributions, similarity relations, linguistic labels and other
fuzzy concepts and datatypes.
The GEFRED model was extended by Galindo et al. [13] by defining a com-
plete database system able to manage fuzzy information. Extending the GEFRED
model they define a fuzzy ER conceptual model, a fuzzy relational database and an
extended SQL language (FSQL) capable to manage fuzzy data. In particular they
define new fuzzy datatypes that allow one to store fuzzy values in database tables
and fuzzy degrees which allow one to incorporate other uncertainty information
with several meanings. Moreover they store some meta-data about fuzzy objects in
auxiliary tables called Fuzzy Metaknowledge Base (FMB).
In this work, we will start from the GEFRED model for defining a suitable
approach to represent fuzzy information in XML documents.
XML (eXtensible Markup Language) [23] is a markup language introduced as
a simplified subset of SGML (Standard Generalized Markup Language) [16] by
the World Wide Web Consortium (W3C) [29]. XML is the standard de facto for
describing and exchanging data between different systems and applications using
Internet. XML is extensible because it supports user-defined elements and datatypes.
The grammar for tags in an XML document is defined in a DTD (Document Type
Definition) [23] to which the XML document must refer. The elements in an XML
document, related to a given DTD, must respect the DTD itself.
An XML Schema for Managing Fuzzy Documents 7

DTD is included in the XML 1.0 standard, and thus it is widely used and sup-
ported in applications. However DTD has some limitations: it does not support new
XML features (e.g., namespaces), it has some lack of expressivity and it uses a
non-XML syntax to describe the grammar.
All these limitations are overcame by the XML Schema [32] (also called XML
Schema Definition, XSD). XML Schema can be used to express a set of rules to
which an XML document must conform in order to be considered “valid” (with
respect to that schema). XML Schema provides an object oriented approach to the
definition of XML elements and datatypes. Moreover it is compatible with other
XML technologies like Web services, XQuery (for XML documents querying) [31]
and XSLT (for XML documents presentation) [33].
Our proposal deals with the representation of fuzzy data in XML documents,
is based on the extended version of the GEFRED model proposed by Galindo
et al. [13], and uses XML Schema.

3 XML Schemata for Fuzzy Information


In this section we propose a fuzzy XML Schema Definition containing the new
fuzzy datatypes and elements needed to represent fuzzy information, accordingly to
the extended GEFRED relational data model [13]. In particular, we define appro-
priate XML schemata for fuzzy datatypes and degrees and for the related auxiliary
information stored in the Fuzzy Metaknowledge Base (FMB).
The definition of an XML Schema may be divided into several related schemata.
Each Schema may refer to other schemata by introducing a different namespace for
each of them. Namespaces allow one to refer and use objects defined in different
schemata specifying their locations. Moreover, namespaces allow one to distinguish
between different elements with the same name but with different definitions, loca-
tions, and semantics. To each namespace corresponds a different XML Schema, in
such a way the system can retrieve the correct definition for each element. Fig. 1
depicts relationships among the XML schemata constituting the proposed overall
schema. Each line represents a reference of a Schema inside another one. Note that
the Schema base.xsd is defined just one time but it is referred by all other second
level schemata.
In XML documents, data are represented in a structured way and their structure is
defined by related XML schemata. For example, if we consider an XML document
obtained by a database, its XML Schema may define that tuples are represented in
elements called record and they are arranged in an element named as the table
name. In this work we focus only on the structure of fuzzy information supposing
the user already has a general XML Schema defining the structure of other crisp
parts of the document.
In the following sections we analyse all parts of the XML Schema we propose
for managing fuzzy information.
8 B. Oliboni and G. Pozzani

FleXchema.xsd

FuzzyOrdType.xsd
base.xsd
FuzzyNonOrdSimType.xsd
base.xsd
FuzzyNonOrdType.xsd
base.xsd
degrees.xsd
base.xsd
FMB.xsd
base.xsd
processing.xsd
base.xsd
Fig. 1 Reference relations between proposed XML schemata

3.1 The Root Schema


FleXchema.xsd is the main file of the proposed schema. It defines the general
structure of fuzzy datatypes, FMB (see Section 3.7), and processing information (see
Section 5) recalling definitions given in several different files, that we will analyse
in following sections.
First of all, we introduce the definitions of the four fuzzy datatypes that our XML
Schema proposal allows one to represent:
1. classical crisp (non fuzzy) data marked to be processed with fuzzy operations,
represented by datatype classicType;

<xs:complexType name="ClassicType">
<xs:sequence>
<xs:any namespace="http://www.w3.org/2001/XMLSchema"
minOccurs="1" maxOccurs="1" />
</xs:sequence>
<xs:attribute name="info" type="xs:IDREF" use="required" />
<xs:attribute name="type" type="xs:string" fixed="T1"
use="required" />
</xs:complexType>
An XML Schema for Managing Fuzzy Documents 9

2. imprecise data over an ordered underlying domain, represented by datatype


FuzzyOrdType (see Section 3.3);

<xs:complexType name="FuzzyOrdType">
<xs:sequence>
<xs:any namespace="http://stars.sci.univr.it/FuzzyOrdType"
minOccurs="1" maxOccurs="1" />
</xs:sequence>
<xs:attribute name="info" type="xs:IDREF" use="required" />
<xs:attribute name="type" type="xs:string" fixed="T2"
use="required"/>
</xs:complexType>

3. imprecise data over a discrete nonordered domain and related by a similarity


relation, represented by datatype FuzzyNonOrdSimType (see Section 3.4);

<xs:complexType name="FuzzyNonOrdSimType">
<xs:sequence>
<xs:any minOccurs="1" maxOccurs="1"
namespace="http://stars.sci.univr.it/FuzzyNonOrdSimType"/>
</xs:sequence>
<xs:attribute name="info" type="xs:IDREF" use="required" />
<xs:attribute name="type" type="xs:string" fixed="T3"
use="required"/>
</xs:complexType>

4. imprecise data over a discrete nonordered domain and not related by a similarity
relation, represented by datatype FuzzyNonOrdType (see Section 3.5).

<xs:complexType name="FuzzyNonOrdType">
<xs:sequence>
<xs:any
namespace="http://stars.sci.univr.it/FuzzyNonOrdType"
minOccurs="1" maxOccurs="1" />
</xs:sequence>
<xs:attribute name="info" type="xs:IDREF" use="required" />
<xs:attribute name="type" type="xs:string" fixed="T4"
use="required"/>
</xs:complexType>

Each datatype is defined as an XML complexType with two required attributes.


The first attribute (info) is an IDREF refering to the element in the FMB part of
the document containing the meta-information (see Section 3.7) about the interested
fuzzy object. The second attribute (type) is a fixed string encoding the datatype of
the considered element. The possible codings we define for the datatypes are:
• T1 for classicType datatype;
• T2 for FuzzyOrdType datatype;
• T3 for FuzzyNonOrdSimType datatype;
• T4 for FuzzyNonOrdType datatype.
10 B. Oliboni and G. Pozzani

This fixed attribute allows us to distinguish between the different fuzzy classes of
datatypes. Some fuzzy datatypes (e.g., possdistr, null, unknown) are defined
in several classes and we may need a way to distinguish them in order to process
them in different ways.
Finally, each datatype contains a subelement representing the actual fuzzy data.
These subelements are defined by using the any XML element and each one allows
one to insert an element selected from a referred different namespace. Each names-
pace is defined in another external XML Schema. In particular, the any subelement
in classicType refers to the basic XML Schema provided by the W3C [29]. In this
way, it is possible to specify any value of the classical crisp datatypes (e.g. strings,
integers, timestamps). Subelements in the other three datatypes refer to namespaces
defined in different XML schemata proposed by us and explained in the following
sections.
To better understand how these definitions may be used, let us consider the fol-
lowing example. It represents a classical crisp data containing the name of a cus-
tomer, where type=T1 means that the name is a crisp data, and info="ABC"
means that the related meta-information are contained in the FMB element with ID
ABC.
<name type="T1" info="ABC">
John
</name>

Up to now, we have defined datatypes able to represent the structure of the fuzzy
information. Finally, the main Schema introduces elements defining the structure
of new particular parts of a fuzzy XML document. These elements delineate the
structure of the FMB and processing information. FMB is a sequence of (in some
cases, optional) elements, each one describing a different meta-information (see
Section 3.7). Meta-information include label definitions, default margin for approx-
imate values, and similarity relations.

<xs:element name="FMB">
<xs:complexType>
<xs:sequence minOccurs="0" maxOccurs="1">
<xs:element ref="xsfmb:fcl" minOccurs="1" maxOccurs="1"/>
<xs:element ref="xsfmb:labelDefs" minOccurs="0"
maxOccurs="1"/>
<xs:element ref="xsfmb:fam" minOccurs="0" maxOccurs="1"/>
<xs:element ref="xsfmb:simRelDefs" minOccurs="0"
maxOccurs="1"/>
</xs:sequence>
</xs:complexType>
</xs:element>

Finally, the root Schema file, FleXchema.xsd, defines the processInfo


element. It is a sequence of (optional) qualifier and quantifier definitions. We will
describe their definition and usage in Section 5. In particular we will see that they
are useful during the fuzzy information processing.
An XML Schema for Managing Fuzzy Documents 11

<xs:element name="processInfo" minOccurs="0" maxOccurs="1">


<xs:complexType>
<xs:sequence>
<xs:element ref="xsproc:qualifiers" minOccurs="0"
maxOccurs="1"/>
<xs:element ref="xsproc:quantifiers" minOccurs="0"
maxOccurs="1"/>
</xs:sequence>
</xs:complexType>
</xs:element>

3.2 Basic Datatypes


In the base namespace four basic datatypes, needed in all other namespaces, are
defined.
The simpleType probType represents the type of a probabilistic data, hence it
is defined as a decimal value in the range [0, 1].
<xs:simpleType name="probType">
<xs:restriction base="xs:decimal">
<xs:minInclusive value="0"/>
<xs:maxInclusive value="1"/>
</xs:restriction>
</xs:simpleType>

The datatype labelRefType represents a reference to the ID of a label


definition contained in the FMB. It is essentially a renaming of the IDREF datatype
(defined by W3C) given in order to clarify the meaning of some attributes used in
other XML schemata.
<xs:complexType name="labelRefType">
<xs:attribute name="label_id" type="xs:IDREF"/>
</xs:complexType>

The datatype ftype is the set of integer values in the range [1, 7]. It is used in the
FMB definition in order to keep information about the fuzzy type of a fuzzy object
(see Section 3.7).
<xs:simpleType name="ftype">
<xs:restriction base="xs:positiveInteger">
<xs:minInclusive value="1"/>
<xs:maxInclusive value="7"/>
</xs:restriction>
</xs:simpleType>

Finally, datatype any defines a shorthand for the any element defined by the
W3C and refering to any element and type already defined in the W3C namespace.
12 B. Oliboni and G. Pozzani

<xs:complexType name="any">
<xs:sequence>
<xs:any namespace="http://www.w3.org/2001/XMLSchema"
minOccurs="1" maxOccurs="1" />
</xs:sequence>
</xs:complexType>

3.3 Fuzzy Data over an Ordered Domain


The FuzzyOrdType.xsd file contains the definition of the fuzzy datatypes repre-
senting imprecise data over an ordered underlying domain. As happen in most sys-
tems allowing null values, the null value can be compared with any other type of
data. The same happens in our proposal where values unknown, undefined, and
null are defined both on ordered underlying domains and on non-ordered underly-
ing domains. Hence, their definitions are present in this namespace, defining fuzzy
ordered datatypes, and in the following ones, defining fuzzy non-ordered datatypes.
The duplication of these definitions is needed because in some cases we have to
process these special values in a different way on the base of their datatype class
(i.e., on the underlying domain).
<xs:element name="unknown" />
<xs:element name="undefined" />
<xs:element name="null" />

For the same reason, in FuzzyOrdType we allow one to introduce also any
crisp data (on an ordered domain).
<xs:element name="crisp" type="xsb:any" />

The namespace with prefix xsb refers to the XML Schema base.xsb reported
in the previous section.
We define that fuzzy data over an ordered domain can include:
• Linguistic labels. The use of a label lies in an IDREF to its definition. This defi-
nition, given in a name and eventually a trapezoidal form, is reported in the FMB
part of the XML document (see Section 3.7). The choice to use IDREFs, storing
label definitions in the FMB, reduces the data redundancy in XML documents
but, on the other hand, requires a more complex data processing for querying
XML data.
<xs:element name="label" type="xsb:labelRefType"/>

• Trapezoidal values. Trapezoidal values allow us to represent continuous possi-


bility distributions defined by four decimal values [α , β , γ , δ ] (see Fig. 2). Values
between β and γ have possibility degrees equal to one, values less than or equal
to α and greater than or equal to δ have possibility degrees equal to zero, and
values in ranges [α , β ] and [γ , δ ] have possibility degrees defined respectively by
An XML Schema for Managing Fuzzy Documents 13

1 1
1

0 α β γ δ 0 0 margin
lb ub d
(a) Trapezoidal distribution (b) Interval (c) Triangular distribution
Fig. 2 Continuous possibility distributions on an ordered domain

the lines connecting the two values. We will see that also labels have a trapezoidal
definition; however, trapezoidal values allow us to define a trapezoidal distribu-
tion without having a label for it. Note that, trapezoidal distributions is a general
case of interval values and triangular distributions.
<xs:element name="trapezoidal">
<xs:complexType>
<xs:sequence>
<xs:element name="alpha" type="xs:decimal"/>
<xs:element name="beta" type="xs:decimal"/>
<xs:element name="gamma" type="xs:decimal"/>
<xs:element name="delta" type="xs:decimal"/>
</xs:sequence>
</xs:complexType>
</xs:element>

• Intervals. Intervals are special cases of trapezoidal values where α = β and γ = δ .


They are then defined by two decimal values, in the Schema named lb and ub,
such that all values in the range [lb, ub] (see Fig. 2(b)) have possibility degree
equal to one, while all other values have possibility degree equal to zero.
<xs:element name="interval">
<xs:complexType>
<xs:sequence>
<xs:element name="lb" type="xs:decimal" />
<xs:element name="ub" type="xs:decimal" />
</xs:sequence>
</xs:complexType>
</xs:element>

• Approximate values. Approximate values represent triangular possibility distri-


butions. They are defined by a central value d and a margin value around the
central value (see Fig. 2(c)). Hence, a triangular distribution is a special case of a
trapezoidal one where β = γ and where α and δ are equidistant from the central
value. Only value d has possibility degree equal to one. All values outside the
range [d − margin, d + margin] have possibility degree equal to zero. In an ap-
proximate value the margin can be omitted, in this case we use the default margin
stored in the FMB tables (see Section 3.7).
14 B. Oliboni and G. Pozzani

<xs:element name="approxvalue">
<xs:complexType>
<xs:sequence>
<xs:element name="d" type="xs:decimal" />
<xs:element name="margin" type="xs:decimal"
minOccurs="0" />
</xs:sequence>
</xs:complexType>
</xs:element>

• Possibility distributions. The XML element possdistr allows one to define a


discrete possibility distribution represented as a set (with finite unbounded maxi-
mum cardinality) of pairs (p, d) meaning that value d has possibility degree equal
to p. We do not wrap any pair inside an ad-hoc element because we can recog-
nize correctly pairs by reading elements two-by-two. The d value may by of any
datatype on an ordered domain, however the system must check that all values
inside the same possibility distribution have the same type. Possibility degrees p
has got type probType defined in the base namespace, which possible values,
we remark, are in the range [0, 1].
<xs:element name="possdistr">
<xs:complexType>
<xs:sequence maxOccurs="unbounded" minOccurs="1">
<xs:element name="p" type="xsb:probType"/>
<xs:element name="d" type="xsb:any"/>
</xs:sequence>
</xs:complexType>
</xs:element>

3.4 Fuzzy Data over a Nonordered Domain with Similarity


Relations
The datatype FuzzyNonOrdSimType defines the possible values of fuzzy objects over
a nonordered domain. As we said in the previous section, possible values in this
datatype include unknown, undefined, and null values, defined exactly as for
the ones on an ordered domain.
This datatype allows one to define possibility distributions composed by pairs
(p, d) where d is a label which possibility degree is p. The d XML element is
defined as a reference to a label which definition is contained in the FMB. Note
that, since the underlying domain is nonordered, these labels have not a trapezoidal
definition (this constraint must be checked by the system). Moreover, values (rep-
resented by labels) are related by a similarity relation. For this reason the XML el-
ement possdistr in this Schema has also a required IDREF attribute (simRel)
refering to a similarity relation defined in the FMB.
An XML Schema for Managing Fuzzy Documents 15

<xs:element name="possdistr">
<xs:complexType>
<xs:sequence maxOccurs="unbounded" minOccurs="1">
<xs:element name="p" type="xsb:probType"/>
<xs:element name="d" type="xsb:labelRefType"/>
</xs:sequence>
<xs:attribute name="simRel" type="xs:IDREF" use="required" />
</xs:complexType>
</xs:element>

3.5 Fuzzy Data over a Nonordered Domain without Similarity


Relations
The datatype FuzzyNonOrdType is very similar to the previous one, FuzzyNonOrd-
SimilarityType. It represents fuzzy values over a nonordered domain, including
unknown, undefined, and null values, and possibility distributions. However,
conversely to the previous datatype, in this case values in a possibility distribution
are not related by a similarity relation. For this reason, the element possdistr
does not include an attribute refering to a similarity relation definition in the FMB.
Hence, possibility distributions are defined just on labels without a trapezoidal defi-
nition. The use of these labels depends only from the application and its semantics.
<xs:element name="possdistr">
<xs:complexType>
<xs:sequence maxOccurs="unbounded" minOccurs="1">
<xs:element name="p" type="xsb:probType"/>
<xs:element name="d" type="xsb:labelRefType"/>
</xs:sequence>
</xs:complexType>
</xs:element>

3.6 Fuzzy Degrees


Another way to incorporate uncertainty in classical databases consists in the use of
degrees. The most common use of a degree is the membership degree associated
to each instance of a tuple. The membership degree says how much the instance
belongs to the tuple. However, other kinds of degree have been proposed in the
literature. For example, the tuple degree may represent the fulfillment degree of a
condition [21], the importance degree [2], the possibility degree or the uncertainty
degree [28]. Any fuzzy data model makes a different choice in the interpretation of
degrees.
In [13], Galindo et al. classify the degrees with respect to their use instead of with
respect to their meaning. A first classification distinguishes between associated and
nonassociated degrees. The former applies their value to one or more attributes, the
latter (FuzzyNonAssDegree) represents an imprecise information without associat-
ing it to another attribute. Moreover, Galindo et al. classify the associated degrees
16 B. Oliboni and G. Pozzani

in degrees associated to one attribute, to a set of attributes, and to a whole tuple


(FuzzyInstDegree).
Since degrees associated to one attribute is a particular case of degrees associated
to a set of attributes where the set is a singleton, we chose to represent only the last
one (FuzzyAttrDegree). Thus, our Schema allows the definition of three kinds of
degrees: FuzzyAttrDegree, FuzzyInstDegree, and FuzzyNonAssDegree.
• FuzzyAttrDegree introduces fuzzy degrees associated to one or more attributes
of an entity instance. They are defined as an extension of the probType in-
troduced in the base namespace. Then, they include a possibility value (in the
range [0, 1]). Moreover, in order to keep information about the attributes to which
a degree is associated, it has an IDREFS attribute (refTo) that refers to the IDs
of these elements. These ID references refer to the FMB definition of the ele-
ments (see Section 3.7). In order to retrieve the actual values to which the degree
is associated we must find the sibling elements of the degree in this tuple that
have the same IDREF. Note that this query is supported by XPath [30]. Finally,
each associated degree includes an info IDREF attribute refering to the meta-
information in the FMB about its definition.
<xs:complexType name="fuzzyAttrDegree">
<xs:simpleContent>
<xs:extension base="xsb:probType">
<xs:attribute name="refTo" type="xs:IDREFS"
use="required" />
<xs:attribute name="info" type="xs:IDREF"
use="required" />
</xs:extension>
</xs:simpleContent>
</xs:complexType>

• FuzzyInstDegree represents degrees associated to the whole instance of an entity,


thus they do not need to refer to something, and are just reported as child of the
instance with which they are associated. Their definition is equal to the one for
degrees associated to attributes but without the attribute refTo.
<xs:complexType name="fuzzyInstDegree">
<xs:simpleContent>
<xs:extension base="xsb:probType">
<xs:attribute name="info" type="xs:IDREF"
use="required" />
</xs:extension>
</xs:simpleContent>
</xs:complexType>

• FuzzyNonAssDegree represents degrees that are associated neither to attributes


nor to an instance. They are reported inside instances of an entity, but their
An XML Schema for Managing Fuzzy Documents 17

Table 1 ftype encoding

f type fuzzy object


1 classical crisp data
2 fuzzy data over an ordered domain
3 fuzzy data over a nonordered domain with similarity relation
4 fuzzy data over a nonordered domain without similarity relation
5 degree associated to attributes
6 instance degree
7 non associated degree

meaning is not fixed in advance, but can be specified by the user in the string
attribute meaning. As the other kinds of degrees, also non-associated degrees
include a possibility value F and an IDREF attribute needed to retrieve the meta-
information about degrees in FMB. The choice to include the meaning inside de-
grees, instead of inside their meta-information, allows the user to easier retrieve
the meaning of degrees, reducing the data processing complexity.
<xs:complexType name="fuzzyNonAssDegree">
<xs:sequence>
<xs:element name="F" type="xsb:probType"/>
<xs:element name="meaning" type="xs:string"/>
</xs:sequence>
<xs:attribute name="info" type="xs:IDREF" use="required" />
</xs:complexType>

3.7 The Fuzzy Metaknowledge Base


The Fuzzy Metaknowledge Base (FMB) of an XML document contains the meta-
information about all fuzzy objects defined and used in the document.
The main FMB information are contained in the fcl (fuzzy column list) ele-
ment that reports basic and common information about all elements that may con-
tain fuzzy data. Information about each fuzzy object are contained in an fc (fuzzy
column) element inside fcl. Among these information we note:
• len reports the max lenght for possibility distributions in such element (it is
valid only for elements which type includes possibility distributions);
• ftype reports the type (from 1 to 7) of the fuzzy object (see Table 1);
• com is an user comment;
• um specifies the unit measure;
• sym specifies for FuzzyNonOrdSimilarityType data whether they use a symmet-
ric or an asymmetric similarity relation.
Elements com, um, and sym are optional.
18 B. Oliboni and G. Pozzani

<xs:element name="fcl">
<xs:complexType>
<xs:sequence minOccurs="1" maxOccurs="unbounded">
<xs:element ref="fc"/>
</xs:sequence>
</xs:complexType>
</xs:element>

<xs:element name="fc">
<xs:complexType>
<xs:sequence>
<xs:element name="name" type="xs:string"/>
<xs:element name="ftype" type="xsb:ftype"/>
<xs:element name="len" type="xs:positiveInteger"
minOccurs="0"/>
<xs:element name="com" type="xs:string" minOccurs="0" />
<xs:element name="um" type="xs:string" minOccurs="0" />
<xs:element name="sym" type="xs:boolean" minOccurs="0" />
</xs:sequence>
<xs:attribute name="id" type="xs:ID"/>
</xs:complexType>
</xs:element>

Since these are the main elements, they have an ID that identifies the fuzzy object.
As we explained in the previous sections, any fuzzy element has an IDREF to the
ID associated to its auxiliary information. These IDs are also used in other auxiliary
elements to give further type-specific information. For example the user may specify
the default margin for approximate values. The margins are stored in elements of
type fam (fuzzy approximate much) together with the value much that defines the
minimum distance needed to consider two values to be very different.

<xs:element name="fam">
<xs:complexType>
<xs:sequence>
<xs:element name="margin" type="xs:nonNegativeInteger"/>
<xs:element name="much" type="xs:positiveInteger"/>
</xs:sequence>
<xs:attribute name="id" type="xs:IDREF"/>
</xs:complexType>
</xs:element>

The FMB contains also the definition of similarity relations used in the XML
document. Definitions of all similarity relations are wrapped in the simRelDefs
element. Inside it, each similarity relation is contained in one simRel element
having an id attribute that identifies univocally the relation inside the document
and a name. A similarity relation is defined by a set of triples (sim), each one
composed by two IDREFs (fid1 and fid2) refering to the two related labels and
a value (degree), in range [0, 1], that specifies the similarity degree between them.
Obviously, labels may appear in several similarity relations, and two labels may be
related with different degrees in different similarity relations.
An XML Schema for Managing Fuzzy Documents 19

<xs:element name="simRelDefs">
<xs:complexType>
<xs:sequence minOccurs="1" maxOccurs="unbounded">
<xs:element ref="simRel" />
</xs:sequence>
</xs:complexType>
</xs:element>

<xs:element name="simRel">
<xs:complexType>
<xs:sequence minOccurs="1" maxOccurs="unbounded">
<xs:element ref="sim" />
</xs:sequence>
<xs:attribute name="id" type="xs:ID" />
<xs:attribute name="name" type="xs:string" />
</xs:complexType>
</xs:element>

<xs:element name="sim">
<xs:complexType>
<xs:sequence>
<xs:element name="fid1" type="xs:IDREF" />
<xs:element name="fid2" type="xs:IDREF" />
<xs:element name="degree" type="xsb:probType" />
</xs:sequence>
</xs:complexType>
</xs:element>

Finally, labelDefs stores label definitions, each one inside a labelinfo


element. Each label has an ID, used to refer to this label, a name (required) and a
trapezoidal definition made up of four decimal subelements.
<xs:element name="labelDefs">
<xs:complexType>
<xs:sequence minOccurs="1" maxOccurs="unbounded">
<xs:element ref="labelinfo" />
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="labelinfo">
<xs:complexType>
<xs:sequence>
<xs:element name="name" type="xs:string" />
<xs:sequence minOccurs="0">
<xs:element name="alpha" type="xs:decimal" />
<xs:element name="beta" type="xs:decimal" />
<xs:element name="gamma" type="xs:decimal" />
<xs:element name="delta" type="xs:decimal" />
</xs:sequence>
</xs:sequence>
<xs:attribute name="label_id" type="xs:ID" />
</xs:complexType>
</xs:element>
20 B. Oliboni and G. Pozzani

Note that the trapezoidal distribution is required only for labels defined over ordered
domains. However, this constraint (as any other one) must be checked by the system
since it cannot be expressed directly in the XML Schema.

4 Example
In this section we give a simple example of an XML document satisfying the pro-
posed XML Schema, by considering information managed by a weather station. The
document represents the tomorrow forecast and in particular the temperature and the
weather at different times in the day.
Each forecast is contained in a record element. The referred time in a record is
a classical information but it is represented by using a fuzzy element, marking it to
be processed by fuzzy querying. The temperature is a numerical datum represented
with a FuzzyOrdType element (because it is based on an ordered domain), while
possible weathers are represented by a FuzzyNonOrdSimType element because they
are based on a nonordered domain. We associate a degree (accuracy) to the tem-
perature for representing the accuracy of the forecasted temperature. Moreover, at
each time several forecasts are calculated by using different meteorological models
(e.g., LAM and GCM [22]). Thus, in each record a degree (precision) represents
the precision of the forecast calculated by the model at the considered time.
In this work, we focused only on the description of new elements enabling for
representation of fuzzy information in XML documents. However, each document
has also other classical elements and it must have its own schema. The XML Schema
for the considered example has to define elements tomorrowForecast (con-
taining all records), record, and so on, eventually by refering to proposed fuzzy
elements. The following listing reports the definition of the record element in
the Schema associated to the document for the weather station. We see that fuzzy
objects have types refering to the proposed ones.
<xs:element name="record">
<xs:complexType>
<xs:sequence>
<xs:element name="model" type="xs:string" />
<xs:element name="time" type="fuzzy:ClassicType" />
<xs:element name="temp" type="fuzzy:FuzzyOrdType" />
<xs:element name="accuracy" type="dgr:fuzzyAttrDegree" />
<xs:element name="weather" type="fuzzy:FuzzyNonOrdSimType"/>
<xs:element name="precision" type="dgr:fuzzyNonAssDegree" />
</xs:sequence>
</xs:complexType>
</xs:element>

The following document portion reports a record about the 5 o’clock forecast
calculated by the LAM model. Temperature is unknown, i.e., every value is possi-
ble, (hence, its accuracy is one), while the weather is undefined. The precision
element has value zero, due to the lack of information in temperature and weather.
Note that, since this degree is not associated to any attribute or instance (i.e., it has
type FuzzyNonAssDegree), it contains also its own meaning.
An XML Schema for Managing Fuzzy Documents 21

<record>
<model>LAM</model>
<time type="T1" info="T0">
<hm>05:00:00</hm>
</time>
<temp type="T2" info="Te1">
<t2:unknown />
</temp>
<accuracy refTo="Te1" info="D1"> 1 </accuracy>
<weather type="T3" info="W1">
<t3:undefined />
</weather>
<precision info="P1">
<dgr:F> 0 </dgr:F>
<dgr:meaning>model forecast precision</dgr:meaning>
</precision>
</record>

At the same time, the GCM model may report temperature by a trapezoidal distri-
bution [24, 25, 26, 27] with an accuracy of 0, 9, while possible weather is represented
by a possibility distribution based on a similarity relation SR1. In the example, with
a percentage of 80%, tomorrow the weather will be sunny (referred by the label
“S”), while with a percentage of 30% it will be cloudy (referred by the label “C”).
We remember that label and similarity relation definitions are contained in the FMB.
<record>
<model>GCM</model>
<time type="T1" info="T0">
<hm>05:00:00</hm>
</time>
<temp type="T2" info="Te1">
<t2:trapezoidal>
<t2:alpha>24</t2:alpha>
<t2:beta>25</t2:beta>
<t2:gamma>26</t2:gamma>
<t2:delta>27</t2:delta>
</t2:trapezoidal>
</temp>
<accuracy refTo="Te1" info="D1"> 0.9 </accuracy>
<weather type="T3" info="W1">
<t3:possdistr simRel="SR1">
<t3:p>1</t3:p>
<t3:d label_id="S"></t3:d>
</t3:possdistr>
</weather>
<precision info="P1">
<dgr:F> 0.86 </dgr:F>
<dgr:meaning>model forecast precision</dgr:meaning>
</precision>
</record>

In other cases, temperature may be represented also by an approximate value


28 ± 0, 5.
22 B. Oliboni and G. Pozzani

<temp type="T2" info="Te1">


<t2:approxvalue>
<t2:d>28</t2:d>
<t2:margin>0.5</t2:margin>
</t2:approxvalue>
</temp>

Otherwise, temperature may be represented by a label with a trapezoidal defini-


tion (contained in the FMB).
<temp type="T2" info="Te1">
<t2:label label_id="k4" />
</temp>

The FMB portion of the XML document reports auxiliary information about
fuzzy elements. As said in Section 3.7, the fc element contains main basic in-
formation about them. For example the fc element for the temperature may be the
following one:
<fmb:fc id="Te1">
<fmb:name>temp</fmb:name>
<fmb:ftype>2</fmb:ftype>
<fmb:com>the expected temperature</fmb:com>
<fmb:um>Celsius degrees</fmb:um>
</fmb:fc>

where Te1 is the unique ID identifying the temp fuzzy object. Hence, it is used
inside the document to link data with auxiliary information and viceversa.
In the FMB, we may retrieve also definitions of the labels with ID S (representing
sunny weather), C (representing cloudy weather), and k4 (representing a possible
value for the temperature).
<fmb:labelDefs>
<fmb:labelinfo label_id="S">
<fmb:name>sunny</fmb:name>
</fmb:labelinfo>
<fmb:labelinfo label_id="C">
<fmb:name>cloudy</fmb:name>
</fmb:labelinfo>
<fmb:labelinfo label_id="k4">
<fmb:name>temperature4</fmb:name>
<fmb:alpha>27.5</fmb:alpha>
<fmb:beta>29</fmb:beta>
<fmb:gamma>30</fmb:gamma>
<fmb:delta>30.5</fmb:delta>
</fmb:labelinfo>
</fmb:labelDefs>
An XML Schema for Managing Fuzzy Documents 23

Labels representing sunny and cloudy weathers are defined over a nonordered
domain, thus they are pure linguistic labels and they have not a trapezoidal defi-
nition. The label used to represent a temperature is defined also by a trapezoidal
distribution. Labels S and C are related also by a similarity relation defined inside
a simRel element. This similarity relation is identified by the ID SR1 and it has
also a name. Inside each sim element we may retrieve a pair of objects and their
similarity degree. In the reported example sunny and cloudy is similar with a degree
of 0, 3.
<fmb:simRelDefs>
<fmb:simRel id="SR1" name="SimilarityRelation1">
<fmb:sim>
<fmb:fid1>S</fmb:fid1>
<fmb:fid2>C</fmb:fid2>
<fmb:degree>0.3</fmb:degree>
</fmb:sim>
</fmb:simRel>
</fmb:simRelDefs>

Finally, the FMB contains information about default margin for approximate val-
ues representing temperatures. Moreover, the threshold necessary to consider two
temperatures very different is defined. In the example these two parameters have
value 1 and 5, respectively.
<fmb:fam id="Te1">
<fmb:margin>1</fmb:margin>
<fmb:much>5</fmb:much>
</fmb:fam>

5 Fuzzy Information for Processing Documents


As seen in Section 8 some approaches to fuzzy databases (including the ones in the
XML context) extend query languages by introducing in them fuzzy features. A pos-
sible way to incorporate fuzziness in queries is defining quantifiers and qualifiers.
In this section we present our proposal for representing them in an XML document.
Moreover, we continue the example from the previous section presenting definitions
of some quantifiers and qualifiers about weather information.

5.1 Representing Fuzzy Quantifiers and Qualifiers


A qualifier is a fuzzy constant in the context of a particular attribute or degree. It is
similar to a linguistic label but it is used in queries in order to set linguistic threshold
and to make them more understandable. Moreover, qualifiers allow one to tune up
queries simply modifying their definitions.
Qualifier definitions are wrapped all together in the qualifiers element. In-
side it, a single qualifier definition is reported in a qualDef element. Each qualifier
24 B. Oliboni and G. Pozzani

has: an id attribute that identifies it, a name that represents the qualifier in queries,
and a value in the range [0, 1].
<xs:element name="qualifiers">
<xs:complexType>
<xs:sequence minOccurs="1" maxOccurs="unbounded">
<xs:element ref="qualDef" />
</xs:sequence>
</xs:complexType>
</xs:element>

<xs:element name="qualDef">
<xs:complexType>
<xs:sequence>
<xs:element name="name" type="xs:string"/>
<xs:element name="qualifier" type="xsb:probType"/>
</xs:sequence>
<xs:attribute name="id" type="xs:ID"/>
</xs:complexType>
</xs:element>

Fuzzy quantifiers [17, 18, 34, 39] are linguistic labels that allow us to represent
uncertain quantities. They may be used in queries in order to provide the approxi-
mate number of elements fulfilling a given condition. Quantifiers may be absolute or
relative. The first ones express quantities with respect to the total number of objects
in a set (e.g., “approximately between 25 and 35”, “close to 0”). Hence, absolute
quantifiers range in R. The second ones represent the proportion between the total
number of objects in a set and the number of objects in this set that complies with the
stated condition. In other words, relative quantifiers measure the fulfillment quantity
of a certain condition (e.g., “the majority”, “about half of”). For this reason relative
quantifiers are valued in the range [0, 1].
Absolute and relative quantifiers may be represented in the same form by using a
trapezoidal representation [α , β , γ , δ ] and keeping information about their type.
Another classification of quantifiers divides them in those based on product and
those based on sum. Moreover, they may have zero, one, or two arguments. A gen-
eral definition of fuzzy quantifiers with respect to their arguments and operations is
the following one:
• quantifiers without arguments are defined simply by their trapezoidal distribution
[α , β , γ , δ ];
• quantifiers with one argument x:
– based on product: [x · α , x · β , x · γ , x · δ ];
– based on sum: [x + α , x + β , x + γ , x + δ ];
• quantifiers with two arguments x and y:
– based on product: [x · α , x · β , y · γ , y · δ ];
– based on sum: [x + α , x + β , y + γ , y + δ ].
An XML Schema for Managing Fuzzy Documents 25

Note that, in some cases, a relative quantifier may not be inside the range [0, 1].
This problem can be addressed by considering only the intersection of trapezoidal
distribution associated to the quantifier with the interval [0, 1].
In our Schema proposal, all these information about a quantifier definition are
contained in a quantDef element. Each quantifier is internally identified by an
unique id, while it is used by refering its name. Moreover, a quantifier definition
has the following subelements:
• args ∈ {0, 1, 2} specifies the number of arguments;
• AR specifies whether the quantifier is absolute (A) or relative (R);
• SP specifies whether the quantifier is based on sum (S) or product (P). When the
quantifier has not arguments a ‘-’ is provided.
Finally, all kinds of quantifiers have a trapezoidal definition provided by four ele-
ments alpha, beta, gamma, delta.
<xs:element name="quantDef">
<xs:complexType>
<xs:sequence>
<xs:element name="name" type="xs:string"/>
<xs:element name="args">
<xs:simpleType>
<xs:restriction base="xs:nonNegativeInteger">
<xs:minInclusive value="0"/>
<xs:maxInclusive value="2"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
<xs:element name="AR">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:enumeration value="A"/>
<xs:enumeration value="R"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
<xs:element name="SP">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:enumeration value="S"/>
<xs:enumeration value="P"/>
<xs:enumeration value="-"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
<xs:element name="alpha" type="xs:decimal"/>
<xs:element name="beta" type="xs:decimal"/>
<xs:element name="gamma" type="xs:decimal"/>
<xs:element name="delta" type="xs:decimal"/>
</xs:sequence>
<xs:attribute name="id" type="xs:ID"/>
</xs:complexType>
</xs:element>
26 B. Oliboni and G. Pozzani

Although quantifiers and qualifiers are information used during the processing
phase of XML documents and they are not really data, it may be useful to repre-
sent them inside documents. In fact, the processing phase is a very important issue
about fuzzy databases and information. Consider cases in which XML documents
are exchanged between several users. In these cases, it may be interesting also to
exchange processing information in order to share not only data but also semantics
and processing operators. In such a way, different users can query a document ob-
taining the same results. However, an user may be free to use his own qualifier and
quantifier definitions instead of the document ones.

6 An Example of Quantifiers and Qualifiers


Continuing the example about information managed by a weather station presented
in Section 4, we may define some quantifiers and qualifiers. Their definitions are
reported in the last part of an XML document.
Forecasted temperature is a fuzzy data and it may be processed by fuzzy queries.
Hence, an absolute quantifier Hot, without arguments and defined by the distribu-
tion [30, 35, 72, 72] (expressed in Celsius degrees), may be used in queries to classify
temperatures overlapping it as “hot”.
<proc:quantifiers>
<proc:quantDef id="H13">
<proc:name>Hot</proc:name>
<proc:args>0</proc:args>
<proc:AR>A</proc:AR>
<proc:SP>-</proc:SP>
<proc:alpha>30</proc:alpha>
<proc:beta>35</proc:beta>
<proc:gamma>72</proc:gamma>
<proc:delta>72</proc:delta>
</proc:quantDef>
</proc:quantifiers>

On the other hand, we may define a qualifier High, with value 0, 8, that may be
used as threshold in queries about temperature. It may be used to constraint query
results to comply with the query condition with a fulfillment degree greater than
80%.
<proc:qualifiers>
<proc:qualDef id="H12">
<proc:name>High</proc:name>
<proc:qualifier>0.8</proc:qualifier>
</proc:qualDef>
</proc:qualifiers>

Note that, in fuzzy queries, quantifiers and qualifiers may be used together in or-
der to constraint results. Considering, for example, queries about temperature cited
above, we may retrieve records which temperature is Hot with a High fulfillment
An XML Schema for Managing Fuzzy Documents 27

degree (i.e., temperature overlaps for at least 80% the trapezoidal distribution defin-
ing the quantifier Hot).

7 Incorporating Fuzziness in Classical XML Documents


In this section we show how classical XML documents, and their schemata, can be
modified to integrate our fuzzy XML Schema. This modification allows to represent
also uncertain data, in addition to already represented classical data.
The first step of this integration consists of to modify the Schema of the original
document by using fuzzy datatypes defined by us. In particular the Schema of the
original document must declare new namespaces importing our proposed Schema.
The namespace declarations can be done with some definitions similar to the
following ones:
xmlns:fuzzy="first-2"
xmlns:degree="degrees"
...
<xs:import namespace="first-2" schemaLocation="./first-2.xsd" />
<xs:import namespace="degrees" schemaLocation="./degrees.xsd" />

Then, the designer must decide which data must be represented with a fuzzy data
type and over which kind of domain, ordered or nonordered, the interested data
are. Once the domains have been decided, each original element must be redefined
changing its type to one of the fuzzy proposed types. Data over an ordered domain
must be declared with type FuzzyOrdType, data over an nonordered domain and with
an associated similarity relation must be declared with type FuzzyNonOrdSimType,
and, finally, data over an nonordered domain and without an associated similarity
relation must be declared with type FuzzyNonOrdType.
For instance, let us consider an XML element age representing the age of a
person.
The original definition of this element may be something like:
<xs:element name="age" type="xs:integer" />

On the other hand, one possible its fuzzy definition may be:
<xs:element name="age" type="fuzzy:FuzzyOrdType" />

After this change the age can be represented by using any kind of element de-
fined for datatype FuzzyOrdType, e.g., interval, trapezoidal distribution, approxi-
mate value, and so on (see Section 3.3).
Similar considerations and changes must be done also for all other elements that
the designer want to be able to represent fuzzy information. Changes to different
elements differ only on the fuzzy datatype the designer needs to use to represent
them: classicType, FuzzyOrdType, FuzzyNonOrdSimType, or FuzzyNonOrdType.
The second step of the translation of a classical XML document to one its fuzzy
version consists of the modification of the document itself. Of course, the usage of
elements which definition has been changed must be replaced accordingly to their
new definition.
28 B. Oliboni and G. Pozzani

Continuing the example here introduced, the usage of the age element changes
from:
<age>32</age>

to something like:
<age type="T2" info="a1">
<t2:interval>
<t2:lb>31</t2:lb>
<t2:ub>34</t2:ub>
</t2:interval>
</age>

Note that the transition, from a classical XML document to a fuzzy one based
on our Schema, allows one not only to change the definition of the elements to a
fuzzy compliant version but also to enrich the XML document by using degrees,
quantifiers, and qualifiers.

8 Related Work and Discussion


In this section we briefly describe other proposals presented in the literature and
related to the representation and querying of fuzzy information in XML documents.
Fuzzy features may be incorporated into databases and XML data by using two
main ways. The former allows the representation of fuzzy information directly in
data, e.g., by extending the data model with fuzzy datatypes. The latter obtains fuzzy
information by processing crisp data by using query languages extended with un-
certain operators.
However, considering general and complete systems for fuzzy data management,
these two ideas are orthogonal and can be combined obtaining three approaches to
fuzzy databases:
1. crisp querying of fuzzy information;
2. fuzzy querying of crisp information;
3. fuzzy querying of fuzzy information.
The proposal by Galindo et al. is based on the last approach. They define new
datatypes in order to represent fuzzy information, and, at the same time, they ex-
tend the SQL [12] query language with fuzzy operators and capabilities.
In the following sections we introduce related work on the representation of XML
fuzzy data (Section 8.1) and related work about fuzzy querying of XML documents
(Section 8.2).

8.1 Representing XML Fuzzy Data


In [14], Gaurav et al. incorporated fuzzy data in XML documents extending the
XML schemata associated to these documents. They observed that fuzziness may
be incorporated in values and structures of XML elements. Hence, they extended
An XML Schema for Managing Fuzzy Documents 29

the definition of values and elements introducing special elements representing pos-
sibility distributions and similarity relations. Possibility distributions may be in-
troduced through the two elements <fuzzyValue> and <fuzzyDegree>. The
first one allows the specification of the possibility degree associated to a classical
value, while the second one allows the specification of the possibility with which a
sub-element belongs to its parent element.
The Schema proposed by Gaurav et al. permits to introduce similarity relations by
using the new element <SimilarityRelation> that defines pairs composing
the similarity relation. The <SimilarityRelationRef> attribute may be used
to refer to an already defined similarity relation.
Differently from our proposal, they do not allow the use of linguistic labels and
generic degrees, thus the example described in Section 4 cannot be fully imple-
mented by using the approach proposed in [14]. The impossibility to define linguis-
tic labels does not allow to Gaurav et al. to define trapezoidal distributions (note
that trapezoidal distributions can represent also triangular distributions and inter-
vals) with a unique name that can be referred in several point of a document. Thus,
when a trapezoidal distribution is used more times inside a document, Gaurav at al.
proposal must specify more times the distribution itself. Conversely, our solution
permits to associate a name (i.e., a linguistic label) to a trapezoidal distribution in
order to refer it by using that name instead of by specifying distribution values. This
approach allows us to reuse distribution definitions, reducing documents size.
Gaurav et al. do not allow to represent fuzzy degrees too. Thus, they cannot asso-
ciate fuzzy information to classical data. For instance, they cannot represent fuzzy
information similar to the accuracy of a forecasted temperature or the precision of a
whole forecast, as we reported in the example in Section 4.
We note that all fuzzy constructs proposed by Gaurav et al. have a corresponding
rappresentation also in our Schema. A similarity relation, defined by Gaurav et al.
through the <SimilarityRelation> element, is defined in our proposal in the
FMB simRel element and it is referred by specifying its IDREF inside the element
possdistr of datatype FuzzyNonOrdSimType.
Elements <fuzzyValue> and <fuzzyDegree> defined by Gaurav et al.
represent possibility distributions and tuple degrees, respectively. Possibility dis-
tributions can be represented, by using our proposal, defining a possibility distri-
bution possdistr as specified in the FuzzyOrdType, FuzzyNonOrdSimType, and
FuzzyNonOrdType datatypes. Tuple degrees are represented in our proposal through
degrees associated to a whole tuple, by using FuzzyInstDegree.
In [20, 19], Ma et al. defined a model for representing fuzzy information modify-
ing the DTD associated to an XML document. In particular they modified the DTD
wrapping the original element definitions inside the new element <Val poss="">
which associates to the current element its possibility degree. The new element
<Dist>, composed by one or more <Val> elements, allows one to define a pos-
sibility distribution in an XML document. Moreover, Ma et al. defined two types
of distribution: disjunctive and conjunctive. The former represents a set of possible
values where actually only one of them is true at any moment, the latter represents
a set of fuzzy values everyone true with different degrees at any moment.
30 B. Oliboni and G. Pozzani

In [35], Ma et al. extend their previous work in order to incorporate fuzziness in


XML documents by using XML Schema. Hence, they define <Val> and <Dist>
elements also using XML Schema and then they explain how classical schemata can
be modified to incorporate their new fuzzy objects. However, notice that Ma et al.
introduce neither similarity relations, nor linguistic labels, nor other fuzzy datatypes.
About the impossibility for Ma et al. to use linguistic labels, remarks similar to
those reported about [14] are valid. Moreover, since Ma et al.’s proposal is not able
to represent similarity relations, they cannot represent data similar to the weather
situation we reported in Section 4. We note that similarity of two values cannot be
inferred if these values are not numerical, thus our proposal can actually represent
more information than [35].
Constructs introduced by Ma et al., <Dist> and <Val>, correspond and can be
represented by using possibility distributions, on ordered or nonordered domains,
defined in our proposal.
An approach similar to those reported in [20, 19], based on an extension of DTD,
is used in [26]. Turowski et al. introduced new appropriate DTDs defining the ele-
ments that allow one to represent discrete fuzzy sets (that can represent possibility
distributions), continuous fuzzy sets, and linguistic variables, that can be associated
to fuzzy sets. Then they do not allow the use of similarity relations or degrees. On
the other hand, using fuzzy sets and variables, they also define the DTDs needed to
implement a fuzzy inference system able to infer the truths implied by some given
facts, by using user-defined rules.
Since, Turowski et al.’s proposal cannot represent similarity relations, it suffers
of lacks similar to those reported for previous discussed approaches.
On the other hand, we note that they can also represent continuous generic fuzzy
sets that cannot be represented by our proposal. Special cases of continuous fuzzy
sets are trapezoidal and triangular distributions, and intervals. Our proposal can rep-
resent these distributions, while it cannot explicitely represent distributions with a
generic trend. However, these distributions can be interpoled from discrete ones,
as Turowski et al. do. In our proposal the distinction from discrete and continu-
ous distributions are implicitely defined by the semantics of data, while in [26] it is
explicitely specified.
In our proposal we allow the user to represent all the aspects related to fuzzy in-
formation. In particular, we define all fuzzy datatypes (e.g. possibility distributions,
approximate values, intervals), fuzzy degrees (with several meanings) and labels al-
ready proposed separately in several proposals in the literature. Moreover, we define
XML schemata instead of DTDs to overcome limitations due to the use of DTDs.

8.2 Fuzzy Querying of XML Documents


Several proposals in literature deal with fuzzy querying of XML documents. In
[3, 5, 9, 10], Campi et al. propose an extension for the XPath query language
[30] by adding new constructions in order to introduce fuzzy querying capabili-
ties. XPath language is based on path expressions able to state the structure and
the value of elements required by the user. With respect to path expressions,
An XML Schema for Managing Fuzzy Documents 31

Campi et al. take into account two kinds of fuzziness: fuzziness on structure and
fuzziness on values. With respect to the first one, users can submit queries without to
specify in a precise way the structure of the XML document and of the required ele-
ments, while, with respect to values, queries do not look only for exact value match-
ing but also for similar values. These features are introduced by defining new fuzzy
path predicates (e.g., NEAR, ABOUT, and BESIDES). Fuzzy predicates allow one
to search elements, attributes, and values similar to those really required. For ex-
ample, the expression /proceedings/article[@year NEAR 2009] re-
trieves article elements, child of an element proceedings, which attribute
year has a value close to 2009. On the other hand, the user may retrieve article
elements that are close descendant of proceedings by using the expression
/proceedings{/NEAR}/article.
Fuzzy predicates can be partially satisfied by XML elements with several de-
grees. Hence, conversely to classical XPath queries, fuzzy queries return a ranked
set of nodes. Ranks associated to elements represent the similarity of returned
elements with the ones required by the query.
Moreover, Campi et al. define a method allowing one to choose how the ranks for
a query may be calculated. Users may associate to each part of a query a variable
which value represents the degree of satisfaction of the conditions. Users may define
how the ranks must be calculated combining values bound to variables.
Finally, Campi et al. proposal allows users to use fuzzy quantifiers (e.g., tall)
and qualifiers (e.g., very) inside predicates (e.g., height = very tall).
A very similar approach to fuzzy querying is proposed by Goncalves and
Tineo [15].
Using a different approach, Amer-Yahia et al. [1] do not extend XPath expres-
sions with new predicates and operators, but they introduce fuzziness by query
relaxations. They define four operations (e.g., axis generalization and leaf deletion)
on the structure of queries that, given a query, produce an its relaxed version (i.e.,
a query containing the original one). Relaxations broaden the scope of the path ex-
pressions provided in the original query. A ranking strategy associates a penalty to
each modification applied to a query through a relaxation operation. Penalties are
then used to calculate how much retrieved elements satisfy the original query.
Note that, in all proposals about fuzzy querying in the literature, query results are
sets of ranked elements where ranks represent the fulfillment degrees of retrieved
elements with respect to the query conditions.

9 Conclusion
In this work, we proposed a general XML Schema definition for representing fuzzy
information in XML documents. In our proposal, we represent different aspects of
fuzzy information by adapting a data type classification already proposed for the
relational database context, and by integrating different kinds of fuzzy information
to compose a complete definition.
For future work we plan to start from documents valid with respect to the
XML Schema proposed in this paper and to study topics related to querying and
32 B. Oliboni and G. Pozzani

retrieval of fuzzy information. As we explained in Section 8, fuzzy information can


be queried by using fuzzy or crisp query languages. We note that the starting point
of our future research will be different from the one assumed by previous works,
that have been presented in literature (see Section 8.2). Conversely from other ap-
proaches, our work will be based on fuzzy information rather than crisp information.
This difference will lead, in our opinion, to a less modification of existing query lan-
guages. As a matter of fact, since fuzzy capabilities are already incorporated in the
document schema, query languages can exploit the structure of documents without
the need to use ad-hoc sintax constructs and features. In this case, we do not need
to enrich the query language but the query engine (i.e., the part of the system liables
to interpret and execute queries). On the other hand, some features (e.g., qualifier
and quantifier usage) will require some little modifications to the query language.
Thus, first of all, future work in this direction must understand which features must
be incorporated in a query language (e.g., XPath) for fuzzy XML documents and
which others need only a particular interpretation from the query engine. After that,
an extended query language with desired fuzzy capabilities will be designed.
Another possible research direction is about how fuzzy XML documents may be
used for XML Schema versioning. Considering XML documents that are instances
of different versions of a given XML Schema, fuzzy XML may be used to represent
the uncertainity associated to the information contained in the documents. More-
over, considering different versions of an XML Schema, our proposal may be used
to represent the uncertainity associated to elements and attributes used in the ver-
sions. Finally, fuzzy XML may represent the uncertainity associated to operations
and to sequences of operations that can be used to obtain a new version of an XML
Schema from other ones.

References
1. Amer-Yahia, S., Lakshmanan, L.V.S., Pandit, S.: FleXPath: flexible structure and full-
text querying for XML. In: ACM (ed.) Proceedings of the 2004 ACM SIGMOD Interna-
tional Conference on Management of Data 2004, Paris, France, June 13–18, pp. 83–94.
ACM Press, New York (2004) pub-ACM:adr
2. Bosc, D., Pivert, P.: Flexible queries in relational databases – the example of the division
operator. TCS: Theoretical Computer Science 171 (1997)
3. Braga, D., Campi, A., Damiani, E., Pasi, G., Lanzi, P.: FXPath: Flexible querying of
XML documents. In: Proceedings of EuroFuse 2002 (2002)
4. Buckles, B.P., Petry, F.E.: A fuzzy representation of data for relational databases. Fuzzy
Sets and Systems 7(3), 213–226 (1982)
5. Campi, A., Guinea, S., Spoletini, P.: A fuzzy extension for the XPath query language. In:
Larsen, H.L., Pasi, G., Ortiz-Arroyo, D., Andreasen, T., Christiansen, H. (eds.) FQAS
2006. LNCS (LNAI), vol. 4027, pp. 210–221. Springer, Heidelberg (2006)
6. Codd, E.F.: A relational model of data for large shared data banks. CACM: Communi-
cations of the ACM 13 (1970)
7. Codd, E.F.: Extending the database relational model to capture more meaning. ACM
Transactions on Database Systems 4(4), 397–434 (1979)
8. Codd, E.F.: The relational model for database management. Addison-Wesley Longman
Publishing Co. Inc., Boston (1990)
An XML Schema for Managing Fuzzy Documents 33

9. Damiani, E., Marrara, S., Pasi, G.: FuzzyXPath: Using fuzzy logic an IR features to ap-
proximately query XML documents. In: Melin, P., Castillo, O., Aguilar, L.T., Kacprzyk,
J., Pedrycz, W. (eds.) IFSA 2007. LNCS (LNAI), vol. 4529, pp. 199–208. Springer, Hei-
delberg (2007)
10. Damiani, E., Marrara, S., Pasi, G.: A flexible extension of xpath to improve XML query-
ing. In: Myaeng, S.H., Oard, D.W., Sebastiani, F., Chua, T.S., Leong, M.K. (eds.) Pro-
ceedings of the 31st Annual International ACM SIGIR Conference on Research and De-
velopment in Information Retrieval, SIGIR 2008, Singapore, July 20-24, pp. 849–850.
ACM, New York (2008)
11. Dubois, D., Prade, H.: Possibility Theory: An Approach to Computerized Processing of
Uncertainty. Plenum Press, New York (1988)
12. Elmasri, R.A., Navathe, S.B.: Fundamentals of Database Systems. Addison-Wesley
Longman Publishing Co. Inc., Boston (1999)
13. Galindo, J., Urrutia, A., Piattini, M.: Fuzzy Databases: Modeling, Design, and Imple-
mentation. IGI Publishing (2006)
14. Gaurav, A., Alhajj, R.: Incorporating fuzziness in XML and mapping fuzzy relational
data into fuzzy XML. In: Haddad, H. (ed.) Proceedings of the 2006 ACM Symposium
on Applied Computing, pp. 456–460. ACM, New York (2006)
15. Goncalves, M., Tineo, L.: A new step towards flexible XQuery. Avances en sistemas e
Informática 4, 27–34 (2007)
16. ISO: ISO 8879:1986: Information processing — Text and office systems — Standard
Generalized Markup Language, SGML (1986),
http://www.iso.ch/cate/d16387.html
17. Liu, Y., Kerre, E.E.: An overview of fuzzy quantifiers. (I). interpretations. Fuzzy Sets
Syst. 95(1), 1–21 (1998)
18. Liu, Y., Kerre, E.E.: An overview of fuzzy quantifiers (II). reasoning and applications.
Fuzzy Sets Syst. 95(2), 135–146 (1998)
19. Ma, Z.: Fuzzy Database Modeling with XML (The Kluwer International Series on Ad-
vances in Database Systems). Springer-Verlag New York, Inc. (2005)
20. Ma, Z.M., Yan, L.: Fuzzy XML data modeling with the UML and relational data models.
DKE 63(3), 972–996 (2007)
21. Medina, J.M., Pons, O., Vila, M.A.: GEFRED: A generalized model of fuzzy relational
databases. Information Sciences 76(1-2), 87–109 (1994)
22. Nebeker, F.: Calculating the Weather: Meteorology in the 20th Century. International
Geophysics Series, vol. 60. Academic Press, London (1995)
23. Paoli, J., Bray, T., Sperberg-McQueen, C.M., Yergeau, F., Maler, E.: Extensible markup
language (XML) 1.0 (fourth edition). W3C recommendation, W3C (2006),
http://www.w3.org/TR/2006/REC-xml-20060816
24. Prade, H.: Lipski’s approach to incomplete information databases restated and general-
ized in the setting of Zadeh’s possibility theory. Information Systems 9(1), 27–42 (1984)
25. Prade, H., Testemale, C.: Generalizing database relational algebra for the treatment of
incomplete or uncertain information and vague queries. Information Sciences 34, 115–
143 (1984)
26. Turowski, K., Weng, U.: Representing and processing fuzzy information - an XML-based
approach. Knowl.-Based Syst. 15(1-2), 67–75 (2002)
27. Umano, M.: FREEDOM-O: A fuzzy database system. In: Gupta, M.M., Sanchez, E.
(eds.) Fuzzy Information and Decision Processes, pp. 339–349. North-Holland, Amster-
dam (1982)
28. Umano, M., Fukami, S.: Fuzzy relational algebra for possibility-distribution-fuzzy-
relational model of fuzzy data. J. Intell. Inf. Syst. 3(1), 7–27 (1994)
34 B. Oliboni and G. Pozzani

29. W3C: World-Wide Web Consortium (1994), http://www.w3.org/


30. XML Path Language (XPath) Version 1.0, W3C Recommendation (1999),
http://www.w3c.org/TR/xpath
31. XQuery 1.0: An XML Query Language, W3C Recommendation (2007),
http://www.w3.org/TR/xquery/
32. XSD: XML Schema Definition (2004), http://www.w3.org/XML/Schema
33. XSL Transformations (XSLT), W3C Recommendation (1999),
http://www.w3.org/TR/xslt
34. Yager, R.R.: Quantified propositions in a linguistic logic. International Journal of Man-
Machine Studies 19(2), 195–227 (1983)
35. Yan, L., Ma, Z.M., Liu, J.: Fuzzy data modeling based on XML schema. In: Proceedings
of the 2009 ACM Symposium on Applied Computing (SAC), Honolulu, Hawaii, USA,
March 9-12, pp. 1563–1567. ACM, New York (2009)
36. Zadeh, L.A.: Fuzzy sets. Information and Control 8(3), 338–353 (1965)
37. Zadeh, L.A.: Similarity relations and fuzzy orderings. Information Sciences 3, 177–200
(1971)
38. Zadeh, L.A.: Fuzzy sets as a basis for possibility. Fuzzy Sets and Systems 1, 3–28 (1978)
39. Zadeh, L.A.: A computational approach to fuzzy quantifiers in natural language. Com-
puters and Mathematics with Applications 9(1), 149–184 (1983)
40. Zemankova, M., Kandel, A.: Fuzzy Relational Databases — A Key to Expert Systems.
Verlag TUV Rheinland (1984)
41. Zemankova, M., Kandel, A.: Implementing imprecision in information systems. Infor-
mation Sciences 37(1-3), 107–141 (1985)
Formal Translation from Fuzzy XML to Fuzzy
Nested Relational Database Schema

Li Yan, Jian Liu, and Z.M. Ma

Abstract. XML has been the de-facto standard of information representation and
exchange over the web. In addition, imprecise and uncertain data are inherent in
the real world. Although fuzzy information has been extensively investigated in
the context of relational model, the classical relational database model and its
fuzzy extension to date do not satisfy the need of modeling complex objects with
imprecision and uncertainty, especially when the fuzzy relational databases are
created by mapping the fuzzy conceptual data models and the fuzzy XML data
model. Based on possibility distributions, this chapter concentrates on fuzzy
information modeling in the fuzzy XML model and the fuzzy nested relational
database model. In particular, the formal approach to mapping a fuzzy DTD model
to a fuzzy nested relational database (FNRDB) schema is developed.

1 Introduction

With the prompt development of the Internet, the requirement of managing


information based on the Web has attracted much attention both from academia
and industry. XML is widely regarded as the next step in the evolution of the
World Wide Web, and has been the de-facto standard. It aims at enhancing
content on the World Wide Web. XML and related standards are flexible that
allow the easy development of applications which exchange data over the web
such as e-commerce (EC) and supply chain management (SCM). However, this
flexibility makes it challenging to develop an XML management system. To

Li Yan
School of Software, Northeastern University, Shenyang, 110819, China

Jian Liu
School of Information Science & Engineering, Northeastern University, Shenyang, 110819,
China

Z.M. Ma
School of Information Science & Engineering, Northeastern University, Shenyang, 110819,
China
e-mail: [email protected]

Z. Ma & L. Yan (Eds.): Soft Computing in XML Data Management, STUDFUZZ 255, pp. 35–54.
springerlink.com © Springer-Verlag Berlin Heidelberg 2010
36 L. Yan, J. Liu, and Z.M. Ma

manage XML data, it is necessary to integrate XML and databases [3]. Various
databases, including relational, object-oriented, and object-relational databases, have
been used for mapping to and from the XML document. At the same time, some data
are inherently imprecise and uncertain since their values are subjective in the real
world applications. For example, consider values representing the satisfaction degree
for a film, different person may have different satisfaction degree. Information
fuzziness has also been investigated in the context of EC and SCM [25, 30, 31]. It is
shown that fuzzy set theory is very useful in Web-based business intelligence.
Fuzzy information has been extensively investigated in the context of relational
model [6, 24, 26, 28]. However, the classical relational database model and its fuzzy
extension do not satisfy the need of modeling complex objects with imprecision and
uncertainty. The requirements of modeling complex objects and information
imprecision and uncertainty can be found in many application domains (e.g.,
multimedia applications) and have challenged the current database technology [2, 7].
In order to model uncertain data and complex-valued attributes as well as complex
relationships among objects, current efforts have concentrated on the conceptual data
models [15, 16, 21, 33], the fuzzy nested relational data model (also known as an NF2
data model) [34], and the fuzzy object-oriented databases [4, 10, 12, 13, 20]. Also
there are efforts to conceptually design the fuzzy databases using the fuzzy conceptual
data models [15, 16, 21, 33]. More recently, the fuzzy object-relational databases are
proposed [9] which combine both characters of fuzzy relational databases and fuzzy
object-oriented databases. Ones can refer to [17, 18] for recent surveys of these fuzzy
data models.
Despite fuzzy values have been employed to model and handle imprecise
information in databases since Zadeh introduced the theory of fuzzy sets [35], relative
little work has been carried out in extending XML towards the representation of
imprecise and uncertain concepts. Abiteboul et al. [1] provide a model for XML
documents and DTDs and a representation system for XML with incomplete
information. The representations of probabilistic data in XML are proposed in other
previous research papers, such as [14, 22, 27, 29]. Without presenting XML
representation model, the data fuzziness in XML document is discussed directly
according to the fuzzy relational databases in [11], and the simple mappings from the
fuzzy relational databases to fuzzy XML document are provided also. Oliboni and
Pozzani [23] propose a XML Schema definition for representing fuzzy information.
They adopt the data type classification for the XML data context. A fuzzy XML data
model which is based XML DTD is proposed in [19], in which the mapping of the
fuzzy XML DTD (Document Type Definition) from the fuzzy UML data model and
to the fuzzy relational database schema are discussed, respectively. In [32], a fuzzy
XML data model based on XML Schema is developed.
The classical relational database model and its fuzzy extension do not satisfy the
need of modeling complex objects with imprecision and uncertainty. It is also true
when the fuzzy relational databases are created by mapping the fuzzy conceptual data
models and the fuzzy XML data model. Being the extension of relational data model,
the NF2 database model is able to handle complex-valued attributes and may be better
Formal Translation from Fuzzy XML to Fuzzy Nested Relational Database Schema 37

suited to some complex applications such as office automation systems, information


retrieval systems and expert database systems [34]. In [8], the fuzzy NF2 database
model is proposed for managing uncertainties in images. This chapter, based on
possibility distributions, concentrates on fuzzy information modeling in the fuzzy
XML model and the fuzzy nested relational database model. In particular, the formal
approach to mapping a fuzzy DTD model to a fuzzy nested relational database
(FNRDB) schema is developed.
The remainder of this chapter is organized as follows. Section 2 discusses fuzzy
sets and possibility distributions. The fuzzy XML data model and fuzzy nested
relational databases are introduced in Section 3. In Section 4, the approaches to
mapping the fuzzy XML model to the fuzzy nested relational schema are
developed. Section 5 concludes this chapter.

2 Fuzzy Sets and Possibility Distributions

Different models have been proposed to handle different categories of data quality
(or lack thereof). Five basic kinds of imperfection have been identified in [5],
which are inconsistency, imprecision, vagueness, uncertainty, and ambiguity.
Instead of giving the definitions of the imperfect information, we herewith explain
their meanings.
Inconsistency is a kind of semantic conflict, meaning the same aspect of the
real world is irreconcilably represented more than once in a database or in several
different databases. For example, the age of George is stored as 34 and 37
simultaneously. Information inconsistency usually comes from information
integration.
Intuitively, the imprecision and vagueness are relevant to the content of an attribute
value, which means that a choice must be made from a given range (interval or set) of
values without knowing which one to choose. In general, vague information is
represented by linguistic values. Assume that, for example, we do not know exactly
the age of two persons named Michael and John, and only know that the age of
Michael may be 18, 19, 20, or 21, and the age of John is old. Then the information of
Michael’s age is an imprecise one, denoted by a set of values {18, 19, 20, 21}. The
information of John’s age is a vague one, denoted by a linguistic value, "old".
The uncertainty is related to the degree of truth of its attribute value. With
uncertainty, we can apportion some, but not all, of our belief to a given value or a
group of values. For example, the possibility that the age of Chris is 35 right now
should be 98%. The random uncertainty, described using probability theory, is not
considered in this chapter. The ambiguity means that some elements of the model lack
complete semantics, leading to several possible interpretations.
Generally, several different kinds of imperfection can co-exist with respect to
the same piece of information. For example, the age of Michael is a set of values
{18, 19, 20, 21} and their possibilities are 70%, 95%, 98%, and 85%, respectively.
Imprecision, uncertainty, and vagueness are three major types of imperfect
information and can be modeled with fuzzy sets [35] and possibility theory [36].
38 L. Yan, J. Liu, and Z.M. Ma

Many of the existing approaches dealing with imprecision and uncertainty are
based on the theory of fuzzy sets.
The concept of fuzzy sets was originally introduced by Zadeh [35]. Let U be a
universe of discourse and F be a fuzzy set in U. A membership function
μF: U → [0, 1]
is defined for F, where μF (u), for each u ∈ U, denotes the membership degree of u
in the fuzzy set F. Thus, the fuzzy set F is described as follows:
F = {μF (u1)/u1, μF (u2)/u2, ..., μF (un)/un}
The fuzzy set F is consisted of some elements just like the conventional set. But,
not being the same as the conventional set, each element in F may or may not
belong to F, having a membership degree to F which needs to be explicitly
indicated. So in F, an element (say ui) is associated with its membership degree
(say μF (ui)), and they occur together in form of μF (ui)/ui. When the membership
degrees that all elements in F belong to F are exactly 1, the fuzzy set F reduces to
a conventional one.
When the membership degree μF (u) above is explained to be a measure of the
possibility that a variable X has the value u, where X takes values in U, a fuzzy
value is described by a possibility distribution πX (Zadeh, 1978).
πX = {πX (u1)/u1, πX (u2)/u2, ..., πX (un)/un}
Here, πX (ui), ui ∈ U denotes the possibility that ui is true. Let πX be the possibility
distribution representation for the fuzzy value of a variable X. It means that the
value of X is fuzzy, and X may take one from some possible values u1, u2, ..., and
un and each one (say ui) taken possibly is associated with its possibility degree
(say πX (ui)).
Definition: A fuzzy set F of the universe of discourse U is convex if and only if
for all u1, u2 in U,
μF (λu1 + (1 − λ) u2) ≥ min (μF (u1), μF (u2))
where λ ∈ [0, 1].
Definition: A fuzzy set F of the universe of discourse U is called a normal
fuzzy set if ∃ u ∈ U, μF (u) = 1.
Definition: A fuzzy set is a fuzzy subset in the universe of discourse U that is
both convex and normal.

3 Representation of Fuzzy Data in XML and Nested Relational


Databases
This section focuses on fuzzy data modeling in XML data model and nested
relational model. First we introduce some notions and notations of the fuzzy XML
model proposed in [19] and then we present an extension of the extended
possibility-based fuzzy nested relational databases.
Random documents with unrelated
content Scribd suggests to you:
Rocky Mountain National Park, 335-353.

North Park, 229, 231.

Ophir Loop, 176, 180.

Orton, Dr. Edward L., 338.

Ouray, 171, 172.

Ouzel, water, in winter, 272, 273.

Ouzel Lake, 158.

Paint-brush, 118.

Parks, mountain, their characteristics, 229-232;


origin, 233-235;
end, 235-237;
glacier meadows, 237-240;
as camping-places, 240-245.

Parks, National. See National Parks.

Penn, William, 329.

Pika, or cony, 110, 111, 344.

Pike, Zebulon M., 301.

Pike's Peak, situation, 295;


altitude, 295;
accessibility, 295, 296;
view, 296;
characteristics, 297;
attractions, 297-300;
history, 301, 302;
climate, 302-305;
summit, 305;
life zones, 305, 306;
bird-life, 306, 307;
big game, 307;
wild flowers and trees, 308, 309;
geology, 309, 310.

Pillars of Hercules, 300.

Pine, limber, 61-63.

Pine, lodge-pole, 125, 126, 140;


extension of area, 211;
seeding, 211-213;
spread dependent upon fire, 213;
elements of success, 214;
a forest pioneer, 218, 219;
hoarding of seed, 219, 220;
rapidity of growth, 220, 221;
overgrown cones, 221-223;
fruitfulness, 223;
release of seeds, 223;
character of stands, 224;
giving way to other species, 225;
dependence upon fire, 225, 226;
range, 226.

Pine, pitch, 214.

Pine, short-leaved (Pinus Montezumæ), 59.

Pine, Western yellow, two stumps, 125, 126;


a good fire-fighter, 129;
preserved by fire, 140.

Porcupine, 270.

Prospect Dome, 300.

Ptarmigan, 102, 113, 114;


in the winter snows, 271, 272;
food, 272.

Rabbit, snowshoe, 344.


Rabbits, 270.

Rats, mountain, 137.

Redwood, 128, 130;


a forest-fire record, 131-133.

Rocky Mountain National Park, location, area, and


topography, 335, 336;
geology, 337-340;
forests, 341, 342;
wild flowers, 342, 343;
animal life, 343-345;
roads and trails, 345;
streams, 346;
climate, 346;
scenery, 346-350;
lakes, 350, 351;
accessibility, 352;
visitors, 352, 353.

Rocky Mountains, Colorado, scenery of, 313, 314.

St. Vrain Moraine, 339, 349, 350.

San Cristoval Lake, 157.

San Juan Mountains, and return horses, 170,


171.

Scenery, of the Rocky Mountains, 313, 314;


conservation and destruction, 314-331;
in the United States, 316, 317, 321, 322;
a judicial decision, 324-326;
and forestry, 327-329;
literary and official recognition, 329, 330.

Schneider, Dr. Edward C., quoted, 303-305.

Seven Lakes, 309.

Sheep, mountain, 64;


a flock descending a mountain, 23-28;
as acrobats, 24;
fable as to landing on horns, 28, 29;
shape and size of horns, 29, 30;
a wild leap, 30-32;
accidents, 32, 33;
an agile ram, 33-35;
hoofs, 35;
size, color, and other characteristics, 35, 36;
species and range, 36, 37;
in winter, 37;
excursions to the lowlands, 37, 38;
composition of flocks, 38;
craving for salt, 38;
lambs, 38, 39;
near approach to, 40;
a ram killed by a barbed-wire fence, 40, 41;
the flock on Battle Mountain, 41-46;
fights, 44-46;
threatened extermination, 46;
at high altitudes, 105-107;
watching a forest fire, 142;
clings to the heights in snowy times, 266, 267.

Shoshone Peak, 339.

Silver Lake, 157.

Silverton, 171.

Snow, on summits of the Rocky Mountains, 103;


and animal life, 259-275;
a great snow, 268;
and the Chinook wind, 269.

Snow-slides, started by dynamite, 79;


a prospector outwitted, 79-84;
habits, 81;
observation of, 84-86;
classification, 87-90;
coasting on a slide, 91-94;
a large slide, 94-97.
Solitaire, Townsend's, 64, 154, 241.

South Cheyenne Cañon, 298-300.

South Park, 229.

Sparrow, white-crowned, 64, 102, 115, 154.

Specimen Mountain, 38, 335, 337, 343, 344.

Sprague Glacier, 153, 337.

"Springfield Republican," quoted, 230, 231.

Spruce, Douglas, 140.

Spruce, Engelmann, 61.

Squirrel, Frémont, or pine, 64, 344.

Squirrels, and deep snow, 270, 271;


hibernation, 271.

Stone's Peak, 338, 341.

Storm Peak, 349.

Switzerland, conservation of scenery in, 313-315,


323, 327.

Telluride, 171, 172, 174-176, 183.

Thatch-Top Mountain, mountain sheep on, 23-28.

Thunder Lake, 157, 158.

Timber-line, characteristics of, 49-58;


altitude, 50, 59, 60, 101;
determining factors, 58, 59;
temperature, 60;
species of trees at, 60, 61;
age of trees at, 61, 62;
animal life at, 63, 64;
flowers at, 65;
impressions at, 65, 66;
animal life above, 101, 102, 105-115;
flowers above, 116-120.

Trapper's Lake, 157.

Trees, species at timber-line, 60;


age at timber-line, 61, 62;
resistance to fire, 128-130;
methods of reproduction, 214-216;
tolerance and intolerance, 216-218;
of Pike's Peak, 308.
See also Timber-line.

Trout Lake, 157, 176.

Twin Lakes, 157.

Tyndall, John, 18.

Tyndall Glacier, 338.

Wasp, battle with a beetle, 111.

Weasels, above timber-line, 111, 112;


and chipmunk, 285.

Wild Basin, 338, 341;


description, 350.

Willow, arctic, 61.

Willow, propagation, 215.

Wolves, 266, 268.

Woodchuck, above timber-line, 109.

Woodpeckers, 273.
Yarding, 262-265.

Yellowstone National Park, establishment of, 329.

Zones, life, 305, 306.


The Riverside Press
CAMBRIDGE . MASSACHUSETTS
U.S.A
*** END OF THE PROJECT GUTENBERG EBOOK THE ROCKY
MOUNTAIN WONDERLAND ***

Updated editions will replace the previous one—the old editions will
be renamed.

Creating the works from print editions not protected by U.S.


copyright law means that no one owns a United States copyright in
these works, so the Foundation (and you!) can copy and distribute it
in the United States without permission and without paying
copyright royalties. Special rules, set forth in the General Terms of
Use part of this license, apply to copying and distributing Project
Gutenberg™ electronic works to protect the PROJECT GUTENBERG™
concept and trademark. Project Gutenberg is a registered trademark,
and may not be used if you charge for an eBook, except by following
the terms of the trademark license, including paying royalties for use
of the Project Gutenberg trademark. If you do not charge anything
for copies of this eBook, complying with the trademark license is
very easy. You may use this eBook for nearly any purpose such as
creation of derivative works, reports, performances and research.
Project Gutenberg eBooks may be modified and printed and given
away—you may do practically ANYTHING in the United States with
eBooks not protected by U.S. copyright law. Redistribution is subject
to the trademark license, especially commercial redistribution.

START: FULL LICENSE


THE FULL PROJECT GUTENBERG LICENSE
PLEASE READ THIS BEFORE YOU DISTRIBUTE OR USE THIS WORK

To protect the Project Gutenberg™ mission of promoting the free


distribution of electronic works, by using or distributing this work (or
any other work associated in any way with the phrase “Project
Gutenberg”), you agree to comply with all the terms of the Full
Project Gutenberg™ License available with this file or online at
www.gutenberg.org/license.

Section 1. General Terms of Use and


Redistributing Project Gutenberg™
electronic works
1.A. By reading or using any part of this Project Gutenberg™
electronic work, you indicate that you have read, understand, agree
to and accept all the terms of this license and intellectual property
(trademark/copyright) agreement. If you do not agree to abide by all
the terms of this agreement, you must cease using and return or
destroy all copies of Project Gutenberg™ electronic works in your
possession. If you paid a fee for obtaining a copy of or access to a
Project Gutenberg™ electronic work and you do not agree to be
bound by the terms of this agreement, you may obtain a refund
from the person or entity to whom you paid the fee as set forth in
paragraph 1.E.8.

1.B. “Project Gutenberg” is a registered trademark. It may only be


used on or associated in any way with an electronic work by people
who agree to be bound by the terms of this agreement. There are a
few things that you can do with most Project Gutenberg™ electronic
works even without complying with the full terms of this agreement.
See paragraph 1.C below. There are a lot of things you can do with
Project Gutenberg™ electronic works if you follow the terms of this
agreement and help preserve free future access to Project
Gutenberg™ electronic works. See paragraph 1.E below.
1.C. The Project Gutenberg Literary Archive Foundation (“the
Foundation” or PGLAF), owns a compilation copyright in the
collection of Project Gutenberg™ electronic works. Nearly all the
individual works in the collection are in the public domain in the
United States. If an individual work is unprotected by copyright law
in the United States and you are located in the United States, we do
not claim a right to prevent you from copying, distributing,
performing, displaying or creating derivative works based on the
work as long as all references to Project Gutenberg are removed. Of
course, we hope that you will support the Project Gutenberg™
mission of promoting free access to electronic works by freely
sharing Project Gutenberg™ works in compliance with the terms of
this agreement for keeping the Project Gutenberg™ name associated
with the work. You can easily comply with the terms of this
agreement by keeping this work in the same format with its attached
full Project Gutenberg™ License when you share it without charge
with others.

1.D. The copyright laws of the place where you are located also
govern what you can do with this work. Copyright laws in most
countries are in a constant state of change. If you are outside the
United States, check the laws of your country in addition to the
terms of this agreement before downloading, copying, displaying,
performing, distributing or creating derivative works based on this
work or any other Project Gutenberg™ work. The Foundation makes
no representations concerning the copyright status of any work in
any country other than the United States.

1.E. Unless you have removed all references to Project Gutenberg:

1.E.1. The following sentence, with active links to, or other


immediate access to, the full Project Gutenberg™ License must
appear prominently whenever any copy of a Project Gutenberg™
work (any work on which the phrase “Project Gutenberg” appears,
or with which the phrase “Project Gutenberg” is associated) is
accessed, displayed, performed, viewed, copied or distributed:
This eBook is for the use of anyone anywhere in the United
States and most other parts of the world at no cost and with
almost no restrictions whatsoever. You may copy it, give it away
or re-use it under the terms of the Project Gutenberg License
included with this eBook or online at www.gutenberg.org. If you
are not located in the United States, you will have to check the
laws of the country where you are located before using this
eBook.

1.E.2. If an individual Project Gutenberg™ electronic work is derived


from texts not protected by U.S. copyright law (does not contain a
notice indicating that it is posted with permission of the copyright
holder), the work can be copied and distributed to anyone in the
United States without paying any fees or charges. If you are
redistributing or providing access to a work with the phrase “Project
Gutenberg” associated with or appearing on the work, you must
comply either with the requirements of paragraphs 1.E.1 through
1.E.7 or obtain permission for the use of the work and the Project
Gutenberg™ trademark as set forth in paragraphs 1.E.8 or 1.E.9.

1.E.3. If an individual Project Gutenberg™ electronic work is posted


with the permission of the copyright holder, your use and distribution
must comply with both paragraphs 1.E.1 through 1.E.7 and any
additional terms imposed by the copyright holder. Additional terms
will be linked to the Project Gutenberg™ License for all works posted
with the permission of the copyright holder found at the beginning
of this work.

1.E.4. Do not unlink or detach or remove the full Project


Gutenberg™ License terms from this work, or any files containing a
part of this work or any other work associated with Project
Gutenberg™.

1.E.5. Do not copy, display, perform, distribute or redistribute this


electronic work, or any part of this electronic work, without
prominently displaying the sentence set forth in paragraph 1.E.1
with active links or immediate access to the full terms of the Project
Gutenberg™ License.

1.E.6. You may convert to and distribute this work in any binary,
compressed, marked up, nonproprietary or proprietary form,
including any word processing or hypertext form. However, if you
provide access to or distribute copies of a Project Gutenberg™ work
in a format other than “Plain Vanilla ASCII” or other format used in
the official version posted on the official Project Gutenberg™ website
(www.gutenberg.org), you must, at no additional cost, fee or
expense to the user, provide a copy, a means of exporting a copy, or
a means of obtaining a copy upon request, of the work in its original
“Plain Vanilla ASCII” or other form. Any alternate format must
include the full Project Gutenberg™ License as specified in
paragraph 1.E.1.

1.E.7. Do not charge a fee for access to, viewing, displaying,


performing, copying or distributing any Project Gutenberg™ works
unless you comply with paragraph 1.E.8 or 1.E.9.

1.E.8. You may charge a reasonable fee for copies of or providing


access to or distributing Project Gutenberg™ electronic works
provided that:

• You pay a royalty fee of 20% of the gross profits you derive
from the use of Project Gutenberg™ works calculated using the
method you already use to calculate your applicable taxes. The
fee is owed to the owner of the Project Gutenberg™ trademark,
but he has agreed to donate royalties under this paragraph to
the Project Gutenberg Literary Archive Foundation. Royalty
payments must be paid within 60 days following each date on
which you prepare (or are legally required to prepare) your
periodic tax returns. Royalty payments should be clearly marked
as such and sent to the Project Gutenberg Literary Archive
Foundation at the address specified in Section 4, “Information
about donations to the Project Gutenberg Literary Archive
Foundation.”

• You provide a full refund of any money paid by a user who


notifies you in writing (or by e-mail) within 30 days of receipt
that s/he does not agree to the terms of the full Project
Gutenberg™ License. You must require such a user to return or
destroy all copies of the works possessed in a physical medium
and discontinue all use of and all access to other copies of
Project Gutenberg™ works.

• You provide, in accordance with paragraph 1.F.3, a full refund of


any money paid for a work or a replacement copy, if a defect in
the electronic work is discovered and reported to you within 90
days of receipt of the work.

• You comply with all other terms of this agreement for free
distribution of Project Gutenberg™ works.

1.E.9. If you wish to charge a fee or distribute a Project Gutenberg™


electronic work or group of works on different terms than are set
forth in this agreement, you must obtain permission in writing from
the Project Gutenberg Literary Archive Foundation, the manager of
the Project Gutenberg™ trademark. Contact the Foundation as set
forth in Section 3 below.

1.F.

1.F.1. Project Gutenberg volunteers and employees expend


considerable effort to identify, do copyright research on, transcribe
and proofread works not protected by U.S. copyright law in creating
the Project Gutenberg™ collection. Despite these efforts, Project
Gutenberg™ electronic works, and the medium on which they may
be stored, may contain “Defects,” such as, but not limited to,
incomplete, inaccurate or corrupt data, transcription errors, a
copyright or other intellectual property infringement, a defective or
damaged disk or other medium, a computer virus, or computer
codes that damage or cannot be read by your equipment.

1.F.2. LIMITED WARRANTY, DISCLAIMER OF DAMAGES - Except for


the “Right of Replacement or Refund” described in paragraph 1.F.3,
the Project Gutenberg Literary Archive Foundation, the owner of the
Project Gutenberg™ trademark, and any other party distributing a
Project Gutenberg™ electronic work under this agreement, disclaim
all liability to you for damages, costs and expenses, including legal
fees. YOU AGREE THAT YOU HAVE NO REMEDIES FOR
NEGLIGENCE, STRICT LIABILITY, BREACH OF WARRANTY OR
BREACH OF CONTRACT EXCEPT THOSE PROVIDED IN PARAGRAPH
1.F.3. YOU AGREE THAT THE FOUNDATION, THE TRADEMARK
OWNER, AND ANY DISTRIBUTOR UNDER THIS AGREEMENT WILL
NOT BE LIABLE TO YOU FOR ACTUAL, DIRECT, INDIRECT,
CONSEQUENTIAL, PUNITIVE OR INCIDENTAL DAMAGES EVEN IF
YOU GIVE NOTICE OF THE POSSIBILITY OF SUCH DAMAGE.

1.F.3. LIMITED RIGHT OF REPLACEMENT OR REFUND - If you


discover a defect in this electronic work within 90 days of receiving
it, you can receive a refund of the money (if any) you paid for it by
sending a written explanation to the person you received the work
from. If you received the work on a physical medium, you must
return the medium with your written explanation. The person or
entity that provided you with the defective work may elect to provide
a replacement copy in lieu of a refund. If you received the work
electronically, the person or entity providing it to you may choose to
give you a second opportunity to receive the work electronically in
lieu of a refund. If the second copy is also defective, you may
demand a refund in writing without further opportunities to fix the
problem.

1.F.4. Except for the limited right of replacement or refund set forth
in paragraph 1.F.3, this work is provided to you ‘AS-IS’, WITH NO
OTHER WARRANTIES OF ANY KIND, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR ANY PURPOSE.

1.F.5. Some states do not allow disclaimers of certain implied


warranties or the exclusion or limitation of certain types of damages.
If any disclaimer or limitation set forth in this agreement violates the
law of the state applicable to this agreement, the agreement shall be
interpreted to make the maximum disclaimer or limitation permitted
by the applicable state law. The invalidity or unenforceability of any
provision of this agreement shall not void the remaining provisions.

1.F.6. INDEMNITY - You agree to indemnify and hold the Foundation,


the trademark owner, any agent or employee of the Foundation,
anyone providing copies of Project Gutenberg™ electronic works in
accordance with this agreement, and any volunteers associated with
the production, promotion and distribution of Project Gutenberg™
electronic works, harmless from all liability, costs and expenses,
including legal fees, that arise directly or indirectly from any of the
following which you do or cause to occur: (a) distribution of this or
any Project Gutenberg™ work, (b) alteration, modification, or
additions or deletions to any Project Gutenberg™ work, and (c) any
Defect you cause.

Section 2. Information about the Mission


of Project Gutenberg™
Project Gutenberg™ is synonymous with the free distribution of
electronic works in formats readable by the widest variety of
computers including obsolete, old, middle-aged and new computers.
It exists because of the efforts of hundreds of volunteers and
donations from people in all walks of life.

Volunteers and financial support to provide volunteers with the


assistance they need are critical to reaching Project Gutenberg™’s
goals and ensuring that the Project Gutenberg™ collection will
remain freely available for generations to come. In 2001, the Project
Gutenberg Literary Archive Foundation was created to provide a
secure and permanent future for Project Gutenberg™ and future
generations. To learn more about the Project Gutenberg Literary
Archive Foundation and how your efforts and donations can help,
see Sections 3 and 4 and the Foundation information page at
www.gutenberg.org.

Section 3. Information about the Project


Gutenberg Literary Archive Foundation
The Project Gutenberg Literary Archive Foundation is a non-profit
501(c)(3) educational corporation organized under the laws of the
state of Mississippi and granted tax exempt status by the Internal
Revenue Service. The Foundation’s EIN or federal tax identification
number is 64-6221541. Contributions to the Project Gutenberg
Literary Archive Foundation are tax deductible to the full extent
permitted by U.S. federal laws and your state’s laws.

The Foundation’s business office is located at 809 North 1500 West,


Salt Lake City, UT 84116, (801) 596-1887. Email contact links and up
to date contact information can be found at the Foundation’s website
and official page at www.gutenberg.org/contact

Section 4. Information about Donations to


the Project Gutenberg Literary Archive
Foundation
Project Gutenberg™ depends upon and cannot survive without
widespread public support and donations to carry out its mission of
increasing the number of public domain and licensed works that can
be freely distributed in machine-readable form accessible by the
widest array of equipment including outdated equipment. Many
small donations ($1 to $5,000) are particularly important to
maintaining tax exempt status with the IRS.

The Foundation is committed to complying with the laws regulating


charities and charitable donations in all 50 states of the United
States. Compliance requirements are not uniform and it takes a
considerable effort, much paperwork and many fees to meet and
keep up with these requirements. We do not solicit donations in
locations where we have not received written confirmation of
compliance. To SEND DONATIONS or determine the status of
compliance for any particular state visit www.gutenberg.org/donate.

While we cannot and do not solicit contributions from states where


we have not met the solicitation requirements, we know of no
prohibition against accepting unsolicited donations from donors in
such states who approach us with offers to donate.

International donations are gratefully accepted, but we cannot make


any statements concerning tax treatment of donations received from
outside the United States. U.S. laws alone swamp our small staff.

Please check the Project Gutenberg web pages for current donation
methods and addresses. Donations are accepted in a number of
other ways including checks, online payments and credit card
donations. To donate, please visit: www.gutenberg.org/donate.

Section 5. General Information About


Project Gutenberg™ electronic works
Professor Michael S. Hart was the originator of the Project
Gutenberg™ concept of a library of electronic works that could be
freely shared with anyone. For forty years, he produced and
distributed Project Gutenberg™ eBooks with only a loose network of
volunteer support.
Project Gutenberg™ eBooks are often created from several printed
editions, all of which are confirmed as not protected by copyright in
the U.S. unless a copyright notice is included. Thus, we do not
necessarily keep eBooks in compliance with any particular paper
edition.

Most people start at our website which has the main PG search
facility: www.gutenberg.org.

This website includes information about Project Gutenberg™,


including how to make donations to the Project Gutenberg Literary
Archive Foundation, how to help produce our new eBooks, and how
to subscribe to our email newsletter to hear about new eBooks.
back
Welcome to our website – the perfect destination for book lovers and
knowledge seekers. We believe that every book holds a new world,
offering opportunities for learning, discovery, and personal growth.
That’s why we are dedicated to bringing you a diverse collection of
books, ranging from classic literature and specialized publications to
self-development guides and children's books.

More than just a book-buying platform, we strive to be a bridge


connecting you with timeless cultural and intellectual values. With an
elegant, user-friendly interface and a smart search system, you can
quickly find the books that best suit your interests. Additionally,
our special promotions and home delivery services help you save time
and fully enjoy the joy of reading.

Join us on a journey of knowledge exploration, passion nurturing, and


personal growth every day!

ebookbell.com

You might also like