Web Intelligence
Introduction
Leading information-exchange service of the Internet. It was created by Tim
Berners-Lee and his colleagues at CERN and introduced to the world in 1991. The
Web gives users access to a vast array of documents that are connected to each
other by means of hypertext or hyperlinks. A hypertext document with its
corresponding text and hyperlinks is written in HTML and is assigned an on-line
address, or URL. The Web operates within the Internet's basic client-server
architecture. Individual HTML files with unique electronic addresses are called Web
pages, and a collection of Web pages and related files (such as graphics files,
scripted programs, and other resources) sharing a set of similar addresses
(see domain name) is called a Web site. The main or introductory page of a Web site
is usually called the site's home page. Users may access any page by typing in the
appropriate address, search for pages related to a topic of interest by using a search
engine, or move quickly between pages by clicking on hyperlinks incorporated into
them. Though introduced in 1991, the Web did not become truly popular until the
introduction of Mosaic, a browser with a graphical interface, in 1993. Subsequently,
browsers produced by Netscape and Microsoft have become predominant.
A part of the Internet that contains linked text, image, sound, and video documents.
Before the World Wide Web (WWW), information retrieval on the Internet was text-
based and required that users know basic UNIX commands. The World Wide Web
has gained popularity largely because of its ease of use (point-and-click graphical
interface) and multimedia capabilities, as well as its convenient access to other
types of Internet services (such as e-mail, Telnet, and Usenet). See also Internet.
Improvements in networking technology, the falling cost of computer hardware and
networking equipment, and increased bandwidth have helped the Web to contain
richer content. The Web is the fastest medium for transferring information and has
universal reach (crossing geographical and time boundaries). It is also easy to
access information from millions of Web sites using search engines (systems that
collect and index Web pages, and store searchable lists of these pages). The Web's
unified networking protocols make its use seamless, transparent, and portable. As
the Web has evolved, it has incorporated complementary new technologies for
developing online commerce and video on demand, to name a few.
Individual documents are called Web pages, and a collection of related documents is
called a Web site. All Web documents are assigned a unique Internet address called
a Uniform Resource Locator (URL) by which they can be accessed by all Web
browsers. A URL (such as http://www.hq.nasa.gov/office/procurement/index.html)
identifies the communication protocol used by the site (http), its location [domain
name or server (www.hq.nasa.gov)], the path to the server (office/procurement),
and the type of document (html).
The language used to create and link documents is called Hypertext Markup
Language (HTML). Markup is the process of adding information to a document that is
not part of the content but identifies the structure or elements. Markup languages
are not new. HTML is based on theStandard Generalized Markup Language (SGML).
Though the initial format for creating a Web site was pure HTML, new and extended
HTML has the ability to include programming language scripts such as common
gateway interface (CGI), active server page (ASP), and Java server page (JSP),
which can be used to create dynamic and interactive Web pages as opposed to just
static HTML text. Dynamic Web pages allow users to create forms for transactions
and data collection; perform searches on a database or on a particular Web site;
create counters and track the domain names of visitors; customize Web pages to
meet individual user preferences; create Web pages on the fly; and create
interactive Web sites.
XML, developed by the World Wide Web Consortium, is another derivative of SGML
and is rapidly becoming the standard information protocol for all commercial
software such as office tools, messaging, and distributed databases. XML is a flexible
way to create common information formats and share both the format and the data
on the World Wide Web, intranets, and other Web-based services.
What is Web intelligence?
Web intelligence (WI) exploits artificial intelligence (AI) and advanced information
technology (IT) on the internet and the web. The Web intelligence is the study and
research of the application of Artificial Intelligence andInformation Technology on
the web in order to create the next generation of products, services and frameworks
based on the internet.
The term was born in a paper written by Ning Zhong, Jiming Liu Yao and Y.Y.Ohsuga
in the Computer Software and Applications Conference in 2000. [1]
The 21st century is the age of the Internet and the World Wide Web. The Web
revolutionizes the way we gather, process, and use information. At the same time, it
also redefines the meanings and processes of business, commerce, marketing,
finance, publishing, education, research, development, as well as other aspects of
our daily life. Although individual Web based information systems are constantly
being deployed, advanced issues and techniques for developing and for benefiting
from Web intelligence still remain to be systematically studied. Roughly speaking,
web intelligence exploits artificial intelligence and advanced information technology
on the Web and Internet. It is the key and the most urgent research field of IT for
business intelligence.
Web intelligence (WI) concerns the design and implementation of intelligent systems on the new
platform of the web and internet. Web intelligence can be characterized as the problems, theories,
methodologies, and techniques studied by WI researchers. The web provides a new and unique
platform for computer applications.
Web Intelligence (WI) is a new research paradigm aimed at exploring the fundamental
interactions between AI-engineering and Advanced Information Technology (AIT) on the
next generation of Web systems, services, and etc. Here AI-engineering is a general term
that refers to a new area, slightly beyond traditional AI: brain informatics, human level AI,
intelligent agents, social network intelligence and classical areas, such as knowledge
engineering, representation, planning, and discovery and data mining are examples. AIT
includes wireless networks, ubiquitous devices, social networks, and data/knowledge grids,
as well as cloud computing. WI research seeks to explore the most critical technology and
engineering to bring in the next generation Web systems.
Size
Complexity
The web increases the availability and accessibility of information to a much larger community than
other computer applications.
The web present new challenging and research problems
Existing theories and technologies need to be modified or enhanced.
A new sub-discipline devoted to Web related research and applications might have a
significant value.
Perspective of WI
WI may be viewed as applying results from these existing disciplines to a totally new
domain.
WI introduces new problems and challenges to the established disciplines.
WI may be viewed as an enhancement or an extension of AI and IT.
WI may become a sub-area of AI and IT or a child of a successful marriage of AI and
IT.
Mathematics: computation, logic, probability.
Applied Mathematics and Statistics: algorithms, nonclassical logics, decision theory,
information theory, measurement theory, utility theory, theories of uncertainty,
approximate reasoning.
Psychology: cognitive psychology, cognitive science, human-machine interaction,
user interface.
Linguistics: computational linguistics, natural language processing, machine
translation.
Information Technology: information science, databases, information retrieval
systems, knowledge discovery and data mining, expert systems, knowledge-based
systems, decision support systems, intelligent information agents.
IWIS (Intelligent Web Information Systems)
Four categories of AI systems (Russell and Norvig)
designing philosophy of AI systems
ability, functionality of AI systems
System that acts rationally, acts like humans, thinks rationally, thinks like humans
IWIS can be similarly classified.
IWIS (Intelligent Web Information Systems)
A full understanding of an intelligent system involves explanations at various levels.
Conceptual formulation and mathematical modeling.
Physical system design and implementation.
Many levels of development can be defined.
Topics of WI
Prefix labeling: adding “Web”as a prefix to an existing topic.
--- Web digital library
--- Web information retrieval
--- Web agents
Postfix labelling: adding “Web” as a postfix to an existing topic.
--- Digital library on the Web
--- Information retrieval on the Web
Web Human
Media Engineering
Web Information
System Environment
and Foundation
Web Mining
and Farming Web Agents
Web Information
Management
Web Information
Retrieval
Web-Based
Applications
Topics of WI
Web Mining
Find patterns, knowledge, etc. from Web usage files or
Web documents.
--- Web log mining
--- Web structure mining
--- Web content mining
Applications:
--- E-commerce
--- Targeted marketing
From the explicit physical links between Web documents and structures of Web
documents, extract implicit logical connections between documents or segments of
documents.
--- citation of papers
--- clustering of Web documents
--- identifying people with common interests
--- identifying emerging research topics and communities on the Web.
Generating logical, personal views of the Web.
Web Information Retrieval
Search engines.
Multi-lingual retrieval.
Search agents.
Text analysis and text mining.
Ontology.
Semantic web and markup languages.
Granular Retrieval Model
A rough description or granulated view of the Web documents can be obtained by
ignoring some unimportant details.
Three languages should be supported in the granular retrieval model
--- a “granulation”language allows a user to create a granulated view.
--- a “navigation”language allows a user to change views
--- a “retrieval”language allows the user to perform an actual search.
Intelligent Web Agents
Personalized Multimodal Interface.
Push and Pull.
Pattern Discovery and Self-Organization.
Information Gateway.
Reward.
Matchmaking.
Decision support.
Delegation.
Collaborative Work Support.
WI: Current status
The 2001 International Conference on Web
Intelligence.
IEEE COMPUTER -- Special Issue on WEB
INTELLIGENCE (to appear in October 2002).
Special issues on Web Intelligence on several leading journals.
A hard-covered edited book on Web Intelligence.
WI: Future
It is difficult to predict the future without uncertainty.
The interest in WI is growing very fast.
WI may serve as a sub-discipline of computer science on its own rights.
WI is attracting and will attract the best researchers in the field.
Steve Lawrence and C. Lee Giles,
NEC Research Institute (1998)
http://www.neci.nec.com/~lawrence/websize.html
An estimated lower bound on the size of the indexable Web is 320 million pages.
back
Complexity
Connectivity and diversity of Web documents.
A heterogeneous collection of structured, unstructured, semi-structured, inters
related, and distributed Web documents consisting of text, images and sounds.
Theories, methodologies and technologies of Web based information systems.
Many different types of search engines and agents.
Industry interest in WI
• Web Intelligence kis.maebashi-it.ac.jp/wi01/
• Web-Intelligence Home Page www.webintelligence.
com/
• Intelligence on the Web www.fas.org/irp/intelwww.html
• WIN: home WEB INTELLIGENCE NETWORK,
smarter.net/
• CatchTheWeb - Web Research, Web Intelligence
Collaboration www.catchtheweb.com/
• Infonoia: Web Intelligence In Your Hands
www.infonoia.com/myagent/en/baseframe.html
back
ReseachIndex database
• Effective Personalization Based on Association Rule.. -
Bamshad Mobasher Honghua
Center for Web Intelligence, School of Computer Science,
International Conference on Information and Knowledge
Management (CIKM 2001).
• FOCI: A Personalized Web Intelligence System - Ah-Hwee
Tan Hwee-Leng
Proceedings of IJCAI workshop on Intelligent Techniques for Web
Personalisation, Seattle, pp. 14-19, August 2001.
• Integrating E-Commerce and Data Mining: Architecture
and.. - Suhail Ansari Ron Web intelligence can be
"Leaders will use metrics to fuel personalization" and that
"firms need web intelligence, not log analysis." back
Analysis
System requirements
Description
Implementation
User guide
Fuctions/links
How to use
Screen shots