What are business intelligence systems?
Answer: Business intelligence (BI) systems are information systems that process operational
and other data to analyze past performance and to make predictions. The patterns,
relationships, and trends identified by BI systems are called business intelligence. As
information systems, BI systems have five standard components: hardware, software, data,
procedures, and people. The software component of a BI system is called a BI application.
How does business intelligence help marketers identify changes in the purchasing patterns of
customers?
Answer: Retailers know that important life events cause customers to change what they buy
and, for a short interval, to form new loyalties to new store brands. Before the advent of BI,
stores would watch the local newspapers for graduation, marriage, and baby announcements
and send ads in response, which is a slow, labor-intensive, and expensive process. However,
by applying business intelligence techniques to their sales data, companies can identify the
purchasing pattern for different products, and by observing this purchasing pattern,
companies can send ads for related products to those customers.
What are the three primary activities in the business intelligence process?
Answer: The three primary activities in the business intelligence process include: acquire
data, perform analysis, and publish results.
Data acquisition is the process of obtaining, cleaning, organizing, relating, and cataloging
source data. Business intelligence analysis is the process of creating business intelligence and
includes three fundamental categories: reporting, data mining, and BigData. Publish results is
the process of delivering business intelligence to the knowledge workers who need it.
Differentiate between push publishing and pull publishing.
Answer: Push publishing delivers business intelligence to users without any request from the
users; the BI results are delivered according to a schedule or as a result of an event or
particular data condition. Pull publishing requires the user to request BI results.
Explain the functions of a data warehouse.
Answer: The functions of a data warehouse are to:
1. Obtain data
2. Cleanse data
3. Organize and relate data
4. Catalog data
Programs read operational and other data and extract, clean, and prepare that data for
business intelligence processing. The prepared data are stored in a data warehouse database
using a data warehouse DBMS, which can be different from an organization's operational
DBMS. Data warehouses include data that are purchased from outside sources. Metadata
concerning the data—its source, its format, its assumptions and constraints, and other facts
about the data—are kept in a data warehouse metadata database. The data warehouse DBMS
extracts and provides data to BI applications.
What are the functions of a data warehouse?
Answer: A data warehouse takes data from data manufacturers (operational systems and
other sources), cleans and processes the data, and locates the data on the shelves of the data
warehouse. Data analysts who work with a data warehouse are experts at data management,
data cleaning, data transformation, data relationships, and the like. However, they are not
usually experts in a given business function. The data warehouse then distributes the data to
data marts.
How is a data warehouse different from a data mart?
Answer: A data warehouse can be compared to a distributor in a supply chain. The data
warehouse takes data from the data manufacturers (operational systems and other sources),
cleans and processes the data, and locates the data on the shelves of the data warehouse. The
data analysts who work with a data warehouse are experts at data management, data cleaning,
data transformation, data relationships, and the like. The data warehouse then distributes the
data to data marts.
A data mart is a data collection, smaller than the data warehouse that addresses the needs of a
particular department or functional area of the business. If a data warehouse is the distributor
in a supply chain, then a data mart is like a retail store in a supply chain. Users in a data mart
obtain data that pertain to a particular business function from a data warehouse. Such users do
not have the data management expertise that data warehouse employees have, but they are
knowledgeable analysts for a given business function.
What is data granularity?
Answer: Data granularity refers to the level of detail represented by data. Granularity can be
too fine or too coarse. In general, it is better to have too fine a granularity than too coarse. If
the granularity is too fine, the data can be made coarser by summing and combining. If the
granularity is too coarse, however, there is no way to separate the data into constituent parts.
What is clickstream data?
Answer: Clickstream data is the data that is captured from customers' clicking behavior.
Such data is very fine and includes everything a customer does at a Web site. Because the
data is too fine, data analysts must throw away millions and millions of clicks if a study
requires data that is coarser.
Explain the curse of dimensionality.
Answer: The curse of dimensionality is associated with the problem of data having too many
attributes. For example, if internal customer data is combined with customer data that has
been purchased, there will be more than one hundred different attributes to consider. It is hard
to select only a few attributes from those available. The curse of dimensionality states that the
more attributes there are, the easier it is to build a model that fits the sample data, but that is
worthless as a predictor.
What is data mining?
Answer: Data mining is the application of statistical techniques to find patterns and
relationships among data for classification and prediction. Data mining techniques emerged
from the combined discipline of statistics, mathematics, artificial intelligence, and machine-
learning. Most data mining techniques are sophisticated, and many are difficult to use well.
Such techniques are valuable to organizations, and some business professionals, especially
those in finance and marketing, have become expert in their use. Data mining techniques fall
into two broad categories: unsupervised and supervised.
What is unsupervised data mining?
Answer: With unsupervised data mining, analysts do not create a model or hypothesis before
running the analysis. Instead, they apply a data mining technique to the data and observe the
results. With this method, analysts create hypotheses after the analysis to explain the patterns
found. These findings are obtained solely by data analysis.
One common unsupervised technique is cluster analysis. With it, statistical techniques
identify groups of entities that have similar characteristics. A common use for cluster analysis
is to find groups of similar customers from customer order and demographic data.
AACSB: Information Technology
Explain supervised data mining.
Answer: With supervised data mining, data miners develop a model prior to the analysis and
apply statistical techniques to data to estimate parameters of the model. For example, suppose
marketing experts in a communications company believe that cell phone usage on weekends
is determined by the age of the customer and the number of months the customer has had the
cell phone account. A data mining analyst would then run an analysis that estimates the
impact of customer and account age. One such analysis, which measures the impact of a set
of variables on another variable, is called a regression analysis.
AACSB: Information Technology
What is BigData?
Answer: BigData is a term used to describe data collections that are characterized by huge
volume, rapid velocity, and great variety. Considering volume, BigData refers to data sets
that are at least a petabyte in size, and usually larger. Additionally, BigData has high velocity,
meaning that it is generated rapidly. BigData is varied. BigData may have structured data, but
it also may have free-form text, dozens of different formats of Web server and database log
files, streams of data about user responses to page content, and possibly graphics, audio, and
video files. BigData analysis can involve both reporting and data mining techniques. The
chief difference is, however, that BigData has volume, velocity, and variation characteristics
that far exceed those of traditional reporting and data mining. Examples of big data include
the Google search index, the database of Facebook user profiles, and Amazon.com's product
list. These collections of data (or "datasets") are so large that the data cannot be stored in a
typical database, or even a single computer. Instead, the data must be stored and processed
using a highly scalable database management system. Some of the most common big data
software products include Apache Hadoop, IBM's Big Data Platform, Oracle NoSQL
Database, Microsoft HDInsight, and EMC Pivotal One
Differentiate between static reports and dynamic reports.
Answer: Static reports are business intelligence documents that are fixed at the time of
creation and do not change. A printed sales analysis is an example of a static report. In the
business intelligence context, most static reports are published as PDF documents. Dynamic
reports are business intelligence documents that are updated at the time they are requested. A
sales report that is current as of the time a user accessed it on a Web server is a dynamic
report. In almost all cases, publishing a dynamic report requires the business intelligence
application to access a database or other data source at the time the report is delivered to the
user.