Unit Two Exploring Business Intelligence: Business Driven Technology - Instructor's Manual
Unit Two Exploring Business Intelligence: Business Driven Technology - Instructor's Manual
UNIT TWO
Exploring Business Intelligence
Information is powerful. Information tells an organization everything form how its current operations
are performing to estimating and strategizing how future operations might perform. New
perspectives open up when people have the right information and know how to use it. The ability to
understand, digest, analyze, and filter information is a key to success for any professional in any
industry.
This unit demonstrates the value an organization can uncover and create by learning how to
manage, access, analyze, and protect organizational information. The chapters in this unit include:
Chapter Six – Valuing Organizational Information
Chapter Seven – Storing Organizational Information - Databases
Chapter Eight – Accessing Organizational Information – Data Warehouse
Wiki technology is taking off and people are continually finding new uses for the technology. Wiki is
being used for collaboration among many businesses. Wiki is being used in education in a number
of ways to support learning including:
A teacher could post some key revision words for students to expand into definitions / pages
Students could work in groups on collaborative documents such as a group report
Course notes could be refined over the duration of the course by both students and teachers
Students could research new topics and contribute their findings
A wiki could be used as a portfolio showing development of a project
Teacher can start a writing prompt and have students add parts to create a comprehensive
class writing activity.
A teacher could start a story and students could create links off it which would allow the story
to follow different, interactive paths.
States and school districts can develop and edit curriculae by allowing teachers to add in
activities and assessments
A wiki would be a great tool for collaboratively constructing answers to exam questions!
A great tool for a team of students involved in project work
Annotating each other's work
UNIT 2 Page 1 of 24
Business Driven Technology - Instructor’s Manual
Wiki Business
Start-ups such as JotSpot are out to harness the power of wikis for businesses. JotSpot’s wiki-
based software lets companies create wikis for business processes. Opsware, a data center
automation software vendor, has used JotSpot to create in a few hours applications that might
have cost $50,000 to $100,000 to develop in Java. Opsware’s technical sales team uses one
JotSpot wiki to manage information such as proposals and status reports associated with pilot
projects for prospective customers.
“It’s a very rich database management system,” said Jason Rosenthal, vice president of
client services at Opsware. “It’s so quick and easy that a new user can learn to use it in 10 to 15
minutes.” The software also reduced the time it took the company to prepare a proof of concept
from five days to three, Rosenthal said, adding that wikis will revolutionize how companies share
information internally.
Cellphedia
Know how long the Brooklyn Bridge is? The date of Andy Warhol”s death? The height of the Sears
Tower? People using Cellphedia, a new cell-phone-based encyclopedia application, can find the
answers to these and plenty of other random questions in a simple text message. Cellphedia is like
Wikipedia on the go. Created as a thesis project by 33-year-old New York University Interactive
Telecommunications Program graduate student Limor Garcia, Cellphedia lets users sign up to
receive updates in one or more categories such as architecture, music, and technology. When a
question is asked about one of those subjects, users receive a text message with the query.
The first answer, which could come within minutes, is forwarded to whoever asked the
question and is posted to the Cellphedia site as well. Subsequent answers are sent only to the
person who asked the question. Users can rack up points as an incentive for being the first to
answer questions.
Jimmy Wales said he has seen Cellphedia but has not used the site. Wales thinks
Cellphedia sounds like a great idea, and Wikipedia is in talks with Nokia about creating a Wikipedia
client on Nokia cell phones.
For those of you who are big fans of wikipedia, here is an interesting comedic segment from the
'Colbert Report' on wikipedia. This video clip comes from youtube.com and lasts about 4 minutes.
You might find this useful to share with your students regarding the need to critically evaluate
information.
This link works - straight from comedy central.
UNIT 2 Page 2 of 24
Business Driven Technology - Instructor’s Manual
http://www.comedycentral.com/motherload/player.jhtml?
ml_video=72347&ml_collection=&ml_gateway=&ml_gateway_id=&ml_comedian=&ml_runtime=&m
l_context=show&ml_
UNIT 2 Page 3 of 24
Business Driven Technology - Instructor’s Manual
CHAPTER SIX
Valuing Organizational Information
This chapter provides an overview of information levels, formats and granularities. It also discusses
the differences between transactional and analytical information. The chapter concludes with a
discussion on the issues found in low quality information and how to obtain high quality information.
LEARNING OUTCOMES
6.1 Describe the broad levels, formats, and granularities of information.
Information levels include individual, department, and enterprise. Information formats include
document, presentation, spreadsheet, and database. Information granularities include detail,
summary, and aggregate.
6.3 List, describe, and provide an example of each of the five characteristics of high quality information.
Accuracy determines if all values are correct. Example – is the name spelled correctly?
Completeness determines if any values are missing. Example - is the address complete?
Consistency ensures that aggregate or summary information is in agreement with
detailed information. Example – do totals equal the true total of the individual fields?
Uniqueness ensures that each transaction, entity, and event is represented only once in
the information. Example – are there any duplicate customers?
Timeliness determines if the information is current with respect to the business
requirement. Example – is the information updated weekly?
6.4 Assess the impact of low quality information on an organization and the benefits of high quality
information on an organization.
Using the wrong information can lead to making the wrong decision. Making the wrong
decision can cost time, money, and even reputations. Business decisions are only as good as
the information used to make the decision. Low quality information leads to low quality
business decisions. High quality information can significantly improve the chances of making
a good business decision and directly affect an organization’s bottom line.
CLASSROOM OPENER
UNIT 2 Page 4 of 24
Business Driven Technology - Instructor’s Manual
GREAT BUSINESS DECISIONS – Julius Reuter Uses Carrier Pigeons to Transfer Information
In 1850, the idea that sending and receiving information could add business value was born. Julius
Reuter began a business that bridged the gap between Belgium and Germany. Reuter built one of
the first information management companies built on the premise that customers would be
prepared to pay for information that was timely and accurate.
Reuter used carrier pigeons to forward stock market and commodity prices from Brussels to
Germany. Customers quickly realized that with the early receipt of vital information they could
make fortunes. Those who had money at stake in the stock market were prepared to pay
handsomely for early information from a reputable source, even if it was a pigeon. Eventually,
Reuter’s business grew from 45 pigeons to over 200 pigeons.
Eventually the telegraph bridged the gap between Brussels to Germany, and Reuter’s brilliantly
conceived temporary monopoly was closed.
CLASSROOM EXERCISE
Understanding Information’s Quality
Break your students into groups and ask them to compile a list of all of the issues found in the
following information. Ask your students to also list why most low quality information errors occur
and what an organization can do to help implement high quality information.
CORE MATERIAL
The core chapter material is covered in detail in the PowerPoint slides. Each slide contains detailed
teaching notes including exercises, class activities, questions, and examples. Please review the
PowerPoint slides for detailed notes on how to teach and enhance the core chapter material.
UNIT 2 Page 5 of 24
Business Driven Technology - Instructor’s Manual
From the customer’s perspective Wikipeida entries are an example of analytical information.
They are using the information to research a topic, make a decision, or perform an analysis.
From Wikipedia’s perspective each entry is an example of transactional information since it is
their primary business to gain entries from individual contributors.
2. Describe the impact to Wikipedia if the information contained in its database is of low quality.
If Wikipedia contained information that was inaccurate its customers would discontinue using it
as a source for information. It could also find itself in legal trouble if it allows entries stating
inaccurate information about people, which is known as defamation of character. This point is
demonstrated in the case when Wikipedia had to start restricting access by tightening its rules
for submitting entries following the disclosure that it ran a piece falsely implicating a man in the
Kennedy assassination.
3. Review the five common characteristics of high quality information and rank them in order of
importance to Wikipedia.
Student answers to this question will vary depending on their personal views and experiences
with technology. The important part of the question is understanding the students’ justifications
for their order. Potential order of importance:
Timeliness – Wikipedia’s information must be timely. If users are receiving old and
outdated entries, or no entries for a new topic, they will not continue using Wikipedia. An
encyclopedia that is outdated is not very useful.
Accuracy – Wikipedia’s entries must be accurate, and if they are inaccurate the users can
change the definition to ensure it is accurate. An encyclopedia that is inaccurate is
useless.
Consistency – Wikipedia’s results must be consistent. Users will not trust the system if it
provides different definitions for the same entry. An encyclopedia that offers inconsistent
terms is not useful.
Completeness – Wikipedia’s entry results need to be complete. An encyclopedia that does
not contain vast amounts of information is not useful.
Uniqueness – Wikipedia’s customers want unique answers to each entry. Multiple answers
to a term will confuse the customer and they will not be able to know which answer is
correct. An encyclopedia cannot have multiple answers for each term.
Wikipedia began tightening its rules for submitting entries following the disclosure that it ran a
piece falsely implication a man in the Kennedy assassination. Wikipedia now requires users to
register before they can create articles.
UNIT 2 Page 6 of 24
Business Driven Technology - Instructor’s Manual
2. Explain the importance of quality information for the Alaska Department of Fish and Game.
If the department receives low quality information from fish counts then either too many fish
escape or too many are caught. Allowing too many salmon to swim upstream could deprive
fishermen of their livelihoods. Allowing too many to be caught before they swim upstream to
spawn could diminish fish populations- yielding devastating effects for years to come.
3. Review the five common characteristics of high quality information and rank them in order of
importance for the Alaska Department of Fish and Game.
Student answers to this question will vary depending on their personal views and experiences
with technology. The important part of the question is understanding the student’s justifications
for their order. Potential order of importance:
Timeliness – Without timely information the department can not make fishing decisions
Accuracy – inaccurate information will lead to the department making the wrong decisions
Completeness – incomplete information will make it harder for the department to make
decisions regarding the amount of fish. Incomplete information probably occurs frequently
since part of the process, fish escapement, is performed manually
Consistency – information inconsistency probably occurs since the fish escapement is
performed manually
Uniqueness – a fish ticket could be mistakenly entered twice
4. Do the managers at the Alaska Department of Fish and Game actually have all of the
information they require to make an accurate decision? Explain the statement “it is never
possible to have all of the information required to make the best decision possible.”
No, the managers at the Alaska Department of Fish and Game will never have every single
piece of information. It would be almost impossible to count every single fish. However, they
have enough to make an accurate estimate as to the number of fish. If you wait to have every
single piece of information you would probably never make a decision. We typically receive
enough information to make an accurate decision. Of course, the more information you have,
the better the decision you can make, but if you wait to get every piece of information you will
take too long to make the decision.
UNIT 2 Page 7 of 24
Business Driven Technology - Instructor’s Manual
CHAPTER SEVEN
Storing Organizational Information
This chapter focuses on the relational database model. It introduces students to entities, attributes,
primary keys, foreign keys, and the four components in a DBMS:
Data definition component – helps create and maintain the data dictionary and the structure of
the database
Data manipulation component – allows users to create, read, update, and delete information in
a database
Application generation component – includes tools for creating visually appealing and easy-to-
use applications
Data administration component – provides tools for managing the overall database
environment by providing faculties for backup, recovery, security, and performance
LEARNING OUTCOMES
7.1 Define the fundamental concepts of the relational database model.
The relational database model stores information in the form of logically related two-
dimensional tables. Entities, attributes, primary keys, and foreign keys are all fundamental
concepts included in the relational database model.
UNIT 2 Page 8 of 24
Business Driven Technology - Instructor’s Manual
by inserting a query, which the Web site then analyzes and custom builds a Web page in real-
time that satisfies the query.
7.5 Describe the two primary methods for integrating information across multiple databases.
Forward integration – takes information entered into a given system and sends it
automatically to all downstream systems and processes.
Backward integration – takes information entered into a given system and sends it
automatically to all upstream systems and processes.
CLASSROOM OPENER
GREAT BUSINESS DECISIONS – Edgar Codd’s Relational Database Theory
Edgar Frank Codd was born at Portland, Dorset, in England. He studied mathematics and
chemistry at Exeter College, Oxford, before serving as a pilot in the Royal Air Force during the
Second World War. In 1948, he moved to New York to work for IBM as a mathematical
programmer. In 1953 Codd moved to Ottawa, Canada. A decade later he returned to the USA and
received his doctorate in computer science from the University of Michigan in Ann Arbor. Two
years later he moved to San Jose, California to work at IBM's Almaden Research Center.
In the 1960s and 1970s he worked out his theories of data arrangement, issuing his paper "A
Relational Model of Data for Large Shared Data Banks" in 1970, after an internal IBM paper one
year earlier. To his disappointment, IBM proved slow to exploit his suggestions until commercial
rivals started implementing them.
Initially, IBM refused to implement the relational model in order to preserve revenue from IMS/DB.
Codd then showed IBM customers the potential of the implementation of its model, and they in turn
pressured IBM. Then IBM included in its Future System project a System R subproject — but put in
charge of it were developers who were not thoroughly familiar with Codd's ideas, and isolated the
team from Codd. As a result, they did not use Codd's own Alpha language but created a non-
relational one, SEQUEL. Even so, SEQUEL was so superior to pre-relational systems that it was
copied, based on pre-launch papers presented at conferences, by Larry Ellison in his Oracle
DBMS, which actually reached market before SQL/DS — due to the then-already proprietary status
of the original moniker, SEQUEL had been renamed SQL.
Codd continued to develop and extend his relational model, sometimes in collaboration with Chris
Date. One of the normalized forms, the Boyce-Codd Normal Form, is named after Codd. Codd also
coined the term OLAP and wrote the twelve laws of online analytical processing, although these
were never truly accepted after it came out that his white paper on the subject was paid for by a
software vendor. Edgar F. Codd died of heart failure at his home in Williams Island, Florida at the
age of 79 on Friday, April 18, 2003.
CLASSROOM EXERCISE
Building an ER Diagram
Break your students into groups and ask them to create an entity relationship diagram similar to the
one in Figure 7.1 for a company or product of their choice. If the students are uncomfortable with
databases, you should recommend that they stick to a company similar to the TCCBCE, perhaps a
UNIT 2 Page 9 of 24
Business Driven Technology - Instructor’s Manual
snack food producer, mountain bike equipment producer, or even a footwear producer. If your
students are more comfortable with databases, ask them to choose a company that would
challenge them such as a fast food restaurant, online book seller, or even a university’s course
registration system.
The important part of this exercise is for your students to begin to understand how the tables in a
database relate. Be sure their ER diagrams include primary keys and foreign keys. Have your
students present their ER diagrams to the class and ask the students to find any potential errors
with the diagrams.
CORE MATERIAL
The core chapter material is covered in detail in the PowerPoint slides. Each slide contains detailed
teaching notes including exercises, class activities, questions, and examples. Please review the
PowerPoint slides for detailed notes on how to teach and enhance the core chapter material.
3. Explain the difference between logical and physical views and why logical views are important
to Wikipedia’s customers.
A well-designed database should handle changes quickly and easily, and provide users with
different views. Physical views deals with the physical storage of information on a storage
device such as a hard disk. Logical views focus on how users logically access information to
meet particular business needs. A database has only one physical view and multiple logical
views. The separation between logical and physical views is what allows each user to access
database information differently. If Wikipedia’s customers had to access physical views of
information they would be confused and find the site difficult to use and understand. The site
provides a logical view for each customer’s queries.
UNIT 2 Page 10 of 24
Business Driven Technology - Instructor’s Manual
2. What information is stored at your college? Is there any chance your information could be
hacked and stolen from your college?
All of your personal information is stored at your college from date of birth to social security
number. Absolutely, information can be stolen from any organization. Colleges have
numerous college students working at different locations across campuses who could easily
access personal information. This is one reason many colleges no longer use social security
numbers as student identification numbers.
4. Do you agree or disagree with changing laws to hold the company where the data theft
occurred accountable? Why or why not?
Student answers to this question will vary. The important part of their answer is the justification
as to why or why not the company should be held accountable. One comment to get your
students thinking would be should a bank be held liable if a gunman robs the bank? Is this the
same type of theft and situation?
5. What impact would holding the company liable where the data theft occurred have on large
organizations?
Companies would take greater actions to ensure the safety of customer information.
UNIT 2 Page 11 of 24
Business Driven Technology - Instructor’s Manual
6. What impact would holding the company liable where the data theft occurred have on small
business?
Small businesses would have to spend more money ensuring the safety of customer data
and it might drain resources that are fundamental in keeping the business running.
UNIT 2 Page 12 of 24
Business Driven Technology - Instructor’s Manual
CHAPTER EIGHT
Accessing Organizational Information – Data Warehouse
This chapter takes a step beyond databases and introduces students to data warehousing, data
warehousing tools, and data mining. These technologies allow organizations to gain vast amounts
of business intelligence.
LEARNING OUTCOMES
8.1 Describe the roles and purposes of data warehouses and data marts in an organization.
The primary purpose of data warehouses and data marts are to perform analytical processing
or OLAP. The insights into organizational information that can be gained from analytical
processing are instrumental in setting strategic directions and goals.
8.2 Compare the multidimensional nature of data warehouses (and data marts) with the two-
dimensional nature of databases.
Databases contain information in a series of two-dimensional tables, which means that you
can only ever view two dimensions of information at one time. In a data warehouse and data
mart, information is multidimensional, it contains layers of columns and rows. Each layer in a
data warehouse or data mart represents information according to an additional dimension.
Dimensions could include such things as products, promotions, stores, category, region, stock
price, date, time, and even the weather. The ability to look at information from different
dimensions can add tremendous business insight.
8.3 Identify the importance of ensuring the cleanliness of information throughout an organization.
An organization must maintain high quality information in the data warehouse. Information
cleansing and scrubbing is a process that weeds out and fixes or discards inconsistent,
incorrect, or incomplete information. Without high quality information the organization will be
unable to make good business decisions.
8.4 Explain the relationship between business intelligence and a data warehouse.
A data warehouse is an enabler of business intelligence. The purpose of a data warehouse is
to pull all kinds of disparate information into a single location where it is cleansed and
scrubbed for analysis.
CLASSROOM OPENER
GREAT BUSINESS DECISIONS – Bill Inmon – The Father of the Data Warehouse
Bill Inmon, is recognized as the "father of the data warehouse" and co-creator of the "Corporate
Information Factory." He has 35 years of experience in database technology management and
data warehouse design. He is known globally for his seminars on developing data warehouses and
UNIT 2 Page 13 of 24
Business Driven Technology - Instructor’s Manual
has been a keynote speaker for every major computing association and many industry
conferences, seminars, and tradeshows.
As an author, Bill has written about a variety of topics on the building, usage, and maintenance of
the data warehouse and the Corporate Information Factory. He has written more than 650 articles,
many of them have been published in major computer journals such as Datamation,
ComputerWorld, DM Review and Byte Magazine. Bill currently publishes a free weekly newsletter
for the Business Intelligence Network, and has been a major contributor since its inception.
http://www.b-eye-network.com/home/
CLASSROOM EXERCISE
Analyzing Multiple Dimensions of Information
Jump! is a company that specializes in making sports equipment, primarily basketballs, footballs,
and soccer balls. The company currently sells to four primary distributors and buys all of its raw
materials and manufacturing materials from a single vendor. Break your students into groups and
ask them to develop a single cube of information that would give the company the greatest insight
into its business (or business intelligence).
Product A, B, C, and D
Distributor X, Y, and Z
Promotion I, II, and III
Sales
Season
Date/Time
Salesperson Karen and John
Vendor Smithson
CORE MATERIAL
The core chapter material is covered in detail in the PowerPoint slides. Each slide contains detailed
teaching notes including exercises, class activities, questions, and examples. Please review the
PowerPoint slides for detailed notes on how to teach and enhance the core chapter material.
2. Explain why Wikipedia must cleanse or scrub the information in its data warehouse.
Wikipedia must maintain high quality information in its data warehouse. Information cleansing
and scrubbing is a process that weeds out and fixes or discards inconsistent, incorrect, or
incomplete information. Without high quality information Wikipedia will be unable to offer
customers accurate and complete information.
UNIT 2 Page 14 of 24
Business Driven Technology - Instructor’s Manual
3. Explain how a company could use information from Wikipedia to gain business intelligence.
Business intelligence comes from such things as environmental scanning and market analysis.
A company could use information from Wikipedia as external information in its data warehouse
that could help it analyses new trends and technologies.
2. Identify why information cleansing and scrubbing is critical to California Pizza Kitchen’s
business intelligence tool’s success.
Financial statements must be as accurate and complete as possible. There have been too
many instances in the past where shoddy financial statements have lead to financial crisis
such as Enron and WorldCom. It does not matter how good or how many BI tools California
Pizza Kitchen uses; if the core data is dirty the results will be inaccurate.
3. Illustrate why 100 percent accurate and complete information is impossible for Noodles &
Company to obtain.
Noodles & Company will never have 100 percent accurate and complete information. Perfect
information is pricey. Achieving perfect information is almost impossible. The more complete
and accurate an organization wants to get its information, the more it costs. The tradeoff
between perfect information lies in accuracy verses completeness. Accurate information
means it is correct, while complete information means there are no blanks. Most organizations
determine a percentage high enough to make good decisions at a reasonable cost, such as
85% accurate and 65% complete.
4. Describe how each of the companies above is using BI to gain a competitive advantage.
Ben & Jerry’s is using BI to improve quality. Customers know that a pint of Ben & Jerry’s ice
cream is of the highest quality.
California Pizza Kitchen and Noodles & Company are using BI to improve financial analysis
capabilities. Both companies can now receive more accurate and complete financial views of
their businesses.
UNIT 2 Page 15 of 24
Business Driven Technology - Instructor’s Manual
UNIT TWO
CLOSING MATERIAL
3. Harrah’s was one of the first casino companies to find value in offering rewards to customers
who visit multiple Harrah’s locations. Describe the effects on the company if it did not build any
integrations among the databases located at each of its casinos.
Without database integration among its hotels and casinos, Harrah’s would be unable to
determine what a customer’s true value is to the company. For example, a customer that
spend $500,000 dollars at one casino might be treated like royalty. This same customer could
visit another Harrah’s location, but since the information is not integrated, the new location
would have no idea that they had a high-rolling customer on the premises, and they might not
treat the customer accordingly.
4. Estimate the potential impact to Harrah’s business if there is a security breach in its customer
information.
Some customers have concerns regarding Harrah’s information collection strategy since they
want to keep their gambling information private. If there was a security violation and sensitive
UNIT 2 Page 16 of 24
Business Driven Technology - Instructor’s Manual
customer information was compromised, Harrah’s would risk losing its customers’ trust and
their business.
5. Explain the business effects if Harrah’s fails to use data-mining tools to gather business
intelligence.
Having terra bytes of data without anyway to analysis the data makes the data useless.
Harrah’s must use data-mining tools to sift through the massive amounts of data in its
warehouse to uncover the business intelligence that has given it a competitive advantage over
its customers.
6. Identify three different types of data marts Harrah’s might want to build to help it analyze its
operational performance.
Answers to this question will vary. Potential answers include (1) customers’ spending habits
across properties, (2) repeat customer spending habits at a single location, (3) dealer sales at
a location and across locations.
7. Predict what might occur if Harrah’s fails to clean or scrub its information before loading it into
its data warehouse.
Harrah’s must maintain high quality information in its data warehouse. Information cleansing
and scrubbing is a process that weeds out and fixes or discards inconsistent, incorrect, or
incomplete information. Without high quality information Harrah’s will be unable to make good
business decisions and operate its service-oriented strategy. Potential business effects
resulting from low quality information include:
Inability to accurately track customers
Difficulty identifying valuable customers
Inability to identify selling opportunities
Marketing to nonexistent customers
Difficulty tracking revenue due to inaccurate invoices
Inability to build strong customer relationships – which increases buyer power
Google previously had access to 3.3 billion pages. In 2004, its index covers 4.3 billion pages,
880 million images, 845 million bulletin board posts, plus book chapters and reviews, which
Google also searches. In 2005, the company went public and indexes over 8 billion pages.
UNIT 2 Page 17 of 24
Business Driven Technology - Instructor’s Manual
Images took the biggest jump, doubling in size because of the booming popularity in digital
cameras.
Google adds indexes by sending tens of thousands of computers to crawl the Web to find pages
to add. To grow the index, Google could not add more PC power, for fear of crashing certain
Web sites; instead, it formed a secondary index. The searches that retrieve millions of results go
into the main index; the more esoteric ones go into the second index. A larger index means a
better chance of people clicking on your ad, because with more results, more people will be
searching.
2. Describe the impact on Google’s business if the search information it presented to its
customers was of low quality.
Displaying links that do not work, links that have nothing to do with the query, or multiple
duplication of links will cause customers to switch to a different search engine. If Google’s
search results were of low quality, they would quickly lose business. Since providing search
results is Google’s primary line of business, it must display high quality search results.
3. Explain how the Web site RateMyProfessors.com solved its problem of poor information.
The developers of the Web site turned to Google’s API to create an automatic verification tool.
If Google finds enough mentions in conjunction with a new professor or university to be added
to the database, then it considers the information valid and posts it to the Web site.
4. Identify the different types of entity classes that might be stored in Google’s indexing database.
Entity classes could include:
DOCUMENT TITLE
SEARCH TERM
WORD
LOCATION
WEB PAGE
5. Identify how Google might use a data warehouse to improve its business.
Google could use a data warehouse to contain not only internal organization information, but
also external information such as market trends, competitor information, and industry trends.
Google could then analyze its business across markets, among its competitors, and throughout
different industries.
6. Explain why Google would need to cleanse the information in its data warehouse.
Google must maintain high quality information in its data warehouse. Information cleansing and
scrubbing is a process that weeds out and fixes or discards inconsistent, incorrect, or
incomplete information. Without high quality information Google will be unable to make good
business decisions.
UNIT 2 Page 18 of 24
Business Driven Technology - Instructor’s Manual
7. Identify a data mart that Google’s marketing and sales department might use to track and
analyze its AdWords revenue.
One potential data mart might include information broken down by industry (products,
telecommunications, health care, energy, travel, human services) and tracked against revenue
by companies. This would tell Google which industries are using AdWords and which
industries are untapped. It would also tell Google which customers in each industry are taking
advantage of AdWords and perhaps would benefit from a specialized marketing plan, and
which customers are not yet taking advantage of AdWords and might be interested in learning
about the product.
2. INFORMATION TIMELINESS
Project Purpose: To understand the role frequency plays in a backup or update strategy.
Potential Solution: Potential answers can include:
Weather tracking system – must update in real-time to be able to track hurricanes,
tornados, etc.
Car dealership inventories – depending on the size and number of dealerships it could
update hourly if there were many dealerships who shared inventories, or it could update
nightly if there were only two dealerships that kept in close contact
Vehicle tire sales forecasts – update weekly or monthly since the information is used for
forecasting
UNIT 2 Page 19 of 24
Business Driven Technology - Instructor’s Manual
Interest rates – depending on what you are doing with the interest rates you might update
hourly, daily, or weekly (if you are only using them for an analysis)
Restaurant inventories – the size of the restaurant will probably play a factor in deciding
the frequency of updates. Hourly for a large restaurant that can not afford to run out of any
of its entrees, to daily for a small restaurant that has loyal customers and it is OK to tell its
customers that it has run out of its specials.
Grocery store inventories – near real-time to avoid product overstocking and understocking
4. INTEGRATING INFORMATION
Project Purpose: To understand the reasons for integrating information.
Potential Solution: Information levels include individual, department, and enterprise.
Information formats include document, presentation, spreadsheet, and database. Information
granularities include detail, summary, and aggregate. In a single organization you will find
many different levels, formats, and granularities of information. Correlating these different
types of information can help an organization analyze its information.
An integration allows separate systems to communicate directly with each other. Typical
organizations maintain many different systems that store information in different levels,
formats, and granularities. Integrating these systems can save an organization time and
money. Without integrations, an organization will (1) spend considerable time entering the
same information in multiple systems and (2) suffer from the low quality and inconsistency
typically embedded in redundant information. A forward integration takes information entered
into a given system and sends it automatically to all downstream systems and processes. A
backward integration takes information entered into a given system and sends it automatically
to all upstream systems and processes. Ideally, an organization wants to build forward and
backward integrations; however, this can be expensive to maintain.
UNIT 2 Page 20 of 24
Business Driven Technology - Instructor’s Manual
Addressing any of the following four primary sources of low quality information will increase
Real People’s quality issues:
Online customers intentionally enter inaccurate information to protect their privacy
Information from different systems that have different information entry standards and
formats
Call center operators enter abbreviated or erroneous information by accident or to save
time
Third party and external information contains inconsistencies, inaccuracies, and errors
UNIT 2 Page 21 of 24
Business Driven Technology - Instructor’s Manual
Payment method
Sales personnel involved with sale
Customer gender
Customer age
Promotion
Weather
One way the student can check the addresses is to compare to the United States Posts Office
which publishes a listing of accurate addresses. Many organizations use this resource to help
ensure accurate addresses with correct zip codes.
Data mining is the process of analyzing data to extract information not offered by the raw data
alone. For example, Ruf Strategic Solutions helps organizations employ statistical approaches
within a large data warehouse to identify customer segments that display common traits. Marketers
can then target these segments with specially designed products and promotions.
Data mining can also begin at a summary information level (coarse granularity) and progress
through increasing levels of detail (drilling down), or the reverse (drilling up). To perform data
mining, users need data-mining tools. Data-mining tools use a variety of techniques to find patterns
and relationships in large volumes of information and infer rules from them that predict future
behavior and guide decision making. Data-mining tools for data warehouses and data marts
include query tools, reporting tools, multidimensional analysis tools, statistical tools, and intelligent
agents.
UNIT 2 Page 22 of 24
Business Driven Technology - Instructor’s Manual
Student answers to this project will vary. The important part is their justification for each rating.
Sample ratings could include:
UNIT 2 Page 23 of 24
Business Driven Technology - Instructor’s Manual
Student answers to these questions will vary depending on the students search preferences.
There are a number of companies competing with Google including Ask.com, Alta Vista, and
Yahoo. What qualifies as a good search engine depends on the types of searches and user’s
preferences, which students will start to understand after performing this activity.
UNIT 2 Page 24 of 24