Introduction To BI in IT (Student Handout)
Introduction To BI in IT (Student Handout)
in
Information Technology
Arup K Das
Doctorate, Information Management, MDI Gurugram
Head of Security & Compliance, Ericsson India Digital Services
Guest Faculty, FMS, University of Delhi
2
Welcome to Course Structure
▪ 14 sessions
✓ 11 core sessions
✓ 3 sessions – case study
▪ 2hrs 15mins session every
Saturday (6:45 - 9:00PM)
✓ Session-1: 60mins
✓ 15 mins (break)
✓ Session-2: 60mins
▪ Case Study Discussions
✓ 3 sessions x 50 marks
▪ Term End Exam
✓ 50 marks
Course Contents
MBAFT - 7609: BUSINESS INTELLIGENCE
Session # Topics
1 Introduction to Business Intelligence in Information Technology
2 Introduction to Business Intelligence Tools
3 Data Mining Tools and it's Applications
4 Business Analytics, Big Data Analytics and Data Science
5 Discussion of Case Study-1
6 Introduction to Data Warehousing
7 Business Intelligence & Data Warehousing Architecture
8 Business Intelligence in Organizations - Preparing a Customer Proposal
9 Business Intelligence in Organizations - Managing & Delivering a Customer Project
10 Data Privacy & Security
11 Discussion of Case Study-2
12 Introduction to Machine Learning
13 Robotic Process Automation
14 Discussion of Case Study-3
Course Contents with Timeplan
MBAFT - 7609: BUSINESS INTELLIGENCE
Session # Topics Dates (Tentative)
1 Introduction to Business Intelligence in Information Technology 23-Jan-21
2 Introduction to Business Intelligence Tools 30-Jan-21
3 Data Mining Tools and it's Applications 6-Feb-21
4 Business Analytics, Big Data Analytics and Data Science 13-Feb-21
5 Discussion of Case Study-1 15-19 Feb
6 Introduction to Data Warehousing 20-Feb-21
7 Business Intelligence & Data Warehousing Architecture 27-Feb-21
8 Business Intelligence in Organizations - Preparing a Customer Proposal 6-Mar-21
9 Business Intelligence in Organizations - Managing & Delivering a Customer Project 13-Mar-21
10 Data Privacy & Security 20-Mar-21
11 Discussion of Case Study-2 22-26 Mar
12 Introduction to Machine Learning 27-Mar-21
13 Robotic Process Automation 3-Apr-21
14 Discussion of Case Study-3 5-Apr-21
Introductions
Who am I …
Who are you …
Your name
Your educational qualifications
Your professional experiences, if any
Your expectation from this course
7
Another Definition of BI
Business Intelligence is a set of methods, processes,
architectures, applications and technologies that gather and
transform raw data into meaningful and useful information
used to enable more effective strategic, tactical and
operational insights and decision-making (to drive business
performance).
8
Data
▪ Types of Data
✓ Numerical / Textual
✓ Structured / Un-structured
✓ Standard format / Proprietary format
✓ Internal / External
✓ System stored / File based
✓ Raw data / Simulated, Forecast, Estimated data
✓ Simple fact data / Calculated metrics data
✓ Information overloading
❑ Too much data & information and difficult to find any meaningful information
❑ Difficulty in organizing data w.r.t effective access & retrieval
✓ Data everywhere
❑ Data in separate systems & different sources
✓ Difficulty of access
❑ Data is in-accessible due to technical / administrative issues
9
Decision Making
▪ Decisions can be made based on
✓ Facts or data
✓ Simulation (models)
✓ Intuition, perception, sense
✓ Group negotiation
▪ Problem
✓ Gap between Data & Knowledge (useful information leading to decision)
✓ Management / Operation by intuition
✓ Lack of effective feedback & alignment systems
✓ Need good analytical processing & models
10
Example-1
HelloFresh Centralized Digital Reporting
Company: HelloFresh
11
Example-2
Coke bottling company maximized operational efficiency
Company: Coca-Cola Bottling Company (CCBC), Coca Cola’s largest independent
bottling partner
Solution: Coca-Cola’s BI team handles reporting for all sales and delivery
operations at the company. With their BI platform, the team automated manual
reporting processes, saving over 260 hours a year—more than six 40-hour work
weeks.
Report automation and other enterprise system integrations put customer
relationship management (CRM) data back into the hands of sales teams in the
field through mobile dashboards that provide timely, actionable information and a
distinct competitive advantage.
A self-service BI implementation fosters more effective collaborations between IT
and business users that maximize the expertise of participants. Analysts and IT
can focus on big-picture strategy and long-term innovations such as enterprise
data governance rather than manual research and reporting tasks.
12
Example-3
Chipotle created a unified view of restaurant operations
Company: Chipotle
Problem: Disparate data sources hindered teams from seeing a unified view of
restaurants.
Solution: Chipotle Mexican Grill is an American restaurant chain with more than
2,400 locations worldwide. Chipotle retired their traditional BI solution for a
modern, self-service BI platform. This allowed them to create a centralized view
of operations so they can track restaurant operational effectiveness at a national
scale.
Now that staff have more access to data, the speed of report delivery for
strategic projects has tripled from quarterly to monthly and saved thousands of
hours. “This was the ticket to take all metrics and understanding to that next
level,” explained Zach Sippl, Director of Business Intelligence.
13
Business Intelligence vs Business Analytics
Business Intelligence is descriptive, telling you what's
happening now and what happened in the past to get us to that
state. It helps us to solve our current problem, by analyzing our
current state of business affairs, and providing us with insights to
solve the problem.
14
Information Processing
Transactional Processing
▪ Focus in individual data item
processing: e.g. data insertion,
modification, deletion &
transmission Analytical Processing
▪ Focus on reporting, analysis,
transformation & decision
support
15
BI – Process Flow
The organization and The process involves analytical Results are presented and delivered
transformation of data components, such as dimensional in different human comprehendible
into clean and common analysis, statistical analysis, formats to support decisions. It also
models and formats. business analytics & data mining to includes data exploration & reporting.
extract information and knowledge.
Data Preparation
The collection of raw The refined data will be modeled and Queries can also directly present
data from different stored in a particular data management results to users without intensive
sources by different systems for quality management, easy analysis. This is usually used for
means & in different and fast access and data profiling. data exploration & descriptive
formats reports.
16
Evolution of BI
Analytics, Big Data, Mobile BI, Personal BI, In-memory Database, Data
2010s Science
17
Critical Capabilities of a BI Platform
▪ Infrastructure
✓ BI Platform Administration: Platform scaling; Performance optimization; High availability
✓ Cloud BI: Platform as a service; Analytic application as a service
✓ Security & User Administration: Platform security; User administration & auditing
✓ Data Source Connectivity: Capability to connect to the data source
▪ Data Management
✓ Governance & Metadata Management: Robust & centralized way of administration of search; Re-use & publish metadata
✓ Self-Oriented ETL & Data Storage: Platform capabilities for loading data into self-containing storage area
✓ Self Service Data Preparation; User defined creation of views; Advanced features – semantic auto discovery, joins et al.
▪ Sharing of Findings
✓ Embedding Analytic Content: Building Developer’s kit with APIs; Open APIs for creating analytic contents
✓ Publishing Analytic Content: Publish, deploy & operationalize analytic content
✓ Collaboration and Social BI: Share & discuss information; Analysis & Decision Making using collaorations
18
BI System Components (at a glance)
19
1. Data
▪ Defined which data will be loaded into the system & analyzed
▪ Where do we need to store the information
▪ Technology Dependency
✓ MSSQL, MYSWL, Oracle, Red Brick, DB2
✓ OLAP type data source
▪ Data summarization
▪ Using Database Queries
✓ SQL – MSSQL & MYSQL
✓ PL/SQL – Oracle
20
2. Extract-Transform-Load (ETL)
▪ ETL is responsible for moving the source data into Data Warehouse
▪ This is a complex step that involves modifications and calculations
on the data itself
▪ This is a critical step to make sure that the BI solution is effective
21
3. Data Warehousing
▪ A Data Warehouse is an analytically oriented, integrated, time-
variant and non-volatile collection of data that supports decision
making processes
▪ It connects electronic data from different operational systems so that
the data can be queried and analyzed over time for business
decisions
▪ Data Warehouse consists of large databases that aggregate data
collected from multiple sources.
22
4. Analytical Engine
▪ Analytical Engine analyzes multi-dimensional data sets found in a Data
Warehouse to identify trends, outliers & patterns
▪ It applies Data Mining for extraction of patterns from data.
▪ Data Mining is becoming an increasingly important tool to transform data
into information. It is commonly used in a wide range of profiling practices
– example, marketing, surveillance, fraid detection & scientific discovery.
▪ Data Mining can be used to uncover patterns in data, but is often carried
out only on samples of data. The mining process will be ineffective if the
samples aren’t a good representation of the larger population.
▪ Data Mining can’t discover patterns that may be present in the larger
population, if those patterns aren’t present in the sample being “mined”.
23
5. Presentation Layer
▪ It consists of Dashboards, Reports & alerts that present findings from the analysis
▪ It is technology agnostic and meant for the end-user. It doesn’t care – how, when,
where and why the user accesses the information
▪ Interactive Dashboards
✓ Dashboard is a set of high-level reports on key metrics, typically for management system users
✓ There could be multiple reports on a single dashboard (like a car’s dashboard)
✓ It gives users a glance of key trends & metrics. It can be customized to work for anyone in an
organization – example, Sales-Rep or Frontline Operations Manager or Middle level Manager or Senior
Executive
✓ An Interactive dashboard allows users to take those dashboard reports and filter information to more
deeply analyze trends & results or to “drill down” into deeper and more detailed analysis of data
▪ Customizable Reports
✓ It presents high level findings as well as enable a user to drill down to find specific details. Most BI
systems either come with report templates and/or provide the capability to create & customize reports.
▪ Alerts
✓ It notifies users to changes selected as key to meeting user goals. It can be set to warn users on an
imminent event, changes to data or that new data needs to be entered into the system.
24
Data Management
▪ Data Management refers to a special Database System
called Data Warehouse or Data Mart, that’s often used to
store Enterprise Data
✓ Purpose of Data Warehouse is to organize lots of stable data for ease of analysis & retrieval
✓ Heterogenity
❖ Individual databases usually manage data in very different ways, even in the same organization (not to
mention external data sources which may be dramatically different)
25
Data Gathering & Integration
▪ Enterprise level data comes from multiple different sources, but need to be
combined & associated
✓ Operational Databases
✓ Spreadsheets
✓ Text, CSV
✓ PDF, Paper
▪ There is a need to bring together different data / information
✓ Autonomous (may not have the control & management of data)
✓ Distributed (from different systems & places)
✓ Different (in data model, format or platform)
▪ General processing steps – ETL
✓ Extraction – Accessing & extraditing the data from the source systems, incl.
database, flat files, spreadsheets etc.
✓ Transformation – Data cleanse i.e. change the extracted data to a format &
structure that conform to the destination data.
✓ Loading – Load the data to the destination database and check for data integrity.
26
Data Mining
▪ Data Mining (or, Knowledge Discovery in Database, KDD)
✓ Processes & techniques for seeking knowledge (relationship, trends,
patterns etc.) from a large amount of data
✓ Nontrivial, non-obvious & implicit knowledge
✓ Extremely large datasets
▪ Data Mining applications use sophisticated statistical &
mathematical techniques to find patterns and relationships among
data
✓ Classification, clustering, association, estimation, prediction, trending,
pattern etc.
▪ Common techniques
✓ Neural network, genetic algorithm, machine learning
27
Data Presentation
▪ The last mile of BI is the presentation of data or analysis to
human users
✓ It makes use of visualization techniques to help human understanding
and sense making
▪ Data visualization is the visual & interactive exploration and
graphic representation of data of any size, type (structure or un-
structured) or origin
▪ Data visualization as a decision-making catalyst
✓ As organizations seek to empower non-technical users to make data
driven decisions, they must consider the prowess of data visualization
in delivering digestible insights.
▪ Visualization can also be part of the analysis process (visual
analytics)
28
Data Presentation / Visualization Tool
▪ Reports
✓ A report is the presentation of detailed data arranged in defined layouts & formats
✓ Based on simple and direct queries, it usually involves simple analysis & transformation of
data (sorting, calculating, filtering, grouping, formatting etc.)
✓ Traditional reports contain detailed data in a tabular format and typically display numbers &
texts only with limited interactivity
✓ Modern reports can be interactive and visual but the focus is still on detailed data
▪ Dashboard
✓ It’s a visual display of the most important information needed to achieve one or more
objectives; consolidated or arranged on a single screen so the information can be monitored
at a glance
✓ It’s a set of visualization or presentation of data views organized in a single screen / page
29
Analysis Tools
▪ Descriptive Reporting
✓ Structured & fixed format reports
✓ Reports based on simple & direct queries
✓ Involves simple descriptive analysis & transformation of data, such as calculating,
sorting, filtering, grouping & formatting
✓ Ad hoc query & reporting
▪ OLAP (Online Analytical Processing)
✓ A multi-dimensional analysis & reporting application for aggregated data
✓ Great for discovering details from large quantities of data
▪ Business Analytics
✓ Practice of iterative, methodical exploration of an organization’s data with
emphasis on statistical analysis
▪ Data Mining
✓ Data Mining technoques are a blend of statistics & mathematics and also includes
AI & ML
30
OLAP
▪ Multi-dimensional queries
✓ A dimension is a particular way (or an attribute) of describing & categorizing
data
✓ Such queries are usually arithmetic aggregation operations (e.g. sum, avg etc.)
on records grouped by multiple dimensions (attributes) at different aggregation
levels
▪ It’s a function / operation that is optimized to answer queries that are
multi-dimensional in nature
✓ OLAP solutions traditionally heavily rely on backend processing and dedicated IT
personnel
▪ Examples
✓ What is the Total Sales grouped by Product Line (dimesion-1), Location
(dimension-2), Time (dimension-3) and … other dimensions
✓ Which segment of business provides the most revenue growth?
31
Reporting & Delivery
32
BI Users
33
BI Application Areas
▪ BI can be applied in all “businesses” (industries, functional areas
or domains) to drive “business” performance in both private and
public sector
✓ Private Sectors – Retail. Manufacture, Real-Estate, Financial, Sports,
Media, Entertainment, Publication etc.
✓ Public Sectors – Education, Government, Healthcare, Association etc.
▪ BI can be applied at different levels
✓ Strategic Level: Focused on high level organizational strategies &
directions
✓ Tactical Level: Focused on goals of an Organization Unit
✓ Operational Level: Focused on streamlining day-to-day operations
34
Sample BI Applications
▪ Business Management ▪ IT Management Management
✓ Strategic Planning ✓ Web Analytics
✓ Performance Management ✓ Usage Analytics
✓ Process Intelligence ✓ Security Management
✓ Competitive Intelligence ▪ Supply Chain
✓ Benchmarking
▪ Logistics
▪ Marketing & Sales ✓ Supplier & Vendor Management
✓ Customer Relationship Management ✓ Shipping & Inventory Management
✓ Customer Behavior Analysis
▪ Healthcare Management
✓ Targeted Marketing & Sales
▪ Insurance
✓ Customer Profiling
✓ Campaign Management
▪ City Planning
35
BI Vendors
▪ Microsoft: SQL Server, Power BI, Sharepoint, Excel
36
Thank You
37