0% found this document useful (0 votes)
25 views

MIS Week 2

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views

MIS Week 2

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 70

Management Information System

Saini Das
Vinod Gupta School of Management, IIT Kharagpur

Module 02: Foundations of Business Analytics


Lecture 01 : Databases & Information Management
File organization concepts

Computer system uses hierarchies


 Bit : Smallest unit of data a computer can handle
 Byte: Group of bits, represents a single character
 Field: Group of bytes put into a word
 Record: Group of related fields
 File: Group of records of same type
 Database: Group of related files
File organization concepts (contd..)

• Record: Describes an entity

• Entity: Person, place, thing about which we store information


• Attribute: Each characteristic, or quality, describing entity
• E.g. Attribute Course and Grade belong to entity STUDENT
The Data Hierarchy
Problems with the traditional files

• Data redundancy and inconsistency


• Data redundancy: Presence of duplicate data in multiple files
• Data inconsistency: Same attribute has different values

• Poor security

• Lack of data sharing and availability


The Database Approach to Data Management
• Database
• Collection of data organized to serve many applications by
centralizing data and controlling redundant data
• Database management system
• Interfaces between application programs and database
• Separates logical and physical views of data
• Solves problems of traditional file environment
• Controls redundancy
• Eliminates inconsistency
• Enables central management and security
Human Resources Database with Multiple Views

•Interfaces between application programs and database


•Separates physical and logical views of data
•Enables central management and security
•Controls redundancy
•Eliminates inconsistency

A single human resources database provides many different views of data, depending on the
information requirements of the user. Illustrated here are two possible views, one of interest to a
benefits specialist and one of interest to a member of the company’s payroll department.

A single human resources database provides many different views of data, depending on the
information requirements of the user. Illustrated here are two possible views, one of interest to a
benefits specialist and one of interest to a member of the company’s payroll department.
Relational DBMS
• Relational DBMS
• Represent data as two-dimensional tables called relations or files
• Each table contains data on entity and attributes
• Examples: Microsoft Access, Oracle Database, MySQL, Microsoft SQL Server.
• Table: Grid of columns and rows
• Rows (tuples): Records for different entities
• Fields (columns): Represents attribute for entity
• Primary key: Field in table used for key fields or unique identification
• Foreign key: Primary key used in second table as look-up field to
identify records from original table
Relational Database Tables

A relational database organizes data in the form of two-dimensional tables. Illustrated here are tables for
the entities SUPPLIER and PART showing how they represent each entity and its attributes.
Supplier_Number is a primary key for the SUPPLIER table and a foreign key for the PART table.
Relational Database Tables ….
Operations of a Relational DBMS

Three basic operations used to develop useful sets of


data
• SELECT: Creates subset of data of all records
that meet stated criteria
• JOIN: Combines relational tables to provide user
with more information than available in individual
tables
• PROJECT: Creates subset of columns in table,
creating tables with only the information specified
The Three Basic Operations of a Relational DBMS

Find suppliers for parts 137 or 150

The select, project, and join operations enable data from two different tables to be
combined and only selected attributes to be displayed.
Capabilities of DBMS

• Data manipulation language: Used to add,


change, delete, retrieve data from database
• Structured Query Language (SQL)
• Microsoft Access user tools

• Many DBMS have report generation capabilities


for creating polished reports
Example of an SQL Query

SELECT: Lists the desired fields that have to be included in the query
FROM: Lists the tables from where the data has to be drawn
WHERE: Specifies the values of the fields that have to be included or the
conditions that have to be met to include the field.

Illustrated here are the SQL statements for a query to select suppliers for parts 137 or 150
References

• K. Laudon and J. Laudon (2016). Management Information


Systems Publisher: Pearson. Edition 14e.

• R. De. (2018). MIS Managing Information Systems in Business,


Government and Society. Publisher: Wiley. Second Edition.
Management Information System
Saini Das
Vinod Gupta School of Management, IIT Kharagpur

Module 02: Foundations of Business Analytics


Data Warehouses & Business Intelligence
E-R (Entity-Relationship Diagrams)
• E-R Models are descriptions of the business requirements of data
from the user perspective.
• They are often the first step in database design.
• A set of diagrammatic tools used to create these models are
called E-R diagrams.
• An E-R diagram consists of data entities, relationships between
these entities and attributes that describe the entities and
relationships.
• Relationships identify natural links or associations between
entities.
E-R (Entity-Relationship Diagrams) contd…

Movie entity with attributes


E-R (Entity-Relationship Diagrams) contd..

• The number of entities in a relationship represent the cardinality


of a relationship.

Student Takes Course


5 75 0 6

Cardinality of relationship
E-R (Entity-Relationship Diagrams) contd..
Normalization
Streamlining complex groupings of data to minimize redundant
data elements.
Students Engg. Mechanics
Roll No. Name Dept. HoD Dept. Contact
no.
101 Sachin Electrical Prof. X 1234567
102 Rahul Mechanical Prof. Y 4567899
103 Saurav Electronics Prof. Z 6789048
104 Virat Mechanical Prof. Y 4567899
105 Dhoni Electrical Prof. X 1234567
106 Anil Mechanical Prof. Y 4567899
Normalization (contd..)

Students Engg. Mechanics Department


Roll No. Name Dept. Id Dept. Dept. Id HoD Dept. Contact
Id no.
101 Sachin 001
001 Electrical Prof. X 1234567
102 Rahul 002
002 Mechanical Prof. Y 4567899
103 Saurav 003
003 Electronics Prof. Z 6789048
104 Virat 002
105 Dhoni 001
106 Anil 002
Using Databases to Improve Business Performance &
Decision Making

• Very large organizations with huge amounts of data felt the need for:
• Consolidating much of the data from various databases into a whole that
could be understood clearly.
• Focusing on the use of data for decision making, as opposed to simply for
running transactions.
Using Databases to Improve Business Performance &
Decision Making (contd..)

• This gave rise to special capabilities and tools required for


analyzing large quantities of data:
• Data warehouses
• Data marts
• Data mining
Data Warehouses
• Stores current and historical data from many core operational
transaction systems
• Consolidates and standardizes information for use across enterprise,
but data cannot be altered.
• To create a data warehouse, data is extracted from transactional
tables, pre-processed to remove unwanted data types and then
loaded into tables in the data warehouse.
• Warehouses differ from transaction databases as users can run
complex queries on them.
• Data warehouses provide querying, analysis, and reporting tools.

Data Mart

• Subset of data warehouse with summarized or highly focused


portion of firm’s data for use by specific population of users.
• Typically focuses on single subject or line of business.
• Used to identify problems and find solutions pertaining to a
particular domain.
• Example: Sales data mart.
Components of a Data Warehouse

BI

The data warehouse extracts current and historical data from multiple operational systems inside the
organization. These data are combined with data from external sources and reorganized into a central
database designed for management reporting and analysis. The information directory provides users
with information about the data available in the warehouse.
Business Intelligence
• Tools for consolidating, analyzing, and providing access to vast
amounts of data to help users make better business decisions
• E.g. Harrah’s Entertainment analyzes customers data to develop gambling profiles and
identify most profitable customers
• Principle tools to derive BI from data in a warehouse include:
• Software for database querying and reporting
• Online analytical processing (OLAP)
• Data mining
What drives Business Intelligence?
Online analytical processing
(OLAP)
• Supports multidimensional data analysis
• Enables viewing data using multiple dimensions
• Each aspect of information (product, pricing, cost, region,
time period) is different dimension
• E.g. How many cycles were sold in Eastern India in June?
• OLAP enables rapid, online answers to ad hoc queries
Multidimensional Data Model

• What were the actual sales of Bolts in Central India?


References

• K. Laudon and J. Laudon (2016). Management Information


Systems Publisher: Pearson. Edition 14e.

• R. De. (2018). MIS Managing Information Systems in Business,


Government and Society. Publisher: Wiley. Second Edition.
Management Information System
Saini Das
Vinod Gupta School of Management, IIT Kharagpur

Module 02: Foundations of Business Analytics


Introduction to Data Mining
- W. Edwards Deming
“There is a striking correlation between an
organization's analytics sophistication and its
competitive performance.”

10 Insights: A first look at the new intelligent enterprise survey on winning with data,
MIT Sloan Management Review, Vol 52, No 1, 2010
“Data Scientists will be the sexiest job of
21st century”

Harvard Business Review 2012


Why mine data?

 Huge amounts of data being collected and warehoused


 Walmart records 20 million items in transactions per day
 Health care transactions: multi-gigabyte databases

 Source of competitive advantage


What is data mining and KDD?

• Knowledge discovery in databases (KDD) is the non-trivial


process of identifying valid, potentially useful,
understandable and ultimately actionable patterns in data.

• Data mining is a step in the KDD process of applying data


analytics and discovery algorithms
Knowledge Discovery Process

Integration Understanding

Interpretation Knowledge
& Evaluation

Knowledge
Raw
Data __ __ __
__ __ __
Patterns
__ __ __ and
Rules
Transformed
DATA Target Data
Ware Data
house
DM is multidisciplinary

Statistics Machine learning

Data Mining

Database
systems
Data Mining Applications

Typical Applications
Customer Segmentation What
Which
WhatHow
Which
are
isofcan Imost
customers
my
the tellchannel
mymarket
best which
are
valuable
What is the life time
segments
goodto
transactions
reach
candidates
and
customers
who
my customers
are
for
are
myare
our
at likely
customers
new
risk to
in each
of
long
Propensity to Buy profitability of my customers ?
be
byfraudulent
distance
marketleaving
segment
calling
segment? ??
? plans ?
Profitability Modeling & Profiling
Customer Attrition
Channel Optimization
Fraud Detection

Targeting customers
Personalize
Increase
Prevent
Interact loss
w/customers
high
customer
ofvalue
highbased on
customers
value their needs.
relationships.
based
customers
onbased
Detect and prevent fraud to minimize loss.
More product
Higher
on
and
their
current
let sales
preference. = value
satisfaction
go of
& lower
future Greater
= loyalty
profitability.
Higher
customers.
retention
9
Industry wide applications of analytics
Some other applications….
• Marketing
• Which customers are likely to respond to this campaign?
• Which customers are likely to be profitable ?
• Who might want to buy this product ? (‘Cross-selling’)
• Which web-pages are customers visiting before buying products or
before leaving our site ? Which types of customers visit which pages ?
• Telecommunications
• Which customers are vulnerable to attrition (at risk of churning) ?
• Based on these symptoms, where are problems located in the network ?
• Finance and Insurance
• Which customers are credit risks / insurance risks ?
• Which claims or credit transactions are fraudulent ?
• Which stocks are likely to perform well in the next 3 months, and why ?
When should we buy and sell, given the likely performance and
transaction costs ?

11
11
Some other applications (Contd..)
• Healthcare
• Which patients may take longer to recover ?
• What is the likely cause of this illness ?
• Which patients are at risk of disease (and might benefit from
medication)? Pfizer pharmaceuticals used data mining to construct a
predictive model that was then embedded in their online cholesterol
health risk assessment, which tells patients their cholesterol risk score.
High risk patients can consult their doctors and request Lipitor, Pfizer’s
cholesterol medication.

• Retail
• Which products do customers buy together (or in sequence) & which do
they not buy together ? (‘Category management’.)
• What characterizes customers at various stores ?
• What items are bought for cash, on credit, or by check ?
• What type of customer buys this item, or this product type ?

12
12
Data Mining Applications (Contd..)

• Quality Control
• Which shipments are high-risk and need to be inspected ?
• Customer Support
• Which tasks schedule (ordering) is optimal (or good enough) ?
• Which customer service representative should I assign to a task ?
• What documents or people are likely to be helpful to the customer in
solving their problem ?

13
13
Real-life Applications…
At Flipkart…

• Forecast demand for each SKU.

• Predict customer cancellations and returns.

• Predict what a customer is likely to purchase in the future?

• How to optimize the delivery system?


Management Information System
Saini Das
Vinod Gupta School of Management, IIT Kharagpur

Module 02: Foundations of Business Analytics


Data Analytics Tools and Techniques
1
Divorce360.com

2
Target predicts customer pregnancy

https://www.youtube.com/watch?v=XH1wQEgROg4

3
Categories of Data Analytics

4
5
Descriptive Analytics Applications

6
Diagnostic Analytics Applications

 Why customers liked your social media campaign or why they


didn’t?

 Why certain products were popular at a certain time, at a certain


place?

7
Predictive Analytics Applications

8
Prescriptive Analytics Applications

9
Analytics Tools & Techniques
Data
Analytics

Descriptive Diagnostic Predictive Prescriptive


analytics analytics analytics analytics

Data Visualization Root cause Regression Optimization


analysis
Clustering Forecasting

Classification
Decision tree
Neural nets
SVM
Logistic Regression

Association Rule
Mining
Descriptive Analytics Techniques
 Data Visualization: Graphical representation of data. By using visual elements
like charts, graphs, and maps, data visualization tools provide an accessible way
to see and understand trends, outliers, and patterns in data.
 Software for Data Visualization: MS Excel; Power BI, Tableau; QlikView

11
Descriptive Analytics Techniques (contd..) Intercluster distances
are maximized

 Clustering: Given a set of data points, each having a set of


attributes, find clusters such that:
 data points in one cluster are more similar to one another
 data points in separate clusters are less similar to one another.
Intracluster distances
are minimized
 Application in Market Segmentation: Subdivide a market into
distinct subsets of customers based on their geographical and
lifestyle related information where any subset may conceivably be
selected as a market target to be reached with a distinct marketing
mix.

12
Predictive Analytics Techniques
 Regression analysis: Regression analysis is a statistical technique
used to describe relationships among variables. The simplest case to
examine is one in which a variable Y , referred to as the dependent or
target variable, may be related to one variable X, called an
independent or explanatory variable.

 Applications:
 Predicting events that are yet to occur:
 Demand analysis or number of units consumers will purchase in the next
quarter based on past trends
 Number of shoppers who will watch a particular advertisement.
 Number of policy holders who will be involved in accidents in the next year.

13
Predictive Analytics Techniques (contd..)
• Classification: Data defined in terms of attributes, one of which is the class or
target and others are predictor variables.
• Find a model for class/target attribute as a function of the values of
other(predictor) attributes, such that previously unseen records can be
assigned a class as accurately as possible. Given data is usually divided into
training and test sets.
• Training Data: used to build the model
• Test data: used to validate the model (determine accuracy of the model).
• Classification Techniques:
• Decision tree
• Neural Networks
• Support Vector Machines
• Bayesian Classifier

14
Predictive Analytics Techniques (contd..)
 Application 1: Given old data about fraudulent customers and their background, predict whether new
customer will commit fraud or not.
Class/
Predictor attributes
Target
Tid Refund Marital Taxable
Status Income Cheat Actual

1 Yes Single 125K No


Predicted by
2 No Married 100K No algorithm
3 No Single 70K No
4 Yes Married 120K No
5 No Divorced 95K Yes
6 No Married 60K No Test
7 Yes Divorced 220K
Set
No
8 No Single 85K Yes
9 No Married 75K No Learn
Training
10 No Single 90K Yes Model
10

Set Classifier

15
Predictive Analytics Techniques (contd..)
• Application 2: Given old data about customers and payments, predict new
applicant’s loan eligibility.

Previous customers Classifier Decision rules Class/


Salary > 5 L Target
Age Eligible/
Salary Not Eligible
Prof. = Exec
Profession
Location
Customer type Loc = Mumbai

Custype = Age > 40


Eligible

New applicant’s data

16
Predictive Analytics Techniques (contd..)
• Association Rule Mining: Finding frequent patterns:
• Analysing item sets in customers basket or transactions and identifying
frequently occurring item sets, which can be basis for recommendations.
• Known as Market Basket Analysis in Retail.

 Applications:
 Store layout: Place the frequently occurring items either in
close proximity or as far apart as possible; give combo offers.
 Warehousing and inventory management.

17
THANK YOU!

18

You might also like