0% found this document useful (0 votes)
86 views

Course Handout MBBA 6004 - 2018-19 - Sem IV

This document provides details about the MBBA 6004 Big Data Analytics course, including the course code, type, credits, prerequisites, and learning objectives. The course aims to impart knowledge of big data processing using Hadoop and Spark. It will provide an understanding of the Hadoop framework, HDFS, YARN, MapReduce, Pig, and Hive. Students will learn to process and analyze large datasets stored in HDFS. The course also maps the course outcomes to program outcomes to ensure students achieve the desired knowledge levels, and lists the recommended textbooks, supplementary readings, and online resources. Assessment includes assignments, internal tests, and an end-term examination to evaluate the course learning goals and objectives.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
86 views

Course Handout MBBA 6004 - 2018-19 - Sem IV

This document provides details about the MBBA 6004 Big Data Analytics course, including the course code, type, credits, prerequisites, and learning objectives. The course aims to impart knowledge of big data processing using Hadoop and Spark. It will provide an understanding of the Hadoop framework, HDFS, YARN, MapReduce, Pig, and Hive. Students will learn to process and analyze large datasets stored in HDFS. The course also maps the course outcomes to program outcomes to ensure students achieve the desired knowledge levels, and lists the recommended textbooks, supplementary readings, and online resources. Assessment includes assignments, internal tests, and an end-term examination to evaluate the course learning goals and objectives.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 11

Course Handout MBBA 6004 Big Data Analytics

Galgotias University
Gautam Buddha Nagar, Greater Noida, Uttar Pradesh, India

Title Page

Course Title: Big Data Analytics

Course Code: MBBA 6004

Course Type: Elective

LTPC: 3003

Semester: IV

Academic Year: 2018-19

Program: MBA

No. of Batch: One

Classroom: A-015

Name of Faculty: Dr. Adarsh Garg

Designation: Professor

School: School of Business

Faculty Cabin: A-013C

Open Hours
Wednesday: 15:00-17:00 (2:00 hrs)

Web link: http://galgotiasuniversity.edu.in/business-


administration-faculty-adarsh-garg.asp
Course Handout MBBA 6004 Big Data Analytics

Official Time Table

8:30- 9:20- 10:10- 11:00- 11:50- Lunch(12:40- 1:30- 2:20- 3:10- 04:00- 4:00-
9:20 10:10 11:00 11:50 12:40 1:30) 2:20 3:10 4:00 04:50 4:50
Sunday I1 E1 F1 TU1 1E ES1 E2 F2 TU4 2E TU16
Monday TU13 F1 E1 TU2 TU3 ES2 F2 E2 TU5 TU6 I2
Tuesday I1 A1 B1 C1 D11 ES3 A2 B2 C2 D21 I2
Wednesday TU14 D12 A1 B1 C1 ES4 D22 A2 B2 C2 TU17
Thursday J1 A1 B1 C1 D13 ES5 A2 B2 C2 D23 J2

Course Content

MBBA 6004 Big Data Analytics L T P C

Version 1.01 3 0 0 3

Prerequisite Basics concepts of Business


Analytics ; Statistics

Co-requisites None

Course Background and Learning Objectives:

Use of information has become central for the survival and development of the human race. Today we
experience a true deluge of data which record and shape our lives, ranging from large global issues
such as climate change to the smallest local problem such as controlling a thermostat. The critical
screening and processing of Big Data has become a world-wide effort, requiring academic attention
from diverse disciplines. The challenge is to develop theoretical and innovative scientific and
technological solutions to cater to the needs of the industry, the society and the environment. Given
the wide gap between demand and supply of scientists, technologists and key experts in the domain of
Data Analytics today, the course has been initiated to prepare the interested young minds for the
academic analysis of such Big Data and its applications in the society today, from business concerns
to social practices and cultural change.

The course has been designed to impart an in-depth knowledge of Big Data processing using Hadoop
and Spark. The course provides with an in-depth understanding of the Hadoop framework including
HDFS, YARN, and MapReduce. Students will learn to use Pig, Hive to process and analyze large
datasets stored in the HDFS. This course provides an overview of the field of big data analytics so that
you can make informed business decisions in distributed environment.

For you to get the most out of this subject, and for it to be a rewarding and fun learning experience for
all, I expect you to:

i. Attend the class sessions and come prepared – that is, having read the assigned readings.

ii. Have a positive attitude and be willing to engage in non-traditional learning formats.

iii. Participate openly and thoughtfully in classroom discussions.


Course Handout MBBA 6004 Big Data Analytics

iv. Challenge the ideas presented in your readings as well as those of the instructor and other
students – demonstrate your ability to think critically and to offer constructive alternatives.

Fulfil the requirements of this subject to the best of your ability. The more time and effort you put
into this subject, the more you’ll get out of it.

Step 0: At the end of the course, the student will be able to:

COs Definition of CO Knowledge


Level
CO1 Explain concrete understanding of Business problems/ Issues/
Opportunities related to Big Data and to logically model and analyze K3
diverse decision making scenarios.
CO2 Interpret the knowledge of Analytics and applying analytics in Business. K4

CO3 Compare the advanced analytical tools/ decision-making tools/ operation


K4
research techniques to analyze the business problems .
CO4 Prioritise the use of advanced analytical tools/ decision-making tools/
K4
operation research techniques for specific business scenario.
CO5 Infer the importance of using NoSQL to large datasets. K4

Step 1: Preparation of course outcomes (COs) assessment table:

COs Knowledge Assessment tools


level Internal test End Term Assignment Target
CAT1 CAT2 Others Exam
1 K3 50 20 15 10

2 K4 50 20 15 10

3 K4 50 20 20 20

4 K4 50 20 20 30

5 K4 20 30 30

Total 100 100 100 100 100

It is to see that efforts are to be taken to achieve the following level of knowledge i.e. K3,K4 through
this course. (K1-Remembering, K2-Understanding, K3-Applying, K4-Analyzing, K5-Evaluating,
K6-Creating)
Course Handout MBBA 6004 Big Data Analytics

Step 2: Course outcomes (COs) and Program Outcome Mapping

CO/PO Mapping
(S/M/W indicates strength of correlation) S-Strong, M-Medium, L-Low
Cos Programme Outcomes(POs)
PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8

CO1 M

CO2 M

CO3 M

CO4 S

CO5 S

Step 3: Books/Journals and Websites

RECOMMENDED TEXT BOOK AUTHOR & PUBLICATION

Big Data, Big Analytics: Emerging Business Michael Minelli, Michele Chambers, and Ambiga Dhiraj
Intelligence and Analytics
Seema Acharya, Subhashini Chhellappan, Willey
Big Data and Analytics

SUPPLEMENTARY READINGS
Jay Liebowitz, CRC Press
Big Data and Business Analytics
Anil Maheshwari, McGH
Data Analytics

“HADOOP: The definitive Guide”,. Tom White ,O Reilly 2012

Chris Eaton,Dirk derooset al. McGraw Hill, 2012.


, “Understanding Big data ”,

Holden Karau

Learning Spark: Lightning-Fast Big Data


Vignesh Prajapati , Packt
Analysis
Course Handout MBBA 6004 Big Data Analytics

Big Data with R and Hadoop

Online Resources
www.solver.com/xlminer-data-mining

https://rapidminer.com/
https://sourceforge.net/projects/weka

Step 4: Quizzes/Assignment/Project:

Components of evaluation are very crucial pertaining to assessing the learning goals and objectives of
the course. The following components of evaluation have been designed to assess the learning goals
and objectives. CAT-I, CAT-II and Semester End Examination will assess the learning goals 1-5 as
follows

Assessment Components and Marks 1 2 3 4 5

Assignments ( 20 marks) √ √ √ √ √

CAT I, CAT II (30 Marks) √ √ √ √

End Term Examination (50 Marks) √ √ √ √ √

Assignments

This is an individual assessment component of evaluation consisting of conceptual, theoretical,


numerical and applied skills. Through assignments students are expected to apply business research
concepts and to develop business models in a decision-making setting to achieve the objectives of the
firm. Through these components students will also develop their creative and innovative thinking by
taking critical decisions into consideration.

Continuous Assessment Test (CATI, CAT II)

This component of evaluation is to assess the performance of students after the completion of 15/30
lectures. This is to monitor students’ performance continuously and make them aware about their
mistakes and wrong understanding of the concepts.

End Term Examination (ETE)

End Term Examination is to assess students individually by keeping the overall learning goals and
objectives in mind. The questions are mostly contextual, numerical, analytical and situational.
Course Handout MBBA 6004 Big Data Analytics

Step 5: Course Outline (lecture-wise):

Lecture Topics to be discussed Readings

Module I: INTRODUCTION TO BIG DATA

This module introduces the concept of big data and Big Data Analytics and emphasizing on applications
of big data in industry.

1 Overview of Big Data and Importance Text Book Ch 1

2-3 Distributed File System Text Book Ch2, Handouts

4 Drivers of Big Data- Four Vs Text Book Ch 1

5 Big Data Analytics Text Book Ch 1


6 Big Data Applications-Industry Examples Text Book Ch 2

Module II: INTRODUCTION TO HADOOP AND HADOOP ARCHITECTURE

Module explains the concept of virtualization, Hadoop ecosystem and MapReduce

7 Concept of Virtualization

Text Book Ch2

(IBM Software: Retail)


8-10 Big Data – Apache Hadoop & Hadoop EcoSystem
IBM Software Analytics:
Media & Entertainment

Overview of HDFS, Comparison with traditional Databases


11-12 Ref Book Ch 1

Text Book Ch 4, Ref Book


13-15 Understanding MapReduce- Map and Reduce Ch 1

16-17 Installing Hadoop, making Single node/multimode Clusters- Ref Book Ch 1(Vignesh),
Handouts

Module III: HDFS, HIVE AND HIVEQL, HBASE HDFS

Basic understanding of Hive, HiveQl and HBase is provided in this module.

18-19 Understanding Hive Ref Book Ch 7(Vignesh)

20-21 Understanding HiveQL Ref Book Ch 7(Vignesh)


Course Handout MBBA 6004 Big Data Analytics

Lecture Topics to be discussed Readings

22-23 Understanding HBase Ref Book Ch 7(Vignesh)

Module IV: SPARK

Fast data analysis is essential while looking at the enormous data. The module provides explains the data
analysis with Spark

24 Understanding Data analytics project Life Cycle

25-28 Introduction to Data Analysis with Spark Ref Book Karau Ch1-4

29-32 Downloading Spark and Getting Started Handouts

Module V : NoSQL

Module explains the concept of NoSQL and its usage in industry.

33 Handouts, Ref Boo Joe


Understanding NoSQL- advantages of NoSQL Celko Ch 1-2, Text Book Ch
4

34-35 Handouts, Ref Boo Joe


SQL vs NoSQL
Celko Ch 1, Text Book Ch 4

36-37 Handouts, Ref Boo Joe


Use of NoSQL in Industry
Celko Ch 1, Text Book Ch 4

38-40 Revision and Project Presentations

Semester End term Examination

2. Course Handout

Course Description:

This course provides a comprehensive introduction to the concepts, techniques and applications of
business intelligence (BI). The class will equip students with a managerial overview of business
intelligence, a basic understanding of statistics and economics foundations in BI, a general exposure
to real world BI applications and trends, and hands-on practices of BI software..

Text Books:

RECOMMENDED TEXT BOOK AUTHOR & PUBLICATION

Big Data, Big Analytics: Emerging Business Michael Minelli, Michele Chambers, and Ambiga Dhiraj
Intelligence and Analytics
Course Handout MBBA 6004 Big Data Analytics

Big Data and Analytics


Seema Acharya, Subhashini Chhellappan, Wiley
SUPPLEMENTARY READINGS
Jay Liebowitz, CRC Press
Big Data and Business Analytics
Anil Maheshwari, McGH
Data Analytics

“HADOOP: The definitive Guide”,. Tom White ,O Reilly 2012

, “Understanding Big data ”,


Chris Eaton,Dirk derooset al. McGraw Hill, 2012.

Learning Spark: Lightning-Fast Big Data Holden Karau


Analysis

Big Data with R and Hadoop Vignesh Prajapati , Packt

Online Resources
www.solver.com/xlminer-data-mining

https://rapidminer.com/
https://sourceforge.net/projects/weka

Evaluation Scheme:

EC Evaluation Duration Marks Date &Time Nature of


No. Component Component
(50)

1. CAT-1 90 mins 30 Feb 3, 2019 Closed Book

2. CAT-2 90mins 30 Feb 22, 2019 Closed Book

3. Quiz-1 15 mins 10 Jan 15, 2019 Closed Book

4. Quiz-2 15 mins 10 Jan 30, 2019 Closed Book

5. Quiz-3 15mins 10 Feb 13, 2019 Closed Book

4. Assignment(s) - 20 Any time throughout the Open Book


semester

Teaching Pedagogy:
Course Handout MBBA 6004 Big Data Analytics

The pedagogy will be a combination of classroom discussions and lab sessions ( Free hours of
students) on Big Data (concepts and solving problems).

Chamber Consultation Hour: Chamber: C-A-013C

Wednesday: 15:00-17:00 (2:00 hrs)

Notices: All notices concerning this course will be communicated through email / whats app

Instructor-in-charge :Dr. Adarsh Garg


Sl. Date Date No. of Topics / Sub- References CO K-Level
No. From To Sessions Topics (Text Book, Mapping Mapping
Journal...(Page
no. __ to __)
Overview of Text Book Ch CO1 K3
Big Data and 1,Ch 2
Importance
Distributed File
System
Drivers of Big
Data- Four Vs
Jan 3, Jan 10, Big Data
1 08
2019 2019 Analytics
Big Data
Applications-
Industry
Examples

Concept of Text Book CO2 K4


Virtualization Ch2,Ch 4
Big Data – Ref Book Ch
Apache 1(Vignesh),
Hadoop & Handouts
Hadoop
EcoSystem
Overview of
HDFS,
Comparison
Jan 17, Jan 24, with traditional
2 12
2019 2019 Databases
Understanding
MapReduce-
Map and
Reduce
Installing
Hadoop,
making Single
node/multimod
e Clusters-

3 Jan 31, Feb 7, 10 Understanding Ref Book Ch CO3 K4


2019 2019 Hive 7(Vignesh)
Course Handout MBBA 6004 Big Data Analytics

Understanding
HiveQL
Understanding
HBase
Understanding Ref Book Karau CO4 K4
Data analytics Ch1-4
project Life Cycle
Introduction to
Feb 14, Feb 21, 10
4 Data Analysis
2019 2019
with Spark
Downloading
Spark and Getting
Started
Understanding Handouts, Ref CO5 K4
NoSQL- Boo Joe Celko
advantages of Ch 1, Text Book
NoSQL Ch 4
SQL vs NoSQL
Feb 28, Feb 28, 04 Use of NoSQL in
5
2019 2019 Industry
Revision and
Project
Presentations
Semester End
term Examination

4. Lecture Material

..\..\..\BIG DATA\bigdata-challenges-opportunities.pdf

..\..\Big Data\(Wiley CIO) Michael Minelli, Michele Chambers, Ambiga Dhiraj-Big Data, Big
Analytics_ Emerging Business Intelligence and Analytic Trends for Today's Businesses-Wiley
(2013).pdf

Suggested Course Materials


Software

Mahout: http://mahout.apache.org/ Hive: https://cwiki.apache.org/confluence/display/Hive/Home


Piglatin: http://pig.apache.org/docs/r0.7.0/tutorial.html Hadoop: http://hadoop.apache.org/
Casadenra: http://cassandra.apache.org/

Suggested Projects
Tweeter Data Management
Anomaly Detection
Stream Mining for Tweets
Course Handout MBBA 6004 Big Data Analytics

Text Mining
Sentiment Analysis

Understanding of Knowledge Levels

You might also like