0% found this document useful (0 votes)
72 views

Mtech-Syllabus-Data Science - Sem2

Syllabus

Uploaded by

reshmaitagi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
72 views

Mtech-Syllabus-Data Science - Sem2

Syllabus

Uploaded by

reshmaitagi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

NITTE MEENAKSHI INSTITUE OF TECHNOLOGY

(A Unit of Nitte Education Trust (R), Mangalore)


An Autonomous Institution

Department of Information Science and


Engineering

Curriculum
Handbook for
M.Tech – Data
Science
SEMESTER II
Semester: II Year: 2019-2020

Department: Information Science and Engineering Course Type: Core


Course Title: Scalable Computing Course Code:19DS21
L-T-P:3-0-2 Credits: 04
Total Contact Hours:39 hrs Duration of SEE: 3 hrs
SEE Marks: 50 CIE Marks: 50

Pre-requisites:
 Software systems, programming, data structures and algorithms.
 Good programming skills (preferably in Java) Operating Systems,
 Distributed Computing Systems,
 Introduction to Cloud Computing,
 Design and Analysis of Algorithms.

Course Outcomes:
Students will be able to
CO’s Course Learning Outcomes BL
CO1 Describe the basic concepts and technologies of distributed systems. L2
CO2 Illustrate the requirements and challenges when designing, building and L2
managing distributed systems.
CO3 Analyze different scalable distributed system designs. L4
CO4 Analyze use cases for managing distributed file system L4
CO5 Implement the scalable distributed databases and its analysis. L3

Teaching Methodology:
 Blackboard teaching and PPT
 Assignment

Assessment Methods
 Open Book Test for 10 Marks.
 Assignment evaluation for 10 Marks.
 Three internals, 30 Marks each will be conducted and the Average of best of two will be taken.
 Final examination, of 100 Marks will be conducted and will be evaluated for 50 Marks.

Course Outcome to Programme Outcome Mapping

PO1 PO2 PO3 PO4 PO5 PO6


CO1 1 2 1 1
CO2 2 2 1 2 2
CO3 2 2 2 1
CO4 3 3 1 2 2
CO5 3 2 3 1 2 2
19DS21 2 2 2 1 2 2

COURSE CONTENT
Unit – I 8 Hrs
Distributed System Models and Enabling Technologies. : Scalable Computing over the Internet. . The Age of Internet
Computing, Scalable Computing Trends and New Paradigms, The Internet of Things and Cyber-Physical Systems,
Technologies for Network-Based Systems, Multicore CPUs and Multithreading Technologies, GPU Computing to
Exascale and Beyond, Memory, Storage, and Wide-Area Networking, Virtual Machines and Virtualization Middleware
, Data Centre Virtualization for Cloud Computing, System Models for Distributed and Cloud Computing, Clusters of
Cooperative Computers, Grid Computing Infrastructures, Peer-to-Peer Network Families., Cloud Computing over the
Internet, Software Environments for Distributed Systems and Clouds. Service-Oriented Architecture (SOA), Trends
toward Distributed Operating Systems, Parallel and Distributed Programming Models, Performance, Security, and
Energy Efficiency, Performance Metrics and Scalability Analysis, Fault Tolerance and System Availability, Network
Threats and Data Integrity, Energy Efficiency in Distributed Computing.
Unit – II 8 Hrs
Virtual Machines and Virtualization of Clusters and Data Centres.: Implementation Levels of Virtualization,
Levels of Virtualization Implementation, Design Requirements and Providers, Virtualization Support at the OS Level,
Middleware Support for Virtualization, Virtualization Structures/Tools and Mechanisms, Hypervisor and Xen
Architecture ,Binary Translation with Full Virtualization, Para-Virtualization with Compiler Support, Virtualization
of CPU, Memory, and I/O Devices, Hardware Support for Virtualization, CPU Virtualization, Memory Virtualization,
I/O Virtualization, Virtualization in Multi-Core Processors, Virtual Clusters and Resource Management, Physical
versus Virtual Clusters, Live VM Migration Steps and Performance Effects, Migration of Memory, Files, and Network
Resources, Dynamic Deployment of Virtual Clusters,

Containers: Containers and Serverless: Kernel namespaces and cgroups, Use cases: Docker, Kubernetes
Unit – III 8 Hrs
Cloud Platform Architecture over Virtualized Data Centers. Cloud Computing and Service Models, Public, Private,
and Hybrid Clouds, Cloud Ecosystem and Enabling Technologies, Infrastructure-as-a-Service (IaaS), Platform-as-a-
Service (PaaS) and Software-as-a-Service (SaaS). Public Cloud Platforms: GAE, AWS, and Azure. Data Science in
cloud: AWS machine learning, Azure Machine Learning, IBM BlueMix, Sense.io, Domino DataLabs, DataJoy,
PythonAnywhere
Unit – IV 7 Hrs
MapReduce and the New Software Stack :Distributed File Systems , MapReduce , Algorithms Using MapReduce ,
Extensions to MapReduce ,The Communication Cost Model,Complexity Theory for MapReduce ,
Unit – V 8 Hrs
Analysing Big Data:The Challenges of Data Science, Introducing Apache Spark. Introduction to Data Analysis with
Scala and Spark :Scala for Data Scientists,The Spark Programming Model, Record Linkage, Getting Started: The Spark
Shell and Spark Context,Bringing Data from the Cluster to the Client,Shipping Code from the Client to the
Cluster,Structuring Data with Tuples and Case Classes, Aggregations, Creating Histograms, Summary Statistics for
Continuous Variables, Creating Reusable Code for Computing Summary Statistics, Simple Variable Selection and
Scoring

Text Books:

1. Kai Hwang, G. C. Fox, J.J. Dongarra “Distributed & Cloud Computing”, Morgan Kauffman Publishers
2. Mining of Massive Datasets. 2nd edition. - Jure Leskovec, AnandRajaraman, Jeff Ullman. Cambridge
University Press. http://www.mmds.org/
3. By Sandy Ryza, Uri Laserson, Josh Wills, Sean Owen Advanced Analytics with Spark”” 2nd Edition,
Publisher: O'Reilly Media, ISBN: 9781491972946
Semester: II Year: 2019-2020


Department: Information Science and Engineering Course Type: Core
Course Title: Neural Network & Deep Learning Course Code:19DS22
L-T-P:3-0-2 Credits: 04
Total Contact Hours:39 hrs Duration of SEE: 3 hrs
SEE Marks: 50 CIE Marks: 50

Pre-requisites:
 Machine learning-I, Data mining

Course Outcomes:
Students will be able to
CO’s Course Learning Outcomes BL
CO1 Understand the basic concepts of artificial neural networks L2
CO2 Model Neuron and Neural Network, and to analyze ANN learning, and its L4
applications.
CO3 Develop different single layer/multiple layer Perception learning L3
algorithms.
CO4 Design of another class of layered networks using deep learning principles. L3

Teaching Methodology:
 Blackboard teaching and PPT
 Executable Codes/ Live Demonstration
 Programming Assignment

Assessment Methods
 Online certification from Course-era/Edx, etc. for 10 marks
 Programming assignments evaluated using rubrics for 10 marks
 Three internals, 30 Marks each will be conducted and the Average of best of two will be taken.
 Final examination, of 100 Marks will be conducted and will be evaluated for 50 Marks.

Course Outcome to Programme Outcome Mapping

PO1 PO2 PO3 PO4 PO5 PO6


CO1 1 2 1 1
CO2 2 2 1 2 2
CO3 2 2 2 1
CO4 3 3 1 2 2
CO5 3 2 3 1 2 2
19DS21 2 2 2 1 2 2
COURSE CONTENT

Unit – I 8 Hrs
Introduction to Neural Networks: Neural Network, Human Brain, Models of Neuron, Neural networks viewed as
directed graphs, Biological Neural Network, Artificial neuron, Artificial Neural Network architecture, ANN learning,
analysis and applications, Historical notes.

Learning Processes: Introduction, Error correction learning, Memory-based learning, Hebbian learning, Competitive
learning, Boltzmann learning, credit assignment problem, learning with and without teacher, learning tasks, Memory
and Adaptation.
Unit – II 8 Hrs
Single layer Perception: Introduction, Pattern Recognition, Linear classifier, Simple perception, Perception learning
algorithm, Modified Perception learning algorithm, Adaptive linear combiner, Continuous perception, Learning in
continuous perception. Limitation of Perception

Unit – III 8 Hrs


Multi-Layer Perceptron Networks: Introduction, MLP with 2 hidden layers, Simple layer of a MLP, Delta learning
rule of the output layer, Multilayer feed forward neural network with continuous perceptions, Generalized delta learning
rule, Back propagation algorithm

Unit – IV 7 Hrs
Introduction to Deep learning: Neuro architectures as necessary building blocks for the DL techniques, Deep
Learning & Neocognitron, Deep Convolutional Neural Networks, Recurrent Neural Networks (RNN)
Unit – V 8 Hrs
Feature extraction, Deep Belief Networks, Restricted Boltzman Machines, Autoencoders, Training of Deep neural
Networks, Applications and examples (Google, image/speech recognition), Deep Learning Tools: Tensorflow, Caffe,
Theano, Torch.

Text Books:
1. Neural Network- A Comprehensive Foundation, Simon Haykins, 2nd Edition, 1999, Pearson
Prentice Hall, ISBN-13: 978-0-13-147139-9.
2. Ian Goodfellow, Yoshua Bengio and Aaron Courville, Deep Learning, MIT Press, 2016.
3.
Reference Books:

1. Introduction to Artificial Neural Systems, Zurada and Jacek M, 1992, West Publishing Company,
ISBN: 9780534954604
2. Learning & Soft Computing, Vojislav Kecman, 1st Edition, 2004, Pearson Education, ISBN:0-262-
11255-8
3. Neural Networks Design, M T Hagan, H B Demoth, M Beale, 2002, Thomson Learning, ISBN-10:
0-9717321-1-6/ ISBN-13: 978-0-9717321-1-7
Online Materials

1. Deep learning courses by courseera: https://www.coursera.org/courses?query=deep%20learning


2. https://www.classcentral.com/course/coursera-neural-networks-and-deep-learning-9058
3. https://www.classcentral.com/course/coursera-introduction-to-deep-learning-9606
4. https://www.deeplearningbook.org/
Semester: II Year: 2019-2020

Department: Information Science and Engineering Course Type: Core


Course Title: Machine Learning-II Course Code:19DS23
L-T-P:3-0-2 Credits: 04
Total Contact Hours:39 hrs Duration of SEE: 3 hrs
SEE Marks: 50 CIE Marks: 50

Prerequisite:
 Machine learning-I

Course Outcomes:
Students will be able to
CO’s Course Learning Outcomes BL
CO1 Understand Key concepts, tools and approaches for pattern recognition on L2
complex data sets
CO2 Understand Kernel methods for handling high dimensional and non-linear L2
patterns
CO3 Apply the state-of-the-art algorithms such as Support Vector Machines and L3
Bayesian networks
CO4 Solve real-world machine learning tasks from data to inference L2
CO5 Demonstrate the theoretical concepts and the motivations behind different L3
learning frameworks

Teaching Methodology:

 Black board teaching / Power Point presentations


 Executable Codes/ Live Demonstration
 Programming Assignment

Assessment Methods:

 Online certification from NPTEL/course-era for 10 marks


 Programming assignments evaluated using rubrics for 10 marks
 Three internals, 30Marks each will be conducted and the Average of best of two will be taken.
 Final examination, of100 Marks will be conducted and will be evaluatedfor50Marks.

Course Outcome to Programme Outcome Mapping:


PO PO1 PO2 PO3 PO4 PO5 PO6

CO1 2 2 2
CO2 2 3 2
CO3 2 3 3 2 2
CO4 3 2 3 3 2
CO5 3 1 2 2 1 2
19DS23 2 2 3 2 1 2
COURSE CONTENT

Unit – I 8 Hrs
Instance based learning and learning set of rules: K- Nearest Neighbor Learning – Locally Weighted Regression –
Radial Basis Functions – Case Based Reasoning – Sequential Covering Algorithms – Learning Rule Sets – Learning
First Order Rules – Learning Sets of First Order Rules – Induction as Inverted Deduction – Inverting Resolution
(TextBook1)
Unit – II 8 Hrs
Support Vector machine: Maximum margin hyperplanes: Rationale for Maximum Margin, Linear SVM: Separable
Case: Linear Decision Boundary, Margin of a Linear Classifier, Learning a Linear SVM model, Linear SVM: Non-
separable Case, Nonlinear SVM: Attribute Transformation, Learning a Nonlinear SVM, Kernel Trick, Characteristics
of SVM. (Chapter 5.5 from TextBook-2)
Unit – III 8 Hrs
Transfer Learning: Introduction, Transfer in inductive learning: Inductive transfer, Bayesian transfer, Hierarchical
transfer, Transfer with Missing Data or Class Labels.
Transfer in reinforcement learning: Starting-Point Methods, Imitation Methods, Hierarchical Methods, Alteration
Methods, New RL Algorithms
Avoiding negative transfer: Rejecting Bad Information, Choosing a Source Task, Modelling Task Similarity;
Automatically mapping tasks: Equalizing Task Representations, Trying Multiple Mappings, Mapping by Analogy; The
future of transfer learning
Unit – IV 8 Hrs
Analytical learning: Introduction, Learning with Perfect domain theories, Remarks on Explanation based learning;
Explanation based learning of search control knowledge
Combining Inductive and Analytical learning: Motivation, Inductive-Analytical approaches to learning, using prior
knowledge to initialize the hypothesis, Using prior knowledge to initialize the hypothesis, Using prior knowledge to
alter the search objective, Using prior knowledge to augment search. (TextBook1)
Unit – V 7 Hrs
Reinforcement Learning: Introduction, The Learning Task, Q Learning, Nondeterministic Rewards and Actions,
Temporal Difference learning, Generalization from Examples, Relationship to Dynamic programming. (TextBook1)

Text Books:
1. Tom M. Mitchell, “Machine Learning”, McGraw-Hill Education (INDIAN EDITION), 2013.
2. Amanda Casari, Alice Zheng, “Feature Engineering for Machine Learning”, O’Reilly, 2018.

Reference Books:
1. Hands-On Machine Learning with Scikit-Learn and Tensor Flow: Concepts, Tools, and Techniques to
Build Intelligent Systems

Online Materials
1. Machine Learning by Stanford University-Coursera
2. Machine Learning with TensorFlow on Google Cloud Platform Specialization-Coursera
3. Become a Machine Learning Engineer – Udacity.
Semester: II Year: 2019-2020

Department: Information Science and Engineering Course Type: Elective


Course Title: Data Security and Privacy Course Code:19DSE243
L-T-P: 4-0-0 Credits: 04
Total Contact Hours:52 hrs Duration of SEE: 3 hrs
SEE Marks: 50 CIE Marks: 50

Pre-requisites:

 Knowledge of databases and how they are managed.


 Fundamentals of algorithm design techniques.
.
Course Outcomes:
Students will be able to
CO’s Course Learning Outcomes BL
CO1 Analyze the vulnerabilities in any computing system and hence be able to L4
design a security solution
CO2 Identify the security issues in the data network and resolve it. L2
CO3 Evaluate security mechanisms using rigorous approaches. L4
CO4 Understand the privacy and anonymoization L2

Teaching Methodology:

 Black Board Teaching / Power Point Presentation

Assessment Methods:

 Seminar on data security for 10 marks


 Assignment based on data security and access control problems for 10 marks
 Three internals, 30 Marks each will be conducted and the Average of best of two will be taken.
 Final examination of 100 Marks will be conducted and will be evaluated for 50 Marks.

Course Outcome to Programme Outcome Mapping

PO1 PO2 PO3 PO4 PO5 PO6


CO1 2 2 1
CO2 2 2 1 1
CO3 3 2 3 2
CO4 2 2 2 2
19DSE243 2 2 2 1 1 1
COURSE CONTENT

UNIT – I : DATA SECURITY FUNDAMENTALS 10 hrs


Computer Security Concepts,IntrusionDetection,Firewalls: Characteristics,Types.Classical Encryption Techniques
Symmetric Cipher Model, Cryptography, Cryptanalysis and Brute-Force Attack,Substitution Techniques, Caesar
Cipher, Monoalphabetic Cipher, Polyalphabetic Cipher, One Time Pad.Block Ciphers and the data encryption
standard: Traditional block Cipher structure, stream Ciphers and block Ciphers, Motivation for the Feistel Cipher
structure, the Feistel Cipher.
UNIT – II: Public-Key Cryptography 10 hrs
Principles of Public-key Cryptosystems, Public-Key Cryptosystems, Applications for Public-Key Cryptosystems,
Requirements for Public-Key Cryptosystems.Public-Key Cryptanalysis. The RSA Algorithm, Description of the
Algorithm,Computational Aspects, the Security of RSA. Other Public-Key Cryptosystems:Diffe-Hellman Key Exchange,
The Algorithm, Key exchange protocols, Man-in-the-Middle Attack, Simple secret key distribution, Secretkey distribution
with confidentiality and authentication, A hybrid scheme.Public keys certificates, X.509certificates. Public key
infrastructure, PKIXManagement Functions, PKIX Management Protocols.

UNIT – III : Authentication and Authorization 10 hrs

Authentication Vs Authorization, Authentication Methods –Password authentication, Public Key Cryptography, Biometric
authentication, Out of band, Authentication Protocols – SSL, Password Authentication Protocol (PAP), Kerberos, Email
authentication,- PGP, Database authentication, Message authentication; secure hash functions and Authorization
Approaches to hmac; publickey cryptography principles; public-key cryptography algorithms, digital signatures, key
management. Kerberos, x.509 directory authentication service. Authorization Definition, Multilayer authorization,

UNIT – IV: DATA PRIVACY AND ANONYMIZATION 12 hrs


Understanding Privacy: Social Aspects of Privacy Legal Aspects of Privacy and Privacy Regulations Effect of
Database and Data Mining technologies on privacy challenges raised by new emerging technologies such RFID,
biometrics, etc., Privacy Models
Introduction to Anonymization, Anonymization models: K-anonymity, l-diversity, t-closeness, differential privacy
Database as a service
UNIT – V : DATA PRIVACY FOR DATA SCIENCE 10 hrs
Using technology for preserving privacy. Statistical Database security Inference Control Secure Multi-party
computation and Cryptography Privacy-preserving Data mining Hippocratic databases
Emerging Applications: Social Network Privacy, Location Privacy, Query Log Privacy, Biomedical Privacy

Text books:

1. Cryptography and Network Security Principles and Practice William Stallings, 6th edition,
Pearson Education
2. The Algorithmic Foundations of DifferentialPrivacy, Cynthia Dwork and Aaron Roth. DOI:
10.1561/0400000042.
Reference books:

1. https://s3.amazonaws.com/assets.datacamp.com/production/course_6412/slides/chapter1.
pdf
2. Privacy-Preserving Data Mining- Models and Algorithms, Charu C Aggarwal, Yu Philips, S., Springer
3. Principles of Information Security, Information SecurityProfessional - Michael E. Whitman and
Herbert J. Mattord,4th Edition, Thompson.
Semester: II Year: 2019-2020

Department: Information Science and Engineering Course Type: Elective


Course Title: Big Data Analytics Course Code:19DSE251
L-T-P: 4-0-0 Credits: 04
Total Contact Hours:52 hrs Duration of SEE: 3 hrs
SEE Marks: 50 CIE Marks: 50

Prerequisite:
 Database Management Systems

Course Outcomes:
Students will be able to
CO’s Course Learning Outcomes BL
CO1 Describe Big Data and its importance with its applications L2
CO2 Differentiate various big data technologies like Hadoop MapReduce, Pig, L4
Hive, Hbase and No-SQL.
CO3 Apply tools and techniques to analyze Big Data. L3
CO4 Design a solution for a given problem using suitable Big Data Techniques L4

Teaching Methodology:
 Black board teaching/ Power Point presentations
 Executable Codes/ Live Demonstration
 Programming Assignment
Assessment Methods:
 Online certification for 10 marks
 Programming assignments evaluated using rubrics for 10 marks
 Three internals, 30Marks each will be conducted and the Average of best of two will be taken.
 Final examination, of100 Marks will be conducted and will be evaluatedfor50Marks.

Course Outcome to Programme Outcome Mapping

PO1 PO2 PO3 PO4 PO5 PO6


CO1 3 3 2
CO2 3 2 3 1
CO3 3 2 3 1
CO4 3 2 3 1
19DSE152 3 2 3 1
COURSE CONTENT

Unit – I 10 Hrs
INTRODUCTION TO BIG DATA: Big Data and its Importance – Four V’s of Big Data – Drivers for Big Data –
Introduction to Big Data Analytics – Big Data Analytics applications
BIG DATA TECHNOLOGIES:Hadoop’s Parallel World – Data discovery – Open source technology for Big Data
Analytics – cloud and Big Data –Predictive Analytics – Mobile Business Intelligence and Big Data – Crowd Sourcing
Analytics – Inter- and Trans-Firewall Analytics - Information Management.
Unit – II 10Hrs
PROCESSING BIG DATA: Integrating disparate data stores - Mapping data to the programming framework -
Connecting and extracting data from storage - Transforming data for processing - Subdividing data in preparation for
Hadoop Map Reduce.
Unit – III 10 Hrs
HADOOP MAPREDUCE: Employing Hadoop Map Reduce - Creating the components of Hadoop Map Reduce jobs
- Distributing data processing across server farms -Executing Hadoop Map Reduce jobs - Monitoring the progress of
job flows - The Building Blocks of Hadoop Map Reduce - Distinguishing Hadoop daemons - Investigating the Hadoop
Distributed File System Selecting appropriate execution modes: local, pseudo-distributed, fully distributed.
Unit – IV 12 Hrs
BIG DATA TOOLS AND TECHNIQUES: Installing and Running Pig – Comparison with Databases – Pig Latin –
User Define Functions – Data Processing Operators – Installing and Running Hive – Hive QL – Tables – Querying
Data – User-Defined Functions – Oracle Big Data
Unit – V 10 Hrs
ADVANCED ANALYTICS PLATFORM: Real-Time Architecture – Orchestration and Synthesis Using Analytics
Engines – Discovery using Data at Rest – Implementation of Big Data Analytics – Big Data Convergence – Analytics
Business Maturity Model.

Text Books:
1. Michael Minelli, Michehe Chambers, “Big Data, Big Analytics: Emerging Business Intelligence and
Analytic Trends for Today’s Business”, 1stEdition, AmbigaDhiraj, Wiely CIO Series, 2013.
2. ArvindSathi, “Big Data Analytics: Disruptive Technologies for Changing the Game”, 1st Edition, IBM
Corporation, 2012.
3. Bill Franks, “Taming the Big Data Tidal Wave: Finding Opportunities in Huge Data Streams with
Advanced Analytics”, 1st Edition, Wiley and SAS Business Series, 2012.
4. Tom White, “Hadoop: The Definitive Guide”, 3rd Edition, O’reilly, 2012
Additional Reference Book
1. Boris lublinsky, Kevin t. Smith, Alexey Yakubovich, “Professional Hadoop Solutions”, Wiley, ISBN:
9788126551071, 2015.
2. Chris Eaton, Dirk deroos et al., “Understanding Big data”, McGraw Hill, 2012.
3. VigneshPrajapati, “Big Data Analytics with R and Haoop”, Packet Publishing 2013.
4. JyLiebowitz, “Big Data and Business analytics”,CRC press, 2013
Online Materials
1. http://www.bigdatauniversity.com/
2. https://www.coursera.org/courses?query=big%20data%20analytics
Semester: II Year: 2019-2020

Department: Information Science and Engineering Course Type: Elective


Course Title: Business Analytics Course Code:19DS252
L-T-P: 4-0-0 Credits: 04
Total Contact Hours:52 hrs Duration of SEE: 3 hrs
SEE Marks: 50 CIE Marks: 50

Course Outcomes:
Students will be able to
CO’s Course Learning Outcomes BL
CO1 Describing the significance of global platform for data retrieval/process L2
among different business cultures of the world.
CO2 Develop domain knowledge of various technology and its application to L2
facilitates managerial decision /MIS
CO3 Enable communication for data driven decision making L3
CO4 Implement cross functional collaboration to enhance efficiency and L3
productivity.

Teaching Methodology:
 ICT enabled Classroom teaching
 Case study
 Practical / live assignment
 Interactive class room discussions
Assessment Methods
 Group Discussion for 10 Marks.
 Assignment evaluation for 10 Marks.
 Three internals, 30 Marks each will be conducted and the Average of best of two will be taken.
 Final examination, of 100 Marks will be conducted and will be evaluated for 50 Marks.

Course Outcome to Programme Outcome Mapping

PO1 PO2 PO3 PO4 PO5 PO6


CO1 1 2 2 1
CO2 2 2 1 2 2
CO3 3 2 3 1
CO4 3 2 3 1 2 1
19DS241 3 2 2 1 2 1
COURSE CONTENT

Unit – I 12 Hrs
Introduction to Business Analytics: Why Analytics, Business Analytics: the Science of data driven decision making,
Descriptive Analysis, Predictive Analytics, Prescriptive Analytics, Big Data Analytics, Web and Social media
Analytics, Machine Learning Algorithms, Framework for data driven decision making, Analytics Capability Building,
Roadmap, Challenges, Types (Descriptive, Predictive and Prescriptive), Business Intelligence versus Business
Analytics, Transaction Processing v/s Analytic Processing, OLTP v/s OLAP, OLAP Operations, Data models for
OLTP
Unit – II 10Hrs
Descriptive Analytics: Introduction, Data Types and Scales, Types of Data Measurement Scale, Population and
Sample, Types of Data Measurement Scale
Data Warehouse: Definition, characteristics, framework Data lake Business Reporting, Visual Analytics: Definition,
concepts, Different types of charts and graphs, Emergence of data visualization and visual analytics
Unit – III 10 Hrs
Data Mining: Concepts and applications, Data mining process Text & Web Analytics, Text analytics and text mining
overview, Text mining applications, Web mining overview, Sentiment analysis overview, Supply Chain and
Operations Analytics, Customer Analytics, Project Management, Decision Analysis, Process Analytics, Market
Intelligence
Unit – IV 12 Hrs
Social Network Analysis: Overview of SNA, history and resources, Mathematical foundations, matrices and graph
theory, Whole versus personal networks, one-mode versus two-mode network data, Collecting network data, Informant
accuracy, Network visualizations, Cohesive subgroups, bottom-up and top-down approaches, Block models, Egocentric
SNA, design and applications
Unit – V 8 Hrs
Business Performance Management: Business performance management cycle, KPI, Dashboard Analytics in
Business Support Functions, Sales & Marketing Analytics, HR Analytics, Financial Analytics, Production and
operations analytics, Analytics in Industries: Telecom, Retail, Healthcare, Financial Services

Text Books:

1. U. Dinesh Kumar, “Business Analytics – The Science of Data Driven Decision Making”, Wiley 2017.
2. Ramesh Sharda, DursunDelen, Efraim Turban, “Business Intelligence: A Managerial Perspective on
Analytics”, Pearson, 3e.
3. Wasserman, S., & Faust, K. (1994). Social Network Analysis: Methods and Applications. A classic,
essential textbook on SNA.
Reference Books:
1. Jesper Thorlund &Gert H.N. Laursen, “ Business Analytics for Managers: Taking Business Intelligence
Beyond”, Wiley
2. Sahil Raj, “Business Analytics”, Cengage
3. James R. Evans, “Business Analytics”, Pearson
4. https://www.bebr.ufl.edu/sites/default/files/ANG5420_Syllabus.pdf

List of Journals / Periodicals / Magazines / Newspapers / Web resources (Case Study):


 International Journal of Business Analytics
 International Journal of Business Analytics and intelligence
 International Journal on Consumer and Business Analytics
 Analytics India – Magazine
Semester: II Year: 2019-2020

Department: Information Science and Engineering Course Type: Elective


Course Title: Social Network Analysis Course Code:19DSE253
L-T-P: 4-0-0 Credits: 04
Total Contact Hours:52 hrs Duration of SEE: 3 hrs
SEE Marks: 50 CIE Marks: 50

Prerequisite:
 Fundamental of Network, Data Mining, Graph theory
 Advanced Algorithms

Course Outcomes:
Students will be able to
CO’s Course Learning Outcomes BL
CO1 Understand the basics of Social Network Models and analysis. L2
CO2 Analysesocial network models for community detection. L4
CO3 Implement link prediction and event detection L3
CO4 Analyse social influence and contributing factors. L4

Teaching Methodology:
 Black board teaching
 Power Point presentations

Assessment Methods:
 Rubrics for evaluation of case study 20 Marks
 Three internals, 30Marks each will be conducted and the Average of best of two will be taken.
 Final examination, of100 Marks will be conducted and will be evaluatedfor50Marks.

Course Outcome to Programme Outcome Mapping:

PO1 PO2 PO3 PO4 PO5 PO6


CO1 2 2 2
CO2 2 1
CO3 3 2 1 2
CO4 3 2 3
19DSE253 2 2 1 2
COURSE CONTENT

Unit – I 10 Hrs

Social Networks : An Introduction; Types of Networks: General Random Networks, Small World Networks, Scale-
Free Networks; Examples of Information Networks; Network Centrality Measures; Strong and Weak ties; Homophily
Walks: Random walk-based proximity measures, Other graph-based proximity measures. Clustering with
random-walk based measures

Unit – II 12 Hrs

Community Detection Algorithms for Community Detection: The Kernighan-Lin algorithm,


Agglomerative/Divisive algorithms, Spectral Algorithms, Multi-level Graph partitioning, Markov Clustering;
Community Discovery in Directed Networks , Community Discovery in Dynamic Networks, Community Discovery in
Heterogeneous Networks, Evolution of Community.

Unit – III 12 Hrs

Link Prediction: Feature based Link Prediction, Bayesian Probabilistic Models, Probabilistic Relational Models,
Linear Algebraic Methods: Network Evolution based Probabilistic Model, Hierarchical Probabilistic Model, Relational
Bayesian Network. Relational Markov Network.

Unit – IV 10 Hrs

Event Detection: Classification of Text Streams, Event Detection and Tracking: Bag of Words, Temporal, location,
ontology based algorithms. Evolution Analysis in Text Streams, Sentiment analysis.

Unit – V 8 Hrs

Social Influence Analysis: Influence measures, Social Similarity - Measuring Influence, Influencing actions and
interactions. Influence maximization.

Text Books:
1. David Easley, Jon Kleinberg: Networks, Crowds and Markets: Reasoning about a highly connected
world, Cambridge Univ Press 2010
2. S.Wasserman, K.Faust: Social Network Analysis: Methods and Applications, Cambridge Univ Press,
1994
Semester: II Year: 2019-2020

Department: Information Science and Engineering Course Type: Core


Course Title: Natural Language & Text Mining Course Code: 19DSE254
L-T-P:4-0-0 Credits: 04
Total Contact Hours:52 hrs Duration of SEE: 3 hrs
SEE Marks: 50 CIE Marks: 50

Prerequisites:
 Fundamental of Language Processing.
Course outcomes:
Students will be able to:
CO’s Course Learning Outcomes BL
CO1 Describe the basics of Natural Language Processing. L2
CO2 Analyze syntactic and semantic parsing techniques. L2
CO3 Implement a rule-based system to tackle morphology/syntax of a Language L3
CO4 Describe the various issues of Natural Language of Processing. L2

Teaching Methodology:
• Blackboard teaching
• PowerPoint presentations
Assessment Methods:
• Three internals, 30 Marks each will be conducted and the Average of best of two will be taken.
• Rubrics for evaluation of case study 20 Marks
• Final examination, of 100 Marks will be conducted and will be evaluated for 50 Marks.

Course Outcome to Programme Outcome Mapping:

PO 1 PO 2 PO 3 PO 4 PO 5 PO 6
CO1 3 2
CO2 3 2 1
CO3 3 2 2 1 2
CO4 3 3 2 2
19DSE254 3 2 2 2 2 2
COURSE CONTENT
Unit – I 11 Hrs

Classical Approaches to Natural Language Processing: context, Classical Toolkit Text Preprocessing Lexical
Analysis, Syntactic Parsing, Semantic Analysis , Natural Language Generation
Text Preprocessing :Introduction Challenges of Text Preprocessing , Character-Set Dependence , Language
Dependence , Corpus Dependence , Application Dependence ,Tokenization ,Tokenization in Space-Delimited
Languages , Tokenization in Un segmented Languages , Sentence Segmentation ,Sentence Boundary Punctuation
, The Importance of Context , Traditional Rule-Based Approaches. Lexical Analysis: Introduction ,Finite State
Morphonology ,Closing Remarks on Finite State Morphonology , Finite State Morphology , Disjunctive Affixes,
Inflectional Classes, and Exceptionality , Further Remarks on Finite State Lexical Analysis , “Difficult”
Morphology and Lexical Analysis ,Isomorphism Problems , Contiguity Problems , Paradigm-Based Lexical
Analysis, Paradigmatic Relations and Generalization..

Unit – II 12 Hrs

Syntactic Parsing: Introduction ,Background ,Context-Free Grammars , Example Grammar , Syntax Trees , Other
Grammar Formalisms , Basic Concepts in Parsing , Parsing as Deduction ,Deduction Systems , The CKY Algorithm ,
Chart Parsing , Bottom-Up Left-Corner Parsing , Top-Down Earley-Style Parsing , Example Session.Semantic
Analysis : Basic Concepts and Issues in Natural Language Semantics ,Theories and Approaches to Semantic
Representation , Logical Approaches , Discourse Representation Theory , Pustejovsky’s Generative Lexicon , Natural
Semantic Meta language , Object-Oriented Semantics , Relational Issues in Lexical Semantics , Sense Relations and
Ontologies , Roles , Fine-Grained Lexical-Semantic Analysis: Three Case Studies , Emotional Meanings: “Sadness”
and “Worry” in English, Ethno geographical Categories: “Rivers” and “Creeks” , Functional Macro-Categories .
Prospectus and “Hard Problems”
Unit – III 08 Hrs

Natural Language Generation: Introduction ,Generation Compared to Comprehension, The Components of a


Generator, Components and Levels of Representation , Approaches to Text Planning ,The Function of the Speaker ,
Desiderata for Text Planning , Pushing vs. Pulling , Planning by Progressive Refinement of the Speaker’s Message ,
Planning Using Rhetorical Operators , Text Schemas , The Linguistic Component, Surface Realization Components ,
Relationship to Linguistic Theory , Chunk Size , Assembling vs. Navigating , Systemic Grammars , Functional
Unification Grammars The Cutting Edge Story Generation , Personality-Sensitive Generation Conclusions.
Unit – IV 10 Hrs

Corpus Creation: Introduction, Corpus Size, Balance, Representativeness, and Sampling Data Capture and
Copyright Corpus Markup and Annotation Multilingual Corpora Multimodal Corpora. Part-of-Speech Tagging
Tunga: Introduction, Parts of Speech , Part-of-Speech Problem , The General Framework, Part-of-Speech Tagging
Approaches , Rule-Based Approaches , Markov Model Approaches , Maximum Entropy Approaches ,Other Statistical
and Machine Learning Approaches , Methods and Relevant Work , Combining Taggers

Unit – V 8 Hrs

Information Retrieval: Introduction, Indexing, Indexing Dimensions • Indexing Process, IR Models Classical
Boolean Model , Vector-Space Models , Probabilistic Models , Query Expansion and Relevance Feedback , Advanced
Models , Evaluation and Failure Analysis , Evaluation Campaigns , Evaluation Measures , Failure Analysis , Natural
Language Processing and Information Retrieval, Morphology , Orthographic Variation and Spelling Errors , Syntax ,
Semantics , Related Applications
Text Analytics: text analytics systems, Named entity recognition Disambiguation, Document clustering: identification
of sets of similar text documents, Term frequency-inverse document frequency- TFIDF, Analysis and Evaluation of
Current Graph-Based Text Mining Researches, Coreference: Relationship, Case study on Biomedical text mining,
Text Books:
1. Nitin Indurkhya, Fred J Damerau “Handbook of Natural Language Processing”, Chapman & Hall/CRC
Publications, 2nd Editions 2010.
Reference Books:
1. Tanveer Sidiqui, U.S Tiwary, “ Natural Language Processing & Information Retrieval”, Oxford
University Press, 2008.
2. Anne Kao & Stephen R Poteel, “ Natural Language & Text Mining”, Springer- Verlag , 2007

You might also like