0% found this document useful (0 votes)
17 views4 pages

NLPCourse Outline 24-25

The Natural Language Processing (Text Analytics) course at Goa Institute of Management focuses on techniques for mining and analyzing text data, with hands-on experience in text preprocessing, classification, sentiment analysis, and topic modeling. The course aims to develop practical problem-solving skills and includes various evaluation methods such as assignments and exams. Key topics covered include Python for NLP, data annotation, feature engineering, web scraping, and transfer learning.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views4 pages

NLPCourse Outline 24-25

The Natural Language Processing (Text Analytics) course at Goa Institute of Management focuses on techniques for mining and analyzing text data, with hands-on experience in text preprocessing, classification, sentiment analysis, and topic modeling. The course aims to develop practical problem-solving skills and includes various evaluation methods such as assignments and exams. Key topics covered include Python for NLP, data annotation, feature engineering, web scraping, and transfer learning.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

GOA INSTITUTE OF MANAGEMENT

Sanquelim Campus, Poriem, Sattari, Goa

Course : Natural Language Processing (Text Analytics)


Program : BDA Term :3
No of Sessions : 24 Academic Year : 2024-2025
Course Instructor(s): Prof. Soumen No. of Credits : 3
email: [email protected]

1. ABOUT THE COURSE:


a. Brief Description:
This course will cover the major techniques for mining and analyzing text data to discover
interesting patterns, extract useful knowledge, and support decision-making. Detailed analysis of
text data requires an understanding of natural language text, which is known to be a difficult task
for computers. However, a number of statistical approaches have been shown to work well for
the "shallow" but robust analysis of text data for pattern finding and knowledge discovery.
Students will get hands-on experience in core text mining techniques including text
preprocessing, text classification, sentiment and emotion analysis, and topic modeling that will
help them to become competent data scientists.

b. Teaching Learning Method:


A major emphasis of this course will be the development of skills through practical problem-
solving. Knowledge and understanding of principles can be accomplished using textbooks,
practical problems, cases and lectures. Application of the analytical skills using core
principles/concepts will be done through "hands-on" application.

2. PROGRAMME LEARNING OUTCOMES (PLO) AND COURSE LEARNING OUTCOMES (CLO)

S.No. Course Learning Outcomes (CLOs)* Bloom’s PLOs mapped / Independent CLO**
learning level

1 Graduating students will learn the NLP tools 2 C.1.Graduating students will be able to
and techniques to analyze text data. illustrate latest business practices.

2 Graduating students will learn how to 3, 4, 5 C.2. Graduating students will be able to
classify text data. apply latest technology relevant to BDA.

3 Graduating students will learn topic 3, 4, 5 C.2. Graduating students will be able to
modeling for text data. apply latest technology relevant to BDA.

4 Graduating students will learn sentimental 3, 4, 5 C.2. Graduating students will be able to
analysis on text data. apply latest technology relevant to BDA.

5 Graduating students will learn Web scraping 3, 4, 5 C.2. Graduating students will be able to

Course outline format Version 2 1


Applicable date – 1st June, 2022
and summarization of text data. apply the latest technology relevant to
BDA.

6 Graduating students will learn Text 3, 4, 5 C.2. Graduating students will be able to
Similarity and Clustering. apply latest technology relevant to BDA.

* Minimum 6 CLOs for a course with more than 2 credits and minimum 4 CLOs for a course less than 2 credits
** This should be as per the Curriculum map of the academic year in which the course is being conducted. All PLOs mapped to
the course should be included. Please state the PLO in full.
For CLOs that are course specific and not linked to any PLO, please state ‘Independent’

3. SYLLABUS
a. Synchronous (topics and sub-topics to be covered in class) –

Introduction
• Main Tasks of Text Data Mining
• Existing Challenges in Text Data Mining

Python for NLP


• NLP Tools and Libraries
• String literals
• String operations and methods
• Regular Expressions

Data Annotation and Preprocessing


• Tag removal, Stop word removal
• Stemming, Lemmatization
• POS tagging
• N-gram Language Model

Feature engineering for Text Representation


• Bag-of-Words model
• CBOW and Skip-Gram Model
• TF-IDF Model
• Word2vec model

Web Scraping
• HTML page understanding
• Information extraction from e-commerce website

Text Classification and Evaluation


• Naïve Bayes Model
• Logistic regression model
• Support Vector Machine.

Topic Model
• Latent Semantic Analysis
• Latent Dirichlet Allocation

Course outline format Version 2 2


Applicable date – 1st June, 2022
Sentiment Analysis and Opinion Mining
• Word-Level Sentiment Analysis and Sentiment Lexicon
• Aspect-Level Sentiment Analysis

Text Similarity and Clustering


• Document similarities
• Recommendation system
• Document Clustering

Information Extraction
• Named Entity Recognition
• Event Extraction

Automatic Text Summarization


• Extraction-Based Summarization
• Query-Based Automatic Summarization

Transfer Learning
• Universal Sentence Encoder
• Deep Averaging Network (DAN)
• BERT

b. Asynchronous (topics and subtopics for self-study through provided learning resources and
MOOCs) -

4. EVALUATION COMPONENTS:

S. No. Evaluation method Weight When CLOs to be


assessed
1 Assignment 20% Middle
2 Group Assignment 20% End
2 Mid-term 20% Middle 1
3 End-term 40% End 2, 3, 4,5,6

* All CLOs to be evaluated, at-least once in Mid-term or End-term. Level 1 and 2 CLOs, should be
evaluated only in Mid-term

5. TEXT BOOKS AND LEARNING MATERIALS


a. TEXT BOOKS
i. Chengqing Zong, Rui Xia, Jiajun Zhang; Text Data Mining, Springer, 2021.
ii. Alexandra George; Python Text Mining, BPB Publications, 2022.
iii. Benjamin Bengfort, Rebecca Bilbro, Tony Ojeda; Applied Text Analysis with Python:
Enabling Language-Aware Data Products with Machine Learning, O’REILLY, 2021.

Course outline format Version 2 3


Applicable date – 1st June, 2022
b.
REFERENCE BOOKS
i. Markus Hofmann, Andrew Chisholm; Text Mining and Visualization: Case Studies Using
Open-Source Tools, CRC Press
ii. Murugan Anandarajan, Chelsey Hill, Thomas Nolan - Practical Text Analytics_
Maximizing the Value of Text Data-Springer International Publishing
c. ADDITIONAL RESOURCES (JOURNALS, WEBSITE, VIDEO LINKS ETC.)
i. Notes and coding will be shared
d. TECHNOLOGY AND SOFTWARE
i. Python
e. MOOC courses
i.
6. SESSION PLAN
SESSION TOPIC Faculty Readings (Case, CLO directly Mode (Online
NO chapter, article covered / Offline /
etc.) Hybrid)
1-2 Introduction Prof. Soumen 1 Offline
3-4 Python for NLP 1 Offline
5-6 Data Annotation and 1 Offline
Preprocessing

7-8 Feature engineering for 1 Offline


Text Representation
9-12 Web scraping 5 Offline
13-14 Text Classification and 2 Offline
Evaluation
15-16 Topic Model 3 Offline
17 Sentiment Analysis and 5 Offline
Opinion Mining

18-19 Text Similarity and 6 Offline


Clustering
20-21 Information Extraction 5 Offline
and Automatic Text
Summarization
22-24 Transfer Learning Offline

7. Other -

Course outline format Version 2 4


Applicable date – 1st June, 2022

You might also like