Code Course Title L T P C
10212EC229 AI IN NATURAL LANGUAGE PROCESSING 2 0 2 3
a) Course Category
Program Elective
b) Preamble
This course provide a general introduction including the use of state automata for language
processing, fundamentals of syntax including a basic parse , advanced feature like structures
and realistic parsing methodologies basic concepts of remotes processing and typical natural
language processing applications
c) Prerequisite
Fundamentals of Machine Learning
d) Related Courses
Deep Learning, Reinforcement Learning
e) Course Outcome
Upon the successful completion of the course, student will be able to:
Knowledge Level
CO Nos. Course Outcomes (Based on Revised
Bloom’s Taxonomy)
Describe the basic fundamental and applications of
CO1 K2
Natural language processing
Apply morphological analysis, inflective and
CO2 derivational morphology, tree structure for K3
dictionaries and Speech Tagging
Analyze the various approaches on syntax in Natural
CO3 language processing K4
Analyze the differentiation of semantic and discourse
CO4 in terms of Natural language processing K4
Design an NLP system for various applications by
CO5 using the tools for sentiment classification & chatbot K3
systems
f) Correlation of COs with POs
PO PO PO PO PO PO PO PO PO PO1 PO1 PO1 PSO PSO
1 2 3 4 5 6 7 8 9 0 1 2 1 2
CO
3 2 - - 1 - 1 - 1 1 - 2 - -
1
CO
3 2 1 - 1 - 1 - 1 1 - 2 - -
2
CO
3 2 2 2 1 - 1 - 1 1 - 2 - -
3
CO
3 2 2 2 1 - 1 - 1 1 - 2 - -
4
CO
3 2 2 2 2 1 1 - 1 1 - 2 - -
5
Unit I INTRODUCTION 12
Introduction to NLP, Regular Expressions, Words, Corpora, Text Normalization, Minimum Edit
distance, N gram Language Models, Evaluating Language Models, Smoothing.
Unit II MORPHOLOGY AND PART OF SPEECH TAGGING 12
Linguistic essentials - Lexical syntax- Morphology and Finite State Transducers - English Word
Classes- The Penn Treebank Part of speech Tagging – Named Entities and Named Entity- Tagging
Rule Based Part of Speech Tagging -HMM Part-of-Speech Tagging –Conditional Random Fields-
Evaluation of Named Entity Recognition
Unit III SYNTAX ANALYSIS 12
Constituency Grammars-Context Free Grammars for English –Tree Banks-Lexicalized Grammars-
Constituency Parsing-Dependency Parsing
Unit IV SEMANTIC AND DISCOURSE ANALYSIS 12
Representing Meaning – Semantic Analysis - Lexical semantics –Word-sense disambiguation -
Supervised –Dictionary based and Unsupervised Approaches - Compositional semantics, Semantic
Role Labeling and Semantic Parsing – Discourse Analysis.
Unit V APPLICATIONS & CASE STUDIES 12
Question Answering -Case Study of Sentiment Classification, [Link] Dialogue Systems
Total: 60 Hours
Text Book:
1. Daniel Jurafsky and James H. Martin Speech and Language Processing (2nd Edition), Prentice
Hall;2 nd ed., 2008.
2. Roland R. Hausser, Foundations of Computational Linguistics: Human- Computer Communication
in Natural Language, Paperback, MIT Press, 2011.
References:
1. MachineLearning for Textby Charu [Link],Springer,2018 edition
2. Foundations of Statistical Natural Language Processing by Christopher [Link] And
Hinrich Schuetze,MIT press,1999
3. Steven Bird,Ewan Klein and Edward Loper Natural Language Processing with Python,O’Reilly
Media;1edition,2009
.
Supplementary Resources:
:
1. [Link]
2. [Link]
3. [Link]
4. [Link]
processing-understanding-text-9f4abfd13e72
5. [Link]
6. NLTK – Natural Language Tool Kit - [Link]
7 [Link]
8. [Link]
9. [Link]
10. [Link]/~klein/cs294-5/[Link]
11. [Link]
12. [Link]
List of Experiments
SINO CYCLE-1 CO mapping of
Experiments
1 Write a program to tokenize text CO 1
2 Write a program to count word frequency CO1
and to remove stop words
3 Write a program to program to tokenize CO2
Non-English Languages
4 Write a program to get synonyms from CO2
WordNet
5 Write a program to get Antonyms from CO3
WordNet
CYCLE-2
6 Write a program for stemming Non-English CO3
words
7 Write a program for lemmatizing words CO4
Using WordNet
8 Case study-based program (IBM) or CO4
Sentiment analysis
9 Write a program for POS Tagging or Word CO5
Embedding’s.
10 Write a program to differentiate stemming CO5
and lemmatizing words