0% found this document useful (0 votes)

90 views34 pages

Understanding Natural Language Processing

The document discusses Natural Language Processing (NLP), explaining its role in enabling computers to understand human language through various techniques and algorithms. It includes activities and examples to illustrate how NLP works, such as guessing games and chatbot interactions, while also addressing challenges faced by machines in processing natural language. Additionally, it outlines the steps involved in building an NLP model, from data acquisition to evaluation.

Uploaded by

bhawsar.om2309

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

90 views34 pages

Understanding Natural Language Processing

Uploaded by

bhawsar.om2309

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

How do Google home, Siri,

Alexa understand me….

Natural Language Processing

Getting Started
Resources:
[Link]
um21/publication/secondary/Class10_Facilitator_
[Link]
[Link]
um20/AI_Curriculum_Handbook.pdf
Agenda:
• NLP Concept
• How does NLP work?
Activity
Solve the puzzle
Number of independent nation in Asia
continent?

There are 48 independent nations in Asia

• How do we understand what others are saying/ written text.
• How does a computer understand what we say in our language?
• Let us experience it with the help of this AI Game:
Identify the mystery animal- it’s a voice experiment guessing game:
[Link]
[Link]

Mystery Animal
• Machine acts as an animal which has randomly picked up and player gets 20 chances to guess that animal.
• The player can ask 20 y/n questions and machine answers them y/n. Machine interprets the meaning of the questions with the help of NLP and answers
accordingly.
20Q
• 20Q will read your mind by asking a few simple questions.
• The object you think of should be something that most people would know about, but not a proper noun or a specific person, place, or thing.

Activity 1
Ask Questions from students – Mystery Animal
• Were you able to guess the animal?
• If yes, in how many questions were you able to guess it? (students can make a table for tries and
number of questions)
• If no, how many times did you try playing this game?
• What according to you was the task of the machine?
• Were there any challenges that you faced while playing this game? If yes, list them down.
• What approach must one follow to win this game?
• If you play for a long time does the performance change?
• If you ask irrelevant questions (like how is the whether .. adding noise) what will be the performance?
• Your observation (any other)
Ask Questions from students – 20 Q
• Was the app. able to guess the object?
• If yes, in how many questions was it able to guess?
• If no, how many times did you try playing this game?
• What according to you was the task of the machine?
• Were there any challenges that you faced while playing this game? If yes, list them down.
• If you play for a long time does the performance change?
• If you answer incorrectly, what will be the performance?
• Your observation (it shows responses / training contradictions detected)
Natural Language Processing

• Natural Language Processing, abbreviated as NLP, is a branch of Artificial Intelligence that

deals with the interaction between computers and humans using the natural language.
Natural language refers to language that is spoken and written by people, and natural
language processing (NLP) attempts to extract information from the spoken and written
word using algorithms.
• The ultimate objective of NLP is to read, decipher, understand, and make sense of the human
languages in a manner that is valuable. Example: spam and ham filter
STEPS for any AI model
Problem Scoping
• To understand the business model
Data Acquisition
• To understand the action from the statement, we need to collect the statement data so the machine can interpret the words/ text that they use and
understand their meaning. Such data can be collected from various means:
1. Statements written/ said by people
2. Databases available on the internet etc
Data Exploration
• Once the textual data has been collected, it needs to be processed and cleaned so that an easier version can be sent to the machine. Thus, the text is
normalised through various steps and is lowered to minimum vocabulary since the machine does not require grammatically correct statements but the
essence of it.
Modelling
• Once the text has been normalised, it is then fed to an NLP based AI model. Note that in NLP, modelling requires data pre-processing only after which the
data is fed to the machine.
Evaluation
• The model trained is then evaluated and the accuracy for the same is generated on the basis of the relevance of the answers which the machine gives to the
user’s responses. To understand the efficiency of the model, the suggested answers by the chatbot are compared to the actual answers. If they are accurate
in a model, that model is deployed.
• Mitsuku Bot

Mitsuku is an emotionally intelligent chatbot that converses [Link]

with users in a very human way, with humor, empathy and
even a little sass. • CleverBot
At the annual event "The Loebner Prize" A.I. specialists from
around the world play their bots against a panel of judges [Link]
and the most human-like bot is the winner.
• Jabberwacky

Used for Cognitive Behaviour Therapy to understand the [Link]

behaviour and mindset of people. Therapist used to treat
patients. • Haptik
Chat bots reduces customer service call, gives response [Link]
quickly, increases sale.
Chat bot used in banks, hospitality, education, food delivery • Rose
etc.
[Link]
[Link]/[Link]

• Ochatbot
[Link]

Activity 2
Discussion

• What chatbot did you try, name any one

• What is the purpose of the chatbot?
• How was the interaction with the chatbot?
• Did the chat feel like talking to a human or a robot? Why do you think so?
• Do you feel that the chatbot has a certain personality? (sports loving/ news loving etc)
• Were the responses same or different when you were asking the same questions?
Conclusion

As you interact with more and more chatbots, you would realise that some of them are
scripted or in other words are traditional chatbots while others were AI-powered and
had more knowledge. With the help of this experience, we can understand that there
are 2 types of chatbots around us: Script-bot and Smart-bot. Let us understand what
each of them mean in detail:
Rule based model/ Traditional Vs AI Model

Example: Script bot : inklet, story speaker, [Link]

Example: smart bot :Alexa, cortana, siri, google assistant
[Link]
Activity 3
NLP is used in

• Sentiment analysis- Finding if the text is leaning towards a

positive or negative sentiment, example: “I love the new
iPhone” and, a few lines later “But sometimes it doesn’t
work well - sentiment of products (Amazon), movies
(Netflix), food, restaurants (Yelp). Machine Translation Sentiment Analysis
• Text Classification - Categorizing text to various categories
example spam email/ SMS, technology, sports, fashion.
• Document Summarization - Compressing a
paragraph/document into few words or sentences,
paraphrasing.
• Parts of Speech Tagging- text to speech conversion.
• Text translator - [Link]/
• Chat bot conversation-It reduces customer service call, Information Extraction
gives response quickly, increases sale
• Virtual Assistants: Google, Cortana, Siri, Alexa
Human Language Vs Computer Language

• Humans communicate through language which we process all the time. Even in the classroom, as the teacher delivers the
session, our brain is continuously processing everything and storing it in some place. Also, while this is happening, when
your friend whispers something, the focus of your brain automatically shifts from the teacher’s speech to your friend’s
conversation. So now, the brain is processing both the sounds but is prioritizing the one on which our interest lies ☺.
• The sound reaches the brain through a long channel. As a person speaks, the sound travels from his mouth and goes to the
listener’s eardrum. The sound striking the eardrum is converted into neuron impulse, gets transported to the brain and
then gets processed. After processing the signal, the brain gains understanding around the meaning of it. If it is clear, the
signal gets stored. Otherwise, the listener asks for clarity to the speaker. This is how human languages are processed by
humans.
• Computer understands the language of numbers. Everything that is sent to the machine has to be converted to numbers. If
a single mistake is made, the computer throws an error and does not process that part.
• Now, if we want the machine to understand our language, how should this happen? What are the possible difficulties a
machine would face in processing natural language? Let us take a look at some of them :
Challenges in understanding Natural Language by Machine

* Arrangement of words and meaning – I like banana, is not same as banana like I

* Different words having the same meaning – synonym date and date - I will have a date.

* It’s raining cats and dogs- ambiguity

* Ritika is my friend, she loves to read- Anaphora resolution, who is she

* The grammar and morphology –Google translator also struggles to perfectly convert text from
one language to another.

* Perfect syntax and no meaning- Chickens feed extravagantly while the moon drinks tea.
Google translator also sometimes struggles to perfectly convert text from one
language to another.

Making the task difficult

[Link]
• Text it is messy and unstructured, and ML prefers structured, well defined
fixed-length inputs.
• By using NLP (Bag-of-Words technique) we can convert variable-length texts
into a fixed-length vector.
• ML works with numerical data rather than textual data.
How does NLP work?
• Data Processing – convert our language to number, by using text normalization, since computer
understands number
• Text Normalization- collecting text from all the documents i.e. corpus
• Sentence Segmentation- corpus is divided into sentence
• Tokenization- Each sentence is further divided into tokens
• Removing unnecessary tokens- stop words, special characters, prepositions
• Converting text to common case
• Stemming-reducing remaining words to root words, remove ing, ed
• Lemmatization – the removed word has a meaning
• Bag of Words (BoW)–occurrence/ frequency of each word and construct the vocabulary for the corpus. Steps to
implement are as follows:
• Create Dictionary
• Create document vector for each document (Term Frequency)
• Create document vector for all documents
• Create inverse document frequency(TFIDF) Term Frequency inverse document inverse
Unplugged activity - step by step approach to implement
Step 1: Collecting data and pre-processing it.
Document 1: Aman and Anil are stressed.
Document 2: Aman went to a therapist.
Document 3: Anil went to download a health chatbot.

Here are three documents having one sentence each.

Corpus: Aman and Anil are stressed. Aman went to a therapist. Anil went to download a health chatbot.

Corpus divided into sentence- Sentence Segmentation

Sentence 1: Aman and Anil are stressed.
Sentence 2: Aman went to a therapist.
Sentence 3: Anil went to download a health chatbot.

Sentence divided into tokens - Tokenization

Sentence 1 with tokens: [Aman, and, Anil, are, stressed,.]
Sentence 2 with tokens: [Aman, went, to, a, therapist,.]
Sentence 3 with tokens: [Anil, went, to, download, a, health, chatbot,.]
Removing unnecessary tokens- stop words
• Sentence 1: [Aman, and, Anil, are, stressed, .]
• Sentence 2: [Aman, went, to, a, therapist,.]
• Sentence 3: [Anil, went, to, download, a, health, chatbot,.]
Converting text to common case- lower case
• Sentence 1: [aman, anil, stressed]
• Sentence 2: [aman, went, therapist]
• Sentence 3: [anil, went, download, health, chatbot]
Stemming-reducing remaining words/ verbs to root words
• Sentence 1: [aman, anil, stress]
• Sentnece 2: [aman, went, therap]
• Sentnece 3: [anil, went, download, health, chatbot]
Lemmatization – the removed word/ stem has a meaning
• Document 1: [aman, anil, stress]
• Document 2: [aman, went, therapy]
• Document 3: [anil, went, download, health, chatbot]
Bag of Words - frequency of each word and construct the vocabulary for the corpus
aman anil stress went therapy download health chatbot

Repeated words are written just once.

Create Document Vector for each document
aman anil stress went therapy download health chatbot
1 1 1 0 0 0 0 0
• Prepare for all document (Term Frequency - TF of words)
aman anil stress went therapy download health chatbot
Sentence 1 1 1 1 0 0 0 0 0
Sentence 2 1 0 0 1 1 0 0 0
Sentence 3 0 1 0 1 0 1 1 1
Sum/ Doc 2 2 1 2 1 1 1 1
frequency
• Create Inverse Document Frequency- IDF
= total no of documents/ document frequency
Denominator = document frequency
Numerator= Total no of documents
aman anil stress went therapy download health chatbot
3/2 3/2 3/1 3/2 3/1 3/1 3/1 3/1
TFIDF for any word (Term Frequency and Inverse data frequency):
TFIDF(W)= TF(W) *log(IDF(W))
aman anil stress went therapy Download Health chatbot
1*log(3/2) 1*log(3/2) 1*log(3/1) 0*log(3/2) 0*log(3/1) 0*log(3/1) 0*log(3/1) 0*log(3/1)
1*log(3/2) 0*log(3/2) 0*log(3/1) 1*log(3/2) 1*log(3/1) 0*log(3/1) 0*log(3/1) 0*log(3/1)
0*log(3/2) 1*log(3/2) 0*log(3/1) 1*log(3/2) 0*log(3/1) 1*log(3/1) 1*log(3/1) 1*log(3/1)

• Words have been converted to numbers. It shows the importance/

considerable value of word, document wise (IDF)

aman anil stress went therapy Download Health chatbot

0.176 0.176 0.477 0 0 0 0 0
0.176 0 0 0.176 0.477 0 0 0
0 0.176 0 0.176 0 0.477 0.477 0.477
Example 2
Pre-process the given data:
• Document 1: Welcome to Great Learning, Now start learning.
• Document2: Learning is a good practice.
2 Documents
• Make a Corpus:

Corpus
Welcome to Great Learning, Now start learning. Learning is
a good practice.
Sentence Segmentation
Sentence 1:Welcome to Great Learning, Now start learning.
Sentence 2: Learning is a good practice.

Sentence divided into tokens

Sentence 1 with tokens: [Welcome, to, Great, Learning,,, Now,
start, learning,. ]
Sentence 2 with tokesn: [Learning, is, a, good, practice,.]
Removing unnecessary tokens
• Sentence 1: [Welcome, to, Great, Learning,,, Now, start,
learning,.] # since L and l is kept different
• Sentence 2: [Learning, is, a, good, practice,.]
Converting text to common case- lower case
• Sentence 1: [welcome, great, learning, now, start, learning]
• Sentence 2: [learning, good, practice]
Stemming-reducing remaining words/ verbs to root words
• Sentence 1: [welcome, great, learn, now, start, learn]
• Sentence 2: [learn, good, practice]
Lemmatization – the removed word/ stem has a meaning

Sentence 1: [welcome, great, learn, now, start, learn]

Sentence 2: [learn, good, practice]

Bag of Words - frequency of each word and construct the vocabulary for the corpus
welcome great learn now start good practice

Repeated words are written just once.

Create Document Vector for each document

welcome great learn now start good practice

1 1 1 1 1 0 0
0 0 1 0 0 1 1

Term Frequency
Document Frequency – the removed word/ stem has a meaning
welcome great learn now start good practice
1
0
1
0
1
1
1
0
1
0
0
1
0
1
Term Frequency
1 1 2 2 1 1 1 Document Freq
2/1 2/1 2/2 2/2 2/1 2/1 2/1 Inverse doc Freq

welcome great learn now start good practice

1X log(2/1) 1X log(2/1) 1X log(2/2) 1X log(2/2) 1Xlog(2/1) 0X log(2/1) 0Xlog(2/1) IDF
0X log(2/1) 0X log(2/1) 1X log(2/2) 0X log(2/2) 0 1 1

TFIDF of any word = TF(W) X log(IDF9W)

Thank you!
The capacity to learn is a gift,
The ability to learn is a skill,
The willingness to learn is a choice!”
- Brian Herbert
Content taken is the property of individual organizations and are used here for reference purpose only.

Natural Language Processing Applications
No ratings yet
Natural Language Processing Applications
71 pages
Understanding Natural Language Processing
No ratings yet
Understanding Natural Language Processing
49 pages
Introduction to Natural Language Processing
No ratings yet
Introduction to Natural Language Processing
26 pages
Understanding Natural Language Processing
No ratings yet
Understanding Natural Language Processing
9 pages
Natural Language Processing Applications
No ratings yet
Natural Language Processing Applications
61 pages
Understanding Natural Language Processing
No ratings yet
Understanding Natural Language Processing
11 pages
Overview of Natural Language Processing
No ratings yet
Overview of Natural Language Processing
21 pages
Build Your NLP AI Chatbot in Python
100% (1)
Build Your NLP AI Chatbot in Python
20 pages
AI Chatbots and Natural Language Processing
No ratings yet
AI Chatbots and Natural Language Processing
118 pages
NLP and Chatbots: Language Processing Insights
No ratings yet
NLP and Chatbots: Language Processing Insights
5 pages
AI Applications and Language Models Overview
No ratings yet
AI Applications and Language Models Overview
17 pages
Overview of Natural Language Processing
No ratings yet
Overview of Natural Language Processing
21 pages
Introduction to Natural Language Processing
No ratings yet
Introduction to Natural Language Processing
9 pages
Understanding Natural Language Processing
No ratings yet
Understanding Natural Language Processing
21 pages
Overview of NLP for Class 10 Students
No ratings yet
Overview of NLP for Class 10 Students
42 pages
AI Chatbots and Natural Language Processing
No ratings yet
AI Chatbots and Natural Language Processing
118 pages
NLP Course Overview and Applications
No ratings yet
NLP Course Overview and Applications
54 pages
NLP Applications in Linguistics and AI
No ratings yet
NLP Applications in Linguistics and AI
4 pages
Understanding Natural Language Processing
No ratings yet
Understanding Natural Language Processing
72 pages
NLP Applications and Techniques
No ratings yet
NLP Applications and Techniques
8 pages
Understanding Natural Language Processing
No ratings yet
Understanding Natural Language Processing
51 pages
Understanding Natural Language Context
No ratings yet
Understanding Natural Language Context
48 pages
B0CR67P3H9
No ratings yet
B0CR67P3H9
79 pages
Introduction to Natural Language Processing
No ratings yet
Introduction to Natural Language Processing
8 pages
Overview of Natural Language Processing
No ratings yet
Overview of Natural Language Processing
3 pages
Understanding Natural Language Processing
No ratings yet
Understanding Natural Language Processing
14 pages
Understanding Natural Language Processing
No ratings yet
Understanding Natural Language Processing
82 pages
Understanding Natural Language Processing
No ratings yet
Understanding Natural Language Processing
39 pages
Introduction to Natural Language Processing
No ratings yet
Introduction to Natural Language Processing
20 pages
Understanding Natural Language Processing
No ratings yet
Understanding Natural Language Processing
5 pages
Understanding NLP: Tokens and Entities
No ratings yet
Understanding NLP: Tokens and Entities
17 pages
Understanding Natural Language Processing
No ratings yet
Understanding Natural Language Processing
46 pages
Overview of Natural Language Processing
No ratings yet
Overview of Natural Language Processing
18 pages
Understanding Natural Language Processing
No ratings yet
Understanding Natural Language Processing
7 pages
Understanding Natural Language Processing
No ratings yet
Understanding Natural Language Processing
6 pages
AI Applications in Speech and Language
No ratings yet
AI Applications in Speech and Language
16 pages
Overview of Natural Language Processing
100% (1)
Overview of Natural Language Processing
83 pages
Introduction to Natural Language Processing
No ratings yet
Introduction to Natural Language Processing
47 pages
Understanding Natural Language Processing
No ratings yet
Understanding Natural Language Processing
70 pages
Introduction to Natural Language Processing
No ratings yet
Introduction to Natural Language Processing
90 pages
Overview of Natural Language Processing
No ratings yet
Overview of Natural Language Processing
65 pages
NLP Presentation Overview and Insights
No ratings yet
NLP Presentation Overview and Insights
18 pages
Understanding Natural Language Processing
No ratings yet
Understanding Natural Language Processing
5 pages
NLP and Machine Learning Overview
No ratings yet
NLP and Machine Learning Overview
11 pages
Introduction to Natural Language Processing
No ratings yet
Introduction to Natural Language Processing
4 pages
NLP: Enhancing Human-Machine Interaction
No ratings yet
NLP: Enhancing Human-Machine Interaction
8 pages
NLP Tasks and Challenges Overview
No ratings yet
NLP Tasks and Challenges Overview
15 pages
AI and Machine Learning Overview
No ratings yet
AI and Machine Learning Overview
9 pages
Introduction to Natural Language Processing
No ratings yet
Introduction to Natural Language Processing
48 pages
NLP Applications and Future Trends Guide
No ratings yet
NLP Applications and Future Trends Guide
12 pages
Introduction to Natural Language Processing
No ratings yet
Introduction to Natural Language Processing
42 pages
Introduction to Natural Language Processing
No ratings yet
Introduction to Natural Language Processing
13 pages
Real-World NLP Applications and Techniques
No ratings yet
Real-World NLP Applications and Techniques
23 pages
Advanced AI Concepts and Applications
No ratings yet
Advanced AI Concepts and Applications
45 pages
Understanding Natural Language Processing
No ratings yet
Understanding Natural Language Processing
5 pages
Understanding Natural Language Processing
No ratings yet
Understanding Natural Language Processing
20 pages
NLP Concepts and Challenges Overview
No ratings yet
NLP Concepts and Challenges Overview
13 pages
SAD Notes
No ratings yet
SAD Notes
147 pages
Ansys Hardware Recommendations Guide
No ratings yet
Ansys Hardware Recommendations Guide
22 pages
SCCM to Intune Migration Plan Guide
No ratings yet
SCCM to Intune Migration Plan Guide
2 pages
C++ Programming Practical Exercises
No ratings yet
C++ Programming Practical Exercises
16 pages
GC28 Fixture and Cue Management Guide
No ratings yet
GC28 Fixture and Cue Management Guide
4 pages
COSC 103 Exam: Computer Applications
No ratings yet
COSC 103 Exam: Computer Applications
3 pages
IBM BladeCenter JS21 Configuration Guide
No ratings yet
IBM BladeCenter JS21 Configuration Guide
34 pages
Citra Emulator Log Analysis
No ratings yet
Citra Emulator Log Analysis
14 pages
Unix Text Processing in Bioinformatics
No ratings yet
Unix Text Processing in Bioinformatics
37 pages
Cobol Tutorial
No ratings yet
Cobol Tutorial
24 pages
Daily Life of a Maintenance Planner
No ratings yet
Daily Life of a Maintenance Planner
206 pages
Dual-Port Asynchronous FIFO Design
100% (1)
Dual-Port Asynchronous FIFO Design
25 pages
Sign Language Apps: Systematic Review
No ratings yet
Sign Language Apps: Systematic Review
19 pages
Modernizing Oracle Tuxedo with Python
No ratings yet
Modernizing Oracle Tuxedo with Python
76 pages
Ey Robotic Process Automation
100% (1)
Ey Robotic Process Automation
6 pages
Sessional-II Exam PF, Fall-2024 - Final - Solution
No ratings yet
Sessional-II Exam PF, Fall-2024 - Final - Solution
15 pages
Omnicore FlexPendant User Manual
No ratings yet
Omnicore FlexPendant User Manual
26 pages
C File Handling Basics
0% (1)
C File Handling Basics
6 pages
MIS Overview and Key Concepts
No ratings yet
MIS Overview and Key Concepts
8 pages
IT Security Management Economics Insights
No ratings yet
IT Security Management Economics Insights
12 pages
IP Exam Paper for Class XII - Ahmedabad
No ratings yet
IP Exam Paper for Class XII - Ahmedabad
6 pages
NAS326 - V5.21 Ed3
No ratings yet
NAS326 - V5.21 Ed3
355 pages
HDFS Directory Deletion Commands
No ratings yet
HDFS Directory Deletion Commands
5 pages
Benefits of Multitouch Technology
No ratings yet
Benefits of Multitouch Technology
6 pages
Computer Networking and Data Management Quiz
No ratings yet
Computer Networking and Data Management Quiz
7 pages
SQL Roadmap: Learn & Practice Skills
No ratings yet
SQL Roadmap: Learn & Practice Skills
28 pages
Tarjan's Algorithm for Strongly Connected Components
No ratings yet
Tarjan's Algorithm for Strongly Connected Components
15 pages
Windows File System Navigation Guide
No ratings yet
Windows File System Navigation Guide
29 pages
Entry-Level Front-End Developer Resume
No ratings yet
Entry-Level Front-End Developer Resume
1 page
AI-Driven Robotic Trash Collector Boat
No ratings yet
AI-Driven Robotic Trash Collector Boat
5 pages

Understanding Natural Language Processing

Uploaded by

Understanding Natural Language Processing

Uploaded by

How do Google home, Siri,

Alexa understand me….

Natural Language Processing

There are 48 independent nations in Asia

• Natural Language Processing, abbreviated as NLP, is a branch of Artificial Intelligence that

Mitsuku is an emotionally intelligent chatbot that converses [Link]

Used for Cognitive Behaviour Therapy to understand the [Link]

• What chatbot did you try, name any one

Example: Script bot : inklet, story speaker, [Link]

• Sentiment analysis- Finding if the text is leaning towards a

* It’s raining cats and dogs- ambiguity

* Ritika is my friend, she loves to read- Anaphora resolution, who is she

Making the task difficult

Here are three documents having one sentence each.

Corpus divided into sentence- Sentence Segmentation

Sentence divided into tokens - Tokenization

Repeated words are written just once.

• Words have been converted to numbers. It shows the importance/

aman anil stress went therapy Download Health chatbot

Sentence divided into tokens

Sentence 1: [welcome, great, learn, now, start, learn]

Repeated words are written just once.

welcome great learn now start good practice

welcome great learn now start good practice

TFIDF of any word = TF(W) X log(IDF9W)

You might also like