0% found this document useful (0 votes)
10 views

Natural Language Processing

Uploaded by

Peela Chandani
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Natural Language Processing

Uploaded by

Peela Chandani
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 87

Natural Language

Processing

Presented by
M.Mohana
What is NLP?
• NLP stands for Natural Language Processing
• Branch of artificial intelligence (AI) that focuses on the interaction
between computers and human language
• Helps computers to understand human language and also allows
machines to communicate with us.
• Algorithms and models that enable computers to understand,
interpret, and generate human language in a meaningful way.
• For instance, Google’s keyboard suggests auto-corrects, and word
predicts in email writing (words that would be used).
• Translation systems use language modeling to work efficiently
with multiple languages.

03-04-2024 M.Mohana, Research Scholar Image Source: Google 2


How does Natural
Language Processing work?

• Converting unstructured data into


computer-readable language by NLP
attributes.
• Complex algorithms to break down any
text content to extract meaningful
information from it.
• Collected data is then used to further
teach machines the logic of natural
language
• Uses syntactic and semantic analysis
to guide machines by identifying and
recognizing data patterns.
Image Source: Google

4/3/2024 M.Mohana, Research Scholar 3


Why natural language processing is important?

• Facilitates Human-Computer Interaction: users to communicate with devices, applications, and


systems using spoken or written language
• Text Understanding and Analysis: enables organizations to analyze customer feedback, social media
content, news articles, and other textual information to understand trends, sentiments, and patterns
• Automates Routine Tasks: asks such as text summarization, sentiment analysis, document
classification, and information extraction, improving efficiency and productivity.
• Enhances Search and Information Retrieval: search engines and information retrieval systems by
understanding user queries and returning relevant results
• Supports Multilingual Communication: machine translation and language understanding capabilities,
NLP facilitates communication across different languages
• Enables Sentiment Analysis and Opinion Mining: sentiment analysis analyzes and interprets the
sentiment or emotion expressed in text data
• Empowers Chatbots and Virtual Assistants: understand user queries, provide information, perform
tasks, and offer personalized recommendations, enhancing customer service and user experience.

03-04-2024 M.Mohana, Research Scholar 4


History of NLP

https://blog.dataiku.com/nlp-metamorphosis
03-04-2024 M.Mohana, Research Scholar 5
Components of NLP
Natural Language Understanding (NLU) Natural Language Generation (NLG)

• NLU focuses on interpreting and extracting • NLG, on the other hand, deals with creating
meaning from human language input. human-like text or speech output based on
• It involves techniques such as text parsing, entity structured data or input from NLU systems.
recognition, sentiment analysis, and intent • NLG systems generate coherent and
detection. contextually relevant text or speech by
• NLU systems aim to comprehend the content of combining linguistic rules, templates, and
text or speech input to extract relevant sometimes machine learning models.
information and understand the user's intentions • NLG applications include text summarization,
or queries. language translation, chatbot responses, and
• Examples of NLU applications include chatbots content generation for news articles or reports.
that understand user queries, sentiment analysis
tools that analyze emotions in text, and voice
assistants that interpret spoken commands

03-04-2024 M.Mohana, Research Scholar 6


NLP Applications

Translating the languages


Text processing in various languages
Automatic Text Summarization
Analyzing sentiments
Image Source: Google
Speech Recognition
Named Entity Recognition
Phrase Extraction
Tense Identification
Relationship Extraction and so on.

03-04-2024 M.Mohana, Research Scholar 7


NLP Pipeline
Sentence Segmentation • Breaks the paragraph into sentence

Word Tokenization • Break the sentence into separate words or tokens

• Normalize words into its base form or root words (Celebrate- celebrates, celebrated,
Stemming
and celebrating) (no meaning sometimes)

• Similar to Stemming but root word has meaning, used to group different inflected
Lemmatization
forms of the word, called Lemma

• “is,and, the, a” – stop words might be filtered out before doing any statistical
Identifying Stop Words
analysis

Dependency Parsing • To find that how all the words in the sentence are related to each other

• Parts of speech, indicates that how a word functions with its meaning as well as
POS tags
grammatically within the sentences.

• Process of detecting the named entity such as person name, movie name,
Name Entity Recognition
organization name or location

• Used to collect the individual piece of information and grouping them into bigger
Chunking
pieces of sentences
03-04-2024 M.Mohana, Research Scholar 8
Phases of NLP

• Scans the source code as a stream of characters and converts lexemes


Lexical Analysis
• Whole text into paragraphs, sentences, and words

• Used to check grammar, and word arrangements, and show the


Syntactic Analysis relationship among the words
• I play (reject – no meaning), I play cricket ( Accept – full meaning)

• Concerned with the meaning representation


Semantic Analysis
• Focus on literal meaning of words, phrases, and sentences

• Depends upon the sentences that proceed with it and also invokes the
Discourse Integration
meaning of the sentences that follow it

• To discover the intended effects by applying a set of rules that


Pragmatic Analysis characterize cooperative dialogues.
• “open the door” – request instead of an order

03-04-2024 M.Mohana, Research Scholar 9


Things We need to know for learning NLP

Basic ML
Basic Deep • Text processing techniques
Learning
Algorithm
Concepts • Word Embedding
• Deep Learning Network for NLP (CNN,
LSTM, GRUs, Encoder and Decoder)
Python for • Attention mechanism, Transfer learning in
Mathematics NLP
NLP
Pre- • Transformers (BERT, GPT, ALBERT and
Requisites so on)
• Fine Tuning NLP task
• Large Language Model (LLM)

03-04-2024 M.Mohana, Research Scholar 10


Roadmap List for NLP
Calculus, Linear
Feature Part Of Speech
Algebra, Stats WordEmbedding Text Similarity
Extraction Tagging
and Probability

Text Information Named Entity Semantic


Text clustering
Preprocessing Extraction Extraction similarity

Text Text Machine


Sentiment Chatbot
Classification summarization Translation

Text to Speech Speech to Text

03-04-2024 M.Mohana, Research Scholar 11


Image Source: Google

Open sources and Library available for NLP


4/3/2024 M.Mohana, Research Scholar 12
Important Reading to know about NLP approaches
Level Topic Weblink
https://arxiv.org/pdf/1808.03314.pdf
RNN and LSTM
https://arxiv.org/pdf/1301.3781
Word2Vec
https://arxiv.org/abs/1409.0473
Attention for Translation Task
Beginning Attention is all you Need https://arxiv.org/abs/1706.03762

https://youtu.be/7vHquWmUriE?si=JkiXw-b6i371hyQX
Level
https://arxiv.org/abs/1810.04805
BERT
https://www.cs.ubc.ca/~amuham01/LING530/papers/radford2018improving.pdf
GPT-1
https://d4mucfpksywv.cloudfront.net/better-language-
GPT-2 models/language_models_are_unsupervised_multitask_learners.pdf
Research Papers https://www.kaggle.com/discussions/general/236973
https://arxiv.org/abs/2203.02155
RLHF
https://arxiv.org/abs/2307.09288
LIama2
https://blog.google/technology/ai/google-palm-2-ai-large-language-model/
Advanced PaLM2
Level https://mistral.ai/news/mixtral-of-experts/
Mistral : Mixture of Experts
https://openai.com/research/gpt-4
GPT -4
https://blog.google/technology/ai/google-gemini-ai
Gemini AI
03-04-2024 M.Mohana, Research Scholar 13
Projects we should try for a better understanding of NLP

Sentiment Analysis
Question Answering System
Named Entity Recognition
Fake News Detection
Topic Modeling
Text Similarity
Text summarization and machine translation
Next word Prediction
LLM applications using RAG
Fine-tune a model for a specific NLP task
Constructing your own LLM, inspired by models like Llame 2

03-04-2024 M.Mohana, Research Scholar 14


Let’s explore more
about it

4/3/2024 M.Mohana, Research Scholar 15


Text Pre-processing Techniques

Tokenization Stopword Removal


Lowercasing Punctuation Removal
Dividing text into smaller Eliminating common words
Converting all text to Removing punctuation
units such as words, phrases, that do not carry significant
lowercase. marks.
or symbols (tokens). meaning.

Lemmatization
Numeric Token Removal Whitespace Removal Stemming
Converting words to their
Eliminating numerical Eliminating extra spaces, Reducing words to their base
canonical form based on their
tokens. tabs, or newline characters. or root form.
part of speech.

Input TextSpell Checking Text Normalization Entity Recognition and Removing HTML Tags and
and Correction Standardizing text by Masking Special Characters
Detecting and correcting converting abbreviations or Identifying and masking Eliminating HTML tags and
spelling errors. variations to their full forms. named entities. special characters.

03-04-2024 M.Mohana, Research Scholar 16


Text Pre-processing Techniques Examples
Tokenization
Lowercasing Stopword Removal Punctuation Removal
Input Text: "The quick brown
Input Text: "The Quick Brown Input Text: "The quick brown Input Text: "The quick, brown
fox jumps over the lazy dog."
Fox" fox jumps over the lazy dog." fox jumps over the lazy dog!"
Output Tokens: ["The", "quick",
Output Text: "the quick brown Output Text: "quick brown fox Output Text: "The quick brown
"brown", "fox", "jumps", "over",
fox" jumps lazy dog." fox jumps over the lazy dog"
"the", "lazy", "dog", "."]

Numeric Token Removal Whitespace Removal


Input Text: "There are 5 apples Input Text: "The \t quick brown Stemming Lemmatization
on the table." fox" Input Text: "running" Input Text: "better"
Output Text: "There are apples Output Text: "The quick brown Output Text: "run" Output Text: "good"
on the table.” fox".

Entity Recognition and


Removing HTML Tags and
Input TextSpell Checking and Masking
Text Normalization Special Characters
Correction Input Text: "John Smith works at
Input Text: "w/o" Input Text: "<p>This is
Input Text: "I havvve a pen." Google."
Output Text: "without" <b>bold</b> text.</p>"
Output Text: "I have a pen." Output Text: "[PERSON] works
Output Text: "This is bold text."
at [ORGANIZATION]."

03-04-2024 M.Mohana, Research Scholar 17


Feature Extraction Techniques
Techniques Main Feature Use case
CountVectorizer Coverts text to matrix of word counts Text classification, topic modeling
TF-IDF Assign weights to words based on importance Information retrieval, text classification
(Term Frequency-Inverse
Document Frequency)
Word embeddings Vector representation of words based on semantics Text classification, Information
and syntax Retrieval
Bag of words Represents text as a vector of word frequencies Text classification, Sentimental
Analysis
Bag of n-grams Capture frequency of sequences of n words Text classification, Sentimental
Analysis
Hashing Vectorizer Maps words to fixed-size features space using Large-scale text classification, online
hashing function learning
Latent Dirichlet Allocation Identifies topics in the corpus and assigns Topic modeling, Content analysis
(LDA) probability distribution to each document
Non-negative matrix Decomposes documents-term matrix into lower- Topic modeling, Content analysis
factorization (NMF) dimensional parts
03-04-2024 M.Mohana, Research Scholar 18
https://medium.com/@eskandar.sahel/exploring-feature-extraction-techniques-for-natural-language-processing-46052ee6514
Feature Extraction Techniques
Techniques Main Feature Use case
Principal Component Reduces dimensionality of documents- Text visualization, text compression
Analysis (PCA) , t-SNE term matrix

Part-of-speech (POS) Assigns parts of speech tag to each word Named entity recognition, text classification
tagging in text
N-grams sequences of contiguous words or Captures context and word order
characters, capturing local word information, useful in language modeling,
dependencies machine translation, and text generation
Named Entity identifies and classifies named entities Information extraction, entity linking, and
Recognition (NER) (e.g., person names, organizations, improving search engine results
locations) in text
Dependency Parsing Captures syntactic dependencies, useful analyzes the grammatical structure of a
for machine translation, question sentence by identifying relationships
answering, and information retrieval between words
Syntax Tree-Based Includes subtree patterns, syntactic paths, Syntax tree-based features capture syntactic
Features and tree kernels for parsing and semantic structures and relationships in sentences
analysis
03-04-2024 M.Mohana, Research Scholar 19
https://medium.com/@eskandar.sahel/exploring-feature-extraction-techniques-for-natural-language-processing-46052ee6514
Word Embedding or Word Vector
• Represent words as dense vectors in a continuous Image Source: Google
vector space, where words with similar meanings
are closer to each other in the space.
• Numeric representations of words in a lower-
dimensional space
• Try to capture semantic and syntactic information
• Word2Vec, GloVe (Global Vectors for Word
Representation), and fastText
• Method of extracting features out of text, to feed
into ML model for work with text data
Need for Word Embedding
• To reduce dimensionality
• To use a word to predict the words around it
• Inter-word semantics must be captured
https://www.geeksforgeeks.org/word-embeddings-in-nlp/

03-04-2024 M.Mohana, Research Scholar 20


Word Embedding or Word Vector (Cont.)
Approaches for Text Representation
• Traditional Approach
• Compiling a list of distinct terms and giving each one a unique integer value, or id. After that,
insert each word’s distinct id into the sentence.
• Every vocabulary word is handled as a feature in this instance.
• Large vocabulary will result in an extremely large feature size.
• One-Hot Encoding, Bag of Words (BoW), CountVectorizer, TF-IDF
• Neural Approach
• Word2Vec, Continuous Bag of Words (CBOW), Skip-Gram
• Pretrained Techniques
• Representation of words that are learned from large corpora and are made available for reuse in
various NLP tasks
• Capture semantic relationships between words, to understand similarities and relationships
between different words in a meaningful way
• GloVe (Global Vectors for Word Representation), FastText, BERT (Bidirectional Encoder
Representations from Transformers)
03-04-2024 M.Mohana, Research Scholar 21
Count Vector

Frequency-based
TF-IDF Vector
Embedding

Co-Occurrence
Types of Word Vector
Embedding
CBOW
Prediction-based
Embedding
Skip – Gram
model

Image Source: Google


03-04-2024 M.Mohana, Research Scholar 22
https://www.analyticsvidhya.com/blog/2017/06/word-embeddings-count-word2veec/
Text Similarity
• Refers to the measure of how similar two or more pieces of text are in terms of their semantic or syntactic
content.
• Finding similarities between documents is utilized in a variety of fields, book and article
recommendations, detection of plagiarism, legal documents, etc.
• Two texts define the same notion and are semantically comparable, or identical when say they are similar.

1. Global Warming is here. 1. Global Warming is here.


Ocean temperature is rising. As a human, Ocean temperature is rising.
this is similar
2. I’m reading a book. 2. I’m reading a book.
The book is about NLP. The book is about NLP.

3. Text similarity in NLP is easy. 3. Text similarity in NLP is easy.


I like data science. I like data science.

4. This place is great. 4. This place is great. In NLP, this is


This is great news. This is great news. similar, How ?

03-04-2024 M.Mohana, Research Scholar 23


Text Similarity (Cont.)
• Semantic similarity is about the meaning closeness, and
lexical similarity is about the closeness of the word set.
Let’s check the following two phrases as an example:
The dog bites the man
The man bites the dog
• According to the lexical similarity, those two phrases are
very close and almost identical because they have the same
word set.
• For semantic similarity, they are completely different
because they have different meanings despite the similarity
of the word set.
• Calculating text similarity depends on converting text to a
vector of features, and then the algorithm selects a proper
feature representation, like TF-IDF.

https://www.baeldung.com/cs/semantic-similarity-of-two-phrases#:~:text=2.-,Text%20Similarity,be%20lexical%20or%20in%20meaning. Image Source: Google


03-04-2024 M.Mohana, Research Scholar 24
Techniques are commonly used to compute text similarity

Edit Distance
Cosine Similarity • Calculates the minimum
• Measures the cosine of the Jaccard Similarity number of operations Word Embeddings
angle between two vectors • Measures the similarity (insertions, deletions, • Used to compute similarity
representing the text. between two sets by dividing substitutions) required to between texts by averaging or
• Used with word embeddings the size of their intersection by transform one text into another. combining word vectors to
or TF-IDF vectors to compute the size of their union. • Measuring similarity between represent the entire text.
similarity. texts with similar structures but
potentially different words.

Semantic Similarity Sequence Alignment


BERT and Similar Metrics Algorithms
Models • Techniques like Needleman-
• Metrics use semantic
Wunsch or Smith-Waterman
• Pre-trained language models information to measure text
algorithms, commonly used in
like BERT (Bidirectional similarity.
bioinformatics for sequence
Encoder Representations from • They may leverage knowledge alignment,
Transformers) can be fine- graphs, ontology-based
tuned for text similarity tasks. • Can also be adapted to measure text
approaches, or semantic vector
similarity by aligning sequences of
spaces to compute similarity.
words or characters.

03-04-2024 M.Mohana, Research Scholar 25


Part-of-Speech Tagging
• Label each word in a sentence with its
corresponding part of speech, such as noun (NN),
verb (VB), adjective (JJ), adverb (RB), pronoun
(PRP), preposition (IN), conjunction (CC),
interjection (UH), Determiner (DT), etc.,
• Machine translation, sentiment analysis, information
retrieval, and NER, POS tagging are essential.
• Works well for clearing out ambiguity in terms with
numerous meanings and revealing a sentence’s
grammatical structure. For example, “lead” can be a
noun (the metal) or a verb (to guide).
• There is a hierarchy of tasks in NLP, at the bottom
are sentence and word segmentation.
Image Source: Google • POS tagging builds on top of that, and phrase
chunking builds on top of POS tags.
• The, a, and an are not always considered POS but
are often included in POS tagging libraries.
03-04-2024 M.Mohana, Research Scholar 26
M.Mohana, Research Scholar

Rule-Based Tagging
• This approach uses handcrafted rules based on linguistic knowledge to assign
POS tags to words.
• For example, a rule might specify that words ending in "-ing" are typically
gerunds (verbs).
Probabilistic Tagging
• This approach uses statistical models to assign POS tags based on the
probability of a word occurring with a particular tag.
• Hidden Markov Models (HMMs) and Conditional Random Fields (CRFs) are
Several approaches commonly used probabilistic models for POS tagging.
to POS tagging Deep Learning Tagging
• With the rise of deep learning, neural network-based approaches have become
popular for POS tagging.
• Models like Bidirectional LSTMs (Long Short-Term Memory networks) or
Transformers can learn complex patterns and dependencies in text to predict
POS tags.
Hybrid Approaches
• Some systems combine rule-based and probabilistic or deep-learning
techniques to improve tagging accuracy.
03-04-2024 27
Challenges in Part-of-Speech Tagging

Syntax and Cross-Lingual


Out-of-
Word Sense Grammar Challenges
Vocabulary Domain- Lack of
Ambiguity Disambiguation Complexity POS tagging
(OOV) Words Specific Context
For example, Terminology correct meaning of a complex syntax across multiple
not present in word with multiple POS tagging
the word or grammar languages
their training trained on meanings in a models may
"bank" can be a structures, such introduces
data, leading to general text particular context. struggle with
noun (e.g., as passive additional
difficulties in may struggle For example, "bat" lack of context,
"riverbank") or as a flying mammal voice, nested challenges due
assigning with domain- especially in
a verb (e.g., "to and "bat" as a sports clauses, or to differences in
appropriate tags specific short or
bank on equipment requires ellipses, can grammar, word
to these out-of- terminology incomplete
something"). contextual pose challenges order, and
vocabulary and jargon. sentences.
understanding. for POS linguistic
words.
taggers. features.

03-04-2024 M.Mohana, Research Scholar 28


Information Extraction
• Process of automatically extracting structured information from unstructured or semi-structured text
data that can be easily analysed, searched, and visualized
• Goal of information extraction is to identify specific pieces of information, such as entities (e.g., names
of people, organizations, locations) and their relationships (e.g., who works for which company), from
textual data.
• Spark NLP – used for identifying specific entities from large volumes of text data, converting them into
a structured format for further analysis
• Involves identifying specific entities, relationships, and events of interest in text data, such as named
entities like people, organizations, dates, and locations
• Main applications like Search engines, chatbots, recommendation systems, and fraud detection, among
others
• ‘TextMatcher’, and ‘BigTextMatcher’ are annotators that used to match and extract text patterns from a
document
• ‘BigTextMatcher’- is designed for large corpora and ‘ TextMatcher’- works by defining a set of rules
that specify the patterns to match and how to match them

03-04-2024 M.Mohana, Research Scholar 29


Several techniques and approaches are used in information extraction
Named Entity Recognition (NER)
• Involves identifying and classifying entities mentioned in text into predefined categories such as person names, organization names,
locations, dates, and more, using machine learning models such as Conditional Random Fields (CRFs), SVMs, or deep learning models
like Bidirectional LSTMs or Transformers.
Relation Extraction
• For example, in the sentence "John works at XYZ Corp," the relation extraction task would identify that "John" is an employee of
"XYZ Corp." Relation extraction can be done using pattern-based approaches, rule-based systems, or machine learning models.
Event Extraction
• Event extraction focuses on identifying events mentioned in text and extracting relevant information such as event types, participants,
time, location, and other attributes.
Template-Based Extraction
• For example, a template for extracting product information might include patterns like "Product: [product name], Price: [price],
Category: [category].“
• Template-based extraction is useful for extracting structured data from semi-structured or unstructured text.
Machine Learning-Based Extraction
• Machine learning models, including supervised, semi-supervised, and unsupervised learning techniques, are widely used for
information extraction tasks.
• These models learn patterns and relationships from annotated training data and apply them to new text for extraction.

03-04-2024 M.Mohana, Research Scholar 30


Named Entity Extraction
• Involves identifying and classifying named entities in
text into predefined categories such as names of persons,
organizations, locations, dates, numerical expressions,
and other entities of interest.
• Serves as a bridge between unstructured text and
structured data,
• enabling machines to sift through vast amounts of textual
M.Mohana, Research Scholar

information and extract nuggets of valuable data in


categorial form

03-04-2024
Fall into three broad categories:
• Rule-based approaches – a set of rules for the grammar
of a language
• Machine learning approaches - machine learning model
on a labeled dataset using algorithms like conditional
random fields and maximum entropy
• Hybrid approaches – a rule-based system to quickly
identify easy-to-recognize entities and a machine
learning system to identify more complex entities
31
Image Source: Google
Key steps and techniques involved in Named Entity Extraction
Tokenization and Part-of-Speech (POS) Tagging:
1. Input text is tokenized into words or tokens, and each token is assigned a part-of-speech tag (e.g., noun, verb, adjective) using
POS tagging techniques.
Named Entity Recognition:
1. NER algorithms then identify and label tokens that correspond to named entities in the text. Commonly used techniques for
NER include:
1. Rule-based approaches: Using handcrafted rules and patterns to match and classify named entities based on linguistic
features (e.g., capitalization, word position, context).
2. Machine learning models: Training supervised learning models such as Conditional Random Fields (CRFs), Support
Vector Machines (SVMs), or deep learning models like Bidirectional LSTMs (Bi-LSTMs) or Transformers (e.g., BERT,
RoBERTa) on annotated NER datasets to predict named entity labels for tokens.
Post-processing and Entity Classification:
1. After identifying named entities, post-processing steps may involve resolving entity boundaries, handling overlapping entities,
and classifying entities into specific categories (e.g., person, organization, location).
Evaluation and Validation:
1. NER systems are evaluated using metrics such as precision, recall, and F1-score, comparing the model's predictions against
manually annotated ground truth data.
Named Entity Linking (NEL):
1. In some cases, NER is followed by Named Entity Linking, where identified named entities are linked to corresponding entries
in knowledge bases or ontologies to enrich their semantic representation.
03-04-2024 M.Mohana, Research Scholar 32
• Refers to the measure of how similar two pieces of text are
Semantic similarity in terms of their meaning or semantics.
• Measures how close or how different the two pieces of
word or text are in terms of their meaning and context.
• NLP applications such as information retrieval, question
answering, text summarization, and recommendation
systems.

03-04-2024 M.Mohana, Research Scholar 33


M.Mohana, Research Scholar

• Semantic Similarity refers to the degree of similarity between the words.


• Focus is on the structure and lexical resemblance of words and phrases.
• Semantic similarity delves into the understanding and meaning of the
content.

What is There are certain approaches for measuring semantic similarity in natural
language processing
Semantic • Word Embedding - (skip-gram, cbow), GloVe, and Fasttext
• Word2Vec – (Continuous Bag of Words, Skip-gram)
Similarity? • Dov2Vec – An extension of word2vec
• SBERT – Transformer-based model in which the encoder part captures the
meaning of words in a sentence.
• InferSent -It uses bi-directional LSTM to encode sentences and infer
semantics.
• USE (universal sentence encoder) – It’s a model trained by Google that
generates fixed-size embeddings for sentences that can be used for any
NLP task.

https://www.geeksforgeeks.org/different-techniques-for-sentence-semantic-similarity-in-nlp/
03-04-2024 34
Types of Semantic Similarity
• To determine the semantic similarity between concepts
Knowledge-Based Similarity • Represents each concept by a node in an ontology graph, also called the
topological method because the graph is used as a representation for the
corpus concepts.

• Calculates the semantic similarity based on learning features’ vectors from the
corpus.
Statistical-Based Similarity
• count or TF-IDF in LSA, weights of Wikipedia concepts in ESA, synonyms in
PMI, and co-occurring words of a set of predefined words in HAL.

• Manhattan Distance, Euclidean Distance, Cosine Similarity, Jaccard Index,


String-Based Similarity
and Sorensen-Dice Index.

• Similarity measurement between two English phrases, with the assumption


Language Model-Based Similarity
that they are syntactically correct.

03-04-2024 M.Mohana, Research Scholar 35


Types of Semantic Similarity (Cont.)

Word-Level Semantic Similarity: Sentence-Level Semantic Similarity:

• Word Embeddings: • Vector Space Models:


Utilizes vector representations of words in a Extend word embeddings to sentences by
continuous vector space to measure similarity based aggregating or combining word vectors to represent
on the cosine similarity or other distance metrics. the entire sentence. Similarity between sentences is
• WordNet-based Measures: then computed using techniques like cosine
Exploits WordNet, a lexical database of English, to similarity.
compute similarity between words based on their
synsets (groups of synonymous words) and • Siamese Networks:
hypernym/hyponym relationships. Deep learning architectures designed to learn
• Distributional Similarity: sentence embeddings by comparing pairs of
Measures similarity between words based on their sentences and learning a similarity metric.
distributional properties in a corpus, capturing how
often words co-occur with each other. • BERT-based Models:
• Lexical Similarity Measures: Utilize pre-trained language models like BERT to
Include techniques like Jaccard similarity, cosine compute sentence embeddings and measure
similarity, or TF-IDF similarity, which compute similarity using techniques such as fine-tuning for
similarity based on word overlap or statistical semantic similarity tasks.
properties of words.
Types of Semantic Similarity (Cont.)

Document-Level Semantic Similarity Semantic Textual Similarity (STS)

Topic Modeling: Techniques like Latent Semantic STS Benchmarks: Datasets like the STS Benchmark
Analysis (LSA) or Latent Dirichlet Allocation (LDA) can provide pairs of text with human-annotated similarity
be used to model topics in documents and compute scores, allowing the evaluation of semantic similarity
similarity based on topic distributions. models.
Doc2Vec: An extension of Word2Vec that learns document Embedding-Based Approaches: Utilize pre-trained word
embeddings, enabling similarity computation at the or sentence embeddings to compute semantic similarity
document level. between texts, often fine-tuned on STS datasets for
Graph-Based Models: Represent documents as nodes in a improved performance.
graph and compute similarity based on graph-based
algorithms like Personalized PageRank or graph neural
networks.
03-04-2024 M.Mohana, Research Scholar 37
Common techniques and methods used for measuring semantic similarity

Word Embeddings Sentence Embeddings WordNet and ConceptNet


• Word2Vec, GloVe, and FastText • USE, BERT, and Sentence-BERT • WordNet - Lexical database that
represent words as dense vectors in a (SBERT) generate embeddings for entire organizes words into semantic hierarchies
continuous vector space sentences or phrases. and provides information about their
• Similarity measured using cosine • Similarity is computed using the same relationships (e.g., synonyms, hypernyms,
similarity, Euclidean distance, or other distance metrics applied to sentence hyponyms).
distance metrics between their embeddings • ConceptNet is a knowledge graph that
corresponding word vectors captures commonsense knowledge and
semantic relationships between concepts

Distributional Semantics Deep Learning Models Metric Learning


• Models capture semantic relationships • Siamese networks and Triplet networks • Similarity metric directly from data by
between words based on their • Transformer-based models like BERT, optimizing a loss function that encourages
distributional patterns in a large corpus RoBERTa, and XLNet leverage pre- similar pairs to have low distances and
of text trained language representations to dissimilar pairs to have high distances.
• Latent Semantic Analysis (LSA) and measure semantic similarity • Pearson correlation, Spearman
Latent Dirichlet Allocation (LDA) correlation, or Mean Squared Error
extract semantic information from word (MSE) can be used to assess the
co-occurrence statistics performance of similarity models

4/3/2024 M.Mohana, Research Scholar 38


Text Clustering
• Process of grouping similar documents or text data into
clusters based on their semantic similarity or other
relevant features
• By clustering text, identify trends, discover hidden
patterns, extract valuable insights, and streamline large
volumes of unstructured text data
• Unsupervised learning technique used for various tasks
such as document categorization, topic modeling, and
information retrieval

Algorithm Selection (K-means, hierarchical


clustering, DBSCAN, and agglomerative)

Feature Extraction ( TfidfVectorizer, TF-IDF, BoW,


Word2Vec or GloVe, LDA)

Similarity Measures (Cosine similarity, Jaccard


similarity or Euclidean distance)
Image Source: Google
Pre-processing (Basic text pre-process)

Evaluation (t-SNE, or PCA)

03-04-2024 M.Mohana, Research Scholar 39


Text Classification
• Ask for automatically categorizing text documents into
predefined classes or categories based on their content.

There are three text classification approaches:


• Rule-based System: texts are separated into an organized
group using a set of handicraft linguistic rules.
Image Source: Google
• For example, words like Donald Trump and Boris Johnson
would be categorized into politics. People like LeBron
James and Ronaldo would be categorized into sports.
• Machine System: learns to make a classification based on
past observations from the data sets. User data is
prelabeled as train and test data. (Bag of words)
• Hybrid System: combines a rule-based and machine-based
approach, approach usage of the rule-based system to
create a tag and use machine learning to train the system
and create a rule.

03-04-2024 M.Mohana, Research Scholar 40


Application of Text Classification
News Categorization
Spam Detection Topic Labeling Language Identification
Sentiment Analysis News articles to be
Identify and filter out spam Automatically assigning Detect the language in
categorized into sections
Text sentiment, categorizing emails or messages by topics or categories to which a text is written,
like politics, technology,
content as positive, analyzing their content and documents, making content which is useful for
sports, etc., improving
negative, or neutral. characteristics, enhancing organization and retrieval multilingual content
content organization for
communication security. more efficient. processing and translation.
readers.

Customer Feedback Medical Document Legal Document Social Media Monitoring


Product Classification
Analysis Classification Categorization NLP tracks and classifies
E-commerce platforms use
Customer reviews and Categorizing medical Law firms use NLP to social media posts, tweets,
NLP for product
feedback to extract insights, records, research papers, classify legal documents, and comments, allowing
categorization, ensuring
understand customer and patient notes, assisting making managing and brands to monitor their
items are correctly labeled
satisfaction and areas of in efficient data retrieval for retrieving information from online presence and engage
and presented to customers.
improvement. healthcare professionals. large databases easier. with users.

Resume Screening
Content Recommendation
Fraud Detection Resume screening by
Recommending relevant
Classify financial texts to categorizing job
articles, blogs, or products
detect fraudulent activities applications based on skills,
to users based on their
and identify potential risks. experience, and
interests.
qualifications.

https://www.analyticsvidhya.com/blog/2020/12/understanding-text-classification-in-nlp-with-movie-review-example-example/
4/3/2024 M.Mohana, Research Scholar 41
Steps Involved in Text Classification

Hyperparameter Tuning Handling Imbalanced


Data Preparation (learning rate, regularization Data
strength, batch size, number (oversampling,
(collecting a labeled dataset)
of epochs, and model undersampling, or class
architecture) weights)

Pre-processing Applications
(Tokenizing the text, (Spam detection, topic
removing stop words, Training and Evaluation categorization, language
performing stemming or identification, document
lemmatization, and handling classification, and content
any noise) recommendation systems)

Feature Extraction Model Selection


SVM, NB, Logistic
(TF-IDF, BoW, Word2Vec or
Regression, Decision Trees,
GloVe, LDA, BERT for content
Random Forest, and CNNs or
understand)
RNNs)
Sentiment Analysis
• Process of determining the sentiment or emotional tone
expressed in a piece of text
• analyzing the text to identify whether the sentiment is positive,
negative, or neutral
• Sentiment analysis, also known as opinion mining, is an
important business intelligence tool
• applications to understand public opinion, customer feedback,
social media trends, and more
Types of Sentiment Analysis
• Binary Sentiment Analysis: Positive and negative sentiments.
• Multi-class Sentiment Analysis: Positive, negative, neutral,
and sometimes additional categories like very positive or very
negative. Image Source: Google
• Aspect-Based Sentiment Analysis: Analyzes specific aspects
or aspects of a product, service, or topic to determine
sentiment.

03-04-2024 M.Mohana, Research Scholar 43


Why Sentiment Analysis is important?
I'm amazed by the speed of the processor but disappointed that it heats up quickly.
• Marketers might dismiss the discouraging part of the review and be positively biased towards the processor's
performance.
• However, accurate sentiment analysis tools sort and classify text to pick up emotions objectively.

Market
Research How does it work?
Track • Pre-processing
Brand
Monitoring
campaign (Tokenization,
performance Lemmatization, Stop-word
removal)
• Keyword Analysis
(extracted keywords and
Improve give them a sentiment
Customer Use cases Social media score)
Service

03-04-2024 M.Mohana, Research Scholar 44


Why Sentiment Analysis is Important in Business?
• Companies have large volumes of text data like emails, customer support chat transcripts, social media
comments, and reviews.
• Sentiment analysis tools can scan this text to automatically determine the author’s attitude towards a topic.
• Companies use the insights from sentiment analysis to improve customer service and increase brand
reputation.

Build better
Analyze at
products and
Neutral Statement and scale
services
Challenges
Emoji form

Provide
Sarcasm Multipolarity Real-time
Negation objective
results
(Yeah, great. It (I'm happy with insights
(I wouldn't say Why it is
took three weeks the sturdy build
the subscription important?
for my order to but not impressed
was expensive)
arrive.) with the color)

https://aws.amazon.com/what-is/sentiment-analysis/#:~:text=Sentiment%20analysis%20is%20an%20application,before%20providing%20the%20final%20result.
03-04-2024 M.Mohana, Research Scholar 45
Sentiment Analysis Vs Semantic Analysis
Sentiment Analysis
Approaches in Sentimental Analysis
• focuses on determining the emotional tone expressed in a piece
of text.
• classify the sentiment as positive, negative, or neutral,
especially valuable in understanding customer opinions, Rule-based
reviews, and social media comments. (Lexicon-Based, VADER)
• used to identify the prevailing sentiment and gauge public or
individual reactions to products, services, or events.
Semantic Analysis
Machine Learning
• Aims to comprehend the meaning and context of the text.
• Understand the relationships between words, phrases, and
concepts in a given piece of content.
• Semantic analysis considers the underlying meaning, intent, Neural Network
and the way different elements in a sentence relate to each
other.
• Crucial for tasks such as question answering, language
translation, and content summarization, where a deeper Hybrid Approach
understanding of context and semantics is required.

03-04-2024 M.Mohana, Research Scholar 46


Image Source: Google
Text summarization
• Process of creating a concise and coherent summary of a
longer text while retaining its key information and main
points
• Task in information retrieval and document analysis,
helping users quickly grasp the essential content of a
document without reading the entire text
Types of Text Summarization
Extractive Summarization:
• Involves selecting and combining important sentences or
phrases directly from the original text to create a summary.
• It doesn't generate new sentences but rather extracts
existing ones.
Abstractive Summarization:
• Generates a summary by understanding the text's meaning
and creating new sentences that convey the main ideas.
• Often involves natural language generation techniques.
03-04-2024 M.Mohana, Research Scholar 47
Text summarization (Cont.)
Extractive Summarization Techniques:
Text
• Graph-based Methods: Use graph Summarization
algorithms (e.g., TextRank, PageRank) to
identify important sentences based on their
connections in the text graph. Based on Input Based on the Based on the
• Machine Learning Models: Supervised or Type Purpose Output Type
unsupervised machine learning algorithms
(e.g., Support Vector Machines, Clustering)
to rank sentences based on features like Single Document Generic Extractive
sentence length, word frequency, and
semantic similarity.
Abstractive Summarization Techniques: Multiple
Domain Specific Abstractive
Document
• Deep Learning Models: Recurrent Neural
Networks and transformer models like BERT
or GPT generate summaries by
understanding the context, semantics, and Query-based
relationships within the text.

03-04-2024 M.Mohana, Research Scholar 48


Why automatic text summarization?
• Summaries reduce reading time.
• When researching documents, summaries make the selection process easier.
• Automatic summarization improves the effectiveness of indexing.
• Automatic summarization algorithms are less biased than human summarization.
• Personalized summaries are useful in question-answering systems as they provide personalized information.
• Using automatic or semi-automatic summarization systems enables commercial abstract services to increase the
number of text documents they can process.
• Summaries make information more accessible to a broader audience, including individuals with limited time,
attention span, or reading abilities.
• Can process text in multiple languages, enabling cross-language summarization and facilitating information
access across diverse linguistic contexts
• Summarization aids decision-making by presenting important information in a condensed and digestible format.
• Summaries allow readers to skim through the main ideas and concepts of a document without delving into every
detail.
• Summaries act as a quick reference point for finding relevant information within documents.

03-04-2024 M.Mohana, Research Scholar 49


Image Source: Google
Chatbot
• Artificial intelligence (AI) program designed to simulate conversations
with users in natural language
• NLP transforms text into structured data that the computer can
understand
• customer service, information retrieval, task automation, entertainment,
and more
• Standard bots don't use AI, so their interactions usually feel less natural
and human.
Types of Chatbots:
Rule-Based Chatbots:
• Follow predefined rules and patterns to respond to user inputs.
• They are relatively simple and have limited capabilities but can handle
specific tasks effectively.
Machine Learning-Based Chatbots:
• Use machine learning algorithms to learn from data and improve their
conversational abilities over time.
• Understand context, handle more complex conversations, and adapt to
user preferences.
03-04-2024 M.Mohana, Research Scholar 50
Chatbot (Cont.)
Application
How Does an NLP Chatbot Work? Why NLP is a Must for Chatbot

Customer service and support

Text Preprocessing
Improved User
Interaction Personalized recommendations

Text Analysis – user


intent and context Virtual assistance
Better Understanding
of Intent
Sentiment analysis
Action Determination
Personalized
Experiences Real-time language translation

Text Generation
Personal coaching

https://doubletick.io/blog/nlp-chatbots

03-04-2024 M.Mohana, Research Scholar 51


NLP Engine
• Processes user inputs to understand intents, extract
Components of a Chatbot entities, and generate responses.
• Includes tasks like tokenization, part-of-speech tagging,
named entity recognition, and sentiment analysis.
Dialog Management
• Manages the flow of conversation, handles context,
maintains state, and determines appropriate responses
based on user inputs and system knowledge.
Backend Integration
• Connects with external systems, databases, APIs, or
services to fetch information, perform actions, and fulfill
user requests.
User Interface
• Provides the interface for users to interact with the
chatbot, which can be a text-based interface (e.g.,
messaging platforms, websites) or voice-based interface
(e.g., virtual assistants).

03-04-2024 M.Mohana, Research Scholar 52


Image Source: Google
Development Approaches and Functionality
• Code-Based Development: Involves writing code using programming languages (e.g., Python, Java) and
NLP libraries (e.g., NLTK, spaCy) to build and train chatbots from scratch.
• Chatbot Platforms: Use pre-built chatbot development platforms (e.g., Dialogflow, Microsoft Bot
Framework, IBM Watson Assistant) that offer tools, APIs, and frameworks for designing, training, and
deploying chatbots with minimal coding.
• Custom Development: Combines code-based development with chatbot platforms to create highly
customized solutions tailored to specific use cases and requirements.
Functionality
• Information Retrieval: Provides information, answers questions, and offers recommendations based on
user queries.
• Task Automation: Performs automated tasks, such as scheduling appointments, making reservations,
placing orders, and handling routine customer inquiries.
• Conversational Engagement: Engages users in meaningful conversations, maintains user interest, and
provides personalized experiences.
• Feedback and Learning: Collects user feedback, learns from interactions, and improves conversational
abilities and performance over time.
03-04-2024 M.Mohana, Research Scholar 53
Difference between NLP, NLG, NLU, and NLI

NLP NLG NLU NLI


(Natural Language (Natural Language (Natural Language (Natural Language
Processing) Generation) Understanding) Interface)

Interface or system that allows


Generation of human-like text Enabling computers to
users to interact with
Enabling computers to or speech from structured data understand and interpret
computers or systems using
understand, interpret, and or non-linguistic input human language input
natural language input
generate human language in a
way that is meaningful

Converting data or information Intent classification, entity Encompasses both NLU and
into natural language text, recognition, sentiment NLG capabilities to interpret
generating summaries, analysis, context user queries or commands,
creating stories, composing understanding, and discourse extract meaning, generate
emails analysis appropriate responses

Text preprocessing,
tokenization, part-of-speech Template-based generation,
tagging, named entity chatbots, virtual assistants,
Rule-based generation, chatbots, voice assistants,
recognition, syntactic analysis voice recognition systems,
statistical approaches, and search engines, and smart
information retrieval, and
neural network-based home devices
sentiment analysis
generation

03-04-2024 M.Mohana, Research Scholar 54


Machine Translation

• Automated process of translating text or speech from one language


to another using computational algorithms and techniques
• Machine translation software in the source language and let the
tool automatically transfer the text to the selected target language.
Automated Translation vs Machine Translation
• Automated translation refers to any triggers built into a traditional
computer-assisted translation tool (CAT tool) or cloud translation
management system (TMS) to execute manual or repetitive tasks
related to translation
• Machine translation is about converting text from one natural
language to another using software.
• In other words, there’s no human input involved as in traditional
translation.

Image Source: Google


03-04-2024 M.Mohana, Research Scholar 55
Process of Machine Translation
• The basic requirement in the complex cognitive process of machine translation is to understand the
meaning of a text in the original (source) language and then restore it to the target (sink) language.

The primary steps in the machine translation process are

1 2 3 4

Need to decode the There is a need to Require an in-depth Need to re-encode this meaning
meaning of the source interpret and analyze all knowledge of in the target language, which
text in its entirety. the features of the text the grammar, semantics, also needs the same in-depth
available in the corpus. syntax, idioms, etc. of knowledge as the source
the source language for language to replicate the
this process. meaning in the target language.

https://www.scaler.com/topics/nlp/machine-translation-in-nlp/

03-04-2024 M.Mohana, Research Scholar 56


Machine Translation (Cont.)
Types of Machine Translation:
Rule-based Translation: (1950-1980)
• Relies on linguistic rules and dictionaries to translate
text.
• Requires a substantial amount of linguistic knowledge
and often lacks flexibility.
Statistical Machine Translation (SMT): (1990-2014)
• SMT uses statistical models that learn from large
bilingual corpora to generate translations.
• Works well for common language pairs but may
struggle with less common languages or complex
sentence structures.
Neural Machine Translation (NMT): (2014 to till)
• NMT is the latest paradigm in machine translation,
using deep learning models such as sequence-to-
sequence models with attention mechanisms.
• NMT has shown significant improvements in
translation quality, especially for long and complex
sentences. Image Source: Google

03-04-2024 M.Mohana, Research Scholar 57


Machine Translation (Cont.)
Neural Machine Translation (NMT) Process:
Encoder-Decoder Architecture: Image Source: Google
• NMT models typically use an encoder-decoder architecture.
• The encoder processes the input text and converts it into a
fixed-dimensional vector representation (encoding).
• The decoder then generates the translated text based on this
encoding.
Attention Mechanism:
• Attention mechanisms in NMT allow the model to focus on
relevant parts of the input sentence during translation,
improving the accuracy of translations, especially for long
sentences.
Training Data:
• NMT models require large amounts of parallel data (source-
target language pairs) for training.
• The quality and diversity of the training data significantly
impact the model's performance.

03-04-2024 M.Mohana, Research Scholar 58


Challenges in Machine Translation

Ambiguity: Many words and phrases have multiple meanings or interpretations, leading to
translation ambiguity.

Idioms and Cultural Nuances: Idiomatic expressions, cultural references, and context-
specific language nuances can be challenging for machine translation systems.

Rare Languages: Limited availability of training data and resources for less common
languages can hinder accurate translations.

Domain-specific Translation: Translating specialized or domain-specific content (e.g.,


medical or legal texts) accurately requires domain knowledge and specialized training data.

03-04-2024 M.Mohana, Research Scholar 59


Text to Speech
• Text-to-speech (TTS) in NLP refers to the conversion of written text into
spoken speech, a Type of speech synthesis that transforms written text
into spoken words using computer algorithms
• Enables machines to communicate with humans in a natural-sounding
voice by processing text into synthesized speech
• TTS - typically uses a combination of linguistic rules and statistical
models to generate synthetic speech
• Speech synthesis refers to the process of using a computer to produce
artificial human speech.
• Synthesized speech can be created by concatenating pieces of recorded
speech that are stored in a database.
• TTS systems can then use NLP understanding to generate more accurate
synthetic speech reflecting the input text’s intended meaning.
• Elevenlabs, amazon’s Polly, and Deepgram’s aura represent some of the
cutting-edge AI & ML developments.
Image Source: Google

03-04-2024 M.Mohana, Research Scholar 60


Why do we need AI for TTS?

Necessity of AI in Machine learning's The spectrogram's


realistic voices role function

Waveforms in speech Overcoming speech


Training AI models
synthesis challenges

https://deepgram.com/ai-glossary/text-to-speech-models

03-04-2024 M.Mohana, Research Scholar 61


Text to Speech (Cont.)
• A text-to-speech system (or “engine”) is composed of two parts: a front-
end and a back-end.
• The front end has two major tasks.
• First, it converts raw text containing symbols like numbers and
abbreviations into the equivalent of written-out words. This process is often
called text normalization, pre-processing, or tokenization.
• The front end then assigns phonetic transcriptions to each word and divides
and marks the text into prosodic units, like phrases, clauses, and sentences.
• The back-end — often referred to as the synthesizer — then converts the
symbolic linguistic representation into sound.

There are different ways to perform speech synthesis:


• Concatenative TTS, (Domain-Specific Synthesis, Unit Selection Synthesis,
Diphone Synthesis)
• Formant Synthesis,
• Parametric TTS, and
• Hybrid approaches.
03-04-2024 M.Mohana, Research Scholar 62
Text to Speech basic process

Text Input: Process begins with a piece of written text as input, text can be in various languages and
formats, such as plain text, web pages, or documents.

Text Analysis : Text undergoes linguistic analysis to understand its structure, including sentence
segmentation, part-of-speech tagging, and syntactic parsing.
Analysis helps in generating more natural-sounding speech

Speech Synthesis: Synthesized speech generation involves converting the analyzed text into spoken
words.
The step includes prosody (intonation, rhythm, and stress) modeling to mimic human-like speech patterns.

Audio Output : Final output is an audio file or real-time speech output that can be played through
speakers, headphones, or integrated into applications.

03-04-2024 M.Mohana, Research Scholar 63


Text to Speech Techniques
Techniques and Models:
Concatenative TTS:
• Uses a database of recorded speech segments (phonemes,
diphones, or larger units) to piece together and synthesize
the desired speech.
• offers good quality but may lack flexibility and naturalness.
Parametric TTS:
Image Source: Google
• Parametric TTS generates speech using mathematical
models that simulate human vocal tract physiology.
• These models can be trained on linguistic and acoustic data
to produce more natural and expressive speech.
Neural TTS:
• Neural TTS, powered by deep learning techniques such as
sequence-to-sequence models and WaveNet, has gained
popularity for its ability to generate highly natural and
expressive speech, including variations in tone, emotion,
and speaking style.
03-04-2024 M.Mohana, Research Scholar 64
Commonly used tools and
libraries for text-to-speech in NLP

• Google Text-to-Speech (TTS): Provides high-quality


and natural-sounding speech synthesis with various
voices and languages.
• Amazon Polly: Offers lifelike speech synthesis with
customizable voice styles and expressive capabilities.
• Microsoft Azure Text-to-Speech: Provides
customizable speech synthesis with neural network-
based models for natural-sounding output.
• Open-source TTS frameworks: Mozilla TTS,
Tacotron, and WaveNet that allow customization and
development of TTS systems.

03-04-2024 M.Mohana, Research Scholar 65


Challenges in TTS
Naturalness:
• Achieving natural-sounding speech with proper
intonation, rhythm, and emphasis is a key challenge.
Neural TTS models have significantly improved
naturalness compared to earlier techniques.
Multilingual Support:
• TTS systems need to support multiple languages and
accents, requiring robust linguistic resources and models
for each language.
Emotional Speech:
• Some TTS systems can generate speech with emotional
variations (e.g., happy, sad, angry), enhancing user
engagement and interaction.
Real-Time Processing:
• Real-time TTS, where speech is generated dynamically
as text is inputted, requires efficient algorithms and
computational resources for seamless performance.

03-04-2024 M.Mohana, Research Scholar 66


Image Source: Google Speech to Text
• Speech-to-text (STT) in NLP refers to the process of converting
spoken language into written text.
• Automatic caption generation for videos, dictation to generate reports,
creating transcriptions of audio recordings, Voice assistants, and
accessibility tools for individuals with disabilities.

To perform this task, modern models use transformers-based deep


learning models, and out of those:
• Wav2Vec 2.0, created and shared by a Facebook researcher
on wav2vec 2.0: A Framework for Self-Supervised Learning of Speech
Representations
• HuBERT, also proposed by Facebook on HuBERT: Self-Supervised
Speech Representation Learning
• Main differences between Wav2Vec 2.0 and HuBERT - how they
process the audio input and the loss function to measure the
performance of the outputs and backpropagate the errors during
training.
https://www.johnsnowlabs.com/converting-speech-to-text-with-spark-nlp-and-python/
03-04-2024 M.Mohana, Research Scholar 67
How does speech-to-text work?
When sounds come out of someone's mouth to create words, it also makes a series of vibrations.
Speech to text technology works by picking up on these vibrations and translating them into a digital language
through an analog to digital converter.

The analog-to-digital-converter takes sounds from an audio file, measures the waves in detail, and filters them to
distinguish the relevant sounds.

The sounds are then segmented into hundredths or thousandths of seconds and are then matched to phonemes.
A phoneme is a unit of sound that distinguishes one word from another in any given language. For example, there are
approximately 40 phonemes in the English language.

The phonemes are then run through a network via a mathematical model that compares them to well-known
sentences, words, and phrases.

The text is then presented as text or a computer-based demand based on the audio’s most likely version.

03-04-2024 M.Mohana, Research Scholar 68


https://aws.amazon.com/what-is/speech-to-text/
STT systems typically involve several stages
Audio Preprocessing
• Input audio is preprocessed to remove noise, normalize volume levels, and potentially separate
multiple speakers if needed.
Feature Extraction
• Preprocessed audio is converted into a format suitable for analysis.
• Mel-frequency cepstral coefficients (MFCCs) or spectrograms to represent the audio's frequency
content over time.
Acoustic Model
• Relationship between input audio features and phonemes (basic units of sound in a language).
• Techniques like Hidden Markov Models (HMMs), Gaussian Mixture Models (GMMs), or deep
learning models such as Convolutional Neural Networks (CNNs) or Recurrent Neural Networks
(RNNs).
Language Model
• Acoustic model is combined with a language model that predicts the most likely sequence of words
given the audio input, corrects errors, and improves overall transcription accuracy.
Decoding
• The combined output from the acoustic and language models is decoded to generate the final text
transcription.

03-04-2024 M.Mohana, Research Scholar 69


Commonly used tools and libraries for speech-to-text in NLP include

Google Cloud Speech-to-Text: IBM Watson Speech to Text:


Provides accurate transcription Offers real-time transcription with
through deep learning models. customization options.

Mozilla DeepSpeech: An open- CMU Sphinx: A toolkit for


source speech-to-text engine speech recognition using HMMs
based on deep learning. and other algorithms.

03-04-2024 M.Mohana, Research Scholar 70


Let’s explore some important
concepts in deep-learning for NLP

4/3/2024 71
Attention Mechanism
• Technique used in neural network models to focus on
relevant parts of the input sequence when making
predictions or generating output sequences
• developed to increase the performance of encoder
decoder(seq2seq) RNN model.
• Solution to the limitation of the Encoder-Decoder
model which encodes the input sequence to one fixed
length vector from which to decode the output at
Image Source: Google each time step.
• Allows the model to "pay attention" to certain parts
Before learning about of the data and to give them more weight when
Attention Mechanism, we making predictions.
should learn about how
RNN, LSTM, and GRU are
• preserve the context of every word in a sentence by
working
assigning an attention weight relative to all other
words.
• even if the sentence is large, the model can preserve
the contextual importance of each word.
03-04-2024 M.Mohana, Research Scholar 72
https://medium.com/analytics-vidhya/https-medium-com-understanding-attention-mechanism-natural-language-processing-9744ab6aed6a
How attention mechanisms work in NLP
Contextual Relevance
• In NLP tasks, especially those dealing with sequences like sentences or documents, different parts of the input sequence may contribute
differently to the output. For example, in machine translation, certain words in the input sentence may have more influence on the translation
of specific words in the output sentence.
Attention Weights
• Attention mechanisms assign weights to each element in the input sequence, indicating how relevant that element is to the current step in
processing. These weights are often computed using neural network layers, such as softmax layers, based on the similarity between the
current state of the model and each element in the input sequence.
Attention Scores
• To calculate the attention weights, attention mechanisms typically compute attention scores. These scores measure the similarity or relevance
between the current state of the model (e.g., the decoder state in machine translation) and each element in the input sequence. Common
methods for computing attention scores include dot product attention, additive attention, and multiplicative attention.
Soft Attention
• After computing the attention scores, a softmax function is often applied to convert the scores into attention weights. Soft attention allows
the model to consider the entire input sequence but with varying degrees of importance for each element.
Context Vector
• Once the attention weights are determined, a context vector is computed as a weighted sum of the input elements, where the weights are
given by the attention weights. This context vector represents the focused information from the input sequence that is relevant to the current
step in the model's processing.

03-04-2024 M.Mohana, Research Scholar 73


Type of Attention Mechanism (Cont.)
• Self-attention, also known as intra-attention, is a specific type of attention Self-Attention
mechanism where the input sequence is compared to itself.
• Multi-head attention, the attention mechanism is split into multiple heads, each Dot-product
computing attention independently.
• Soft attention computes attention weights for all elements in the input Scaled dot-product
sequence, typically using a softmax function to normalize these weights. Attention
• Hard attention selects a single element from the input sequence at each step of

Types of Attention
processing. This selection process is often based on learned probabilities or Multi-head
heuristics.
• Structured Attention allows the attention weights to be learned using a Structured
structured prediction model, such as a conditional random field.
• Scaled dot-product Attention, a variant of dot-product attention that scales the Soft
dot product by the square root of the key dimension.
• Dot-product Attention computes the attention weights as the dot product of the Hard
query and key vectors.
• Local attention restricts the attention mechanism to a specific region or
window of the input sequence. Local
• Global attention considers the entire input sequence when computing attention
weights. Global

03-04-2024 M.Mohana, Research Scholar 74


https://www.scaler.com/topics/deep-learning/attention-mechanism-deep-learning/
Difference Between Global vs Local Attention
Global Attention Local Attention
• Model considers the entire input sequence when • Restricts the attention mechanism to a specific
computing attention weights. region or window of the input sequence.
• requires computing attention weights for all • reduces the computational burden by focusing on a
elements in the input sequence, leading to higher smaller subset of input elements. This can be
computational complexity, especially for long advantageous for handling long sequences more
sequences. efficiently, as the attention mechanism only needs
to consider a limited number of elements at each
• allows the model to align input and output
step.
sequences more flexibly, as it can attend to any
part of the input sequence based on the relevance • enforces a more constrained alignment between
to the current processing step. input and output sequences, as it only considers a
• may struggle with long sequences due to the need specific region or window of the input sequence.
to compute attention weights for all elements, • can handle long sequences more efficiently by
leading to memory and computational challenges. focusing on relevant parts of the input sequence,
mitigating some of the issues associated with
processing lengthy data.

03-04-2024 M.Mohana, Research Scholar 75


Transformers
• Transformer is a revolutionary deep learning model introduced
in the paper "Attention is All You Need" by Vaswani et al. in
2017.
• Profound impact on NLP tasks, especially in areas like machine
translation, text generation, and language understanding.
• First encoding the input sentence into a sequence of vectors.
This encoding is done using a self-attention mechanism, which
allows the model to learn the relationships between the words in
the sentence.
• Once the input sentence has been encoded, the model decodes it
into a sequence of output tokens. This decoding is also done
using a self-attention mechanism.
Image Source: Google
• The attention mechanism is what allows transformer models to
learn long-range dependencies between words in a sentence.
• The attention mechanism works by focusing on the most
relevant words in the input sentence when decoding the output
tokens.

03-04-2024 M.Mohana, Research Scholar 76


Transformers (Cont.)
Self-Attention Mechanism: allows the model to weigh the importance of different
words in a sentence based on their context within the sentence itself.
This enables the model to capture dependencies and long-range relationships between
words more effectively than traditional recurrent or convolutional architectures.
Encoder-Decoder Architecture: Encoder processes the input sequence (e.g., source
language sentence) using self-attention layers and feed-forward layers, producing a
representation that captures the semantic and contextual information of the input.
The decoder then uses self-attention layers and encoder-decoder attention layers to
generate the output sequence (e.g., target language translation).
Positional Encoding: Provides the model with information about the position of each
word in the sequence, allowing it to differentiate between words based on their
position.
Multi-Head Attention: Allows the model to focus on different parts of the input or
output sequence simultaneously, enhancing its ability to capture diverse relationships
and patterns.
Feed-forward Networks: Networks process the outputs of the attention layers, adding
non-linearity and further modeling capabilities to the architecture.
Layer Normalization and Residual Connections: To stabilize training and improve
gradient flow, the Transformer uses layer normalization and residual connections after
each sub-layer in the encoder and decoder.

03-04-2024 M.Mohana, Research Scholar Image Source: Google 77


BERT RoBERTa
GPT
(Bidirectional Encoder (A Robustly Optimized
(Generative Pre-trained
Representations from BERT Pretraining
Transformer)
Transformers) Approach)

Transformer-based T5
models and XLNet DistilBERT (Text-to-Text Transfer
Transformer)
architectures

BART ELECTRA
(BART is a denoising (Efficiently Learning an BERT-based
autoencoder for Encoder that Classifies Multilingual Models
pretraining sequence-to- Token Replacements
sequence models) Accurately)

03-04-2024 M.Mohana, Research Scholar 78


Large Language Model (LLM)
• Type of AI model that has been trained on a vast amount of text
data to understand and generate human-like text
• Large language models use transformer models and are trained
using massive datasets — hence, large.
• This enables them to recognize, translate, predict, or generate text
or other content.
• GPT – 3.5, including NLP understanding, text generation, code
generation, and more
• Works by processing input text, analyzing patterns, and
generating responses based on learned information from the
training data.
• Many LLMs are trained on data that has been gathered from the
Internet – thousands or millions of gigabytes’ worth of text as
programmers may use a more curated data set.
• LLMs are then further trained via tuning for particular tasks such
Image Source: Google
as interpreting questions and generating responses or translating
text from one language to another.

03-04-2024 M.Mohana, Research Scholar 79


How a Large Language Model (LLM) works ?
Pre-training Tokenization
LLMs undergo pre-training on large Transformer Architecture Text data is tokenized into smaller units,
corpora of text data. such as words or subwords, before being
LLMs are often built using the Transformer fed into the LLM.
Model learns to predict the next word in a architecture or its variants.
sequence of text given the preceding Breaks down the text into meaningful
context components that the model can process.

Contextual Representations
Attention Mechanisms Fine-tuning
LLMs learn to generate contextual
representations of words and phrases based LLMs leverage attention mechanisms to After pre-training, LLMs can be fine-tuned
on their surrounding context. weigh the importance of different words or on specific tasks or domains by further
tokens in a sequence. training on task-specific data.
Model captures these contextual variations.

Generation Transfer Learning


LLMs can generate human-like text by LLMs leverage transfer learning principles,
sampling from their learned language where knowledge learned during pre-
distributions. training is transferred and adapted to new
Text completion, summarization, question tasks with relatively little additional
answering, and even creative writing. training data.

03-04-2024 M.Mohana, Research Scholar 80


Large language models use cases
• Information retrieval: Think of Bing or Google. Whenever a user uses their search feature, the user
relies on a large language model to produce information in response to a query.
• Sentiment analysis
• Text generation: Large language models are behind generative AI, like ChatGPT, and can generate
text based on inputs.
• Code generation: Like text generation, code generation is an application of generative AI.
• Chatbots and conversational AI: Large language models enable customer service chatbots or
conversational AI to engage with customers, interpret the meaning of their queries or responses, and
offer responses in turn.
• Tech: Large language models are used anywhere from enabling search engines to respond to queries,
to assisting developers with writing code.
• Healthcare and Science: Large language models can understand proteins, molecules, DNA, and
RNA.
• Customer Service: LLMs are used across industries for customer service purposes such as chatbots or
conversational AI.
• Legal: From searching through massive textual datasets to generating legalese, large language models
can assist lawyers, paralegals, and legal staff.
• Banking: LLMs can support credit card companies in detecting fraud.

03-04-2024 M.Mohana, Research Scholar 81


Hallucinations
• When a LLM produces a false output, or that does not match the user's
intent.
Security
• Large language models present important security risks when not managed
or surveilled properly.
• Can leak people's private information, participate in phishing scams, and
produce spam.
Bias
• Data used to train language models will affect the outputs a given model
Limitations and produces.
Consent
challenges of large
• Large language models are trained on trillions of datasets — some of which
language models might not have been obtained consensually.
• When scraping data from the internet, large language models have been
known to ignore copyright licenses, plagiarize written content, and repurpose
proprietary content without getting permission from the original owners or
artists.
Scaling
• Can be difficult and time- and resource-consuming to scale and maintain
large language models.
Deployment
• Deploying large language models requires deep learning, a transformer
model, distributed software and hardware, and overall technical expertise.
03-04-2024 M.Mohana, Research Scholar 82
https://www.elastic.co/what-is/large-language-models
Let’s explore
visualization techniques
to understand text data

4/3/2024 M.Mohana, Research Scholar 83


Visualization in NLP
Process of representing and presenting linguistic data,
text corpora, language models, or NLP algorithms in a
visual format.

Goal of visualization in NLP is to make complex


linguistic patterns, semantic relationships, and textual
information more accessible and interpretable to users

When choosing a visualization technique, consider the


nature of the text data, the insights to convey, and user
Image Source: Google preferences to understand textual information visually.

Interactive visualizations are useful for exploring and


interacting with large text datasets or complex textual
structures.
https://www.numpyninja.com/post/nlp-text-data-visualization

03-04-2024 M.Mohana, Research Scholar 84


M.Mohana, Research Scholar

• Word clouds display words from a text corpus, with font size indicating word
frequency.
• Bar charts display word frequencies or other textual data categories.
• Histograms show the distribution of word lengths, sentence lengths, or other text-
related metrics.
• Scatter plots can represent relationships between words or text features.
Visualization • Tree maps show hierarchical data structures, such as folder structures or topic
hierarchies in text.
in NLP • Heatmaps represent text-related metrics using color gradients.
(Cont.) • Network graphs depict relationships between entities in text, such as co-occurrence
networks or social networks based on mentions.
• Visualizing topics and their word distributions using bar charts, word clouds, or
interactive topic models.
• Time series plots show changes in text-related metrics over time, such as word
frequencies in social media posts or news articles.
• Mapping text data geographically, such as sentiment analysis results across regions or
locations mentioned in text.
• Adding annotations, labels, or tooltips to visualizations to provide context and
insights about specific text elements.

03-04-2024 85
Explainable Artificial Intelligence (XAI) techniques in NLP
Attention Mechanisms: attention maps show which words or phrases contribute most to the model's output, providing
insights into the model's decision-making process.

Feature Importance: identify the most important features or words in a text that influence the model's predictions (feature
permutation importance or SHAP (SHapley Additive exPlanations).

LIME (Local Interpretable Model-agnostic Explanations): used to explain why a specific text input led to a particular
prediction from a machine learning model.

ELI5 (Explain Like I'm 5): explain the predictions of models such as text classifiers by highlighting important words or
phrases in the input text that contribute to the predicted class.

Integrated Gradients: assigns importance scores to each word or feature in the input text based on how they contribute to
the model's output

Saliency Maps: show which words or tokens have the highest impact on the model's decision, aiding in interpretability.

Rule-based Explanations: creates human-readable rules that describe how the model makes predictions based on specific
linguistic patterns or features in the text.

03-04-2024 M.Mohana, Research Scholar 86


Thank You..!

4/3/2024 M.Mohana, Research Scholar 87

You might also like