0% found this document useful (0 votes)
26 views27 pages

NLP

NLP msc cs notes

Uploaded by

Sayli Gawde
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views27 pages

NLP

NLP msc cs notes

Uploaded by

Sayli Gawde
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 27

MODULE - I Unit 1: Introduction to Natural Language Processing (NLP) and

Language Modelling

Natural Language Processing (NLP) is a field of Artificial Intelligence (AI) that deals
with the interaction between computers and human languages. NLP is used to analyze,
understand, and generate natural language text and speech. The goal of NLP is to enable
computers to understand and interpret human language in a way that is similar to how
humans process language.
1. Natural Language Processing (NLP) is a field of computer science and artificial
intelligence that focuses on the interaction between computers and humans using
natural language. It involves analyzing, understanding, and generating human language
data, such as text and speech.
2. NLP has a wide range of applications, including sentiment analysis, machine
translation, text summarization, chatbots, and more. Some common tasks in NLP
include:
3. Text Classification: Classifying text into different categories based on their content,
such as spam filtering, sentiment analysis, and topic modeling.
4. Named Entity Recognition (NER): Identifying and categorizing named entities in text,
such as people, organizations, and locations.
5. Part-of-Speech (POS) Tagging: Assigning a part of speech to each word in a sentence,
such as noun, verb, adjective, and adverb.
6. Sentiment Analysis: Analyzing the sentiment of a piece of text, such as positive,
negative, or neutral.
7. Machine Translation: Translating text from one language to another.
NLP involves the use of several techniques, such as machine learning, deep learning, and
rule-based systems. Some popular tools and libraries used in NLP include NLTK (Natural
Language Toolkit), spaCy, and Gensim.
Overall, NLP is a rapidly growing field with many practical applications, and it has the
potential to revolutionize the way we interact with computers and machines using natural
language.
NLP techniques are used in a wide range of applications, including:
 Speech recognition and transcription: NLP techniques are used to convert speech to
text, which is useful for tasks such as dictation and voice-controlled assistants.
 Language translation: NLP techniques are used to translate text from one language to
another, which is useful for tasks such as global communication and e-commerce.
 Text summarization: NLP techniques are used to summarize long text documents into
shorter versions, which is useful for tasks such as news summarization and document
indexing.
 Sentiment analysis: NLP techniques are used to determine the sentiment or emotion
expressed in text, which is useful for tasks such as customer feedback analysis and
social media monitoring.
Question answering: NLP techniques are used to answer questions asked in natural
language, which is useful for tasks such as chatbots and virtual assistants.
 NLP is a rapidly growing field and it is being used in many industries such as
healthcare, education, e-commerce, and customer service. NLP is also used to improve
the performance of natural language-based systems like chatbot, virtual assistants,
recommendation systems, and more. With the advancement in NLP, it has become
possible for computers to understand and process human languages in a way that can be
used for various applications such as speech recognition, language translation, question
answering, and more.

Applications of Natural Language Processing

1. Chatbots
Chatbots are a form of artificial intelligence that are programmed to interact with humans in
such a way that they sound like humans themselves. Depending on the complexity of the
chatbots, they can either just respond to specific keywords or they can even hold full
conversations that make it tough to distinguish them from humans. Chatbots are created
using Natural Language Processing and Machine Learning, which means that they
understand the complexities of the English language and find the actual meaning of the
sentence and they also learn from their conversations with humans and become better with
time. Chatbots work in two simple steps. First, they identify the meaning of the question
asked and collect all the data from the user that may be required to answer the question.
Then they answer the question appropriately.
2. Autocomplete in Search Engines
Have you noticed that search engines tend to guess what you are typing and automatically
complete your sentences? For example, On typing “game” in Google, you may get further
suggestions for “game of thrones”, “game of life” or if you are interested in maths then
“game theory”. All these suggestions are provided using autocomplete that uses Natural
Language Processing to guess what you want to ask. Search engines use their enormous
data sets to analyze what their customers are probably typing when they enter particular
words and suggest the most common possibilities. They use Natural Language Processing
to make sense of these words and how they are interconnected to form different sentences.
3. Voice Assistants
These days voice assistants are all the rage! Whether its Siri, Alexa, or Google Assistant,
almost everyone uses one of these to make calls, place reminders, schedule meetings, set
alarms, surf the internet, etc. These voice assistants have made life much easier. But how
do they work? They use a complex combination of speech recognition, natural language
understanding, and natural language processing to understand what humans are saying and
then act on it. The long term goal of voice assistants is to become a bridge between humans
and the internet and provide all manner of services based on just voice interaction.
However, they are still a little far from that goal seeing as Siri still can’t understand what
you are saying sometimes!
4. Language Translator
Want to translate a text from English to Hindi but don’t know Hindi? Well, Google
Translate is the tool for you! While it’s not exactly 100% accurate, it is still a great tool to
convert text from one language to another. Google Translate and other translation tools as
well as use Sequence to sequence modeling that is a technique in Natural Language
Processing. It allows the algorithm to convert a sequence of words from one language to
another which is translation. Earlier, language translators used Statistical machine
translation (SMT) which meant they analyzed millions of documents that were already
translated from one language to another (English to Hindi in this case) and then looked for
the common patterns and basic vocabulary of the language. However, this method was not
that accurate as compared to Sequence to sequence modeling.
5. Sentiment Analysis
Almost all the world is on social media these days! And companies can use sentiment
analysis to understand how a particular type of user feels about a particular topic, product,
etc. They can use natural language processing, computational linguistics, text analysis, etc.
to understand the general sentiment of the users for their products and services and find out
if the sentiment is good, bad, or neutral. Companies can use sentiment analysis in a lot of
ways such as to find out the emotions of their target audience, to understand product
reviews, to gauge their brand sentiment, etc. And not just private companies, even
governments use sentiment analysis to find popular opinion and also catch out any threats
to the security of the nation.
6. Grammar Checkers
Grammar and spelling is a very important factor while writing professional reports for your
superiors even assignments for your lecturers. After all, having major errors may get you
fired or failed! That’s why grammar and spell checkers are a very important tool for any
professional writer. They can not only correct grammar and check spellings but also
suggest better synonyms and improve the overall readability of your content. And guess
what, they utilize natural language processing to provide the best possible piece of writing!
The NLP algorithm is trained on millions of sentences to understand the correct format.
That is why it can suggest the correct verb tense, a better synonym, or a clearer sentence
structure than what you have written. Some of the most popular grammar checkers that use
NLP include Grammarly, WhiteSmoke, ProWritingAid, etc.
7. Email Classification and Filtering
Emails are still the most important method for professional communication. However, all of
us still get thousands of promotional Emails that we don’t want to read. Thankfully, our
emails are automatically divided into 3 sections namely, Primary, Social, and Promotions
which means we never have to open the Promotional section! But how does this work?
Email services use natural language processing to identify the contents of each Email with
text classification so that it can be put in the correct section. This method is not perfect
since there are still some Promotional newsletters in Primary, but its better than nothing. In
more advanced cases, some companies also use specialty anti-virus software with natural
language processing to scan the Emails and see if there are any patterns and phrases that
may indicate a phishing attempt on the employees.

Phases of Natural Language Processing


 Lexical Analysis
The first phase of NLP, where text or sound waves are segmented into words and other
units. This phase converts a sequence of characters into a sequence of tokens.
 Syntactic Analysis
Also referred to as parsing, this phase analyzes natural language with the rules of a formal
grammar. It assigns a semantic structure to text, and allows the extraction of phrases that
convey more meaning than just the individual words.
 Semantic Analysis
This phase understands the meaning in a statement, by uncovering the definitions of words,
phrases, and sentences. It also maps syntactic structures and objects in the task domain.
 Discourse Integration
This phase considers the meaning of any sentence in relation to the previous and following
sentences.
 Pragmatic Analysis
This phase focuses on understanding a text's intended meaning by considering the
contextual factors surrounding it. It involves deriving those aspects of language which
require real world knowledge.

Difficulty of NLP including ambiguity;

Natural Language Processing (NLP) presents several challenges, and ambiguity is one of the
most significant ones. Here's a breakdown of the difficulty of NLP, including ambiguity:

1. Ambiguity: Natural language is inherently ambiguous. Words, phrases, and sentences can
have multiple meanings depending on context, tone, and cultural nuances. Resolving
ambiguity is crucial for accurate NLP tasks such as machine translation, sentiment analysis,
and question answering.
2. Semantic Understanding: NLP systems must comprehend the meaning of words, phrases,
and sentences to perform tasks accurately. However, capturing the nuances of human
language, including sarcasm, irony, and metaphor, poses a significant challenge.
3. Context Dependency: The meaning of a word or phrase often depends on the surrounding
context. For example, the word "bank" can refer to a financial institution or the side of a
river. Understanding the intended meaning requires analyzing the context in which the word
appears.
4. Data Sparsity: NLP models require vast amounts of data to learn effectively. However,
language data is often sparse, meaning that certain words or phrases may occur infrequently
in the training data, making it challenging for models to learn their meaning accurately.
5. Syntactic Complexity: Natural language exhibits complex syntactic structures, including
grammar rules, syntax, and word order. Parsing and understanding these structures accurately
is essential for tasks such as syntactic analysis and grammar correction.
6. Domain Adaptation: NLP systems trained on one domain may struggle to perform well in a
different domain due to differences in language use, terminology, and style. Adapting models
to new domains requires additional training data and fine-tuning techniques.
7. Multilinguality: Dealing with multiple languages adds another layer of complexity to NLP
tasks. Translation, sentiment analysis, and other NLP tasks must account for linguistic
differences between languages, including grammar, syntax, and idiomatic expressions.
8. Ethical and Bias Concerns: NLP models can inherit biases present in the training data,
leading to biased or unfair outcomes. Addressing these biases and ensuring fairness and
ethical considerations in NLP systems is an ongoing challenge.

Overall, NLP is a complex and challenging field that requires addressing various issues,
including ambiguity, to develop accurate and robust language understanding systems.
Ongoing research and advancements in machine learning techniques continue to improve the
capabilities of NLP systems, but many challenges remain to be addressed.

Spelling error and Noisy Channel Model;

Spelling Error: In NLP, spelling errors refer to mistakes made during the writing process,
such as typographical errors, misspellings, or phonetic mistakes. These errors can
significantly affect the performance of NLP tasks, as they can lead to incorrect interpretations
or classifications of text.

Noisy Channel Model: The Noisy Channel Model is a probabilistic framework used in tasks
like spell checking and correction. It is based on the idea that when we observe a misspelled
word (the "observation"), it's likely that the original intended word (the "hidden" or "latent"
variable) was misspelled due to some noise in the channel (e.g., typing errors, auto-correction
mistakes).

Here's how the Noisy Channel Model works:

1. Error Model: This component estimates the probability of different types of errors
occurring. For example, it might model the likelihood of a specific letter being mistyped as
another letter, or the likelihood of adjacent keys being pressed accidentally.
2. Language Model: This component estimates the probability of different words or sequences
of words occurring in the language. It helps in determining the likelihood of a particular word
being the intended word based on the context of the surrounding words.
3. Decoding: Given a misspelled word, the Noisy Channel Model computes the probability of
all possible corrections by combining the probabilities from the error model and the language
model. The correction with the highest probability is then selected as the most likely intended
word.
4. Example: Let's say we have the misspelled word "hte". The Noisy Channel Model would
consider possible corrections like "the", "hat", "hot", etc. It calculates the probability of each
correction based on the likelihood of different types of errors and the probability of each
corrected word occurring in the language.

By combining information from both the error model and the language model, the Noisy
Channel Model can effectively correct spelling errors and improve the accuracy of NLP tasks
that rely on correctly spelled text.
Concepts of Parts-of speech and Formal Grammar of English.

In the field of Natural Language Processing (NLP), understanding the concepts of parts of
speech and formal grammar of English is crucial for various tasks such as text processing,
parsing, and language generation. Here's how these concepts relate to NLP:

1. Parts of Speech Tagging (POS Tagging): Parts of speech tagging is the process of
automatically assigning a grammatical category (such as noun, verb, adjective, etc.) to each
word in a sentence. POS tagging is a fundamental task in NLP, as it provides valuable
information about the syntactic structure of text, which can be used for tasks like information
extraction, text classification, and sentiment analysis.
2. Named Entity Recognition (NER): Named Entity Recognition is a task in NLP where the
goal is to identify and classify named entities (such as person names, organization names,
locations, etc.) within a text. Understanding parts of speech helps in identifying named
entities by recognizing patterns such as proper nouns (which often represent entities) and
their surrounding words.
3. Dependency Parsing: Dependency parsing is the process of analyzing the grammatical
structure of a sentence to determine the relationships between words. It involves identifying
the syntactic dependencies between words, such as subject-verb relationships, object
relationships, and modifiers. Parts of speech information is crucial for dependency parsing as
it provides the necessary context for understanding the grammatical relationships between
words.
4. Constituency Parsing: Constituency parsing involves breaking down a sentence into its
syntactic constituents or phrases. It aims to identify the hierarchical structure of a sentence,
including phrases such as noun phrases, verb phrases, and prepositional phrases. Formal
grammar rules, such as phrase structure rules, are often used in constituency parsing to
generate parse trees that represent the syntactic structure of sentences.
5. Language Generation: In language generation tasks, such as text summarization, machine
translation, and dialogue systems, understanding formal grammar rules is essential for
generating grammatically correct and coherent text. By adhering to the rules of English
grammar, NLP models can produce text that is fluent and natural-sounding.

Overall, the concepts of parts of speech and formal grammar of English play a fundamental
role in various NLP tasks, enabling machines to understand, analyze, and generate human
language effectively.

N-gram and Neural Language Models Language Modelling with N-gram

N-gram and Neural Language Models are two approaches used in language modeling, a
fundamental task in natural language processing (NLP). Both methods aim to predict the next
word in a sequence of words given the previous context. Let's discuss each of them:

1. N-gram Language Models:


 In N-gram language models, the probability of a word given its context (previous N-1
words) is estimated using the conditional probability distribution based on observed
frequencies in a training corpus.
 The "N" in N-gram represents the number of preceding words used for prediction. For
example, in a bigram model (2-gram), the probability of a word depends only on the
preceding word.
 N-gram models are relatively simple and computationally efficient, making them
suitable for tasks where large datasets are available. However, they suffer from the
sparsity problem when dealing with long contexts or rare combinations of words.
2. Neural Language Models (NLMs):
 Neural Language Models utilize neural networks, typically recurrent neural networks
(RNNs), long short-term memory networks (LSTMs), or transformer models, to learn
the probability distribution of words in a sequence.
 NLMs can capture complex patterns and dependencies in language data by learning
distributed representations of words and their contexts in continuous vector spaces.
 Unlike N-gram models, which rely on counting occurrences, NLMs learn from raw
text data through optimization techniques like backpropagation and gradient descent.
 Transformer-based models like GPT (Generative Pre-trained Transformer) have
achieved state-of-the-art performance in various NLP tasks, including language
modeling, by capturing long-range dependencies effectively.

While N-gram models are simpler and have been traditionally used for language modeling,
Neural Language Models have gained popularity due to their ability to handle more complex
language structures and capture contextual nuances more effectively. However, NLMs are
computationally more demanding and require large amounts of data for training. Depending
on the specific task and available resources, either approach can be chosen for language
modeling tasks.

Simple N-gram models are straightforward probabilistic models used for language modeling,
where the probability of a word given its context (preceding N-1 words) is estimated based
on observed frequencies in a training corpus. Here's a step-by-step explanation of how a
simple N-gram model works:

1. Tokenization:
 The first step involves tokenizing the text data into words or tokens. This involves
splitting the text into individual words or subwords, depending on the granularity
desired for modeling.
2. Constructing N-grams:
 Next, the text data is used to construct N-grams, where an N-gram is a sequence of N
consecutive words. For example, in a bigram model (2-gram), the text "The quick
brown fox jumps" would produce the following bigrams:
 ["The", "quick"], ["quick", "brown"], ["brown", "fox"], ["fox", "jumps"]
 Higher N-gram models capture longer dependencies but may suffer from data sparsity
issues due to the increased number of possible N-grams.
3. Counting Frequencies:
 After constructing N-grams from the training corpus, the frequency of occurrence of
each N-gram is counted. This involves counting how many times each N-gram
appears in the corpus.
4. Estimating Probabilities:
 Once the frequencies of N-grams are obtained, probabilities are estimated using these
frequencies. The probability of a word given its context (previous N-1 words) is
calculated as the ratio of the count of the N-gram to the count of the preceding (N-1)-
gram.
 For example, in a bigram model, the probability of a word given its preceding word is
calculated as:

 These probabilities represent the likelihood of observing a particular word given its
context according to the training data.
5. Prediction:
 During inference or testing, the N-gram model predicts the next word in a sequence
based on the context of the preceding N-1 words.
 The word with the highest probability given the context is chosen as the predicted
next word.
 If using a higher-order N-gram model (e.g., trigrams or higher), the prediction is
based on the preceding N-1 words.
6. Smoothing (Optional):
 To address the issue of data sparsity, smoothing techniques such as Laplace (add-one)
smoothing or Good-Turing smoothing may be applied to adjust the probability
estimates, especially for unseen N-grams.

Simple N-gram models are easy to implement and computationally efficient, making them
suitable for various language modeling tasks. However, they may not capture long-range
dependencies and contextual nuances as effectively as more advanced models like neural
language models.

Smoothing (basic techniques)

In natural language processing (NLP), smoothing techniques are used to address the problem
of data sparsity in language models, where some n-grams observed in the test data may not
occur in the training data. Here are some basic smoothing techniques commonly used:

1. Laplace (Add-One) Smoothing:


 Add one count to every possible n-gram in the training data.
 Adjust the probability estimates to include unseen n-grams by adding a pseudocount.
 This ensures that no probability estimate is zero, but it can overly smooth the
probabilities.
2. Lidstone Smoothing:
 Generalization of Laplace smoothing where a fractional count (0 < λ < 1) is added to
each count.
 λ represents a smoothing parameter, allowing for varying degrees of smoothing.
3. Additive (or Add-k) Smoothing:
 Similar to Laplace smoothing, but instead of adding one, a constant k is added to each
count.
 Helps to reduce the impact of zero counts more gently than Laplace smoothing.
4. Good-Turing Smoothing:
 Estimate the probability of unseen events by extrapolating from seen events.
 It involves re-estimating the probabilities of low-frequency events based on the
frequency of higher-frequency events.
 This method often requires more computational resources but can provide more
accurate smoothing estimates.
5. Kneser-Ney Smoothing:
 Particularly used in language modeling with N-grams.
 Combines absolute discounting (subtracting a constant from the count of each n-
gram) with backoff.
 It redistributes the probability mass from frequent to less frequent n-grams, improving
generalization.

These techniques help ensure that the language model assigns non-zero probabilities to all
possible events, even those not observed in the training data. They are essential for improving
the generalization and performance of language models, especially in cases where the
vocabulary or context is diverse and data sparsity is a concern.

Evaluating language models;

Evaluating language models involves assessing their performance in generating or predicting


text based on a given context. Here are some common evaluation techniques used for
language models:
1. Perplexity: Measure of how well a language model predicts a sample of text, with lower
values indicating better performance.
2. Cross-Entropy: Average number of bits needed to represent the outcomes of an event from a
probability distribution, with lower values indicating better performance.
3. N-gram Evaluation: Assess accuracy of predicted next words given preceding context, often
using metrics like precision, recall, and F1-score.
4. Human Evaluation: Subjective assessment by human judges based on criteria such as
fluency, coherence, relevance, and grammaticality.
5. Task-Specific Evaluation: Evaluation on downstream NLP tasks like machine translation,
text summarization, and sentiment analysis to measure real-world applicability.
6. Transfer Learning Evaluation: Assessment of performance on specific tasks after applying
transfer learning techniques like fine-tuning or domain adaptation.
7. Diversity and Novelty: Evaluation metrics focusing on variety, uniqueness, and originality
of generated text to assess creativity and richness.

Neural Network basics


Neural networks are widely used in natural language processing (NLP) for tasks such as text
classification, sentiment analysis, machine translation, named entity recognition, and more.
Here's how neural networks are applied in NLP:
1. Word Embeddings: Dense vector representations of words capturing semantic relationships.
2. Recurrent Neural Networks (RNNs): Designed for sequential data, maintaining internal
state for context.
3. Convolutional Neural Networks (CNNs): Originally for images, adapted for local pattern
recognition in text.
4. Transformer Models: Leverage self-attention mechanisms to capture global dependencies in
text.
5. Sequence-to-Sequence Models: Encode input sequences into fixed-length vectors and
decode into output sequences.
6. Transfer Learning: Pre-trained models fine-tuned on specific tasks, leveraging knowledge
from large-scale pre-training.

Training

Training in natural language processing (NLP) refers to the process of teaching machine
learning models to understand and generate human language. Here's how training works in
NLP:
Data Collection: Gathering relevant text data from various sources for training an NLP
model.
 Data Preprocessing: Cleaning and transforming raw text data into a format suitable for
training, including tasks like tokenization, lowercasing, and removing punctuation.
 Feature Engineering: Extracting informative features from the preprocessed text data,
such as word embeddings or TF-IDF vectors.
 Model Selection: Choosing an appropriate model architecture for the NLP task at hand,
considering factors like complexity and computational resources.
 Model Training: Optimizing the model parameters to minimize a loss function using an
algorithm such as gradient descent.
 Hyperparameter Tuning: Adjusting non-learnable model parameters (hyperparameters)
to optimize performance, often through techniques like grid search or random search.
 Evaluation: Assessing the performance of the trained model using metrics like accuracy,
precision, recall, or task-specific metrics such as BLEU score for machine translation.
 Fine-Tuning and Iteration: Refining the model iteratively by adjusting hyperparameters,
architecture, or incorporating additional training data to improve performance.
 Deployment: Integrating the trained model into a production environment for real-world
use, often as an API or part of a larger software application.
Neural Language Model
In natural language processing (NLP), a Neural Language Model (NLM) is a type of
statistical model that uses neural networks to learn the probability distribution of sequences
of words in natural language. Here's a concise overview:

1. Definition:
 A Neural Language Model (NLM) predicts the likelihood of a word given its context,
typically the preceding words in a sentence or sequence.
2. Architecture:
 NLMs are often implemented using recurrent neural networks (RNNs), long short-
term memory networks (LSTMs), or transformer architectures.
 RNN-based models process sequences one token at a time, maintaining an internal
state to capture context.
 LSTMs address the vanishing gradient problem, allowing them to capture long-range
dependencies in text.
 Transformer models use self-attention mechanisms to capture global dependencies
between words in a sequence.
3. Training:
 NLMs are trained on large text corpora using techniques like backpropagation and
stochastic gradient descent.
 During training, the model learns to predict the next word in a sequence given the
context of the preceding words.
 Training objectives typically involve maximizing the likelihood of observing the
actual next word in the training data given the model's predictions.
4. Applications:
 NLMs are used in various NLP tasks such as machine translation, text generation,
sentiment analysis, and speech recognition.
 They excel in tasks that require understanding and generating human language, thanks
to their ability to capture complex linguistic patterns.
5. Advantages:
 NLMs can capture long-range dependencies and contextual nuances effectively,
surpassing the limitations of traditional n-gram models.
 They can learn from raw text data without manual feature engineering, making them
adaptable to different languages and tasks.
6. State-of-the-Art Models:
 Recent advancements include models like OpenAI's GPT (Generative Pre-trained
Transformer) and Google's BERT (Bidirectional Encoder Representations from
Transformers).
 These models are pre-trained on vast text corpora and fine-tuned on specific tasks,
achieving state-of-the-art performance in various NLP benchmarks.

In summary, Neural Language Models have revolutionized NLP by enabling more accurate
and context-aware language understanding and generation systems.
application of neural language model in NLP system development Python Libraries for
NLP
Certainly! Let's delve into more specific applications of Neural Language Models (NLMs) in
NLP system development and corresponding Python libraries:

1. Text Generation:
 Application: Generating human-like text for chatbots, content creation, and
storytelling.
 Python Libraries: OpenAI's GPT (Generative Pre-trained Transformer), which can be
accessed through the transformers library by Hugging Face.
2. Machine Translation:
 Application: Translating text from one language to another.
 Python Libraries: Google's TensorFlow and Facebook's PyTorch for training custom
translation models; Hugging Face's transformers for using pre-trained translation
models like MarianMT.
3. Text Summarization:
 Application: Automatically generating concise summaries of long documents or
articles.
 Python Libraries: Gensim for text summarization using algorithms like TextRank or
Latent Semantic Analysis (LSA); spaCy for extracting key sentences.
4. Named Entity Recognition (NER):
 Application: Identifying and classifying entities like names, organizations, and
locations in text.
 Python Libraries: spaCy for built-in NER capabilities; TensorFlow and PyTorch for
custom NER models using BiLSTM-CRF architectures.
5. Sentiment Analysis:
 Application: Analyzing the sentiment expressed in text data (positive, negative,
neutral).
 Python Libraries: TextBlob for basic sentiment analysis; transformers for using pre-
trained sentiment analysis models like BERT or RoBERTa.
6. Text Classification:
 Application: Categorizing text into predefined classes or labels.
 Python Libraries: scikit-learn for classical machine learning models like Naive Bayes
or SVM; transformers for using pre-trained models like BERT or XLNet for text
classification tasks.
7. Dialogue Systems:
 Application: Building conversational agents that can understand and generate human-
like responses.
 Python Libraries: OpenAI's GPT for generating dialogue responses; TensorFlow and
PyTorch for training custom dialogue models using sequence-to-sequence
architectures.
8. Question Answering:
 Application: Answering questions based on given context or knowledge bases.
 Python Libraries: Hugging Face's transformers for using pre-trained models like
BERT or RoBERTa fine-tuned on QA datasets like SQuAD or Natural Questions.

These applications demonstrate the versatility and effectiveness of Neural Language Models
in various NLP tasks. Python libraries such as TensorFlow, PyTorch, spaCy, Gensim,
TextBlob, scikit-learn, and Hugging Face's transformers provide powerful tools and pre-
trained models for building sophisticated NLP systems.

Using Python libraries/packages such as Natural Language Toolkit (NLTK)

Certainly! Let's explore how we can utilize the Natural Language Toolkit (NLTK) Python
library for some NLP tasks:

1. **Tokenization**:
- NLTK provides functions for tokenizing text into words or sentences.
- Example:
```python
from nltk.tokenize import word_tokenize, sent_tokenize

text = "NLTK is a powerful library for natural language processing."


words = word_tokenize(text)
sentences = sent_tokenize(text)

print("Words:", words)
print("Sentences:", sentences)
```

2. **Part-of-Speech Tagging**:
- NLTK allows tagging words in a sentence with their respective part-of-speech (POS)
labels.
- Example:
```python
from nltk import pos_tag

words = word_tokenize("NLTK is a powerful library for natural language processing.")


tagged_words = pos_tag(words)

print("POS Tags:", tagged_words)


```

3. **Named Entity Recognition (NER)**:


- NLTK provides a basic NER functionality to identify named entities in text.
- Example:
```python
from nltk import ne_chunk

text = "Bill Gates is the founder of Microsoft Corporation."


words = word_tokenize(text)
tagged_words = pos_tag(words)
named_entities = ne_chunk(tagged_words)

print("Named Entities:", named_entities)


```
4. **Stemming and Lemmatization**:
- NLTK offers stemming and lemmatization algorithms to reduce words to their base forms.
- Example:
```python
from nltk.stem import PorterStemmer, WordNetLemmatizer

words = ["running", "plays", "better"]


porter_stemmer = PorterStemmer()
lemmatizer = WordNetLemmatizer()

stemmed_words = [porter_stemmer.stem(word) for word in words]


lemmatized_words = [lemmatizer.lemmatize(word) for word in words]

print("Stemmed Words:", stemmed_words)


print("Lemmatized Words:", lemmatized_words)
```

5. **WordNet**:
- NLTK includes WordNet, a lexical database of English words with their semantic
relationships.
- Example:
```python
from nltk.corpus import wordnet

synsets = wordnet.synsets("car")
hypernyms = synsets[0].hypernyms()

print("Synsets:", synsets)
print("Hypernyms:", hypernyms)
```

6. **Text Classification**:
- NLTK provides tools for text classification using machine learning algorithms like Naive
Bayes.
- Example:
```python
from nltk.classify import NaiveBayesClassifier
from nltk.corpus import movie_reviews

documents = [(list(movie_reviews.words(fileid)), category)


for category in movie_reviews.categories()
for fileid in movie_reviews.fileids(category)]
featuresets = [(dict([(word, True) for word in document]), category) for document,
category in documents]

train_set = featuresets[:1900]
test_set = featuresets[1900:]

classifier = NaiveBayesClassifier.train(train_set)
accuracy = nltk.classify.accuracy(classifier, test_set)

print("Accuracy:", accuracy)
```

These examples illustrate how NLTK can be used for various NLP tasks such as tokenization,
POS tagging, NER, stemming, lemmatization, WordNet operations, and text classification.
NLTK serves as a versatile and comprehensive toolkit for NLP tasks in Python.

spaCy, genism

Certainly! Let's explore how spaCy and Gensim, two popular Python libraries for NLP, can
be used:

1. **spaCy**:
- spaCy is a modern and efficient library for NLP tasks, designed to be fast, accurate, and
easy to use.

- **Key Features**:
- Tokenization: Splitting text into words or phrases.
- Part-of-Speech (POS) Tagging: Assigning grammatical categories to words.
- Named Entity Recognition (NER): Identifying named entities like people, organizations,
or locations.
- Dependency Parsing: Analyzing the syntactic structure of sentences.
- Lemmatization: Reducing words to their base or dictionary forms.
- Sentence Segmentation: Splitting text into individual sentences.
- Entity Linking: Linking named entities to a knowledge base.
- Word Vectors: Generating word embeddings for words in a document.
- Text Classification: Classifying text into predefined categories or labels.

- **Example Usage**:
```python
import spacy

# Load spaCy's English language model


nlp = spacy.load("en_core_web_sm")

# Process text
text = "spaCy is a powerful library for NLP."
doc = nlp(text)

# Extract entities
entities = [(entity.text, entity.label_) for entity in doc.ents]

# Tokenization and POS tagging


tokens_and_tags = [(token.text, token.pos_) for token in doc]
print("Entities:", entities)
print("Tokens and POS tags:", tokens_and_tags)
```

2. **Gensim**:
- Gensim is a library for topic modeling, document similarity analysis, and word
embeddings.

- **Key Features**:
- Topic Modeling: Discovering topics in a collection of documents using algorithms like
Latent Dirichlet Allocation (LDA).
- Document Similarity: Calculating similarity between documents based on their content.
- Word Embeddings: Learning dense vector representations of words in a large text corpus
using algorithms like Word2Vec.
- Text Summarization: Automatically generating summaries of text documents.
- Semantic Search: Finding documents or sentences related to a query based on their
semantic similarity.

- **Example Usage**:
```python
from gensim.models import Word2Vec
from gensim.summarization import summarize

# Example Word2Vec training


sentences = [["spaCy", "is", "a", "powerful", "library", "for", "NLP"],
["Gensim", "is", "used", "for", "topic", "modeling", "and", "word",
"embeddings"]]
model = Word2Vec(sentences, min_count=1)

# Example text summarization


text = "Gensim is a popular library for topic modeling and word embeddings. It offers
tools for text summarization."
summary = summarize(text)

print("Word2Vec Similarity:", model.similarity("spaCy", "Gensim"))


print("Text Summary:", summary)
```

spaCy and Gensim are powerful tools in the NLP toolkit, offering a wide range of
functionalities for tasks such as tokenization, entity recognition, word embeddings, topic
modeling, and more.

Unit 2: Morphology & Parsing in NLP

Morphology and parsing are fundamental components of natural language processing (NLP)
that deal with the structure and analysis of words and sentences. Here's an overview of each:
1. Morphology:
 Definition: Morphology is the study of the internal structure of words and the rules
governing their formation.
 Tasks:
 Tokenization: Breaking text into individual words or tokens.
 Stemming: Reducing words to their root or base form by removing affixes.
 Lemmatization: Similar to stemming, but producing valid words (lemmas) by
considering the word's context and part of speech.
 Morphological Analysis: Identifying and analyzing morphemes, the smallest
units of meaning in a language, such as prefixes, suffixes, and roots.
 Example:
 Tokenization: "The quick brown fox" -> ["The", "quick", "brown", "fox"]
 Stemming: "running" -> "run", "cats" -> "cat"
 Lemmatization: "running" -> "run", "cats" -> "cat"
2. Parsing:
 Definition: Parsing is the process of analyzing the grammatical structure of sentences
to determine their syntactic relationships.
 Tasks:
 Part-of-Speech (POS) Tagging: Assigning grammatical categories (nouns,
verbs, adjectives, etc.) to words in a sentence.
 Dependency Parsing: Analyzing the syntactic dependencies between words in
a sentence, typically represented as a parse tree.
 Constituency Parsing: Identifying the constituents (phrases) within a sentence
and their hierarchical structure.
 Example:
 POS Tagging: "The cat is sleeping" -> [("The", "DT"), ("cat", "NN"), ("is",
"VBZ"), ("sleeping", "VBG")]
 Dependency Parsing: "The cat is sleeping" -> [(2, 1, 'det'), (2, 3, 'nsubj'), (3, 0,
'root'), (3, 4, 'ccomp')]
 Constituency Parsing: "(S (NP (DT The) (NN cat)) (VP (VBZ is) (VP (VBG
sleeping))))"

Understanding morphology and parsing is crucial for various NLP tasks such as information
extraction, sentiment analysis, machine translation, and question answering. These concepts
enable NLP systems to analyze and understand the structure of text, leading to more accurate
and meaningful language processing.

Computational morphology & Parts-of-speech Tagging: basic concepts;


Computational Morphology:
1. Definition:
 Computational morphology involves the study and development of algorithms and
techniques for analyzing the internal structure of words in natural language.
2. Basic Concepts:
 Tokenization: Process of breaking text into individual words or tokens.
 Stemming: Technique for reducing words to their base or root form by removing
affixes.
 Lemmatization: Process of determining the lemma or base form of a word
considering its context and part of speech.
 Morphological Analysis: Identifying and analyzing morphemes, which are the
smallest meaningful units of language (e.g., prefixes, suffixes, roots).
Parts-of-Speech Tagging (POS Tagging):
1. Definition:
 Parts-of-speech tagging, also known as POS tagging or grammatical tagging, is the
process of assigning grammatical categories (tags) to words in a sentence based on
their syntactic roles.
2. Basic Concepts:
 POS Tags: Each word in a sentence is assigned a tag representing its grammatical
category, such as noun (N), verb (V), adjective (ADJ), etc.
 Tagset: Collection of tags used for POS tagging, often based on linguistic theories or
standards like the Penn Treebank tagset.
 Tagging Algorithms: Various algorithms are used for POS tagging, including rule-
based approaches, statistical models like Hidden Markov Models (HMMs), and neural
network-based models.
 Applications: POS tagging is fundamental to many NLP tasks, such as syntactic
parsing, information extraction, and machine translation.

These basic concepts provide a foundation for understanding computational morphology and
parts-of-speech tagging in natural language processing. They are essential for processing and
analyzing text data effectively.

Tagset:
A tagset is a collection of tags or labels used for annotating words in natural language text
data. In the context of parts-of-speech tagging (POS tagging), a tagset consists of a
predefined set of grammatical categories or tags that are assigned to words in a sentence
based on their syntactic roles. Here are some key points about tagsets:

1. Purpose:
 Tagsets are used to represent the grammatical properties of words in text data,
facilitating linguistic analysis and computational processing.
2. Granularity:
 Tagsets vary in granularity, ranging from coarse-grained to fine-grained. Coarse-
grained tagsets may have fewer categories, while fine-grained tagsets provide more
detailed distinctions.
3. Standardization:
 Tagsets are often standardized based on linguistic theories or corpora annotations to
ensure consistency across different NLP tasks and systems.
 Common standards include the Penn Treebank tagset, Universal POS tagset, and
Brown Corpus tagset.
4. Examples:
 Penn Treebank Tagset: Includes tags such as NN (noun, singular), VB (verb, base
form), JJ (adjective), etc.
 Universal POS Tagset: Provides a simplified set of tags that are language-
independent, such as NOUN, VERB, ADJ, etc.
 Brown Corpus Tagset: Used in the Brown Corpus, a historical corpus of English text,
with tags like NN (noun), VB (verb, base form), JJ (adjective), etc.
5. Usage:
 Tagsets are applied during the annotation or tagging process in NLP tasks like POS
tagging, named entity recognition (NER), and syntactic parsing.
 They serve as the standard vocabulary for representing linguistic features in
computational models and algorithms.
6. Customization:
 Depending on the specific requirements of a task or domain, tagsets can be
customized or extended to include additional categories or tags that are relevant to the
application.

Understanding the tagset used in a particular NLP task is essential for interpreting the
annotations and analyzing the syntactic structure of text data effectively. It provides the
foundation for various downstream tasks and applications in natural language processing.

Lemmatization
1. Role in Morphology:
 Lemmatization standardizes words by reducing them to their base or dictionary forms.
 It aids in analyzing the internal structure of words and extracting meaningful
morphemes.
2. Role in Parsing:
 Lemmatization normalizes words before syntactic analysis in parsing.
 Lemmatized words are used as input to parsing algorithms for constructing parse
trees.
3. Integration:
 Lemmatization is part of the preprocessing phase in NLP pipelines, alongside tasks
like tokenization and stemming.
 It simplifies parsing tasks by resolving ambiguity and improving accuracy.
4. Example:
 Original Sentence: "The cats are running around the houses."
 Lemmatized Sentence: "The cat be run around the house."
 Lemmatized forms are used for further parsing tasks to analyze the sentence's
syntactic structure.

Early approaches: Rule-based and TBL


1. Rule-Based Approach:
 Definition: Involves using manually crafted linguistic rules to perform tasks such as
lemmatization, POS tagging, and parsing.
 Operation: Linguists or domain experts develop rules based on linguistic principles
to handle various language phenomena.
 Advantages: Offers transparency and interpretability as rules are explicit and human-
readable.
 Limitations: Limited scalability and coverage, requiring extensive manual effort to
develop and maintain rules for different languages and domains.
2. Transformation-Based Learning (TBL):
 Definition: TBL is a machine learning approach for tasks like lemmatization and POS
tagging, where transformation rules are learned from annotated data.
 Operation: The system iteratively learns transformation rules by comparing predicted
outputs with annotated data and updating rules to minimize errors.
 Advantages: Automates the rule generation process and adapts to new data,
improving performance over time.
 Limitations: Relies heavily on the quality and representativeness of annotated data,
and may suffer from overfitting if not properly regularized.

In summary, the rule-based approach relies on manually crafted linguistic rules, while TBL
automates rule generation through machine learning techniques. Each approach has its
strengths and limitations, influencing their applicability in various NLP tasks and contexts.
POS Tagging using HMM (Hidden Markov Model):

1. Definition:
 Hidden Markov Model (HMM) is a probabilistic model commonly used for sequence
labeling tasks such as POS tagging.
 In POS tagging, HMM models the sequence of words in a sentence as a sequence of
hidden states (POS tags) emitting observable symbols (words).
2. Operation:
 HMM consists of two main components: transition probabilities between hidden
states (POS tags) and emission probabilities of observing words given the hidden
states.
 Given a sequence of words (observations), HMM computes the most likely sequence
of hidden states (POS tags) using the Viterbi algorithm.
3. Training:
 HMM for POS tagging is trained on annotated corpora, where each word is tagged
with its correct POS tag.
 Transition and emission probabilities are estimated from the training data using
maximum likelihood estimation or other statistical methods.
4. Advantages:
 HMM can capture sequential dependencies between POS tags, such as noun phrases
or verb phrases.
 It is computationally efficient and can handle variable-length sequences.
5. Limitations:
 HMM assumes that each observation (word) depends only on the corresponding
hidden state (POS tag), which may not always hold true in natural language.
 It does not capture long-range dependencies or context beyond neighboring words.
6. Example:
 Given the sentence "The cat is sleeping", HMM computes the most likely sequence of
POS tags (e.g., DT NN VBZ VBG) for each word.

Overall, POS tagging using HMM is a widely used approach in NLP for assigning
grammatical categories to words in a sentence. While it has limitations, it remains effective
for many applications and serves as a foundational technique in sequence labeling tasks.

Introduction to POS Tagging using Neural Models:

1. Neural Models:
 Neural models leverage neural network architectures to learn distributed
representations of words and capture complex relationships in text data.
 These models have gained popularity in various NLP tasks due to their ability to
capture contextual information and generalize well across different domains.
2. POS Tagging with Neural Models:
 In POS tagging, neural models learn to predict the POS tags of words based on their
surrounding context in a sentence.
 Instead of relying on handcrafted features or linguistic rules, neural models learn
representations of words and their contexts directly from annotated data.
3. Operation:
 Neural POS tagging models typically use recurrent neural networks (RNNs), long
short-term memory networks (LSTMs), or transformer architectures like BERT.
 The model takes a sequence of word embeddings or contextualized word
representations as input and produces a sequence of predicted POS tags as output.
4. Training:
 Neural POS tagging models are trained on large annotated corpora using supervised
learning techniques.
 During training, the model learns to minimize a loss function that measures the
discrepancy between predicted POS tags and the ground truth annotations.
5. Advantages:
 Neural models can capture long-range dependencies and contextual information,
leading to more accurate POS tagging results.
 They can generalize well to unseen words or contexts and perform effectively across
different languages and domains.
6. Limitations:
 Neural models require large amounts of annotated data for training, which may be
resource-intensive to collect and annotate.
 They can be computationally expensive to train and may require substantial
computational resources.
7. Example:
 A neural POS tagging model takes a sentence as input, processes it through a neural
network architecture, and outputs a sequence of POS tags for each word in the
sentence.

In summary, POS tagging using neural models represents a state-of-the-art approach in NLP,
offering significant improvements in accuracy and generalization compared to traditional
rule-based or statistical methods. These models have become an essential component of many
NLP pipelines and applications due to their ability to learn complex patterns and relationships
from data.

Parsing Basic Concepts: Top-Down and Bottom-Up Parsing

1. Definition:
 Parsing is the process of analyzing the grammatical structure of a sentence according
to a formal grammar.
 Top-down and bottom-up parsing are two common strategies used to construct parse
trees representing the structure of a sentence.
2. Top-Down Parsing:
 Operation: Top-down parsing starts with the highest-level rule of the grammar (e.g.,
the start symbol) and applies production rules to expand it into more specific
substructures.
 Procedure: It recursively applies grammar rules, starting from the top (e.g., the
sentence) and moving down to the leaves (e.g., words).
 Advantages: Provides a systematic and deterministic approach, ensuring that the
parsing process follows the grammar rules strictly.
 Limitations: May suffer from inefficiency or backtracking if the grammar is
ambiguous or if incorrect choices are made during parsing.
3. Bottom-Up Parsing:
 Operation: Bottom-up parsing starts with the individual words of the sentence and
applies production rules to build higher-level structures, eventually constructing the
entire sentence.
 Procedure: It applies grammar rules in reverse, starting from the words and working
up to the top-level structure.
 Advantages: More flexible and robust, as it can handle a wider range of grammars
and ambiguities.
 Limitations: May result in multiple possible parse trees for a given sentence,
requiring additional disambiguation techniques.
4. Comparison:
 Top-Down: Begins with the start symbol and expands it into more specific
substructures.
 Bottom-Up: Begins with individual words and constructs higher-level structures
through rule application.
5. Example:
 Consider the grammar rule S → NP VP, where S represents a sentence, NP represents
a noun phrase, and VP represents a verb phrase.
 Top-Down Parsing: Starts with S and expands it into NP VP.
 Bottom-Up Parsing: Starts with individual words and combines them into NP VP,
eventually forming S.

In summary, top-down and bottom-up parsing are complementary strategies for analyzing the
grammatical structure of sentences. Top-down parsing starts from the top-level structure and
expands downwards, while bottom-up parsing starts from the words and builds upwards.
Each approach has its strengths and weaknesses, making them suitable for different types of
grammars and parsing tasks.

Treebank:

 Definition: Corpus of text data annotated with syntactic structures, typically represented as
parse trees.
 Components: Contains text data annotated with parse trees representing the syntactic
structure.
 Representation: Standardized representation based on formal grammar formalisms like
context-free grammars (CFGs).
 Uses: Used for training and evaluating syntactic parsers, linguistic research, and developing
new parsing algorithms.
 Example Treebanks: Penn Treebank, Universal Dependencies, PTB (Penn Treebank), UD
(Universal Dependencies), etc.
 Challenges: Ensuring annotation consistency, coverage of linguistic phenomena, and domain
adequacy.

Syntactic Parsing: CKY Parsing

 Definition: CKY (Cocke-Kasami-Younger) parsing is a dynamic programming algorithm


used to parse sentences according to a context-free grammar (CFG).
 Operation: CKY parsing constructs a chart that represents all possible subtrees of a sentence
and efficiently computes the most likely parse tree using dynamic programming techniques.
 Chart Representation: The chart is a 2D matrix where each cell represents a span of the
sentence and contains potential constituents or partial parse trees for that span.
 Algorithm Steps:
1. Initialization: Fill the chart's diagonal cells with lexical items (words) and their
corresponding non-terminal symbols (POS tags).
2. Filling: Iterate over the chart diagonally, combining smaller constituents to form
larger ones based on CFG rules.
3. Completion: Continue filling the chart until the top-right cell is reached, representing
the entire sentence.
 Efficiency: CKY parsing has a time complexity of O(n^3 * |G|), where n is the length of the
sentence and |G| is the size of the grammar. However, due to its dynamic programming
approach, it avoids redundant computations and is more efficient than naive parsing
algorithms.
 Applications: CKY parsing is commonly used in natural language processing tasks such as
syntactic parsing, machine translation, and grammar checking.
 Example: Given a sentence "The cat sleeps", CKY parsing constructs a chart with potential
constituents like NP (noun phrase) and VP (verb phrase), eventually deriving the most likely
parse tree for the sentence.

Statistical Parsing Basics: Probabilistic Context-Free Grammar (PCFG)

 Definition: Probabilistic Context-Free Grammar (PCFG) is an extension of context-free


grammar (CFG) where each production rule is associated with a probability.
 Components:
 Non-terminal Symbols: Represent syntactic categories (e.g., NP for noun phrase).
 Terminal Symbols: Correspond to words in the language.
 Production Rules: Rules defining how non-terminal symbols can be expanded into
sequences of terminal and/or non-terminal symbols, each associated with a
probability.
 Probabilistic Nature: PCFG assigns probabilities to production rules, reflecting the
likelihood of generating certain linguistic structures.
 Training: PCFGs are trained on annotated treebanks, where each parse tree is used to
estimate the probabilities of production rules.
 Rule Probabilities:
 Rule probabilities are typically estimated using maximum likelihood estimation
(MLE) or other statistical methods.
 They represent the relative frequency of observing a particular rule in the training
data.
 Parsing with PCFGs:
 PCFGs can be used to parse sentences probabilistically, generating parse trees based
on the likelihood of different syntactic structures.
 Probabilistic CKY parsing is a common algorithm used with PCFGs to find the most
likely parse tree for a given sentence.
 Applications:
 PCFGs are used in various natural language processing tasks, including syntactic
parsing, machine translation, and grammar checking.
 They provide a probabilistic framework for modeling syntactic ambiguity and
uncertainty in natural language.
 Example:
 Given a PCFG trained on a treebank, the rule S → NP VP might have a probability of
0.3, indicating that 30% of sentences in the training data have a noun phrase followed
by a verb phrase as their top-level structure.

In summary, PCFGs extend traditional CFGs by associating probabilities with production


rules, enabling the modeling of uncertainty in syntactic structures and providing a foundation
for statistical parsing algorithms.

Probabilistic CKY Parsing of PCFGs:

 Definition: Probabilistic CKY Parsing is an extension of the CKY parsing algorithm


specifically designed to handle Probabilistic Context-Free Grammars (PCFGs). It efficiently
computes the most likely parse tree for a given sentence based on the probabilities associated
with production rules in the PCFG.
 Algorithm:
 Probabilistic CKY Parsing is based on dynamic programming principles, similar to
the original CKY parsing algorithm.
 At each step of the parsing process, the algorithm computes the probability of each
constituent in the parse chart based on the probabilities of the production rules in the
PCFG.
 It selects the most probable constituents for each span of the sentence, taking into
account the probabilities of combining smaller constituents to form larger ones.
 Dynamic Programming:
 Probabilistic CKY Parsing efficiently computes the probabilities of constituents using
dynamic programming techniques.
 It avoids redundant computations by memoizing intermediate results and reusing them
in subsequent computations.
 Probabilistic Scores:
 Each constituent in the parse chart is associated with a probabilistic score,
representing the likelihood of generating that constituent according to the PCFG.
 The probabilistic scores are computed recursively based on the probabilities of
production rules and the scores of constituent combinations.
 Backpointers:
 Along with the probabilistic scores, backpointers are maintained to reconstruct the
most likely parse tree once parsing is complete.
 Backpointers are used to trace the path of the most likely constituents back to the root
of the parse tree.
 Efficiency:
 Probabilistic CKY Parsing has a time complexity of O(n^3 * |G|), where n is the
length of the sentence and |G| is the size of the grammar.
 Despite its cubic time complexity, the algorithm is efficient in practice due to its
dynamic programming approach and avoids exponential growth in computation.
 Applications:
 Probabilistic CKY Parsing is used in various natural language processing tasks where
probabilistic parsing is required, such as syntactic parsing, machine translation, and
grammar checking.
 It provides a principled approach to parse sentences probabilistically based on the
likelihood of different syntactic structures according to the PCFG.

You might also like