0% found this document useful (0 votes)
36 views96 pages

NLP - Ques Ans Bank

The document is a question-answer bank for a Natural Language Processing course at Krishna Institute of Technology, detailing course content, topics covered, and various types of questions. It includes an overview of natural language understanding, linguistics, and applications of NLP, along with specific questions and answers related to the subject matter. Additionally, it outlines the history and major disciplines involved in studying language, as well as applications of NLP systems.

Uploaded by

mishraricha9839
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views96 pages

NLP - Ques Ans Bank

The document is a question-answer bank for a Natural Language Processing course at Krishna Institute of Technology, detailing course content, topics covered, and various types of questions. It includes an overview of natural language understanding, linguistics, and applications of NLP, along with specific questions and answers related to the subject matter. Additionally, it outlines the history and major disciplines involved in studying language, as well as applications of NLP systems.

Uploaded by

mishraricha9839
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

KRISHNA INSTITUTE OF

TECHNOLOGY, KANPUR

QUESTION ANSWER BANK

NATURAL LANGUAGE PROCESSING (KOE-088)


[Link] ( CSE IV YEAR)

SUBMITTED BY
[Link] Khanna (Asstt. Professor)

[Link] [Link] [Link] kgikanpur


UNIT WISE COURSE CONTENT AS PER AKTU SYLLABUS
NATURAL LANGUAGE PROCESSING

NO. OF
UNIT NO. TOPICS QUESTIONS
Introduction to Natural Language Understanding: The study of
Unit-1 Language, Applications of NLP, Evaluating Language
Understanding Systems, Different levels of Language Analysis,
Representations and Understanding, Organization of Natural 20
language Understanding Systems, Linguistic Background: An
outline of English syntax.
Introduction to semantics and knowledge representation, some
Unit-2 applications like machine translation, database interface. 20

Grammars and Parsing: Grammars and sentence Structure, Top-


Unit-3 Down and Bottom-Up Parsers, Transition Network Grammars,
Top- Down Chart Parsing. Feature Systems and Augmented 12
Grammars: Basic Feature system for English, Morphological
Analysis and the Lexicon, Parsing with Features, Augmented
Transition Networks.
Grammars for Natural Language: Auxiliary Verbs and Verb
Unit-4 Phrases, Movement Phenomenon in Language, Handling
10
questions in Context-Free Grammars. Human preferences in
Parsing, Encoding uncertainty, Deterministic Parser.
Ambiguity Resolution: Statistical Methods, Probabilistic
Unit-5 Language Processing, Estimating Probabilities, Part-of Speech
tagging, Obtaining Lexical Probabilities, Probabilistic Context-
Free Grammars, Best First Parsing. Semantics and Logical Form, 19
Word senses and Ambiguity, Encoding Ambiguity in Logical
Form.

MACHINE LEARNING (ROE 083) MR. ANUJ KHANNA(ASST. PROFESSOR)


KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 1)

UNIT -1

 Introduction to Natural Language Understanding

 The study of Language, Applications of NLP.

 Evaluating Language Understanding Systems.

 Different levels of Language Analysis.

 Representations and Understanding.

 Organization of Natural language Understanding Systems.

 Linguistic Background: An outline of English syntax

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 1
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 1)

SHORT ANSWER TYPE QUESTIONS

Ques 1. What is language modeling?

Ans. Language modeling is central to many important natural language processing tasks. The notion of a
language model is inherently probabilistic. A language model is a function that puts a probability measure over
strings drawn from some vocabulary. A language model is basically a probability distribution over words or
word sequences. In practice, a language model gives the probability of a certain word sequence being “valid”.
Validity in this context does not refer to grammatical validity at all.

Ques 2. What do you mean by the term linguistic?

Ans . Linguistics is the scientific study of language. Linguists (experts in linguistics) work on specific
languages, but their primary goal is to understand the nature of language in general by asking questions such as:

 What distinguishes human language from other animal communication systems?


 What features are common to all human languages?
 How are the modes of linguistic communication (speech, writing, sign language) related to each other?
 How is language related to other types of human behavior?

Ques 3. Define the terms ‘Lexicon’ and ‘Morphemes’.

Ans : The definition of a lexicon is a dictionary or the vocabulary of a language, a people or a subject . Lexicon
is the central knowledge base of linguistic meanings as meanings. Any expansions or extensions of linguistic meanings
ride on the constructions of larger structures out of the elements of the lexicon.
The lexicon of a natural language contains all lexical items, that is, words. In a certain sense, the lexicon of any natural
language is the stock of unique and irregular pieces of information. Initially the term “lexicon” was used to characterize a
list of morphemes of a specific language different from a word list.
As the ideas of transformational generative grammar developed, some researchers started to treat the lexicon as a
component of the generative language model playing an auxiliary role in respect of grammar. The word was defined as a
meaningful unit that can be identified in a syntactic chain, and the lexicon was seen as a list of indivisible finite
elements regulated by morpho-lexical rules.

Ques 4. What is ‘Stemming’ and ‘Lemmatization’?

Ans : Stemming is used to normalize words into its base form or root form. E.g : celebrates , celebrated and
celebrating all these words have single root word ‘celebrate’.

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 2
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 1)

Lemmatization is quite similar to the stemming. Lemmatization usually refers to doing things properly with
the use of a vocabulary and morphological analysis of words, normally aiming to remove inflectional endings
only and to return the base or dictionary form of a word, which is known as the lemma.
If confronted with the token saw, stemming might return just s, whereas lemmatization would attempt to
return either see or saw depending on whether the use of the token was as a verb or a noun. The two may
also differ in that stemming most commonly collapses derivationally related words, whereas lemmatization
commonly only collapses the different inflectional forms of a lemma. Linguistic processing for stemming or
lemmatization is often done by an additional plug-in component to the indexing process, and a number of such
components exist, both commercial and open-source.

Ques 5. What is NER (Named Entity Relation)?


Ans : Named-entity recognition (NER) (also known as (named) entity identification, entity chunking,
and entity extraction) is a subtask of information extraction that seeks to locate and classify named
entities mentioned in unstructured text into pre-defined categories such as person names, organizations,
locations, medical codes, time expressions, quantities, monetary values, percentages, etc.

Ques 6. Mention some application areas of NLP.


Ans : Speech recognition
Machine Translations
Text summarization
Auto correction and detection.
Email filtering
Sentiment analysis
Social media analysis.

LONG ANSWER TYPE QUESTIONS

Ques 7. What are the major disciplines used in studying language? Briefly explain each of them.

Ans : Major disciplines used in study of a language are as following:


(i)Linguists : How do words form phrases and sentences? What constrains the possible meanings for a sentence?
Intuitions about well-formedness and meaning; mathematical models of structure (for example, formal language theory,
model theoretic semantics)

(ii)Psycholinguists : How do people identify the structure of sentences? How are word meanings identified? When does
understanding take place? How do people identify the structure of sentences? How are word meanings identified? When
does understanding take place?

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 3
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 1)

(iii)Philosophers: What is meaning, and how do words and sentences acquire it? How do words identify objects in the
world? Natural language argumentation using intuition about counterexamples; mathematical models (for example, logic
and model theory)

(iv)Computational Linguists: How is the structure of sentences identified? How can knowledge and reasoning be
modeled? How can language be used to accomplish specific tasks? Algorithms, data structures; formal models of
representation and reasoning; AI techniques (search and representation methods).

Ques [Link] explain the history of natural language processing.


Ans. History The field of natural language processing has been around for nearly 70 years. Perhaps most
famously, Alan Turing laid the foundation for the field by developing the Turing test in 1950. The Turing test
is a test of a machine’s ability to demonstrate intelligence that is indistinguishable from that of a human. For the
machine to pass the Turing test, it must generate human-like responses such that a human evaluator would not
be able to tell whether the responses were generated by a human or a machine (i.e., the machine’s responses are
of human quality).
 Like the broader field of artificial intelligence, NLP has had many booms and busts, lurching from hype
cycles to AI winters. In 1954, Georgetown University and IBM successfully built a system that could
automatically translate more than 60 Russian sentences to English. At the time, researchers at
Georgetown University thought machine translation would be a solved problem within three to five
years.
 The success in the US also spurred the Soviet Union to launch similar efforts. The Georgetown IBM
success coupled with the Cold War mentality led to increased funding for NLP in these early years.
 However, by 1966, progress had stalled, and the Automatic Language Processing Advisory Committee
(known as ALPAC)—a US government agency set up to evaluate the progress in computational
linguistics. The report led to a reduction in funding for machine translation research.
 Despite these setbacks, the field of NLP reemerged in the 1970s. By the 1980s, computational power
had increased significantly and costs had come down sufficiently, opening up the field to many more
researchers around the world.
 In the late 1980s, NLP rose in prominence again with the release of the first statistical machine
translation systems, led by researchers at IBM’s Thomas J. Watson Research Center. Prior to the rise
of statistical machine translation, machine translation relied on human handcrafted rules for language.
These systems were called rules-based machine translation. The rules would help correct and control
mistakes that the machine translation systems would typically make, but crafting such rules was a
laborious and painstaking process.

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 4
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 1)

 By the mid-1980s, IBM applied a statistical approach to speech recognition and launched a voice-
activated typewriter called Tangora, which could handle a 20,000- word vocabulary.
 DARPA, Bell Labs, and Carnegie Mellon University also had similar successes by the late 1980s.
Speech recognition software systems by then had larger vocabularies than the average human and could
handle continuous speech recognition, a milestone in the history of speech recognition.
 Today’s NLP heavyweights, such as Google, hired their first speech recognition employees in 2007.
The US government also got involved then; the National Security Agency began tagging large volumes
of recorded conversations for specific keywords, facilitating the search process for NSA analysts.
 By the early 2010s, NLP researchers, both in academia and industry, began experimenting with deep
neural networks for NLP tasks. Early deep learning–led successes came from a deep learning method
called long short-term memory (LSTM).
 In 2015, Google used such a method to revamp Google Voice.
 NLP made waves from 2014 onward with the release of Amazon Alexa, a revamped Apple Siri,
Google Assistant, and Microsoft Cortana.
 Google also launched a much-improved version of Google Translate in 2016, and now chatbots and
voice bots are much more common place.
 That being said, it wasn’t until 2018 that NLP had its very own Image-Net moment with the release of
large pre-trained language models trained using the Transformer architecture; the most notable of these
was Google’s BERT, which was launched in November 2018.
 In 2019, generative models such as Open-AI’s GPT-2 made splashes, generating new content on the
fly based on previous content, a previously insurmountable feat.
 In 2020, Open-AI released an even larger and more impressive version, GPT-3, building on its
previous successes.
 Heading into 2021 and beyond, NLP is now no longer an experimental subfield of AI. Along with
computer vision.

Ques 9. Category wise explain in detail various applications of NLU system


Ans. The applications can be divided into two major classes:
(i) Text-based applications
(ii) Dialogue-based applications.

Text-based applications: This consists of processing of written text, such as books, newspapers, reports, manuals, e-
mail messages, and so on. These are all reading-based tasks. Text-based natural language research is ongoing in
applications such as
 finding appropriate documents on certain topics from a data-base of texts (for example, finding relevant books in
a library)
[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)
Page 5
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 1)

 Extracting information from messages or articles on certain topics (for example, building a database of all stock
transactions described in the news on a given day)

 Translating documents from one language to another (for example, producing automobile repair manuals in many
different languages)

 Summarizing texts for certain purposes (for example, producing a 3-page summary of a 1000-page government
report).

Some machine translation systems have been built that are based on pattern matching, that is, a sequence of words in
one language is associated with a sequence of words in another language. Translation is accomplished by finding the best
set of patterns that match the input and producing the associated output in the other language. This technique can produce
reasonable results in some cases but sometimes produces completely wrong translations because of its inability to use an
understanding of content to disambiguate word senses and sentence meanings appropriately.

One very attractive domain for text-based research is story understanding. In this task the system processes a story and
then must answer questions about it. This is similar to the type of reading comprehension tests used in schools and
provides a very rich method for evaluating the depth of understanding the system is able to achieve.

Dialogue-based applications: This involves human-machine communication. Most naturally this involves spoken
language, but it also includes interaction using keyboards. Typical potential applications include :
 Question-answering systems, where natural language is used to query a database (for example, a query system to
a personnel database)
 Automated customer service over the telephone (for example, to perform banking transactions or order items
from a catalogue)
 Tutoring systems, where the machine interacts with a student (for example, an automated mathematics tutoring
system)
 Spoken language control of a machine (for example, voice control of a VCR or computer) .
 General cooperative problem-solving systems (for example, a system that helps a person plan and schedule
freight shipments)

Text-to-speech and speech-to-text Software is now able to convert text to high-fidelity audio very easily. For
example, Google Cloud Text-to-Speech is able to convert text into human-like speech in more than 180 voices
across over 30 languages. Likewise, Google Cloud Speech-to-Text is able to convert audio to text for over 120
languages, delivering a truly global offering

Chat bots : If you have spent some time perusing websites recently, you may have realized that more and
more sites now have a chat bot that automatically chimes in to engage the human user. The chat bot usually
greets the human in a friendly, non‐ threatening manner and then asks the user questions to gauge the purpose

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 6
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 1)

and intent of the visit to the site. The chat bot then tries to automatically respond to any questions the user has
without human intervention. Such chat bots are now automating digital customer engagement.

Voice bots : Ten years ago, automated voice agents were clunky. Unless humans responded in a fairly
constrained manner (e.g., with yes or no type responses), the voice agents on the phone could not process the
information. Now, AI voice bots like those provided by VOIQ are able to help augment and automate calls for
sales, marketing, and customer success teams.

Sentiment analysis(Opinion mining): Opinion mining, or sentiment analysis, is a text analysis technique that
uses computational linguistics and natural language processing to automatically identify and extract sentiment
or opinion from within text (positive, negative, neutral, etc.).With the explosion of social media content, there is
an ever-growing need to automate customer sentiment analysis, dissecting tweets, posts, and comments for
sentiment such as positive versus negative versus neutral or angry versus sad versus happy. Such software is
also known as emotion AI. It allows you to get inside your customers’ heads and find out what they like and
dislike, and why, so you can create products and services that meet their needs.

Information extraction: One major challenge in NLP is creating structured data from unstructured and/or
semi-structured documents. For example, named entity recognition software is able to extract people,
organizations, locations, dates, and currencies from long-form texts such as mainstream news. Information
extraction also involves relationship extraction, identifying the relations between entities.

Ques 10. Define the term NLP. What are the various levels of analysis in NLP system.
Ans. Natural language processing (NLP) is a subfield of linguistics, computer science, information engineering,
and artificial intelligence concerned with the interactions between computers and human (natural) languages, in
particular how to program computers to process and analyze large amounts of natural language data.
The term ‘Natural language processing’ (NLP) is normally used to describe the function of software or
hardware components in a computer system which analyze or synthesize spoken or written language.

The Different Levels of Language Analysis:

1. Lexical Analysis : The first phase of NLP is the Lexical Analysis. This phase scans the source code as a
stream of characters and converts it into meaningful lexemes. It divides the whole text into paragraphs,
sentences, and words.
2. Morphological Analysis: concerns how words are constructed from more basic meaning units called
morphemes. A morpheme is the primitive unit of meaning in a language (for example, the meaning of the
word "friendly" is derivable from the meaning of the noun "friend" and the suffix "-ly", which transforms a
noun into an adjective).

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 7
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 1)

3. Syntactic Analysis (Parsing) : Syntactic Analysis is used to check grammar, word arrangements, and shows
the relationship among the words. Example: Agra goes to the Poonam

In the real world, Agra goes to the Poonam, does not make any sense, so this sentence is rejected by the
Syntactic analyzer.

4. Semantic Analysis : Semantic analysis is concerned with the meaning representation. It mainly focuses on
the literal meaning of words, phrases, and sentences.

5. Discourse Integration : Discourse Integration depends upon the sentences that proceeds it and also invokes
the meaning of the sentences that follow it.

6. Pragmatic Analysis : Pragmatic is the fifth and last phase of NLP. It helps you to discover the intended
effect by applying a set of rules that characterize cooperative dialogues.

For Example: "Open the door" is interpreted as a request instead of an order.

Ques11. Differentiate between NLP and NLU. Explain levels of knowledge representation in NLP.
Ans. Natural language processing (NLP) is actually made up of natural language understanding (NLU)
and natural language generation (NLG). Natural language understanding is how the machine takes in the
query or request from the user and use sentiment analysis, part-of-speech tagging, topic classification, and
other machine learning techniques to understand the intent of what the user has said.
This also includes turning the unstructured data, the plain language query , into structured data that can be used
to query the data set.
Natural language generation is how the machine takes the results of the query and puts them together into
easily understandable human language. Applications for these technologies could include product descriptions,
automated insights, and other business intelligence applications in the category of natural language search.

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 8
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 1)

Levels of knowledge representation in NLP are as described below :


Phonetic And Phonological Knowledge : Phonetics is the study of language at the level of sounds while
phonology is the study of combination of sounds into organized units of speech, the formation of syllables and
larger units. Phonetic and phonological knowledge are essential for speech based systems as they deal with how
words are related to the sounds that realize them.

Morphological Knowledge : Morphology concerns word formation. It is a study of the patterns of formation of
words by the combination of sounds into minimal distinctive units of meaning called mophemes.
Morphological knowledge concerns how words are constructed from morphemes.

Syntactic Knowledge: Syntax is the level at which we study how words combine to form phrases, phrases
combine to form clauses and clauses join to make sentences. Syntactic analysis concerns sentence formation. It
deals with how words can be put together to form correct sentences. It also determines what structural role each
word plays in the sentence and what phrases are subparts of what other phrases.

Semantic Knowledge : It concerns meanings of the words and sentences. This is the study of context
independent meaning that is the meaning a sentence has, no matter in which context it is used. Defining the
meaning of a sentence is very difficult due to the ambiguities involved.

Pragmatic Knowledge: Pragmatics is the extension of the meaning or semantics. Pragmatics deals with the
contextual aspects of meaning in particular situations. It concerns how sentences are used in different situations
and how use affects the interpretation of the sentence.

Discourse Knowledge: Discourse concerns connected sentences. It is a study of chunks of language which are
bigger than a single sentence. Discourse language concerns inter-sentential links that is how the immediately
preceding sentences affect the interpretation of the next sentence. Discourse knowledge is important for
interpreting pronouns and temporal aspects of the information conveyed.

World Knowledge: Word knowledge is nothing but everyday knowledge that all speakers share about the
world. It includes the general knowledge about the structure of the world and what each language user must
know about the other user’s beliefs and goals. This is essential to make the language understanding much better.

Ques 12. What is information extraction? Explain various sub tasks of information extraction.

Ans : This explosion of information and need for more sophisticated and efficient information handling tools gives rise to
Information Extraction(IE) and Information Retrieval(IR) technology. Information Extraction systems takes natural
language text as input and produces structured information specified by certain criteria, that is relevant to a particular
application.
Various sub-tasks of IE such as Named Entity Recognition, Co-reference Resolution, Named Entity Linking, Relation
Extraction, Knowledge Base reasoning forms the building blocks of various high end Natural Language Processing (NLP)

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 9
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 1)

tasks such as Machine Translation, Question-Answering System, Natural Language Understanding, Text Summarization
and Digital Assistants like Siri, Cortana and Google Now.

Various sub-tasks of IE such as Named Entity Recognition, Coreference Resolution, Named Entity Linking, Relation
Extraction, Knowledge Base reasoning forms the building blocks of various high end Natural Language Processing (NLP)
tasks such as Machine Translation, Question-Answering System, Natural Language Understanding, Text Summarization
and Digital Assistants like Siri, Cortana and Google Now.
(i)Parts-of-Speech (POS) tagging: In corpus linguistics, part-of-speech tagging (POS tagging or PoS
tagging or POST),also called grammatical tagging is the process of marking up a word in a text (corpus) as
corresponding to a particular part of speech, based on both its definition and its context. A simplified form of
this is commonly taught to school-age children, in the identification of words
as nouns, verbs, adjectives, adverbs, etc.

(ii) Parsing : Produces syntactic analysis in the form of a tree that shows the phrases comprising the sentence
and the hierarchy in which these phrases are associated. Constituency parsers have been used for pronoun
resolution, labeling phrases with semantic roles and assignment of functional category tags.

(iii) Named Entity Recognition (NER) : The task is to find Person (PER), Organization (ORG), Location
(LOC) and Geo-Political Entities (GPE). For instance, in the statement ”Michael Jordan lives in United States”,
NER system extracts Michael Jordan which refers to name of the person and United States which refers to
name of the country.
(iv) Named Entity Linking (NEL): Named Entity Linking (NEL) also known as Named Entity
Disambiguation (NED) or Named Entity Normalization (NEN) is the task of identifying the entity that
corresponds to particular occurrence of a
[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)
Page 10
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 1)

noun in a text document.


(v) Co-reference Resolution (CR): Coreference Resolution is the task which determines which noun phrases
(including pronouns, proper names and common names) refer to the same entities in documents. For instance,
in the sentence, ”I have seen the annual report. It shows that we have gained 15% profit in this financial year”.
Here, ”I” refers to name of the person, ”It” refers to annual report and ”we” refers to the name of the company
in which that person works.
(vi). Temporal Information Extraction (Event Extraction): Task of identifying events (i.e information
which can be ordered in a temporal order)in free text and deriving detailed and structured information about
them, ideally identifying who did what to whom, where, when and why.

(vii). Relation Extraction (RE): Relation Extraction is the task of detecting and classifying pre-defined
relationships between entities identified in the text.

(viii). Knowledge Base Reasoning and Completion: There are various applications based on link prediction such
as recommendation systems, Knowledge base completion and finding links between users in social networks. In
recommendation systems, goal is to predict the rating of the movies which are not already rated and
recommending it to users to have better user experience.

Ques 13. Explain in detail, the three main approaches of developing NLP system.

Ans. The three dominant approaches of developing NLP system today are rule-based, traditional machine
learning (statistical-based), and neural network–based.
(i) Rule-based NLP (ii) Traditional (or classical) machine learning (iii) Neural networks

Rule based NLP


Traditional NLP software relies heavily on human-crafted rules of languages; domain experts, typically
linguists, curate these rules using things like regular expressions and pattern matching. Rule-based NLP
performs well in narrowly scoped-out use cases but typically does not generalize well. More and more rules are
necessary to generalize such a system, and this makes rule-based NLP a labor-intensive and brittle solution
compared to the other NLP approaches.
Here are examples of rules in a rule-based system: words ending in -ing are verbs, words ending in -er or -
est are adjectives, words ending in ’s are possessives, etc. Think of how many rules we would need to create by
hand to make a system that could analyze and process a large volume of natural language data. Not only would
the creation of rules be a mind-bogglingly difficult and tedious process, but we would also have to deal with the
many errors that would occur from using such rules. We would have to create rules for rules to address all the
corner cases for each and every rule.

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 11
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 1)

Traditional (or classical) machine learning based NLP


Traditional machine learning relies less on rules and more on data. It uses a statistical approach, drawing
probability distributions of words based on a large annotated corpus. Humans still play a meaningful role;
domain experts need to perform feature engineering to improve the machine learning model’s performance.
Features include capitalization, singular versus plural, surrounding words, etc. After creating these features,
you would have to train a traditional ML model to perform NLP tasks; e.g., text classification. Since
traditional ML uses a statistical approach to determine when to apply certain features or rules to pro‐ cess
language, traditional ML-based NLP is easier to build and maintain than a rule-based system. It also generalizes
better than rule-based NLP.

Neural networks based NLP


Neural networks address the shortcomings of traditional machine learning. Instead of requiring humans to
perform feature engineering, neural networks will “learn” the important features via representation learning. To
perform well, these neural networks just need large amounts of data. The amount of data required for these
neural nets to perform well is substantial, but, in today’s internet age, data is not too hard to acquire. You can
think of neural networks as very powerful function approximators or “rule” creators; these rules and features
are several degrees more nuanced and complex than the rules created by humans, allowing for more automated
learning and more generalization of the system in processing natural language data.
Of these three, the neural network–based branch of NLP, fueled by the rise of very deep neural networks
(i.e., deep learning), is the most powerful and the one that has led to many of the mainstream commercial
applications of NLP in recent years.

Ques 14. Draw the architecture of NLP system. Explain various components of Natural Language
Generation System.
Ans . Architecture of NLP system consists of following modules:
(a).Text Planning :This includes relevant content from KB also known as content planning.
(b). Sentence Planning: It includes selecting required word which formed meaningful phrases.
(c). Surface Realization: In the context of Natural Language Generation, surface realization is the task of
generating the linear form of a text following a given grammar. Surface realization models usually consist of
a cascade of complex sub-modules, either rule-based or neural network-based, each responsible for a specific
sub-task
(d). Discourse Planning : It is used for discourse integration. It means a sense of the context. The meaning of
any single sentence which depends upon that sentences. It also considers the meaning of the following sentence.
For example, the word “that” in the sentence “He wanted that” depends upon the prior discourse context.

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 12
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 1)

Components of NL System

(a).Text Organization: Text organization refers to how a text is organized to help readers follow and
understand the information presented. There are a number of standard forms that help text organization when
writing.
(b). Text Realization: In linguistics, realization is the process by which some kind of surface representation is
derived from its underlying representation; that is, the way in which some abstract object of linguistic analysis
comes to be produced in actual language. Phonemes are often said to be realized by speech sounds. The
different sounds that can realize a particular phoneme are called its allophones.

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 13
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 1)

(c). Content Selection: Content selection is a central component in many natural language generation tasks,
where, given a generation goal, the system must determine which information should be expressed in the output
text. In summarization, content selection is usually accomplished through sentence (and, occasionally, phrase)
extraction.
(d). Linguistic Resource: Linguistic resources are essential for creating grammars, in the framework of
symbolic approaches or to carry out the training of modules based on machine learning. In Latin, the word
corpus means body, but when used as a source of data in linguistics, it can be interpreted as a collection of
texts.

Ques 15. What is the organization of NLP system? Explain.


Ans Organization of a general NLP system is described as below:
(i) Interpretation processes: This maps from one representation to the other. For instance, the process
that maps a sentence to its syntactic structure and logical form is called the parser. It uses knowledge
about word and word meanings (the lexicon) and a set of rules defining the legal struc-tures (the
grammar) in order to assign a syntactic structure and a logical form to an input sentence.
(ii) An alternative organization could perform syntactic processing first and then perform semantic
interpretation on the resulting structures. Combining the two, however, has considerable
advantages because it leads to a reduction in the number of possible interpretations, since every
proposed interpretation must simultaneously be syntactically and semantically well formed.
For example, consider the following two sentences:
 Visiting relatives can be tyring
 Visiting museums can be tyring.
These two sentences have identical syntactic structure, so both are syntactically ambiguous. In 1st sentence, the
subject might be relatives who are visiting you or the event of you visiting relatives. Both of these alternatives
are semantically valid, and you would need to determine the appropriate sense by using the contextual
mechanism. However, 2nd sentence has only one possible semantic inter-pretation, since museums are not
objects that can visit other people; rather they must be visited.

(iii)Contextual processing: The process that transforms the syntactic structure and logical form into a final
meaning representation is called contextual processing. This process includes issues such as
identifying the objects referred to by noun phrases such as definite descriptions (for example, "the
man") and pronouns, the analysis of the temporal aspects of the new information conveyed by the
sentence, the identification of the speaker’s intention (for example, whether "Can you lift that rock"
is a yes/no question or a request),

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 14
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 1)

(iv) Inferential processing required to interpret the sentence appropriately within the application domain. It uses
knowledge of the discourse context (determined by the sentences that preceded the current one) and
knowledge of the application to produce a final representation. The system would then perform whatever
reasoning tasks are appropriate for the application.

(v) Generation process: The meaning that must be expressed is passed to the generation component of the system. It
uses knowledge of the discourse context, plus information on the grammar and lexicon, to plan the form of an
utterance, which then is mapped into words by a realization process.

Ques 16. Explain the steps to build the NLP pipeline.


Ans. Following are the steps to build an NLP pipeline -

Step1: Sentence Segmentation : Sentence Segment is the first step for building the NLP pipeline. It breaks the paragraph
into separate sentences.

Example: Consider the following paragraph - Independence Day is one of the important festivals for every Indian
citizen. It is celebrated on the 15th of August each year ever since India got independence from the British rule. The
day celebrates independence in the true sense.

Sentence Segment produces the following result:

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 15
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 1)

1. "Independence Day is one of the important festivals for every Indian citizen."
2. "It is celebrated on the 15th of August each year ever since India got independence from the British rule."
3. "This day celebrates independence in the true sense."

Step2: Word Tokenization : Word Tokenizer is used to break the sentence into separate words or tokens.

Example: Word Tokenizer generates the following result:

"JavaTpoint", "offers", "Corporate", "Training", "Summer", "Training", "Online", "Training", "and", "Winter",
"Training", "."

Step3: Stemming : Stemming is used to normalize words into its base form or root form. For example, celebrates,
celebrated and celebrating, all these words are originated with a single root word "celebrate." The big problem with
stemming is that sometimes it produces the root word which may not have any meaning.

For Example, intelligence, intelligent, and intelligently, all these words are originated with a single root word
"intelligen." In English, the word "intelligen" do not have any meaning.

Step 4: Lemmatization : Lemmatization is quite similar to the Stamming. It is used to group different inflected forms of
the word, called Lemma. The main difference between Stemming and lemmatization is that it produces the root word,
which has a meaning.

For example: In lemmatization, the words intelligence, intelligent, and intelligently has a root word intelligent, which has
a meaning.

Step 5: Identifying Stop Words : In English, there are a lot of words that appear very frequently like "is", "and", "the",
and "a". NLP pipelines will flag these words as stop words. Stop words might be filtered out before doing any statistical
analysis. Example: He is a good boy.

Step 6: Dependency Parsing : Dependency Parsing is used to find that how all the words in the sentence are related to
each other.

Step 7: POS tags : POS stands for parts of speech, which includes Noun, verb, adverb, and Adjective. It indicates that
how a word functions with its meaning as well as grammatically within the sentences. A word has one or more parts of
speech based on the context in which it is used.

Example: "Google" something on the Internet. Google is used as a verb, although it is a proper noun.

Step 8: Named Entity Recognition (NER) : Named Entity Recognition (NER) is the process of detecting the named
entity such as person name, movie name, organization name, or location.

Example: Steve Jobs introduced iPhone at the Macworld Conference in San Francisco, California.

Step 9: Chunking : Chunking is used to collect the individual piece of information and grouping them into bigger pieces
of sentences.

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 16
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 1)

Ques 17. What are the various techniques to evaluate an NLP system?
Ans : Various techniques to evaluate NLP system are described as below :
(i) Run and Test : One obvious way to evaluate a system is to run the program and see how well it
performs the task it was designed to do. If the program is meant to answer questions about a
database of facts, you might ask it questions to see how good it is at producing the correct answers.
(ii) Black Box Evaluation: If the system is designed to participate in simple conversations on a certain
topic, you might try conversing with it. This is called black box evaluation because it evaluates
system performance without looking inside to see how it works. While ultimately this method of
evaluation may be the best test of a system’s capabilities, it is problematic in the early stages of
research because early evaluation results can be misleading.
 Sometimes the techniques that pro-duce the best results in the short term will not lead to the best
results in the long term. For instance, if the overall performance of all known systems in a given
application is uniformly low, few conclusions can be drawn.
 The fact that one system was correct 50 percent of the time while another was correct only 40
percent of the time says nothing about the long-term viability of either approach. Only when the
success rates become high, making a practical application feasible, can much significance be given
to overall system performance measures.

(iii)Glass Box Evaluation: An alternative method of evaluation is to identify various subcomponents of a


system and then evaluate each one with appropriate tests. This is called glass box evaluation because you look
inside at the structure of the system. The problem with glass box evaluation is that it requires some consensus
on what the various components of a natural language system should be.

Ques 18. What do you mean by understanding and representation of a natural language system?
Ans : A crucial component of understanding involves computing a representation of the meaning of sentences and texts.
Without defining the notion of representation, however, this theory has little importance. For instance, why not simply use
the sentence itself as a representation of its meaning?
 One reason is that most words have multiple meanings, which we will call senses. The word "cook", for
example, has a sense as a verb and a sense as a noun; "dish" has multiple senses as a noun as well as a
sense as a verb; and "still" has senses as a noun, verb, adjective, and adverb.
 This ambiguity would inhibit the system from making the appropriate inferences needed to model
understanding. The disambiguation problem appears much easier than it actually is because people do
not generally notice ambiguity. While a person does not seem to consider each of the possible.
 To represent meaning, we must have a more precise language. The tools to do this come from
mathematics and logic and involve the use of formally specified representation languages.

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 17
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 1)

 Formal languages are specified from very simple building blocks. The most fundamental is the notion of
an atomic symbol which is distinguishable from any other atomic symbol simply based on how it is
written. Useful representation languages have the following two properties:
 The representation must be precise and unambiguous. You should be able to express every distinct
reading of a sentence as a distinct formula in the representation.
 The representation should capture the intuitive structure of the natural language sentences that it represents. For
example, sentences that appear to be structurally similar should have similar structural representations, and the
meanings of two sentences that are paraphrases of each other should be closely related to each other.

Syntax: Representing Sentence Structure The syntactic structure of a sentence indicates the way that
words in the sentence are related to each other. This structure indicates how the words are grouped
together into phrases, what words modify what other words, and what words are of central importance in
the sentence. In addition, this structure may identify the types of relationships that exist between phrases
and can store other information about the particular sentence structure that may be needed for later
processing. For example, consider the following sentences:
1. John sold the book to Mary.
2. The book was sold to Mary by John.
Most syntactic representations of language are based on the notion of context-free grammars, which
represent sentence structure in terms of what phrases are subparts of other phrases.

Logical Form: The intended meaning of a sentence depends on the situation in which the sentence is
produced. The Division is between context-independent meaning and context-dependent meaning. The
representation of the context-independent meaning of a sentence is called its logical form.
The fact that "catch" may refer to a baseball move or the results of a fishing expedition is knowledge about
English and is independent of the situation in which the word is used. On the other hand, the fact that a
particular noun phrase "the catch" refers to what Jack caught when fishing yesterday is contextually
dependent.

Final Meaning Representation: The final representation needed is a general knowledge representation
(KR), which the system uses to represent and reason about its application domain. This is the language in
which all the specific knowledge based on the application is represented. The goal of contextual
interpretation is to take a representation of the structure of a sentence and its logical form, and to map this
into some expression in the KR that allows the system to perform the appropriate task in the domain. In a
question-answering application, a question might map to a database query, in a story-understanding
application, a sentence might map into a set of expressions that represent the situation that the sentence

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 18
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 1)

describes. First-order predicate calculus (FOPC) is the final representation language because it is
relatively well known, well studied, and is precisely defined.

Ques 19. Write short notes on the following:


(a).Morphemes (b) Polysemy and Homonomy (c) Phrases in language.
Ans . (a) Morphemes: Words are potentially complex units, composed of even more basic units, called
morphemes. A morpheme is the smallest part of a word that has grammatical function or meaning. For
example, sawed, sawn, sawing, and saws can all be analyzed into the morphemes {saw} + {-ed}, {-n}, {-ing},
and {-s}, respectively. On elementary basis two types of morphemes exist:
(i) Lexical Morphemes: These can not be divided into other words . there meaning completely exist in that
words.
(ii) Grammatical Morphemes: When suffix like {-ed , -ing , -ful , -ly –est } or prefix like {-pre , -sub , -un}
Are added in words then it is known as grammatical morphemes.
Affixes are classified according to whether they are attached before or after the form to which they are added.
Prefixes are attached before and suffixes after. E.g: {re-} of resaw is a prefix.
A root morpheme is the basic form to which other morphemes are attached. It provides the basic meaning of
the word. The morpheme {saw} is the root of sawers.
Derivational morphemes are added to forms to create separate words: {-er} is a derivational suffix whose
addition turns a verb into a noun, usually meaning the person or thing that performs the action denoted by the
verb. For example, {paint}+{-er} creates painter, one of whose meanings is “someone who paints.”
Inflectional morphemes do not create separate words. They merely modify the word in which they occur in
order to indicate grammatical properties such as plurality.

(b). A word is polysemous if it can be used to express different meanings. The difference between the
meanings can be obvious or subtle. E.g: school , university , college.
Two or more words are homonyms if they either sound the same (homophones), have the same spelling
(homographs), or both, but do not have related meanings. E.g : (right & write) , (piece & peace).

(c) Phrases in language: Traditionally “phrase” is defined as “a group of words that does not contain a verb
and its subject and is used as a single part of speech.” This definition has three characteristics:
(1) It specifies that only a group of words can constitute a phrase, implying that a single word cannot.
(2) It distinguishes phrases from clauses.
(3) It requires that the groups of words believed to be a phrase constitute a single grammatical unit.
 A single word may be a phrase when it is the head of that phrase. The head of a phrase is the phrase’s
central element; any other words (or phrases) in the phrase orient to it, either by modifying it or
complementing it.

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 19
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 1)

 The head determines the phrase’s grammatical category: if the head is a noun, the phrase is a noun
phrase; if the head is a verb, the phrase is a verb phrase, and so on.
 The head can also determine the internal grammar of the phrase: if the head is a noun, then it may be
modified by an article; if the head is a transitive verb, it must be complemented by a direct object.

 Heads also determine such things as the number of their phrases: if the head of an NP is singular, then
the NP is singular; if the head is plural, then the NP is plural.

*****************End of Unit 1*****************

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 20
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 2)

UNIT -2

 Introduction to Semantics and Knowledge Representation

 Some applications like Machine Translation.

 NLP Database Interface.

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 21
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 2)

SHORT ANSWER TYPE QUESTIONS

Ques 1. What do you mean by semantics in NLU?

Ans. The terms ‘semantics’ and ‘semantic interpretation’ usually refer to methods of representing the meanings of
natural language expressions, and of computing such meaning representations. A system for semantic analysis
determines the meaning of words in text. Semantics gives a deeper understanding of the text in sources such as
a blog post, comments in a forum, documents, group chat applications, chatbots, etc.

Ques 2. Mention few elements of language that help in semantic analysis.


Ans .Elements that support to understand the semantics of any language are :
 Hyponymy: A generic term.
 Homonymy: Two or more lexical terms with the same spelling and different meanings.
 Polysemy: Two or more terms that have the same spelling and similar meanings.
 Synonymy: Two or more lexical terms with different spellings and similar meanings.
 Antonymy: A pair of lexical terms with contrasting meanings.
 Meronomy: A relationship between a lexical term and a larger entity.

Ques 3. Mention some of the issues in knowledge representation of NLP systems.


Ans. Semantic representation of NLP systems still confront many specific difficulties as following :
 The representation of tense and aspect, of adjectival and adverbial modification.
 Nominalization, generic sentences, propositional attitudes, counterfactual conditionals, comparatives,
and generalized quantifiers.
 Many aspects of the disambiguation process and word-sense disambiguation.
 The inference of implicit causal connections, plans, goals, reasons, and so on remains a refractory
problem.

Ques 4. What are the semantic nets?


Ans : The semantic network based knowledge representation mechanism is useful where an object or concept is
associated with many attributes and where relationships between objects are important. Semantic nets have also been used
in natural language research to represent complex sentences expressed in English. Semantic network is simple
representation scheme which uses a graph of labeled nodes and labeled directed arcs to encode knowledge
 Nodes : objects, concepts, events .
 Arcs : relationships between nodes . Arcs define binary relations which hold between objects denoted by the
nodes.
Non binary relation : We can represent the generic give event as a relation involving three things:
A agent , A beneficiary , An object.

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 22
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 2)

Consider the following examples:


1. Suppose we have to represent the sentence “Sima is a girl”.

Ram is taller than Hari

Ques 5. What are the conceptual tenses, state weights proposed by Schank?
Ans (i) Conceptual Tenses
past p

future f

negation /

start of a transition ts

end of a transition tf

present nil

conditional c

continuous k

interrogative ?

timeless ∞

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 23
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 2)

(ii) Conceptual States:

Ques 6 : Differentiate between semantics sand pragmatics.


Ans.
[Link] SEMANTICS PRAGMATICS
1. Semantics looks at the literal meaning Pragmatic recognizes how important
of words and the meanings that are context can be when interpreting the
created by the relationships between meaning of discourse and also considers
linguistic expressions. things such as irony, metaphors, idioms,
and implied meanings.
2. Looks at the literal meanings of Looks at the intended meaning of
words. words.
3. Limited to the relationship between Covers the relationships between words,
words. interlocutors (people engaged in the
conversation), and contexts.

Ques 7. Difference between semantic nets and partitioned nets.


Ans :
[Link] SEMANTIC NETS PARTITIONED NETS
A semantic network is a representation These are also graph representation of real world
1. of knowledge in the form of graphs knowledge but emphasizes on quantifiers also
with the help of interconnected nodes. (existential and universal both).
2. The semantic network takes a long time It takes less time
to answer the question
3. Graph is drawn as a whole single entity Partitioning of graph based on agent who creates
with IS-a and Kind-of relations the event and entity who is benefited by that
between event

LONG ANSWER TYPE QUESTIONS

Ques 8. How knowledge is represented using semantic nets? Explain with example. Mention few
advantages and disadvantages also.

Ans : Semantic networks structure the knowledge of a specific part of any information. It uses real-world
meanings that are easy to understand. For example, it uses "is a" and "is a part" inheritance hierarchies. Besides
this, they can be used to represent events and natural language sentences. The semantic network based
knowledge representation mechanism is useful where an object or concept is associated with many attributes
and where relationships between objects are important. Semantic nets have also been used in natural language
research to represent complex sentences expressed in English.

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 24
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 2)

 It represents knowledge in the form of graphs with the help of interconnected nodes. It's a widely
popular idea in artificial intelligence and natural language processing because it supports reasoning.
 A semantic network is an alternative way to represent knowledge in a logical way.
 Arcs show the relationships between objects.

A semantic net showing relationship and attributes of Base Ball Player


To store knowledge semantic net has following components:
 Lexical Component: Nodes represent physical entities, links, and labels. The links show the
relationship between the objects. Labels, on the other hand, denote particular objects and their
relationships.
 Structural Component: The links and nodes form a diagram according to the direction.
 Semantic Components: In the semantic component, all the definitions are related to the links, and label
of nodes while facts are dependent on the approved areas.
 Procedural part: It has constructors and destructors. Constructors allow the creation of new links and
nodes while destructors permit the removal of links and nodes.

ADVANTAGES OF SEMANTIC NET


 The semantic network is a natural way to represent knowledge in the world
 It communicates information in a transparent form.
 It’s easy to decode.

DISADVANTAGES OF SEMANTIC NET


 The semantic network takes a long time to answer the question. For instance, we need to study the
whole network to get a simple answer and the worst part is it’s also possible that we end up with no
answer.
 It stores information like a human brain, but in reality, it’s not possible to create that sort of vast
network.
 These representations are confusing as there are quantifiers like “for all”, “for some”, “none”, etc. seem
missing.

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 25
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 2)

Ques 9. Eplain about partitioned nets and their properties. How information is deduced through
partitioned nets?

Ans. Partitioned Semantic Network : Some complex sentences are there which cannot be represented by simple
semantic nets and for this we have to follow the technique partitioned semantic networks. Partitioned semantic
net allow for
1. Propositions to be made without commitment to truth.
2. Expressions to be quantified (Universal or Existential Quantification)
In partitioned semantic network, the network is broken into spaces which consist of groups of nodes and arcs
and regard each space as a node.
Examples: (i) All Seema are eating Apples

(ii) Every Dog has bitten a shopkeeper.

In above examples GS stands for general statement of real world which is true for all the instances of a class
which hold some properties. GS has the form of another semantic net which shoes the relationship between
objects w.r.t some event which is occurred.
The g is an instance of GS which universally quantifies the event.

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 26
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 2)

Information in partitioned nets can be deduced by the following techniques:


(a)Recognizing Text Entailment : Given some pairs of texts as input, the task of recognizing text entailment is to
identify whether the semantic meaning of one text is entailed or can be inferred from another text. The output is binary

(b).Paraphrasing Task: Given some pairs of texts as input, the task of paraphrasing is to recognize whether each pair of
texts captures a paraphrase/semantic equivalence relationship. A paraphrase is the restatement of the meaning of a text
using different words. The output is binary (paraphrase or non-paraphrase).

(c)Fact-checking : Given some claims as input, the task of fact-checking is to check the factuality of these claims.
Factuality indicates the degree of being actual in terms of right or wrong. The predefined output can be binary (e.g., true
or false) or multi-class (e.g., true, false, half-true, mostly-true, etc.)

(d) Relation Extraction : Given some texts and entities from the texts as input, the task of relation extraction is to
identify the semantic relations between two or more entities. The semantic relations are predefined, considering direction.
The direction of relation means who modifies what

Ques 10. How humans can help a machine translation system to produce better quality. What are various
methods of machine translation?

Ans. We can adopt following practices to produce better quality in machine translation systems.

 Use short sentences


 Make sure your sentence structure is well-written
 Aim for simple sentence structure
 Use adverbs concisely
 Avoid industry jargon
 Stay away from slang
 Avoid compound words
 Don’t use ambiguous words
Methods or approaches of machine translation applications are as described below:

(a).RULE BASED MACHINE TRANSLATION (RBMT): RBMT is called Knowledge Based Machine
Translation that retrieves rules from bilingual dictionaries a grammars based on linguistic information about source and
target languages. RBMT generates target sentences on the basis of syntactic, morphological and semantic regularities of
each language. It converts source language structures to target language structures and it is extensible and maintainable .
There are three types of RBMT systems:
(i) Direct translation systems

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 27
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 2)

(ii) Transfer based systems


(iii) Interlingua based systems.

Direct method (Dictionary Based Machine Translation): Source language text are translated without passing through
an intermediary representation. The words will be translated as a dictionary does word by word, usually without much
correlation of meaning between them. Dictionary lookups may be done with or without morphological analysis.
Anusaarka is the example of system that uses direct approach. Indian Institute of Information Technology,
Hyderabad, develops it.

Transfer Rules Based Machine Translation Systems: Morphological and syntactical analysis is the
fundamental approaches in Transfer based systems. Here source language text is converted into less language
specific representation and same level of abstraction is generated with the help of grammar rules and bilingual
dictionaries.

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 28
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 2)

In the transfer approach of translation divergence, there is transfer rule for transforming a source language (SL)
sentence into target language (TL), by performing lexical and structural manipulations Mantra is a transfer
based tool which is a funded project of India Government

Interlingual RBMT Systems (Interlingua) This model is indented to make linguistic homogeneity across the world. In
this method, source language is translated into an intermediary representation which does not depends on any languages.
Target language is derived from this auxiliary form of representation.

Challenge with Interlingual Rules Based Machine Translation


 Hard to handle exceptions to rule for interlingual.
 The number of rules will grow drastically in case of general translation systems.

Corpus based machine translation : One of the main methods of machine translation is Corpus Based Machine
Translation because high level of accuracy is achieved at the time of translation by this method. Large volumes of
translations are presented after the development of corpus based system that is used in various computer-aided translation
applications. Following is the different types of Corpus Based Machine Translation models.

(a).Statistical Machine Translation (SMT) : Statistical models are applied in this method to create translated output
with the assistance of bilingual corpora. The concept of Statistical Machine Translation comes from information theory.
The important feature of this method is no customization work is required by linguists because the tool learns translation
methods through statistical analysis of bilingual corpora

(b).Example based Machine Translation System: This method is also called as Memory based translation in which set
of sentences from source language is given and generates corresponding translations of target language with point to point
mapping. Here examples are used to convert similar types of sentences and previously translated sentence repeated, the
same translation is likely to be correct again . The main advantage of this model is it work well with small set of data and
possible to generate output more quickly by train the translation program. Example based method is mainly used to
translate two totally different languages like Japanese and English as in. It is not possible to apply deep linguistic analysis
that is one of the main drawbacks of Example based engine. PanEBMT is an example of EBMT tool.

(c).Hybrid Machine Translation: HMT takes the advantages of RBMT and Statistical Machine Translation. It uses
RBMT as baseline and refines the rules through statistical models. Rules are used to pre-process data in an attempt to
better guide the statistical engine. Hybrid model differ in various ways.

 Rules Post-Processed by Statistics : Rule based tool is used for translation at first. Statistical model is applied to
adjust the translated output of rule based tool.
 Statistics Guided by Rules In this method, rules are applied to pre-process input that gives better guidance to
statistical tool. Rules are also used to post-process the statistical output that caused to normalized output. This
[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)
Page 29
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 2)

method has more flexibility, power and control at the translation time.

Challenge with HYBRID Based Machine Translation:


 Speech agreement mistakes.
 Extra punctuations
 Wrong capitalization.

Ques 11. What are optimization techniques in semantic nets, explain?

Ans. Frame based technique is used ti optimize semantic networks by incorporating reusability in language
system via inheritance. Frame based representation is a development of semantic nets and allow us to express the idea
of inheritance. A Frame System consists of a set of frames (or nodes), which are connected together by relations. Each
frame describes either an instance or a class. Each frame has one or more slots, which are assigned slot values. This is the
way in which the frame system is built up. Rather than simply having links between frames, each relationship is expressed
by a value being placed in a slot. Example:

Diagrammatic representation of frames is as mentioned below:

When we say, “Tommy is a dog” we really mean, “Tommy is an instance of the class dog” or “Tommy is a
member of the class dogs”. Why are Frames useful? The main advantage of using frame-based systems for
expert systems is that all information about a particular object is stored in one place(optimization takes place).

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 30
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 2)

Ques 12 . Explain the goal of conceptual dependency. What are its components and primitive actions?

Ans : Conceptual Dependency originally developed to represent knowledge acquired from natural language
input. This was proposed by Schank. The goals of this theory are:

 To provide automated reasoning/ inference from sentences.


 To be independent of the words used in the original input i.e. different words and structures represent
the same concept.
 Language-independent meaning representation that means for any two or more sentences that are
identical in meaning there should be only one representation of that meaning.

Components of Conceptual Dependency Graph.

 A structure into which nodes representing information can be placed.


 A specific set of primitives.
 Information of Tenses and Moods.
 Sentences are represented as a series of diagrams depicting actions using both abstract and real physical
situations.

 The agent and the objects are represented.


 The actions are built up from a set of primitive acts which can be modified by tense.
 Focuses on concepts instead of syntax.
 Focuses on understanding instead of structure.

(A) Five Primitives for Physical Actions

INGEST: to take something inside an animate object.


EXPEL: to take something from inside an animate object and force it out.
GRASP: to physically grasp an object
MOVE: to move a body part
PROPEL: to apply a force to

(B) Other Primitive Actions

 State Changes (physical and abstract transfers)

PTRANS: to change the location of a physical object.


ATRANS: to change an abstract relationship of a physical object.
 Mental acts

MTRANS: to transfer information mentally.


MBUILD: to create or combine thoughts.
(C) Instruments for other ACTs

SPEAK: to produce a sound.


ATTEND: to direct a sense organ or focus an organ towards a stimulus.

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 31
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 2)

Ques 13. What are various conceptual categories, conceptual roles and conceptual syntax rules applied
in conceptual dependency.

Ans . Conceptual Categories


PP (picture producer) : physical object. Actors must be an animated PP, or a natural force.
ACT : One of eleven primitive actions.
LOC : Location.
T : Time.
AA (action aider) : modifications of features of an ACT. e.g., speed factor in PROPEL.
PA : Attributes of an object, of the form STATE(VALUE). e.g., COLOR (red).

Conceptual roles
Conceptualization: The basic unit of the conceptual level of understanding.
Actor: The performer of an ACT.
ACT: An action done to an object.
Object: A thing that is acted upon.
Recipient: The receiver of an object as the result of an ACT.
Direction: The location that an ACT is directed toward.
State: The state that an object is in.

Conceptual Syntax Rules (Pictorial representation)

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 32
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 2)

Ques 14. Draw the conceptual graphs for following sentences:


(i) John gave Mary a book.
(ii) John sold his car to Bill
(iii) John annoyed Mary
(iv) John grew the plants with fertilizer.
(v) John threw a ball to Mary.

Ans. (i)John gave Mary a book.

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 33
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 2)

(ii) John sold his car to Bill

(iii)John annoyed Mary

(iv)John grew the plants with fertilizer.

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 34
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 2)

(v) John threw a ball to Mary.

Ques [Link] is corpus? Explain its types and importance.

Ans . Corpus is a large collection of authentic text (i.e., samples of language produced in genuine
communicative situations), and corpus linguistics as any form of linguistic inquiry based on data derived from
such a corpus.
A corpus can be defined as a systematic collection of naturally occurring texts (of both written and spoken
language). “Systematic” means that the structure and contents of the corpus follows certain extra-linguistic
principles (“sampling principles”, i.e. principles on the basis of which the texts included were chosen).
For example, a corpus is often restricted to certain text types, to one or several varieties of English, and to a
certain time span. If several subcategories (e.g. several text types, varieties etc.) are represented in a corpus,
these are often represented by the same amount of text. “Systematic” also means that information on the exact
composition of the corpus is available to the researcher (including the number of words in each category and
in the whole corpus, how the texts included in the corpus were sampled etc).
The four major points of criticism leveled at the use of corpus data in linguistic research are the
following:
1. Corpora are usage data and thus of no use in studying linguistic knowledge.
2. Corpora and the data derived from them are necessarily incomplete;
3. Corpora contain only linguistic forms (represented as graphemic strings), but no information about the
semantics, pragmatics, etc. of these forms; and
4. Corpora do not contain negative evidence, i.e., they can only tell us what is possible in a given language, but
not what is not possible.

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 35
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 2)

The four main characteristics of the modern corpus. ƒ


 Sampling and representativeness ƒ
 Finite size ƒ
 Machine-readable form
 A standard reference

Corpus Representativeness
Representativeness is a defining feature of corpus design. The following definitions from two great researchers
Leech and Biber, will help us understand corpus representativeness −

 According to Leech (1991), “A corpus is thought to be representative of the language variety it is


supposed to represent if the findings based on its contents can be generalized to the said language
variety”.

 According to Biber (1993), “Representativeness refers to the extent to which a sample includes the full
range of variability in a population”.

Corpus Balance : This is defined as the range of genre included in a corpus. A balanced corpus covers a wide
range of text categories, which are supposed to be representatives of the language. We do not have any reliable
scientific measure for balance but the best estimation and intuition works in this concern.

Sampling: Another important element of corpus design is sampling. Corpus representativeness and balance is
very closely associated with sampling. That is why we can say that sampling is inescapable in corpus building.

 According to Biber(1993), “Some of the first considerations in constructing a corpus concern the
overall design: for example, the kinds of texts included, the number of texts, the selection of particular
texts, the selection of text samples from within texts, and the length of text samples. Each of these
involves a sampling decision, either conscious or not.”

 Sampling unit − It refers to the unit which requires a sample. For example, for written text, a sampling
unit may be a newspaper, journal or a book.

 Sampling frame − The list of al sampling units is called a sampling frame.

 Population − It may be referred as the assembly of all sampling units. It is defined in terms of language
production, language reception or language as a product.

Corpus Size : It defines how large the corpus should be? There is no specific answer to this question. The size
of the corpus depends upon the purpose for which it is intended as well as on some practical considerations as
follows :

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 36
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 2)

 Kind of query anticipated from the user.


 The methodology used by the users to study the data.
 Availability of the source of data.

Tree Bank Corpus : It may be defined as linguistically parsed text corpus that annotates syntactic or semantic
sentence structure. Geoffrey Leech coined the term ‘treebank’, which represents that the most common way of
representing the grammatical analysis is by means of a tree structure. Generally, Treebanks are created on the
top of a corpus, which has already been annotated with part-of-speech tags.

Types of Tree Bank Corpus

 Semantic Treebanks : These Tree banks use a formal representation of sentence’s semantic structure.
They vary in the depth of their semantic representation..

 Syntactic Tree banks : Opposite to the semantic Tree banks, inputs to the Syntactic Treebank systems
are expressions of the formal language obtained from the conversion of parsed Treebank data. The
outputs of such systems are predicate logic based meaning representation.

Prop Bank Corpus: Prop Bank more specifically called “Proposition Bank” is a corpus, which is
annotated with verbal propositions and their arguments. The corpus is a verb-oriented resource; the
annotations here are more closely related to the syntactic level.

VerbNet(VN): Verb Net(VN) is the hierarchical domain-independent and largest lexical resource present in
English that incorporates both semantic as well as syntactic information about its contents.

WordNet: Word Net, created by Princeton is a lexical database for English language. It is the part of the NLTK
corpus. In Word Net, nouns, verbs, adjectives and adverbs are grouped into sets of cognitive synonyms
called Synsets. All the synsets are linked with the help of conceptual-semantic and lexical relations.

Ques 16. How is intelligent database created for NLP system? Explain with example.

Ans : The importance of NLIDB system is that it makes it easy for users with no programming experience to
deal with a [Link] can just use Natural Language to interact with a database which is very simple and
easy. Also, the user would not need a special training on using such systems (maybe some training to know the
interface). The user is not forced to learn any database language.

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 37
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 2)

In general, the existing methods to interact with a database using NLP can be divided into three categories:
(1) Pattern Matching Models or Template Based Approach.
(2) Syntactic Models.
(3) Syntactic and Semantic Models.
In the first model, the entered query is processed by matching it with a predefined set of rules or patterns. The
next step then translates it into a logical form according to what pattern it belongs to. From these rules, the
database query is directly formulated

 The Syntactic model in general presents linguistic information based on tokenizers, morph analyzers, part-of-
speech tagging (POS). There are eight parts of speech in the English grammar: verbs, nouns, pronouns,
adjectives, adverbs, prepositions, conjunctions and interjections

 The POS is not enough by itself to convert the Natural Language query into SQL, so we need to add more
information that we can use to understand the query. For this, we have used Stanford Named Entity Recognizer
(NER) in order to assign the keywords we already extracted from the query to the pre-defined category it belongs
to.

In the second model, a constituent syntactic tree using a syntactic parser is used, where the leaves are used in
the process of mapping to a database query based on predefined syntactic grammars. It starts with the Syntactic
analysis performed by Stanford POS tagger. Then, the keyword extractor use the information from the POS
tagger to extract the keywords that are used by the NER. The Named entity Recognizer defines the related
domain concepts like person or department. In complex queries we go through a dependency semantic parsing.
Then we move to the nodes mapping which maps each node in the keywords into the corresponding SQL
statement component. The SQL statement is executed against our relational database.

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 38
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 2)

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 39
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 2)

Ques 17 . Explain information retrieval and information extraction in NLP.

Ans . Information Extraction: Extraction means “pulling out” and Retrieval means “getting back.”
Information retrieval is about returning the information that is relevant for a specific query or field of interest
of the user. Information extraction is the standard process of taking data and extracting structured information
from it so that it can be used for various purposes, one of which may be in a search engine.

Information Retrieval: Information Retrieval refers to the human-computer interaction that happens when
we use a machine to search some piece of information for information objects (content) that match our search
query. It is all about retrieving information that is stored in a database or computer and related to the user’s
needs. A user’s query is matched against a set of documents to find the relevant documents.
. There are various methods and techniques used in information retrieval. In an information retrieval system,
we reduce information overload using an automated IR system.

 Precision: It is number of document retrieved and relevant to user’s information need divided by total
number of document that is retrieved.
 Recall: It is number of document retrieved and relevant to user’s information need divided by total
number of relevant document in whole document set.
Various techniques used in information retrieval are:

 Vector space retrieval


 Boolean space retrieval
 Term-document matrix
 Block-sort based indexing
 Tf-idf indexing
 Various clustering methods

Ques 18. (i) What are conversational agents? Give few examples of them. Describe architecture of a
general conversational software.
(ii) Mention some issues which may be observed in dialogue based agents.

Ans . Conversational agents, chatbots, dialogue systems and virtual assistants are some of the terms used by
scientific literature to describe software-based systems which are capable of processing natural language data
to simulate a smart conversational process with humans. These conversational mechanisms are built and driven
by a wide variety of techniques of different complexity, from traditional, pre-coded algorithms to emerging
adaptive machine learning algorithms. Usually deployed as service-oriented systems, they are designed to
assist users to achieve a specific goal based on their personal needs.
The use of natural language interfaces in the field of human-computer interaction is undergoing intense study
through dedicated scientific and industrial research. The latest contributions in the field, including deep learning

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 40
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 2)

approaches like recurrent neural networks, the potential of context-aware strategies and user-centred design
approaches, have brought back the attention of the community to software-based dialogue systems, generally
known as conversational agents or chat-bots.
Five distinct traditions in dialogue systems research involving communities that have largely worked
independently of one another are as mentioned below:
 Text-based and Spoken Dialogue Systems.
 Voice User Interfaces.
 Chat bots.
 Embodied Conversational Agents.
 Social Robots and Situated Agents.

Examples of Dialogue Based Agent Applications


 Dialogue systems that appeared in the 1960s and 1970s were text-based. BASEBALL, SHRDLU and GUS are
some well-known examples. BASEBALL was a question-answering system that could answer questions about
baseball games. The system was able to handle questions with a limited syntactic structure and simply rejected
questions that it was not able to answer. SHRDLU was linguistically more advanced, incorporating a large
grammar of English, semantic knowledge about objects in its domain (a blocks world), and a pragmatic
component that processed nonlinguistic information about the domain. GUS was a system for booking flights
that was able to handle linguistic phenomena such as indirect speech acts and anaphoric reference. For
example, the utterance I want to go to San Diego on May 28 was interpreted as a request to make a
flight reservation, and the utterance the next flight was interpreted with reference to a previously
mentioned flight.
 Around the late 1980s and early 1990s, with the emergence of more powerful and more accurate speech
recognition engines, Spoken Dialogue Systems (SDSs) began to appear, such as: ATIS (Air Travel
Information Service) in the U.S.
 The DARPA Communicator systems were an exception as they investigated multi-domain dialogues
involving flight information, hotels, and car rentals These systems often suffered from speech
recognition errors and so a major focus was on avoiding miscommunication, for example, by employing
various strategies for error detection and correction.
 Around 2000 the emphasis in spoken dialogue systems research moved from handcrafted systems using
techniques from symbolic and logic-based AI to statistical, data-driven systems using machine learning

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 41
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 2)

Fig: Present day Dialogue System

Following figure represents an architecture/components of a conversational agent software:

Fig: Architecture of a general conversational agent


A straightforward and well-known approach to dialogue system architecture is to build it as a chain of processes a
pipeline, where the system takes a user utterance as input and generates a system utterance as output In this chain, the
speech recognizer (ASR) takes a user’s spoken utterance.

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 42
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 2)

(1) Transforms it into a textual hypothesis of the utterance


(2). The natural language understanding (NLU) component parses the hypothesis and generates a semantic
representation of the utterance
(3). This representation is then handled by the dialogue manager (DM), which looks at the discourse and
dialogue context to, for example, resolve anaphora and interpret elliptical utterances, and generates a response
on a semantic level.
(4). The natural language generation (NLG) component then generates a surface representation of the utterance
.(5) often in some textual form, and passes it to a text-to-speech synthesis (TTS) which generates the audio
output to the user

(ii) Issues related to dialogue based agents


 Issues related as prosodic analysis, discourse modeling, deep and surface language generation.
 Instead of passing all data along the pipe, it is possible to have a shared information storage or a
blackboard that all components write and read to, and from which they may subscribe to events.
 The components may send other messages, and not just according to this pipeline. For example, the
ASR may send messages about whether the user is speaking or not directly to the TTS.
 The components might operate asynchronously and incrementally.
 Asynchronicity means that, for example, the ASR may recognize what the user is saying while the DM
is planning the next thing to say.
 Instrumentality means that, for example, the ASR recognizes the utterance word by word as they are
spoken, and that the NLU component simultaneously parses these words.

Ques [Link] short notes on the following:


(i) Advantages of natural language to data base interface.
(ii) Sub components of NLIDB

Ans . (i) Following are some of the advantages of natural language to data base interface:
 Provide high-level intelligent tools that provide new insights into the contents of the database by
extracting knowledge from data.
 Make information available to larger numbers of people because more people can now utilize the system
due to its ease of use.
 Improve the decision making process involved in using information after it has been retrieved by using
higher level information models.
 Interrelate information from different sources using different media so that the information is more easily
absorbed and utilized by the user.

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 43
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 2)

No Artificial Language: One advantage of NLIDBs is supposed to be that the user is not required to learn an artificial
communication language. Formal query languages like SQL are difficult to learn and master, at least by non-computer-
specialists
 Simple, easy to use: Consider a database with a query language or a certain form designed to
display the query. While an NLIDB system only requires a single input, a form-based may
contain multiple inputs (fields, scroll boxes, combo boxes, radio buttons, etc) depending on the
capability of the form
 Most of NLIDB systems provide some tolerances to minor grammatical errors, while in a
computer system; most of the time, the lexicon should be exactly the same as defined, the
syntax should correctly follow certain rules, and any errors will cause the input automatically
be rejected by the system

(ii)Problem of natural language access to a database is divided into two sub-components:


 Linguistic component
 Database component
Linguistic Component It is responsible for translating natural language input into a formal query and generating
a natural language response based on the results from the database search.
Database Component It performs traditional Database Management functions. A lexicon is a table that is used
to map the words of the natural input onto the formal objects (relation names, attribute names, etc.) of the
database. Both parser and semantic interpreter make use of the lexicon. A natural language generator takes the
formal response as its input, and inspects the parse tree in order to generate adequate natural language response.
Natural language database systems make use of syntactic knowledge and knowledge about the actual database
in order to properly relate natural language input to the structure and contents of that database.

Ques [Link] briefly approaches for the development of NLIDB Systems. Also describe the
architecture of NLIDB system.
Ans. Main approaches to develop a natural language intelligent database systems are as following:
(i) Symbolic Approach (Rule Based Approach): Natural Language Processing appears to be a
strongly symbolic activity. Words are symbols that stand for objects and concepts in real
worlds, and they are put together into sentences that obey well specified grammar rules.
Knowledge about language is explicitly encoded in rules or other forms of representation.
Language is analyzed at various levels to obtain information. On this obtained information
certain rules are applied to achieve linguistic functionality. As Human Language capabilities
include rule-based reasoning, it is supported well by symbolic processing. In symbolic
processing rules are formed for every level of linguistic analysis.

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 44
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 2)

(ii) Empirical Approach (Corpus Based Approach): Empirical approaches are based on
statistical analysis as well as other data driven analysis, of raw data which is in the form of
text corpora. A corpus is collections of machine readable text. Corpora are primarily used as
a source of information about language and a number of techniques have emerged to enable
the analysis of corpus data. Syntactic analysis can be achieved on the basis of statistical
probabilities estimated from a training corpus. Lexical ambiguities can be resolved by
considering the likelihood of one or another interpretation on the basis of context.
(iii) Connectionist Approach (Using Neural Network) : Since human language capabilities are
based on neural network in the brain, Artificial Neural Networks (also called as connectionist
network) provides on essential starting point for modeling language processing

Architecture of NLIDB

Most current NLIDBs first transform the natural language question into an intermediate logical query,
expressed in some internal meaning representation language. The intermediate logical query expresses the
meaning of the user’s question in terms of high level world concepts, which are independent of the database
structure. The logical query is then translated to an expression in the database’s query language, and evaluated
against the database The idea is to map a sentence into a logical query language first, and then further
translate this logical query language into a general database query language, such as SQL. In the process
there can be more than one intermediate meaning representation language

In the intermediate representation language approach, the system can be divided into two parts. One part
starts from a sentence up to the generation of a logical query. The other part starts from a logical query until the
generation of a database query. In the part one, the use of logic query languages makes it possible to add
reasoning capabilities to the system by embedding the reasoning part inside a logic statement. In addition,

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 45
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 2)

because the logic query languages is independent from the database, it can be ported to different database query
languages as well as to other domains, such as expert systems and operating systems

A semantic grammar system is very similar to the syntax based system, meaning that the query result is
obtained by mapping the parse tree of a sentence to a database query. The basic idea of a semantic grammar
system is to simplify the parse tree as much as possible, by removing unnecessary nodes or combining some
nodes together.

****************End of Unit -2*******************

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 46
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 3)

UNIT -3

 Grammars and Parsing: Grammars and sentence Structure


 Top-Down and Bottom-Up Parsers
 Transition Network Grammars
 Augmented Transition Networks.
 Top- Down Chart Parsing.
 Feature Systems and Augmented Grammars: Basic Feature system for
English
 Morphological Analysis and the Lexicon
 Parsing with Features

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 47
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 3)

SHORT ANSWER TYPE QUESTIONS

Ques [Link] are context free grammars?


Ans: A context-free grammar is a set of recursive rules used to generate patterns of strings. A
context-free grammar can describe all regular languages and more, but they cannot describe all possible
languages. Context-free grammars are studied in fields of theoretical computer science, compiler design,
and linguistics. Context free grammar G can be defined by four tuples as: G= (V, T, P, S)
T describes a finite set of terminal symbols.
V describes a finite set of non-terminal symbols
P describes a set of production rules
S is the start symbol.

Ques 2. What are semantic grammars?


Ans . Semantic grammars combine syntactic, semantic, and pragmatic knowledge into a single set
of rules in the form of a grammar. This is usually just a context-free grammar in which the choice of
nonterminal and production rules is governed by semantic as well as syntactic [Link]
categories are added for terminal symbols.

CONTEXT FREE GRAMMAR SEMANTIC GRAMMAR


<S> →<NP><VP> <S> →<NP_A><VP_Res>
<NP> →(<DET>)<NOUN> | <Name> <NP_A> →(<DET>) <Noun_A> | <P_Name>

Ques 3. What are Verb-Form Features ?


Ans. Following verb form features are defined in feature systems:
 base - base form (for example, go, be, say, decide)
 pres - simple present tense (for example, go, goes, am, is, say, says, decide)
 past - simple past tense (for example, went, was, said, decided)
 fin - finite (that is, a tensed form, equivalent to {pres past})
 ing - present participle (for example, going, being, saying, deciding)
 pastprt - past participle (for example, gone, been, said, decided)
 inf - a special feature value that is used for infinitive forms with the word to

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 48
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 3)

Ques 4. What is Chomsky normal form?

Ans . A CFG is said to be in Chomsky Normal Form (in short, CNF) if the following are true.

(i) Every rule is of the form –


A → BC, or A → a, where A, B, C are variables and a is a terminal.

(ii) The start variable is not present in the right hand side of any rule.
(iii) The rule S →  may be present (depending on whether the language has  or not.

LONG ANSWER TYPE QUESTIONS

Ques 5. Explain Unification Grammars and Transformational grammars with examples.


Ans. Unification grammar combines syntactic properties of categorical grammar with semantic
properties of discourse representations. It is strongly lexicalist. Less information resides in
grammar rules and most are available in Lexicon. It is strictly declarative.
Unification grammars describe constituents. If two words sequence are mutually interchangeable in
every context, preserving grammatically then both are constituents with some category of grammar.
Transformational Grammars:
A grammar is called transformational as it transforms one sentence into another by keeping the meaning
intact . The activepassive transformation of sentences is a case in point. If the grammar is to consist of a
finite set of rules operating upon a finite vocabulary and thereby capable of generating an infinite set of
sentences, it follows that at least some of the rules must be applicable more than once in the generation
of the same sentence . Such rules, and the structures they generate, are called recursive.
According to Chomsky the syntactic description of sentences has two aspects:
 Surface Structure and
 Deep Structure.
Surface structure is the aspect of description that determines the phonetic form of sentences; while deep
structure determines semantic interpretation. The rules that express the relation of Deep and Surface structures
in sentences are called Grammatical Transformation and hence the terms Transformational Generative
Grammar.
Ques 6. Describe the relevance of phrase structures. What are phrase marker? Explain with example.
Ans : Adequate grammatical rules are sufficient to provide grammatical strings and equally they preclude us to
generate obviously ungrammatical ones. These rules also help us to decide in the marginal cases where our
intuition fails us. Rules also help us to apprehend the grammatical structure of the sentences. So the grammar

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 49
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 3)

which Chomsky proposes to address here is the type of grammar that satisfies the two requirements stated
above.
This type of grammar is said to be phrase-structure grammar. It is claimed that phrase-structure grammar is
"the prevailing conception of syntactic description among modern linguists today". Phrase-structure grammar,
at times, may be called "Immediate Constituent" grammar. Phrase-structure is found a mental process for
earning a language at the beginning. It is a kind of grammar, which we have taught at our kindergarten. Here we
come to know the various components of sentence, viz., Noun, Verb, Adverb etc. But one of the important
shortcomings of this
One of the important aspects of Phrase structure grammar is that any set of sentences that can be generated
by a finite state grammar can equally be generated by a phrase structure grammar. Chomsky himself has
attempted to construct a grammar on the basis of a carefully axiomatized and consistently detailed level of
Phrase Structure
Phrase structure grammar is more powerful than finite state grammar as they do everything that finite state
grammars do and more. Let us consider the sentence The boy hits the girl. Now we can represent the structure
of this sentence in the form of the following tree diagram:

Here the above structural description of the sentence under consideration is called Phrase Marker (PM)
of the sentence.
 PM informs us how many different ways the sentence can be distributively analysed.
 It shows that the sentence can be composed of a noun phrase and verb phrase and the verb
phrase is again composed of a verb and another noun phrase.
 Also that a noun can be composed of an article plus a noun.
 It clears under which categories certain English words belong to.
 Which higher degrees those categories belong to; to which categories the second categories
belong to and so on.
 This makes sense to say that PM is a hierarchical process through which different categories and
sub-categories of sentences can be reflected. It provides us with categorical information with the
help of PM.
[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)
Page 50
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 3)

Ques 7. Write an algorithm to convert context free grammar into Chomsky normal form. Give
example also.
Ans : A CFG is said to be in Chomsky Normal Form (in short, CNF) if the following are true.

(i) Every rule is of the form –


A → BC, or A → a, where A, B, C are variables and a is a terminal.

(ii) The start variable is not present in the right hand side of any rule.
(iii) The rule S →  may be present (depending on whether the language has  or not.

Steps for converting CFG into CNF

Step 1: Eliminate start symbol from the RHS. If the start symbol T is at the right-hand side of any production,
create a new production as:
S1 → S , where S1 is the new start symbol.

Step 2: In the grammar, remove the null, unit and useless productions. You can refer to the Simplification of
CFG.

Step 3: Eliminate terminals from the RHS of the production if they exist with other non-terminals or terminals.
For example, production S → aA can be decomposed as:
S → RA , R→a

Step 4: Eliminate RHS with more than two non-terminals. For example, S → ASB can be decomposed as:
S → RS , R → AS

Example : Convert the given CFG to CNF. Consider the given grammar G1:
S → a | aA | B
A → aBB | ε
B → Aa | b

Step 1: We will create a new production S1 → S, as the start symbol S appears on the RHS. The grammar will
be:
S1 → S
S → a | aA | B
A → aBB | ε
B → Aa | b

Step 2: As grammar G1 contains A → ε null production, its removal from the grammar yields:
S1 → S
S → a | aA | B
A → aBB
B → Aa | b | a

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 51
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 3)

S1 → S
S → a | aA | Aa | b
A → aBB
B → Aa | b | a

Also remove the unit production S1 → S, its removal from the grammar yields:

S0 → a | aA | Aa | b
S → a | aA | Aa | b
A → aBB
B → Aa | b | a

Step 3: In the production rule S0 → aA | Aa, S → aA | Aa, A → aBB and B → Aa, terminal a exists on RHS
with non-terminals. So we will replace terminal a with X:

S0 → a | XA | AX | b
S → a | XA | AX | b
A → XBB
B → AX | b | a
X→a

Step 4: In the production rule A → XBB, RHS has more than two symbols, removing it from grammar yield:

S0 → a | XA | AX | b
S → a | XA | AX | b
A → RB
B → AX | b | a
X→a
R → XB

Hence, for the given grammar, this is the required CNF.

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 52
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 3)

Ques 8. (i)What are the issues in parsing? Differentiate between top down parsing and bottom up
parsing.
(ii) Explain case grammar and its use in NLP.
Ans . (i) Few issues in parsing are as mentioned below:
 Ambiguity in parsing: A sentence is structurally ambiguous if the grammar assigns it more than
a possible parse
 Grammar expressivity
 Coverage and Involved Knowledge Sources
 Parsing strategy (Top down or bottom up)
 Parsing direction
 Production application order
 Ambiguity management

Top Down Parsing:


 Guided by goals
 Starts with a goal (or set of goals) to be built.
 Tries to solve one of the pending goals
 If more than one production can be applied.
 Pending goals can be reordered
 Several search criteria (including heuristics) can be applied
 The process ends when all the goals have been reached
 Left recursion

Bottom Up Parsing :
 Data driven
 Starts from the sequence of words to be parsed (facts)
 Proceeds bottom up (leaf nodes to root node)
 Several search criteria (including heuristics) can be applied
 The process ends when the list of facts and contains the initial symbol of the grammar.
 Inefficient when there is a high lexical ambiguity and repeated work

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 53
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 3)

Example:

(ii) Case Grammar : In 1968, Charles Fillmore published his theory of Case Grammar, which
highlighted the fact that syntactic structure can be predicted by semantic participants. It focuses on the
link between the number of subjects, objects and so on of a verb and the grammatical context it requires.
The underlying structure of the syntactic and semantic relationships between nouns and verbs related
with it is deep case. This kind of case doesn’t have to be shown through the change of the morphology
of nouns and pronouns. The case is determined based on the underlying structure of the syntactic.
and semantic relationships between nouns and verbs.
Another definition is , In a case grammar sentence is defined as being composed of a preposition P ,
Modality M , Mood , Tense , Aspect , Negation etc.

Example of case grammar:


 The door
“The door” in the first sentence is the subject. But to our knowledge, the door can’t open itself. It’s
opened by the people. So according to case grammar, “the door” is the affected substance of the action.
We may call it the objective case.
 The key opened the door.
The same goes to “the key” in the second sentence. “The key” in the second sentence is the subject.
However, the key can’t open the door by itself. It is the people who use the key to open the door. So the
key is an instrument with which people can carry out an action. Here, we call the key instrumental case
on the basis of case grammar.
List of cases and their definitions:
 (A)Agentive : instigator of the action ,animate

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 54
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 3)

 (E) Experiencer affected by the action, animate


 (I) Instrumental: force or object causing action or state.
 (O) Objective : semantically most neutral case
 (S) Source : The origin or starting point

Importance of case grammar: It has been widely used. Mainly, people have applied the case grammar to
English language teaching, English-Chinese translation, machine translation system and so on. The following
will present its application in English vocabulary teaching.

Ques 9. What are Transition networks and its types ? Explain augmented transition networks with its
importance.
Ans. A transition network is a finite state automaton that is used to represent a part of a grammar. A
transition network parser uses a number of these transition networks to represent its entire grammar. Each
network represents one non-terminal symbol in the grammar.

Types of Transition Networks:

(a). Augmented Transition Networks (ATNs): ATN was developed by William Woods in 1970. The ATN
method of parsing sentences integrates many concepts from Chomsky’s (1957) formal grammar theory with a
matching process resembling a dynamic semantic network.

(b).Recursive Transition Networks (RTNs): RTN is a recursive transition network that permits arc labels to
refer to other networks and they, in turn, may refer back to the referring network rather than just permitting
word categories used previously.

Augmented Transition Networks (ATN) became popular in the 1970s, for parsing text. An ATN is a
generalized transition network with two major enhancements:

1. Support for recursive transitions, including jumping to other ATNs


2. Performing arbitrary actions when edges are traversed
3. Remembering state through the use of registers

The Sentence ATN has 3 nodes, whereas the NP and VP nodes each have 2 nodes. Control starts at
the start node of Sentence, and as part of traversing the next two nodes, the ATNs NP and VP are respectively
visited (in essence, like subroutine calls) and if the process is successful, a data structure representing the parsed
information is synthesized.

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 55
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 3)

Properties of ATNs:
 Arcs, aside from being labeled by word classes or syntactic constructs, can have arbitrary tests
associated with them that must be satisfied before the arc is taken.
 Certain actions may be "attached" to an arc, to be executed whenever it is taken (usually to
modify the data structure returned).
 Registers - can contain items, flags, or structures
 register named SUBJ could contain the NP that is the candidate for the subject of the sentence
 PUSH arc saves the registers on a stack
 POP arc restores the registers from the stack

 Tests - look for agreement (usually by examining registers).

 Could prevent trying for a match or an arc if it would lead to a failure , e.g., CAT NP// arc
following a verb test could examine the VERB register to see if the verb can take an NP in that
position.

 Actions - taken if an arc match is found , e.g., - set or modify registers .

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 56
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 3)

 HOLD - puts a constituent on a holding list from which it is later removed and placed in its
correct structural pattern

Ques 10. Explain chart parsing algorithm.


Ans. Chart Parsing Technique:

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 57
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 3)

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 58
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 3)

Ques 11. Write short notes on the following :


(i) Feature structure and feature values.
(ii) Formalizing feature structure.
(iii) Augmented grammars.
(iv) Morphological analysis and lexicons.

Ans. (i) Feature structure and feature values.


A constituent is defined as a feature structure - a mapping from features to values that defines the relevant
properties of the constituent. a feature structure for a constituent ART1 that represents a particular use of the
word a might be written as follows:

This says it is a constituent in the category ART that has as its root the word a and is singular. Usually an
abbreviation is used that gives the CAT value more prominence and provides an intuitive tie back to simple
context-free grammars. In this abbreviated form, constituent ART1 would be written as:
ART1: (ART ROOT a NUMBER s)
Feature structures can be used to represent larger constituents as well. To do this, feature structures themselves
can occur as values. Special features based on the integers - 1, 2, 3, and so on - will stand for the first sub-
constituent, second sub-constituent and so on. With this, the representation of the NP constituent for the phrase
"a fish" could be:

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 59
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 3)

(ii) Formalizing feature structure: This means analyzing feature structure in predicate logic form.

(iii)Augmented grammars: An augmented grammar is any grammar whose productions are augmented
with conditions expressed using features. Features may be associated with any nonterminal symbol in a
derivation. A feature associated with a nonterminal symbol is shown following that nonterminal separated from
it by a dot(.).
In natural languages there are often agreement restrictions between words and phrases. For example, the NP "a
men" is not correct English because the article a indicates a single object while the noun "men" indicates a
plural object; the nounphrase does not satisfy the number agreement restriction of English. There are many
other forms of agreement, including subject-verb agreement, gender agree-ment for pronouns, restrictions
between the head of a phrase and the form of its complement, and so on. To handle such phenomena
conveniently, the grammati-cal formalism is extended to allow constituents to have features. For example, we
might define a feature NUMBER that may take a value of either s (for sin-gular) or p (for plural), and we then
might write an augmented CFG rule.
NP  ART N only when NUMBER1 agrees with NUMBER2
This rule says that a legal noun phrase consists of an article followed by a noun, but only when the number
feature of the first word agrees with the number feature of the second. This one rule is equivalent to two CFG
rules that would use different terminal symbols for encoding singular and plural forms of all noun phrases, such
as
NP-SING  ART-SING N-SING
NP-PLURAL  ART-PLURAL N-PLURAL
Using features, the size of the augmented grammar remains the same as the original one yet accounts for
agreement constraint

The lexicon must contain information about all the different words that can be used, including all the relevant feature
value restrictions. When a word is ambiguous, it may be described by multiple entries in the lexicon, one for each
different use. Because words tend to follow regular morphological patterns, however, many forms of words need not be
explicitly included in the lexicon. Most English verbs, for example, use the same set of suffixes to indicate different

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 60
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 3)

forms: -s is added for third person singular present tense, -ed for past tense, -ing for the present participle, and so on.
Without any morphological analysis, the lexicon would have to contain every one of these forms. For the verb want this
would require six entries, for want (both in base and present form), wants, wanting, and wanted (both in past and past
participle form).

(iv)Morphological analysis and lexicons: The lexicon must contain information about all the different words
that can be used, including all the relevant feature value restrictions. When a word is ambiguous, it may be
described by multiple entries in the lexicon, one for each different use. Because words tend to follow regular
morphological patterns, however, many forms of words need not be explicitly included in the lexicon. Most
English verbs, for example, use the same set of suffixes to indicate different forms: -s is added for third person
singular present tense, -ed for past tense, -ing for the present participle, and so on. Without any morphological
analysis, the lexicon would have to contain every one of these forms. For the verb want this would require six
entries, for want (both in base and present form), wants, wanting, and wanted (both in past and past participle
form).

Ques 12. Explain briefly some Basic Feature Systems for English.
Ans. Person and Number Features
 First Person (1): The noun phrase refers to the speaker, or a group of people including the speaker
(for example, I, we, you, and 0.
 Second Person (2): The noun phrase refers to the listener, or a group including the listener but not
including the speaker (for example, you, all of you)
 Third Person (3): The noun phrase refers to one or more objects, not including the speaker or
hearer.
Since number and person features always co-occur, it is convenient to combine the two into a single feature,
AGR, that has six possible values: first person singular (is), second person singular (2s), third person singular
(3s), and first, second and third person plural (ip, 2p, and 3p, respectively). For example, an instance of the
word is can agree only with a third person singular subject, so its AGR feature would be 3s.

Verb-Form Features and Verb Subcategorization


Another very important feature system in English involves the form of the verb. This feature is used in many situations,
such as the analysis of auxiliaries and generally in the subcategorization restrictions of many head words. feature values
for the feature VFORM.
 base - base form (for example, go, be, say, decide)
 pres - simple present tense (for example, go, goes, am, is, say, says, decide)
 past - simple past tense (for example, went, was, said, decided)
 fin - finite (that is, a tensed form, equivalent to {pres past})

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 61
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 3)

 ing - present participle (for example, going, being, saying, deciding)


 pastprt - past participle (for example, gone, been, said, decided)
 inf - a special feature value that is used for infinitive forms with the word to
For instance, the rule for verbs with a SUBCAT value of _np_vp:inf would be:

(VP)  (V SUBCAT _np_vp:inf)


(NP)
(VP VFORM inf)
Many verbs have complement structures that require a prepositional phrase with a particular preposition, or one
that plays a particular role. For example, the verb give allows a complement consisting of an NP followed by a
PP using the preposition to, as in "Jack gave the money to the bank". Qther verbs, such as "put", require a
prepositional phrase that describes a location, using prepositions such as "in", "inside", "on", and "by".
express this within the feature system, we introduce a feature PFORM on prepositional phrases

**********End of Unit 3**********

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 62
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 4)

UNIT - 4

 Grammars for Natural Language: Auxiliary Verbs and Verb Phrases.


 Movement Phenomenon in Language
 Handling questions in Context-Free Grammars
 Human preferences in Parsing
 Encoding uncertainty
 Deterministic Parser.

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 63
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 4)

SHORT ANSWER TYPE QUESTIONS

Ques [Link] are auxiliary verbs ?


Ans. An auxiliary verb (or a helping verb as it's also called) is used with a main verb to help express the main
verb's tense, mood, or voice.
The main auxiliary verbs are to be, to have, and to do. They appear in the following forms:
To Be: am, is, are, was, were, being, been, will be
To Have: has, have, had, having, will have
To Do: does, do, did, will do

Ques 2. What is verb phrase?


Ans. All English verbs function inside verb phrases (VPs). A simple VP consists of a lexical verb acting
as the main verb of the VP and anywhere from zero to four auxiliary verbs which are used to mark
modality, aspect, and voice. (A compound VP consists of the conjunction of two or more simple VPs.
VPs can be finite or non-finite :A finite verb phrase , marks tense and agreement where appropriate, and
,has a subject which must be in the subject case if it is a pronoun1 .
A non-finite verb phrase : Never marks tense or agreement, has a subject which can never be in the subject
case if it is a pronoun.

Ques 3. Define verb complement with example.


Ans. Verb complement definition: a verb complement is usually an object that comes after a verb and
completes its meaning. Without the verb complement, the sentence stops giving the same meaning and looks
incomplete.
E.g : I want.
When you read this sentence, you feel something needs to come after the verb ‘want’. It’s important to talk
about what you want. Without its object (complement), the sentence looks incomplete.
 I want to learn from you.
 I want some food.
 I want skilled employees.
 I want a lot of money.

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 64
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 4)

LONG ANSWER TYPE QUESTIONS


Ques 4. Define modal auxiliaries with examples in detail. What are common errors while using modal
verbs and how they can be resolved?
Ans. We all need to express our moods and emotions, both in writing and in our everyday life. We do this by
using modal auxiliaries.
Modal Auxiliaries: Modal auxiliaries are a type of helping verb that are used only with a main verb to help
express its mood. E.g : can ,could , shall , should , may , might ,will ,would ,must.

Modal
Use Modal Auxiliary + Main Verb
Auxiliary
I can lift this forty-pound box. (ability)
can Expresses an ability or possibility We can embrace green sources of energy.
(possibility)
I could beat you at chess when we were kids.
(past ability)
Expresses an ability in the past; a present possibility;
could We could bake a pie! (present possibility)
a past or future permission
Could we pick some flowers from the garden?
(future permission)
I may attend the concert. (uncertain future action)
Expresses uncertain future action; permission; ask a
may You may begin the exam. (permission)
yes-no question
May I attend the concert? (yes-no questions)
I might attend the concert (uncertain future
might Expresses uncertain future action
action—same as may)
shall Expresses intended future action I shall go to the opera. (intended future action)
I should mail my RSVP. (obligation, same
as ought to)
should Expresses obligation; ask if an obligation exists
Should I call my mother? (asking if an obligation
exists)
I will get an A in this class. (intended future
action)
Expresses intended future action; ask a favor; ask for
will
information Will you buy me some chocolate? (favor)
Will you be finished soon? (information)
I would like the steak, please. (preference)
Would you like to have breakfast in bed? (request
a choice politely)
States a preference; request a choice politely; explain
would I would go with you if I didn’t have to babysit
an action; introduce habitual past actions
tonight. (explain an action)
He would write to me every week when we were
dating. (habitual past action)

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 65
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 4)

Modal
Use Modal Auxiliary + Main Verb
Auxiliary
must Expresses obligation We must be on time for class.
I ought to mail my RSVP. (obligation, same as
ought to Expresses obligation
may)

Four common errors when using modal auxiliaries and their resolution:

1. Using an infinitive instead of a base verb after a modal


Incorrect: I can to move this heavy table.
Correct: I can move this heavy table.
2. Using a gerund instead of an infinitive or a base verb after a modal
Incorrect: I could moving to the United States.
Correct: I could move to the United States.
3. Using two modals in a row
Incorrect: I should must renew my passport.
Correct: I must renew my passport.
Correct: I should renew my passport.
4. Leaving out a modal
Incorrect: I renew my passport.
Correct: I must renew my passport.

Ques 5. Explain feature structure of VFORM with example.


Ans. It is very cumbersome to write grammatical rules that include all the necessary features. But there are
certain regularities in the use of features that can be exploited to simplify the process of writing ‘rules. For
instance, many feature values are unique to a feature (for example, the value inf can only appear in the
VFORM feature, and _np_vp:inf can only appear in the SUBCAT feature). Because of this, we can omit
the feature name without introducing any ambiguity.
 Unique feature values will be listed using square parentheses.
 Thus (VP SUBCAT inf) will be abbreviated as VP[ inf ].
 Since binary features do not have unique values, a special convention is introduced for them.
 For a binary feature B, the constituent C[+B] indicates the constituent (C B +).

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 66
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 4)

(VP VFORM ?v AGR ?a)  (V VFORM ?v AGR ?a SUBCAT _np_vp:inf)


(NP)
(VP VFORM inf)
If the head features can be declared separately from the rules, the system can automatically add these features to the
rules as needed. With VFORM and AGR declared as head features, the previous VP rule can be abbreviated as:
VP  (V SUBCAT _np_vp:inf ) NP (VP VFORM inf)
Combining all the abbreviation conventions, the rule could be further simplified to
VP  V [_vp_vp:inf ] NP VP[ inf ]
A simple grammar using these conventions is shown as below:

Ques 6. How is parse tree with feature values generated? Draw the parse tree with feature values for
below given sentences:
(i) He wants to be happy
(ii) The man cries
Ans. Parse trees with feature values are same as annotated parse trees in compiler design. Nodes of parse
Trees are attached with following feature values inside square brackets:

 Information about tenses


 Information about persons(singular , plural )
 Information about subject-verb agreements
 Information about type of verbs like , infinitive , auxiliary etc.

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 67
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 4)

(i)

(ii)

Ques 7. Explain movement phenomenon in natural language and its types with examples.
Ans. Movement phenomenon in natural language: Many sentence structures appear to be simple
variants of other sentence structures. In some cases, simple words or phrases appear to be locally
reordered; sentences are identical except that a phrase apparently is moved from its expected position in a
basic sentence.

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 68
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 4)

Fig: Active and Passive forms of sentences observe reordering / movement of words
In the structure of yes/no questions and how they relate to their assertional counterpart. Consider the
following examples:

He will run in the marathon next year. Jack is giving Sue a back rub.
Will he run in the marathon next year? Is Jack giving Sue a back rub?

yes/no questions appear identical in structure to their assertional counterparts except that the subject NPs
and first auxiliaries have swapped positions. If there is no auxiliary in the assertional sentence, an auxiliary
of root "do", in the appropriate tense, is used:

John went to the store. Henry goes to school every day.


Did John go to the store?
Does Henry go to school every
day?
This rearranging of the subject and the auxiliary is called subject-aux inversion. Informally, you can
think of deriving yes/no questions from assertions by moving the constituents in the manner just described.
This is an example of local (or bounded) movement.

 Local movement: The movement is considered local because the re-arranging of the constituents is
specified precisely within the scope of a limited number of rules.
 Unbounded movement: This occurs in wh-questions. In cases of unbounded movement,
constituents may be moved arbitrarily far from their original position.
 wh-movement - move a wh-term to the front of the sentence to form a wh-question

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 69
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 4)

 Topicalization - move a constituent to the beginning of the sentence for emphasis, as in :


I never liked this picture.
This picture, I never liked.

 Adverb Preposing - move an adverb to the beginning of the sentence, as in:

I will see you tomorrow.


Tomorrow, I will see you.

 Extraposition - move certain NP complements to the sentence final position, as in:

A book discussing evolution was written.


A book was written discussing evolution.

Ques 8. What are Wh-Questions in natural language? Explain its behavior in movement
phenomenon.
Ans. Pharses in natural language sentences of interrogative form starting with words like which , what ,
where , why , when , whose , whom are termed as Wh-Questions.
Each question has the same form as the original assertion, except that the part being questioned is removed
and replaced by a wh-phrase at the beginning of the sentence. In addition, except when the part being
queried is the subject NP, the subject and the auxiliary are apparently inverted, as in yes/no questions. This
similarity with yes/no questions even holds for sentences without auxiliaries. In both cases, a "do"
auxiliary is inserted:
I found a bookcase.
Did I find a bookcase?
What did I find?
Thus you may be able to reuse much of a grammar for yes/no questions for wh-questions. A serious
problem remains, however, concerning how to handle the fact that a constituent is missing from someplace
later in the sentence. For example, consider the italicized VP in the sentence

What will the fat man angrily put in the corner?


While this is an acceptable sentence, "angrily put in the corner" does not appear to be an acceptable VP
because you cannot allow sentences such as *"I angrily put in the corner". Only in situations like wh-
questions can such a VP be allowed, and then it is allowed only if the wh-constituent is of the right form to
make a legal VP if it were inserted in the sentence.

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 70
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 4)

Effect on grammar size: If you constructed a special grammar for VPs in wh-questions, you would need a
separate grammar for each form of VP and each form of missing constituent. This would create a
significant expansion in the size of the grammar. The place where a sub-constituent is missing is called the
gap, and the constituent that is moved is called the filler. The techniques that follow all involve ways of
allowing gaps in constituents when there is an appropriate filler available.
Many linguistic theories have been developed that are based on the intuition that a constituent can be moved from
one location to another.
A context-free grammar generated a base sentence; then a set of transformations converted the
resulting syntactic tree into a different tree by moving constituents. Augmented transition networks
offered a new formalism that captured much of the behavior in a more computationally effective manner.
A new structure called the hold list was introduced that allowed a constituent to be saved and used later in
the parse by a new arc called the virtual (VIR) arc. This was the predominant computational mech-anism
for quite some time.
The term movement arose in transformational grammar (TG). TG posited two distinct levels of structural
representation: surface structure, which corresponds to the actual sentence structure, and deep structure. A
CFG generates the deep structure, and a set of transformations map the deep structure to the surface structure.
For example, the deep structure of "Will the cat scratch John?" would be:

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 71
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 4)

With this transformation the surface form will be:

Ques 9: Write short notes on the following:


(i) Handling questions in CFG
(ii) Encoding Uncertainty in natural language
(iii) Determinism in parsing.

Ans. (i) Handling questions in CFG: The goal is to extend a context-free grammar minimally so that it can
handle questions. In other words, you want to reuse as much of the original grammar as possible. For yes/no
questions, this is easily done.
S [ +inv ]  (AUXAGR ? a SUBCAT ? v) (NP AGR ?a) (VP VFORM ?v)
This enforces subject-verb agreement between the AUX and the subject NP, and ensures that the VP has the
right VFORM to follow the AUX. This one rule is all
A special feature GAP is introduced to handle wh-questions. This feature is passed from mother to sub-
constituent until the appropriate place for the gap js found in the sentence. At that place, the appropriate
constituent can bç constructed using no input. This can be done by introducing additional rules with empty
right-hand sides. For instance, you might have a rule such as:
(NP GAP ((CAT NP) (AGR ?a)) AGR ?a)  ò
Above rule builds an NP from no input if the NP sought has a GAP feature that is set to an NP. Furthermore,
the AGR feature of this empty NP is set to the AGR feature of the feature that is the value of the GAP.
 There are two general ways in which the GAP feature propagates, depending on whether the head
constituent is a lexical or non-lexical category. If it is a non-lexical category, the GAP feature is passed
from the mother to the head and not to any other sub-constituents. For example, a typical S rule with the
GAP feature would be: (S GAP ?g)  (NP GAP-) (VP GAP ?g)

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 72
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 4)

The GAP can be in the VP, the head sub-constituent, but not in the NP subject. For rules with lexical heads, the
gap may move to any one of the rjonlexical sub-constituents. For example, the rule for verbs with a _np_vp:inf
complement,
VP  V [ _np_vp:inf] NP PP . this would result in two rules involving gaps:
 (VP GAP ?g)  V[ _np_vp:inf ] ( NP GAP ?g ) (PP GAP-)
 (VP GAP ?g)  V[_np_vp:inf] (NP GAP -) (PP GAP ?g)

(ii)Encoding Uncertainty in natural language: Uncertainty – in its most general sense – can be interpreted as
lack of information: the receiver of the information (i.e. the hearer or the reader) cannot be certain about some
pieces of information. From the viewpoint of computer science, uncertainty emerges due to partial
observability, non-determinism or both.
 Linguistic theories usually associate the notion of modality with uncertainty: epistemic modality
encodes how much certainty or evidence a speaker has for the proposition expressed by his
utterance
 It also refers to a possible state of the world in which the given proposition holds.
There are many situations where uncertainties arise when applying machine learning models. First, we are
uncertain about whether the structure choice and model parameters can best describe the data distribution. This
is referred to as model uncertainty, also known as epistemic uncertainty. E.g : BNN(Bayesian neural
network)
Data uncertainty: Another situation where uncertainty arises is when collected data is noisy. This is often the
case when we rely on observations and measurements to obtain the data. Even when the observations and
measurements are precise, noises might exist within the data generation process. Such uncertainties are referred
to as data uncertainties.
Model uncertainty can be quantified using BNNs which captures uncertainty about model parameters. Data
uncertainty describes noises within the data distribution. When such noises are homogeneous across the input
space, it can be modeled as a parameter. When the noises are input-dependent, i.e. observation noise varies
with input x, hetero-scedastic models.
Techniques to encode uncertainty:
Method 1: Law of total variance. Given an input variable x and its corresponding output variable y, the
variance in y can be decomposed as:

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 73
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 4)

where Um and Ud are model and data uncertainties respectively. E[y|x] and data uncertainty describes the
variance inherent to the conditional distribution Var(y|x).

(iii)Determinism in parsing: Deterministic parser permits only one choice for each word category. So each
arc has a different test condition
In natural language processing, deterministic parsing refers to parsing algorithms that do not backtrack.
Backtracking is a general algorithm for finding solutions to some computational problems, notably constraint
satisfaction problems, that incrementally builds candidates to the solutions, and abandons a candidate
("backtracks") as soon as it determines that the candidate cannot possibly be completed to a valid solution.

Ques [Link] about “Human preference in parsing” with example.


Ans. Human preferences in parsing is based on Minimal attachment strategy.
 Preference for the syntactic analysis that creates the least number of nodes in parse tree.
e.g: The man kept the dog in the house

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 74
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 4)

George said that Henry left in his car

Parsing, (generally called ‘sentence processing’ when we are referring to human parsing), is the process of
building up syntactic interpretations for a sentence from an input sequence of written or spoken words.
Ambiguity is extremely common in parsing problems, and previous research on human parsing has focused on
showing that many factors play a role in choosing among the possible interpretations of an ambiguous sentence.
 Many factors are known to influence human parse preferences. One such factor is the different
lexical/morphological frequencies of the simple past and participial forms of the ambiguous verb form
Types of human preferences in parsing:
(a) Frequency-based preference : Trueswell (1996) found that verbs like searched, with a for the simple
past form, caused readers to prefer the main clause interpretation, while verbs like selected, had a
preference for a participle reading, and supported the reduced relative interpretation.

(b)The Transitivity preference: of the verb also plays a role in human syntactic disambiguation. Some
verbs are preferably transitive, where others are preferably intransitive. The reduced relative interpretation,
since it involves a passive structure, requires that the verb be transitive.

***************End of Unit 4*****************

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 75
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 5)

UNIT - 5

 Ambiguity Resolution: Statistical Methods


 Probabilistic Language Processing
 Estimating Probabilities
 Part-of Speech tagging
 Obtaining Lexical Probabilities
 Probabilistic Context-Free Grammars
 Best First Parsing. Semantics and Logical Form
 Word senses and Ambiguity
 Encoding Ambiguity in Logical Form.

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 76
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 5)

SHORT ANSWER TYPE QUESTIONS

Ques 1. Define parts of speech tagging with example.


Ans : Part-of-speech (POS) tagging is a popular Natural Language Processing process which refers
to categorizing words in a text (corpus) in correspondence with a particular part of speech, depending on
the definition of the word and its context. It is a process of converting a sentence to forms – list of words,
list of tuples (where each tuple is having a form (word, tag)). The tag in case of is a part-of-speech tag, and
signifies whether the word is a noun, adjective, verb, and so on.

Ques 2. Define pragmatic ambiguity with an example.


Ans. Pragmatic ambiguity arises when the statement is not specific, and the context does not provide
the information needed to clarify the statement.
Sentence Direct meaning (semantic Other meanings
meaning) (pragmatic meanings)
Do you know what time is it? Asking for the current time Expressing anger to someone
who missed the due time or
something `
Will you crack open the door? To break Open the door just a little
I am getting hot
The chicken is ready to eat The chicken is ready to eat its The cooked chicken is ready
breakfast, for example. to be served

Ques 3. What is scope neutral logical form?


Ans . In scope neutral logical form , the quantifiers are stored in the predicate argument structure with no
scoping preference indicated , hence the representation does not commit to a specific meaning for the sentence.
It is simply a compact way to enhance readiness of sentences.
E.g : Someone loves everone
Possible logical forms : (i)  x  y (love x y)
(ii)  y  x(love x y)
Scope neutral form: (love [  x x ] [  y y ]

Ques 4. What is computational semantics?


Ans. Computational semantics is a relatively new discipline that combines insights from formal semantics,
computational linguistics, and automated reasoning. The aim of computational semantics is to find techniques
for automatically constructing semantic representations for expressions of human language, representations that
can be used to perform inference. The major tasks of computational semantics are :
 Word sense disambiguation

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 77
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 5)

 Word similarity
 Semantic role labeling

Ques 5. What is word2Vec model?


Ans. This model is used to capture semantic information. It was developed by Tomas Mikolov at Google in
[Link] is preferably used to solve advance NLP problems. It can iterate over large corpus of text to learn
association and dependencies among the words.
 We can find similarities among the words by “Cosine Similarity”.
 If cosine angle is 1 then words overlap.
 If cosine angle is 90 then words are independent

Ques 6. What is term frequency and inverse document frequency in word sense disambiguation?
Ans . Term frequency uses two parameters to count the frequency of words in documents. Parameters are term t
and document d. Since the ordering of terms is not significant, we can use a vector to describe the text
in the bag of term models. For each specific term in the paper, there is an entry with the value being the term
frequency.
TF(t , d) = (Number of times term t appears in a document) / (Total number of terms in the document).

Inverse document frequency: Mainly, it tests how relevant the word is. The key aim of the search is to
locate the appropriate records that fit the demand. Since tf considers all terms equally significant, it is
therefore not only possible to use the term frequencies to measure the weight of the term in the paper. First,
find the document frequency of a term t by counting the number of documents containing the term:
df(t) = N(t)
where
df(t) = Document frequency of a term t
N(t) = Number of documents containing the term t
Term frequency is the number of instances of a term in a single document only; although the frequency of the
document is the number of separate documents in which the term appears, it depends on the entire corpus..
The IDF of the word is the number of documents in the corpus separated by the frequency of the text.
idf(t) = N/ df(t) = N/N(t)

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 78
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 5)

LONG ANSWER TYPE QUESTIONS


Ques 7. Explain various techniques of ambiguity resolution in natural languages.
Ans. : Ambiguity is referred as the ability of having more than one meaning or being understood in more than
one way. Natural languages are ambiguous, so computers are not able to understand language the way people
do. Ambiguity can occur at various levels of NLP. Ambiguity could be Lexical, Syntactic, Semantic, Pragmatic
etc.
(a).Lexical Ambiguity: Ambiguity of a single word. A word can be ambiguous with respect to its syntactic
class. Eg: The word silver can be used as a noun, an adjective, or a verb.
 She bagged two silver medals.
 She made a silver speech.
 His worries had silvered his hair.
(b) Lexical Semantic Ambiguity: The type of lexical ambiguity, which occurs when a single word is
associated with multiple senses. Eg: bank, pen, fast, bat, cricket etc.
Example: The tank was full of water.
I saw a military tank.
The occurrence of tank in both sentences corresponds to the syntactic category noun, but their meanings are
different.
(c) Syntactic Ambiguity: It is a type of ambiguity where the doubt is about the syntactic structure of the
sentence. That is, there is a possibility that a sentence could be parsed in many syntactical forms (a sentence
may be interpreted in more than one way). The doubt is about which one among different syntactical forms is
correct.
Structural ambiguity is of two kinds: Scope Ambiguity and Attachment Ambiguity.
 A scope ambiguity occurs when two quantifiers or similar expressions can take scope over each other
in different ways in the meaning of a sentence.
Consider the example: Old men and women were taken to safe [Link] scope of the adjective (i.e.,
the amount of text it qualifies) is ambiguous. That is, whether the structure (old men and women) or
((old men) and women)? The scope of quantifiers is often not clear and creates ambiguity.

 Attachment ambiguity arises from uncertainty of attaching a phrase or clause to a part of sentence. It
usually happens when a sentence has more than two prepositional phrases.
Eg: In the sentence “the boy saw the girl with the telescope”, the uncertainty is about relating the
prepositional phrase “with the telescope” to “the boy” or to “the girl”. This could end up with the
following meaning based on the attachment.

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 79
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 5)

(d) Semantic Ambiguity: This occurs when the meaning of the words themselves can be misinterpreted. Even
after the syntax and the meanings of the individual words have been resolved, there are two ways of reading the
sentence. E.g: Seema loves her mother and Sriya does too. The interpretations can be Sriya loves Seema’s
mother or Sriya likes her own mother. Semantic ambiguities born from the fact that generally a computer is not
in a position to distinguishing what is logical form and what is not. We saw his duck Duck can refer to the
person’s bird or to a motion he made

(e)Discourse: Discourse level processing needs a shared world or shared knowledge and the interpretation is
carried out using this context. Anaphoric ambiguity comes under discourse level.

(f)Anaphoric Ambiguity: Anaphoras are the entities that have been previously introduced into the discourse.
E.g: The horse ran up the hill. It was very steep. It soon got tired.
The anaphoric reference of ‘it’ in the two situations cause ambiguity. Steep applies to surface hence ‘it’ can be
hill. Tired applies to animate object hence ‘it’ can be horse.

(g)Pragmatic Ambiguity: Pragmatic ambiguity refers to a situation where the context of a phrase gives it
multiple interpretation. One of the hardest tasks in NLP. The problem involves processing user intention,
sentiment, belief world, modals etc.- all of which are highly complex tasks.
E.g: Tourist (checking out of the hotel): Waiter, go upstairs to my room and see if my sandals are there; do not
be late; I have to catch the train in 15 minutes.
Waiter (running upstairs and coming back panting): Yes sir, they are there. Clearly, the waiter is falling short of
the expectation of the tourist, since he does not understand the pragmatics of the situation.

Ques 8. Discuss few methods of ambiguity resolution with examples.


Ans. Methods for statistical NLP mainly come from machine learning, which is a scientific discipline
concerned with learning from data. That is, to extract information, discover patterns, predict missing
information based on observed information, or more generally construct probabilistic models of the data.
A statistical model is probability distribution over all possible word sequences. n-gram model:-The goal of
statistical language model is to estimate the probability of a sentence. This is achieved by decomposing
sentence probability into a product of conditional probabilities using chain rule
p(s) = p(w1,w2,……..,wn) =p(w1) p(w2/w1) p(w3/w1w2)…p(wn /w1w2…….wn-1) = Π p(wi /hi )
Where ‘hi’ is history of word In order to calculate sentence probability, we need to calculate the probability of word,
given the sequence of words preceding it. An n-gram model simplifies the task by approximating the probability of
word given all the previous words by the conditional probability given previous n-1 words only.

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 80
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 5)

Parts of speech tagging : Statistical methods for reducing ambiguity in sentences have been developed by
using large corpora (e.g., the Brown Corpus or the COBUILD Corpus) about word usage. The words and
sentences in these corpora are pre-tagged with the parts of speech, grammatical structure, and frequencies of
usage for words and sentences taken from a broad sample of written language. Part of speech tagging is the
process of selecting the most likely part of speech from among the alternatives for each word in a sentence .

Probabilistic grammars: Markov models can be extended to grammar rules to help govern choices among
alternatives when the sentence is syntactically ambiguous (that is, there are two or more different parse trees for
the sentence). This technique is called probabilistic grammars. A probabilistic grammar has a probability
associated with each rule, based on its frequency of use in the parsed version of a corpus.

Ques 9. What is probabilistic language processing? Why it is needed in NLP ?

Ans. Probabilistic methods are providing new explanatory approaches to fundamental cognitive science
questions of how humans structure, process and acquire language. Language comprehension and production
involve probabilistic inference in such models; and acquisition involves choosing the best model, given innate
constraints and linguistic and other input. Probabilistic models can account for the learning and processing of
language, while maintaining the sophistication of symbolic models.
Formally, probabilistic parsing involves estimating Prm(t|s) – estimating the likelihood of different trees, t,
given a sentence, s, and given a probabilistic model P rm of the language. By applying Baye’s rule:

The probabilistic model can take as many forms as there are linguistic theories (and linguistic structures, t, may
equally be trees, attribute-value matrices, dependency diagrams, etc.)

Need of probabilistic models in NLP: Due to following reasons probabilistic models are required:
 Language use is situated in a world context
 People write or say the little that is needed to be understood in a certain discourse situation
 Language is highly ambiguous
 Tasks like interpretation and translation involve (probabilistically) reasoning about meaning, using world
knowledge not in the source text
 Categorical grammars aren’t predictive: their notions of grammaticality and ambiguity do not accord with
human perceptions.
 Need to account for variation of languages across speech communities and across time
 People are creative: they bend language ‘rules’ as needed to achieve their novel communication needs

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 81
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 5)

Probabilities give opportunity to unify reasoning, planning, and learning, with communication.

Ques 10. Explain various probability based language models with examples.
Ans. A language model is defined as follows: Define V to be the set of all words in the language. For example,
when building a language model for English we might have V = {the, dog, laughs, saw, barks, cat, . . .} V can
be quite large: it might contain several thousand , or tens of thousands, of words. V is a finite set. A sentence in
the language is a sequence of words x1 x2 . . . xn.
where the integer n is such that n ≥ 1, we have xi ∈ V for i ∈ {1 . . .(n − 1)}, and we assume that xn is a special
symbol, STOP (we assume that STOP is not a member of V)

Markov Models for Fixed-length Sequences: Consider a sequence of random variables, X1, X2, Xn. Each random
variable can take any value in a finite set V. For now we will assume that the length of the sequence, n, is some
fixed number. Our goal is as follows: we would like to model the probability of any sequence x 1 . . . xn, where n
≥ 1 and xi ∈ V for i = 1 . . . n, that is, to model the joint probability: P(X1 = x1, X2 = x2, . . . , Xn = xn)
First order Markov assumption: It creates Bigram language models. Identity of the i’th word in the sequence
depends only on the identity of the previous word, xi−1. More formally, we have assumed that the value of Xi
is conditionally independent of X1 . . . Xi−2, given the value for Xi−1.
P(Xi = xi |X1 = x1 . . . Xi−1 = xi−1) = P(Xi = xi | Xi−1 = xi−1)
Second-order Markov process, which will form the basis of trigram language models, we make a slightly
weaker assumption, namely that each word depends on the previous two words in the sequence:
P(X1 = x1, X2 = x2, . . . Xn = xn) = P (Xi = xi | Xi−2 = xi−2, Xi−1 = xi−1)

Detailed definition of Trigram Language Model: A trigram language model consists of a finite set V, and a parameter
q(w|u, v) for each trigram u, v, w such that w ∈ V ∪ {STOP}, and u, v ∈ V ∪ {*}. The value for q(w | u, v) can be
interpreted as the probability of seeing the word w immediately after the bigram (u, v). For any sentence x1 . . . xn where
xi ∈ V for i = 1 . . .(n − 1), and xn = STOP, the probability of the sentence under the trigram language model is

For example, for the sentence the dog barks STOP


we would have : p(the dog barks STOP) = q(the|*, *)×q(dog|*, the)×q(barks | the, dog) × q(STOP | dog, barks)

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 82
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 5)

Ques 11. Discuss about maximum likelihood estimation technique in terms of probabilistic language
processing.
Ans. We use MLE to estimate the probability of w as a sentence of English , that means the probability that
some sentence S has words w : PMLE(S = w) = C( w)/ N
MLE : maximum likelihood estimation.
where C( w) is the count of w in a large dataset, and N is the total number of sentences in the dataset
If we want P(I spent three years before the mast)
we still need P(mast | I spent three years before the).
Trigram model interprets as : P(mast| I spent three years before the) ≈ P(mast | before the)
Also Trigram model assumes these are all equal:
 P(mast | I spent three years before the)
 P(mast | I went home before the)
 P(mast | I saw the sail before the)
 P(mast | I revised all week before the)
If we use relative frequencies (MLE), we consider out of all cases where we saw before, the as the first two
words of a trigram, then how many had mast as the third word?
We estimate : PMLE(mast | before, the) = C(before, the, mast) / C(before, the)
where C(x) is the count of x in our training data. For any Trigram we have
PMLE(wi |wi−2, wi−1) = C(wi −2, wi−1, wi ) / C(wi−2, wi−1)

Ques 12. Discuss some method/metrics used to evaluate language models.


Ans. Method /metrics used to evaluate language models are described as following:
Perplexity : Perplexity measures how many different equally most probable words can follow any given
word.
Assume that we have some test data sentences x (1), x(2), . . . , x(m) . Each test sentence x (i) for i
∈ {1 . . . m} is a sequence of words x (i) 1 , . . . , x (i) ni , where ni is the length of the i’th sentence.
Also every sentence ends in the STOP symbol. It is critical that the test sentences are “held out”,
in the sense that they are not part of the corpus used to estimate the language model. That means
they are examples of new, unseen sentences.
For any test sentence x (i) , we can measure its probability p(x (i) ) under the language model. A natural
measure of the quality of the language model would be the probability it assigns to the entire set of test
sentences, that is :

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 83
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 5)

 Higher this quantity is, the better the language model is at modeling unseen sentences.
 Define M to be the total number of words in the test corpus. More precisely, under the definition that ni
is the length of the i’th test sentence,

 Average probability of model is defined as :

 This is just the log probability of the entire test corpus, divided by the total number of words in the test
corpus. Here we use log2 (z) for any z > 0 .

The perplexity is then defined as : 2 −l

Ques 13. What is POS tagging? Discuss various approaches of POS tagging.
Ans. Tagging is a kind of classification that may be defined as the automatic assignment of description to the
tokens. Here the descriptor is called tag, which may represent one of the part-of-speech, semantic information
and so on.
Part-of-Speech (PoS) tagging, then it may be defined as the process of assigning one of the parts of speech to
the given word. Parts of speech include nouns, verb, adverbs, adjectives, pronouns, conjunction and their sub-
categories.
Approaches of POS tagging :
 Rule Base POS Tagging
 Stochastic POS Tagging
 Transformation Based Tagging.

(a)Rule-based POS Tagging
One of the oldest techniques of tagging is rule-based POS tagging. Rule-based taggers use dictionary or
lexicon for getting possible tags for tagging each word. If the word has more than one possible tag, then
rule-based taggers use hand-written rules to identify the correct tag. Disambiguation can also be performed
in rule-based tagging by analyzing the linguistic features of a word along with its preceding as well as
following words. For example, suppose if the preceding word of a word is article then word must be a noun.

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 84
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 5)

Information in rule-based POS tagging is coded in the form of rules. These rules may be either −

 Context-pattern rules
 Regular expression compiled into finite-state automata, intersected with lexically ambiguous sentence
representation.
Rule-based POS tagging by its two-stage architecture −

 First stage : In the first stage, it uses a dictionary to assign each word a list of potential parts-of-speech.
 Second stage: In the second stage, it uses large lists of hand-written disambiguation rules to sort down
the list to a single part-of-speech for each word.

(b) Stochastic POS Tagging: The model that includes frequency or probability (statistics) can be called
stochastic. Any number of different approaches to the problem of part-of-speech tagging can be referred to as
stochastic [Link] simplest stochastic tagger applies the following approaches for POS tagging :

(i)Word Frequency Approach : In this approach, the stochastic taggers disambiguate the words based on the probability
that a word occurs with a particular tag. We can also say that the tag encountered most frequently with the word in the
training set is the one assigned to an ambiguous instance of that word. The main issue with this approach is that it may
yield inadmissible sequence of tags.

(ii)Tag Sequence Probabilities: It is another approach of stochastic tagging, where the tagger calculates the probability
of a given sequence of tags occurring. It is also called n-gram approach. It is called so because the best tag for a given
word is determined by the probability at which it occurs with the n previous tags.

(c)Transformation-based Tagging: Transformation based tagging is also called Brill tagging. It is an instance
of transformation-based learning (TBL), which is a rule-based algorithm for automatic tagging of POS to the
given text. TBL, allows us to have linguistic knowledge in a readable form, transforms one state to another state
by using transformation [Link] we see a similarity between rule-based and transformation tagger, then like
rule-based, it is also based on the rules that specify what tags need to be assigned to what words. On the other
hand, if we see a similarity between stochastic and transformation tagger then must use stochastic, it is a
machine learning technique in which rules are automatically induced from data.

Advantages of Transformation-based Learning (TBL)

 We learn a small set of simple rules and these rules are enough for tagging.
 Development as well as debugging is very easy in TBL because the learned rules are easy to understand.
 Complexity in tagging is reduced because in TBL there is the interlacing of machine-learned and human-
generated rules.

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 85
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 5)

Ques 14. Explain morpho-lexical ambiguities.


[Link] a word w with K-analyses A1 , A2 ,…..,Ak the morpho-lexical probabilities of Ai is estimate of
conditional probability P( Ai | w ) from a given corpus.

A word is morphologically ambiguous if k>=[Link] a language L and a word w , we can find all possible
Morphological analyses of w.
A1 , A2 , ……,Ak are K-analyses of w. Ar ={ A1 , A2 , ….Ak}is right analysis while other k-1 are wrong.
Reduction of ambiguity level of an ambiguous word w , with K-morphological analysis : A1 , A2 ….Ak occurs
when it is possible to select from A1 , A2 , …..Ak a proper subset of l analyses 1  l  k ,such that right
Analyses of w is one of these l analyses. If l=1 , we say word w is fully disambiguated.

Ques 15. What is probabilistic context free grammar (PCFG)? Why it is required in NLP?
Ans. The key idea in probabilistic context-free grammars is to extend our definition to give a probability
distribution over possible derivations. That is, we will find a way to define a distribution over parse trees, p(t),
such that for any t ∈ TG, p(t) ≥ 0, TG is the set of all possible left-most derivations (parse trees) under the
grammar G. When the grammar G is clear from context we will often write this as simply T .

each parse-tree t is a complex structure, and the set TG will most likely be infinite. However, we will see that
there is a very simple extension to context-free grammars that allows us to define a function p (t).
A crucial idea is that once we have a function p(t), we have a ranking over possible parses for any sentence in

order of probability. In a given a sentence s, we can return as the output from our parser—this is the most
likely parse tree for s under the model. If our distribution p(t) is a good model for the probability of different
parse trees in our language, we will have an effective way of dealing with ambiguity.
Issues handled in PCFG:
 How to define the function p(t)?
 How do we learn the parameters of our model of p(t) from training examples?
 For a given sentence s, how do we find the most likely tree

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 86
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 5)

Example: To calculate the probability of any parse tree t, we simply multiply together the q values for the
context-free rules that it contains. For example, if our parse tree t is

Reasons to use a PCFG:


 PCFG provides partial solution for grammar ambiguity.
 PCFG gives some idea of the plausibility of a sentence.
 Incorporates Robustness to the NLP parsing. (Admit everything with low probability.)
 PCFG encodes certain biases, e.g., that smaller trees are normally more probable

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 87
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 5)

Ques 16. Explain CKY algorithm. Why it is used?


Ans. Cocke–Younger–Kasami algorithm (alternatively called CYK, or CKY) :This is algorithm for context-
free grammars. The algorithm is named after some of its re-discoverers: John Cocke, Daniel Younger, Tadao
Kasami, and Jacob T. Schwartz. It employs bottom-up parsing and dynamic programming. The standard
version of CYK operates only on context-free grammars given in Chomsky normal form (CNF).
The input to the algorithm is a PCFG : G = (N, Σ, S, R, q) in Chomsky normal form, and a sentence

because by definition π(1, n, S) is the score for the highest probability parse tree spanning words x1 . . . xn, with S as
its root. The key observation in the CKY algorithm is that we can use a recursive definition of the π values, which
allows a simple bottom-up dynamic programming algorithm. The algorithm is “bottom-up”, in the sense that it will
first fill in π(i, j, X) values for the cases where j = i, then the cases where j = i + 1, and so on
The base case in the recursive definition is as follows: for all i = 1 . . . n, for all X ∈ N,

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 88
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 5)

Example : Consider parsing the sentence x1 . . . x8 = the dog saw the man with the telescope and consider
the calculation of π(3, 8, VP). This will be the highest score for any tree with root VP, spanning words x3 . . .
x8 = saw the man with the telescope. Two rules VP → Vt NP and VP → VP PP. Second, a choice of s ∈ {3,
4, . . . 7}

Ques 17. What are dependency graphs? How it is useful in NLP, explain.
Ans. Dependency tree of a n-word sentence has n-1 dependency links .Every word in the sentence
must have its head, except the word which is the head of sentence. Cross links are not allowed.
Basic units of dependency graphs are :
 Non constituent objects
 Complete link
 Complete sequence
Dependency graphs are supported by dependency grammars. This grammar describes a language with a set of
Head-Dependent Relation between any two words. E.g : Modifier-modifier, predicate-argument. Functional
role is assigned to a dependency link and specifies syntactic semantic relation between head and dependent.

Fig: Dependency tree with link representation


A dependency tree as a hierarchical representation and a link representation respectively. In both, the word
“sold" is the head of the sentence. Here, we define the non-constituent objects, complete-link and complete-
sequence which are used in PDG re-estimation and BFP(Best First Parsing) algorithms. A set of dependency
links constructed for word sequence wij is defined as complete-link if the set satisfies following conditions:
The set has exclusively (wi  wj) or (wi  wj).
• There is neither link-crossing nor link-cyele.
• The set is composed of j - i dependency links.
• Every inner word of wij must have its head and thus a link from the head.

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 89
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 5)

Complete-sequence is defined as a sequence of null or more adjacent complete-links of same direction. Basic
complete-sequence is null sequence of complete-links which is defined on one word, the smallest word
sequence. The direction of complete-sequence is determined by the direction of component complete-links. If
the complete-sequence is composed of leftward complete-links, the complete-sequence is leftward, and vice
versa.

Following notations are used to represent the four kinds of objects for a word sequence w ij and for an m
from i to j-1.

The probability of each object is defined as follows:

The PDG best-first parsing algorithm constructs the best dependency tree in bottom-up manner, with
dynamic programming method using CYK style chart. It is based on complete-link and complete-sequence of
non-constituent concept. The parsing algorithm constructs the complete-link and complete-sequences for
substring, and merges incrementally the complete-links into larger complete-sequence and complete-sequences
into larger complete-link until the Lr (BOS, EOS) with maximum probability is constructed.
BOS: Beginning of sentence , EOS : end of sentence

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 90
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 5)

Ques 18. Define the term word sense disambiguation. Discuss some approaches of it.
Ans. Word sense disambiguation, in natural language processing (NLP), may be defined as the ability to
determine which meaning of word is activated by the use of word in a particular context. Lexical ambiguity,
syntactic or semantic, is one of the very first problem that any NLP system faces. Part-of-speech (POS) taggers
with high level of accuracy can solve Word’s syntactic ambiguity. On the other hand, the problem of resolving
semantic ambiguity is called WSD (word sense disambiguation). Resolving semantic ambiguity is harder than
resolving syntactic ambiguity.

For example, consider the two examples of the distinct sense that exist for the word “bass” −

 I can hear bass sound.


 He likes to eat grilled bass.

The occurrence of the word bass clearly denotes the distinct meaning. In first sentence, it means frequency and
in second, it means fish. Hence, if it would be disambiguated by WSD then the correct meaning to the above
sentences can be assigned as follows

Evaluation of WSD : The evaluation of WSD requires the following two inputs −

A Dictionary : The very first input for evaluation of WSD is dictionary, which is used to specify the senses to
be disambiguated.
Test Corpus : Another input required by WSD is the high-annotated test corpus that has the target or correct-
senses. The test corpora can be of two types :
 Lexical sample − This kind of corpora is used in the system, where it is required to disambiguate a
small sample of words.
 All-words − This kind of corpora is used in the system, where it is expected to disambiguate all the
words in a piece of running text.

Approaches to Word Sense Disambiguation (WSD) : Approaches and methods to WSD are classified
according to the source of knowledge used in word disambiguation.

(i)Dictionary-based or Knowledge-based Methods


(ii) Supervised Methods
(iii) Semi-supervised methods
(iv)Unsupervised Methods

Dictionary based approach: These methods primarily rely on dictionaries, treasures and lexical knowledge
base. They do not use corpora evidences for disambiguation which further means identify the correct sense for
one word at a time. Here the current context is the set of words in surrounding sentence or paragraph.

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 91
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 5)

Supervised Methods: For disambiguation, machine learning methods make use of sense-annotated corpora to
train. These methods assume that the context can provide enough evidence on its own to disambiguate the
sense. In these methods, the words knowledge and reasoning are deemed unnecessary. The context is
represented as a set of “features” of the words. It includes the information about the surrounding words also.
Support vector machine and memory-based learning are the most successful supervised learning approaches to
WSD. These methods rely on substantial amount of manually sense-tagged corpora, which is very expensive to
create.

Semi-supervised Methods: Due to the lack of training corpus, most of the word sense disambiguation
algorithms use semi-supervised learning methods. It is because semi-supervised methods use both labeled as
well as unlabeled data. These methods require very small amount of annotated text and large amount of plain
unannotated text.

Ques 19. Write short notes on the following:


(i) Semantics and logical form of sentences.
(ii) Encoding Ambiguity in logical form.
Ans. Semantics and logical form of sentences: To avoid immediately committing to a single meaning of an
ambiguous sentence, Logical Form partially specifies the meaning of a sentence based on syntactic and
sentence-level information, without considering the effects of pragmatics and context. This partial specification
of meaning allows us to process additional sentences before further limiting the meaning of an ambiguous
sentence. as relevant information becomes available (from a context processing module), the intermediate
representation of the ambiguous sentence can be incrementally updated. This process can continue until all of
the ambiguities are resolved and an unambiguous internal representation of the sentence is generated (although
this is not a requirement of the approach).
Three constraints for using LF in a computational framework:
Compactness Constraint: The compactness constraint postpones decisions about ambiguity without large
storage requirements. LF should compactly represent the underdetermined meaning of a sentence

Modularity Constraint: The modularity constraint requires LF to be initially computable from syntax and
local (sentence-level) semantics only.

Formal Consistency Constraint: The formal consistency constraint requires that any update to the meaning of
LF should be a refinement of the original meaning. Initially, LF provides a composite representation for a
sentence.

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 92
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 5)

(ii)Encoding ambiguity in logical form:


 Quantifier Scope Ambiguity in Logical Form : Quantifier scope ambiguity has been handled by
some researchers by using an intermediate scope-neutral LF. In scope-neutral LF, the quantifiers are
stored in the predicate argument structure with no scoping preference indicated, hence the
representation does not commit to a specific meaning for the sentence; it is simply a compact way of
expressing the set of all possible readings.
 E.g : Someone loves everone
Possible logical forms : (i)  x  y (love x y)
(ii)  y  x(love x y)
Scope neutral form: (love [  x x ] [  y y ]

Meanings of pronouns, singular definite NPs, and singular indefinite NPs often cannot be determined without
additional contextual information. Postponing decisions about the precise meanings of sentences containing
these types of constituents could be extremely useful.
Example : Every man showed a boy a picture of his mother. The precise meaning of the sentence cannot be
specified until information is available to select the pronoun's antecedent and to determine the quantifier
scoping
Pronouns are a source of ambiguity in verb phrase ellipsis (VPE). To signal a VPE, a full verb phrase (VP) is
replaced with an auxiliary, as in the second sentence. A sentence with VPE is called an elided sentence. The
index on Fred and his indicates that they are co-referential.

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 93
KRISHNA INSTITUTE OF TECHNOLOGY NATURAL LANGUAGE PROCESSING ( UNIT 5)

Trigger Sentence: Fred loves his wife


Elided Sentence: George does too.
Possible Meanings: George loves George's wife. (sloppy meaning) The elided sentence has little meaning
independent of the first sentence, called a trigger sentence. Hence, before determining the meaning of the elided
sentence, the meaning of the trigger sentence must be completely determined.

******End of Unit 5******

[Link](CSE IV YEAR) MR. ANUJ KHANNA( Asstt. Professor)


Page 94

You might also like