0% found this document useful (0 votes)

822 views

Unit 4 NLP

Natural Language Processing (NLP) involves computer analysis and representation of human language input. The field aims to perform useful tasks with human languages and improve understanding of language. NLP involves understanding language through morphological, syntactic, semantic and discourse analysis, and generating language. It is an interdisciplinary field that draws from linguistics, computer science, engineering, psychology and philosophy.

Uploaded by

Kumar Sumit

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

822 views

Unit 4 NLP

Uploaded by

Kumar Sumit

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 51

AI: NLP

1
What is Natural Language Processing
(NLP)
• The process of computer analysis of input
provided in a human language (natural language),
and conversion of this input into a useful form of
representation.
• The field of NLP is primarily concerned with
getting computers to perform useful and
interesting tasks with human languages.
• The field of NLP is secondarily concerned with
helping us come to a better understanding of
human language.
2
Forms of Natural Language
• The input/output of a NLP system can be:
– written text
– speech
• We will mostly concerned with written text (not
speech).
• To process written text, we need:
– lexical, syntactic, semantic knowledge about the language
– discourse information, real world knowledge
• To process spoken language, we need everything
required to process written text, plus the
challenges of speech recognition and speech
synthesis.

3
Components of NLP
• Natural Language Understanding
– Mapping the given input in the natural language into a useful representation.
– Different level of analysis required:
morphological analysis,
syntactic analysis,
semantic analysis,
discourse analysis, …
• Natural Language Generation
– Producing output in the natural language from some internal representation.
– Different level of synthesis required:
deep planning (what to say),
syntactic generation
• NL Understanding is much harder than NL Generation. But, still
both of them are hard.

4
Why NL Understanding is hard?
• Natural language is extremely rich in form and structure,
and very ambiguous.
– How to represent meaning,
– Which structures map to which meaning structures.
• One input can mean many different things. Ambiguity can
be at different levels.
– Lexical (word level) ambiguity -- different meanings of words
– Syntactic ambiguity -- different ways to parse the sentence
– Interpreting partial information -- how to interpret pronouns
– Contextual information -- context of the sentence may affect
the meaning of that sentence.
• Many input can mean the same thing.
• Interaction among components of the input is not clear.

5
Knowledge of Language
• Phonology – concerns how words are related to the sounds
that realize them.
• Morphology – concerns how words are constructed from
more basic meaning units called morphemes. A
morpheme is the primitive unit of meaning in a language.
• Syntax – concerns how can be put together to form correct
sentences and determines what structural role each word
plays in the sentence and what phrases are subparts of
other phrases.
• Semantics – concerns what words mean and how these
meaning combine in sentences to form sentence meaning.
The study of context-independent meaning.

6
Knowledge of Language (cont.)
• Pragmatics – concerns how sentences are used in
different situations and how use affects the
interpretation of the sentence.

• Discourse – concerns how the immediately preceding

sentences affect the interpretation of the next
sentence. For example, interpreting pronouns and
interpreting the temporal aspects of the information.

• World Knowledge – includes general knowledge about

the world. What each language user must know about
the other’s beliefs and goals.

7
What is Natural Language Processing
(NLP)
• The process of computer analysis of input
provided in a human language (natural language),
and conversion of this input into a useful
form of representation.
• The field of NLP is primarily concerned with
getting computers to perform useful and
interesting tasks with human languages.
• The field of NLP is secondarily concerned with
helping us come to a better understanding
of human language.
BİL711 Natural Language Processing 8
Forms of Natural Language
• The input/output of a NLP system can be:
– written text
– speech
• We will mostly concerned with written text (not
speech).
• To process written text, we need:
– lexical, syntactic, semantic knowledge about the language
– discourse information, real world knowledge
• To process spoken language, we need everything
required to process written text, plus the
challenges of speech recognition and speech
synthesis.

BİL711 Natural Language Processing 9

Components of NLP
• Natural Language Understanding
– Mapping the given input in the natural language into a useful representation.
– Different level of analysis required:
morphological analysis,
syntactic analysis,
semantic analysis,
discourse analysis, …
• Natural Language Generation
– Producing output in the natural language from some internal representation.
– Different level of synthesis required:
deep planning (what to say),
syntactic generation
• NL Understanding is much harder than NL Generation. But, still
both of them are hard.

BİL711 Natural Language Processing 10

Why NL Understanding is hard?
• Natural language is extremely rich in form and structure,
and very ambiguous.
– How to represent meaning,
– Which structures map to which meaning structures.
• One input can mean many different things. Ambiguity can
be at different levels.
– Lexical (word level) ambiguity -- different meanings of words
– Syntactic ambiguity -- different ways to parse the sentence
– Interpreting partial information -- how to interpret pronouns
– Contextual information -- context of the sentence may affect
the meaning of that sentence.
• Many input can mean the same thing.
• Interaction among components of the input is not clear.

BİL711 Natural Language Processing 11

Knowledge of Language
• Phonology – concerns how words are related to the sounds
that realize them.
• Morphology – concerns how words are constructed from
more basic meaning units called morphemes. A
morpheme is the primitive unit of meaning in a language.
• Syntax – concerns how can be put together to form correct
sentences and determines what structural role each word
plays in the sentence and what phrases are subparts of
other phrases.
• Semantics – concerns what words mean and how these
meaning combine in sentences to form sentence meaning.
The study of context-independent meaning.

BİL711 Natural Language Processing 12

Knowledge of Language (cont.)
• Pragmatics – concerns how sentences are used in
different situations and how use affects the
interpretation of the sentence.

• Discourse – concerns how the immediately preceding

sentences affect the interpretation of the next
sentence. For example, interpreting pronouns and
interpreting the temporal aspects of the information.

• World Knowledge – includes general knowledge about

the world. What each language user must know about
the other’s beliefs and goals.

BİL711 Natural Language Processing 13

Language and Intelligence
Turing Test

Computer Human

Human Judge

• Human Judge asks tele-typed questions to Computer and

Human.
• Computer’s job is to act like a human.
• Human’s job is to convince Judge that he is not machine.
• Computer is judged “intelligent” if it can fool the judge
• Judgment of intelligence is linked to appropriate answers to
questions from the system.

BİL711 Natural Language Processing 14

NLP - an inter-disciplinary Field
• NLP borrows techniques and insights from several disciplines.
• Linguistics: How do words form phrases and sentences? What
constraints the possible meaning for a sentence?
• Computational Linguistics: How is the structure of sentences are
identified? How can knowledge and reasoning be modeled?
• Computer Science: Algorithms for automatons, parsers.
• Engineering: Stochastic techniques for ambiguity resolution.
• Psychology: What linguistic constructions are easy or difficult for
people to learn to use?
• Philosophy: What is the meaning, and how do words and sentences
acquire it?

BİL711 Natural Language Processing 15

Some NLP Applications
• Machine Translation – Translation between two natural
languages.
– See the Babel Fish translations system on Alta Vista.
• Information Retrieval – Web search (uni-lingual or
multi-lingual).
• Query Answering/Dialogue – Natural language
interface with a database system, or a dialogue system.
• Report Generation – Generation of reports such as
weather reports.
• Some Small Applications –
– Grammar Checking, Spell Checking, Spell Corrector

BİL711 Natural Language Processing 16

Brief History of NLP
• 1940s –1950s: Foundations
– Development of formal language theory (Chomsky, Backus, Naur,
Kleene)
– Probabilities and information theory (Shannon)
• 1957 – 1970s:
– Use of formal grammars as basis for natural language processing
(Chomsky, Kaplan)
– Use of logic and logic based programming (Minsky, Winograd,
Colmerauer, Kay)
• 1970s – 1983:
– Probabilistic methods for early speech recognition (Jelinek, Mercer)
– Discourse modeling (Grosz, Sidner, Hobbs)
• 1983 – 1993:
– Finite state models (morphology) (Kaplan, Kay)
• 1993 – present:
– Strong integration of different techniques, different areas.

BİL711 Natural Language Processing 17

Natural Language Understanding

Words

Morphological Analysis
Morphologically analyzed words (another step: POS tagging)

Syntactic Analysis
Syntactic Structure

Semantic Analysis
Context-independent meaning representation

Discourse Processing
Final meaning representation

BİL711 Natural Language Processing 18

Natural Language Generation
Meaning representation

Utterance Planning
Meaning representations for sentences

Sentence Planning and Lexical Choice

Syntactic structures of sentences with lexical choices

Sentence Generation
Morphologically analyzed words

Morphological Generation
Words

BİL711 Natural Language Processing 19

Morphological Analysis
• Analyzing words into their linguistic components (morphemes).
• Morphemes are the smallest meaningful units of language.
cars car+PLU
giving give+PROG
geliyordum gel+PROG+PAST+1SG - I was coming
• Ambiguity: More than one alternatives
flies flyVERB+PROG
flyNOUN+PLU

adamı adam+ACC - the man

(accusative)
adam+P1SG - my man
ada+P1SG+ACC - my island
(accusative)

BİL711 Natural Language Processing 20

Morphological Analysis (cont.)
• Relatively simple for English. But for some languages
such as Turkish, it is more difficult.
uygarlaştıramadıklarımızdanmışsınızcasına
uygar-laş-tır-ama-dık-lar-ımız-dan-mış-sınız-casına
uygar +BEC +CAUS +NEGABLE +PPART +PL +P1PL +ABL +PAST +2PL +AsIf
“(behaving) as if you are among those whom we could not civilize/cause to become civilized”
+BEC is “become” in English
+CAUS is the causative voice marker on a verb
+PPART marks a past participle form
+P1PL is 1st person plural possessive marker
+2PL is 2nd person plural
+ABL is the ablative (from/among) case marker
+AsIf is a derivational marker that forms an adverb from a finite verb form
+NEGABLE is “not able” in English

• Inflectional and Derivational Morphology.

• Common tools: Finite-state transducers
BİL711 Natural Language Processing 21
Part-of-Speech (POS) Tagging
• Each word has a part-of-speech tag to describe its category.
• Part-of-speech tag of a word is one of major word groups
(or its subgroups).
– open classes -- noun, verb, adjective, adverb
– closed classes -- prepositions, determiners, conjuctions,
pronouns, particples
• POS Taggers try to find POS tags for the words.
• duck is a verb or noun? (morphological analyzer cannot
make decision).
• A POS tagger may make that decision by looking the
surrounding words.
– Duck! (verb)
– Duck is delicious for dinner. (noun)

BİL711 Natural Language Processing 22

Lexical Processing
• The purpose of lexical processing is to determine meanings of
individual words.
• Basic methods is to lookup in a database of meanings -- lexicon
• We should also identify non-words such as punctuation marks.
• Word-level ambiguity -- words may have several meanings, and the
correct one cannot be chosen based solely on the word itself.
– bank in English
– yüz in Turkish
• Solution -- resolve the ambiguity on the spot by POS tagging
(if possible) or pass-on the ambiguity to the other levels.

BİL711 Natural Language Processing 23

Syntactic Processing
• Parsing -- converting a flat input sentence into a hierarchical
structure that corresponds to the units of meaning in the sentence.
• There are different parsing formalisms and algorithms.
• Most formalisms have two main components:
– grammar -- a declarative representation describing the syntactic
structure of sentences in the language.
– parser -- an algorithm that analyzes the input and outputs its
structural representation (its parse) consistent with the grammar
specification.
• CFGs are in the center of many of the parsing mechanisms. But they
are complemented by some additional features that make the
formalism more suitable to handle natural languages.

BİL711 Natural Language Processing 24

Semantic Analysis
• Assigning meanings to the structures created by
syntactic analysis.
• Mapping words and structures to particular domain
objects in way consistent with our knowledge of the
world.
• Semantic can play an import role in selecting among
competing syntactic analyses and discarding illogical
analyses.
– I robbed the bank -- bank is a river bank or a
financial institution
• We have to decide the formalisms which will be used in
the meaning representation.

BİL711 Natural Language Processing 25

Knowledge Representation for NLP
• Which knowledge representation will be used depends
on the application -- Machine Translation, Database
Query System.
• Requires the choice of representational framework, as
well as the specific meaning vocabulary (what are
concepts and relationship between these concepts --
ontology)
• Must be computationally effective.
• Common representational formalisms:
– first order predicate logic
– conceptual dependency graphs
– semantic networks
– Frame-based representations

BİL711 Natural Language Processing 26

How to get there
NLP applications are all similar in that they
require some level of understanding.

Understand the query, understand the

document, understand the data being
communicated…
Understanding Sentences: Overview
Parsing and Grammar
How is a sentence composed?

Lexicons
How is a word composed?

Ambiguity
Parsing Requirements
Requires a defined Grammar
Requires a big dictionary (10K words)
Requires that sentences follow the grammar
defined
Requires ability to deal with words not in
dictionary
Parsing (from Section 22.4)
Goal:
Understand a single sentence by syntax analysis
Methods
– Bottom-up
– Top-down
More efficient (and complicated) algorithm
given in 23.2
A Parsing Example
S  NP VP
NP  Article N | Proper
Rules: VP  Verb NP
N  home | boy | store
Proper  Betty | John
Verb  go|give|see
Article  the | an | a

The Sentence: The boy went home.

n-grams

• Limit hi to n-1 preceding words

Most used cases
n

– Uni-gram: P ( s )   P( wi )
i 1
n
– Bi-gram: P( s)   P( wi | wi 1 )
i 1
n
– Tri-gram: P( s)   P( wi | wi 2 wi 1 )
i 1
A simple example
(corpus = 10 000 words, 10 000 bi-grams)
wi P(wi) wi-1 wi-1wi P(wi|wi-1)
I (10) 10/10 000 # (1000) (# I) (8) 8/1000
= 0.001 = 0.008
that (10) (that I) (2) 0.2
talk (8) 0.0008 I (10) (I talk) (2) 0.2
we (10) (we talk) (1) 0.1
…
talks (8) 0.0008 he (5) (he talks) (2) 0.4
she (5) (she talks) (2) 0.4
…
she (5) 0.0005 says (4) (she says) (2) 0.5
laughs (2) (she laughs) (1) 0.5
listens (2) (she listens) (2) 1.0
Uni-gram: P(I, talk) = P(I) * P(talk) = 0.001*0.0008
P(I, talks) = P(I) * P(talks) = 0.001*0.0008
Bi-gram: P(I, talk) = P(I | #) * P(talk | I) = 0.008*0.2
P(I, talks) = P(I | #) * P(talks | I) = 0.008*0
Smoothing

• Goal: assign a low probability to words or

n-grams not observed in the training corpus

P
MLE

smoothed

word
Smoothing methods
n-gram: 
• Change the freq. of occurrences
– Laplace smoothing (add-one):
|  | 1
Padd _ one ( | C ) 
 (|  i | 1)
 i V
– Good-Turing
nr 1
change the freq. r to r*  (r  1)
nr
nr = no. of n-grams of freq. r
Smoothing (cont’d)

• Combine a model with a lower-order model

– Backoff (Katz)

 PGT (wi | wi 1 ) if | wi 1wi | 0

PKatz ( wi | wi 1 )  
 (wi 1 ) PKatz ( wi ) otherwise
– Interpolation (Jelinek-Mercer)

PJM ( wi | wi 1 )  wi1 PML ( wi | wi 1 )  (1  wi1 ) PJM ( wi )

Information Retrieval
Now the main focus of Natural Language
Processing

There are four types:

1. Query answering
2. Text categorization
3. Text summary
4. Data extraction
Information Retrieval: The task
Choose from some set of documents ones that
are related to my query

Ex. Internet search

Information Retrieval
Methods
Boolean: “(Natural AND Language) OR
(Computational AND Linguistics)”
• too confusing for most users

Vector: Assign different weights to each term in

query. Rank documents by distance from
query and report ones that are close.
Information Retrieval
Mostly implemented using simple statistical
models on the words only
More advanced NLP techniques have not
yielded significantly better results

Information in a text is mostly in its words

41
Text Categorization
Once upon a time… this was done by humans
Computers are much better at it (and more consistent)
Best success for NLP so far (90+ % accuracy)
Much faster and more consistent than humans.
Automated systems now perform most of the work.
NLP works better for TC than IR because categories are
fixed.
Text Summarization
Main task: understand main meaning and
describe in a shorter way
Common Systems: Microsoft
How:
– Sentence/paragraph extraction (find the most
important sentences/paragraphs and string them
together for a summary)
– Statistical methods are more common
The PageRank Algo
• PageRank3 was one of the two original ideas
that set Google’s search apart from other Web
search engines when it was introduced in
1997.
• “The other innovation was the use of anchor
text—the underlined text in a hyperlink—to
index a page, even though the anchor text was
ona different page than the one being
indexed.)

44
45
• The PageRank algorithm is designed to weight links
from high-quality sites more heavily. What is a high-
quality site? One that is linked to by other high-quality
sites. The deﬁnition is recursive, but we will see that
the recursion bottoms out properly.
• The PageRank for a page p is deﬁned as:

• where PR(p) is the PageRank of page p, N is the total

number of pages in the corpus, ini are the pages that
link in to p, and C(ini) is the count of the total number
of out-links on page ini.
• The constant d is a damping factor. It can be
understood through the random surfer model: imagine
a Web surfer who starts at some random page and
begins exploring.
46
The HITS Algo: Question Answering
System

• The Hyperlink-Induced Topic Search algorithm,

also known as “Hubs and Authorities” or HITS,
is another inﬂuential link-analysis algorithm.
• Both PageRank and HITS played important
roles in developing our understanding of Web
information retrieval.

47
HITS differs from PageRank in several ways:
• First, it is a query-dependent measure: it rates
pages with respect to a query. That means that it
must be computed anew for each query—a
computational burden that most search engines
have elected not to take on.
• Given a query, HITS ﬁrst ﬁnds a set of pages that
are relevant to the query. It does that by
intersecting hit lists of query words, and then
adding pages in the link neighborhood of these
pages—pages that link to or are linked from one
of the pages in the original relevant set.

48
Question Answering
• Information retrieval is the task of ﬁnding documents
that are relevant to a query, where the query may be a
question, or just a topic area or concept.
• Question answering is a somewhat different task, in
which the query really is a question, and the answer is
not a ranked list of documents but rather a short
response—a sentence, or even just a phrase.
• There have been question-answering NLP (natural
language processing) systems since the 1960s, but only
since 2001 have such systems used Web information
retrieval to radically increase their breadth of coverage.

49
Information Extraction
• In formation extraction is the process of acquiring
knowledge by skimming a text and looking for
occurrences of a particular class of object and for
relationships among objects.
• A typical task is to extract instances of addresses
from Web pages, with database ﬁelds for street,
city, state, and zip code; or instances of storms
from weather reports, with ﬁelds for
temperature, wind speed, and precipitation.

50
• 1. Tokenization
• 2. Complex-word handling
• 3. Basic-group handling
• 4. Complex-phrase handling
• 5. Structure merging

Hourglass Workout Program by Luisagiuliet 2
76% (21)
Hourglass Workout Program by Luisagiuliet 2
51 pages
12 Week Program: Summer Body Starts Now
87% (46)
12 Week Program: Summer Body Starts Now
70 pages
Read People Like A Book by Patrick King-Edited
58% (81)
Read People Like A Book by Patrick King-Edited
12 pages
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
77% (13)
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
260 pages
Cheat Code To The Universe
94% (79)
Cheat Code To The Universe
34 pages
Facial Gains Guide (001 081)
91% (45)
Facial Gains Guide (001 081)
81 pages
Curse of Strahd
95% (467)
Curse of Strahd
258 pages
The Psychiatric Interview - Daniel Carlat
91% (34)
The Psychiatric Interview - Daniel Carlat
473 pages
The Borax Conspiracy
91% (57)
The Borax Conspiracy
14 pages
The Secret Language of Attraction
86% (107)
The Secret Language of Attraction
278 pages
How To Develop and Write A Grant Proposal
83% (542)
How To Develop and Write A Grant Proposal
17 pages
Penis Enlargement Secret
60% (124)
Penis Enlargement Secret
12 pages
Workbook For The Body Keeps The Score
89% (53)
Workbook For The Body Keeps The Score
111 pages
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
83% (1016)
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
13 pages
KamaSutra Positions
78% (69)
KamaSutra Positions
55 pages
7 Hermetic Principles
93% (30)
7 Hermetic Principles
3 pages
27 Feedback Mechanisms Pogil Key
77% (13)
27 Feedback Mechanisms Pogil Key
6 pages
Frank Hammond - List of Demons
92% (92)
Frank Hammond - List of Demons
3 pages
Phone Codes
79% (28)
Phone Codes
5 pages
36 Questions That Lead To Love
91% (35)
36 Questions That Lead To Love
3 pages
How 2 Setup Trust
97% (307)
How 2 Setup Trust
3 pages
The 36 Questions That Lead To Love - The New York Times
94% (34)
The 36 Questions That Lead To Love - The New York Times
3 pages
100 Questions To Ask Your Partner
78% (36)
100 Questions To Ask Your Partner
2 pages
Satanic Calendar
25% (56)
Satanic Calendar
4 pages
The 36 Questions That Lead To Love - The New York Times
95% (21)
The 36 Questions That Lead To Love - The New York Times
3 pages
14 Easiest & Hardest Muscles To Build (Ranked With Solutions)
100% (8)
14 Easiest & Hardest Muscles To Build (Ranked With Solutions)
27 pages
Jeffrey Epstein39s Little Black Book Unredacted PDF
75% (12)
Jeffrey Epstein39s Little Black Book Unredacted PDF
95 pages
Basic 9 English 2ND Term E-Notes
No ratings yet
Basic 9 English 2ND Term E-Notes
58 pages
1001 Songs
69% (72)
1001 Songs
1,798 pages
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
23% (954)
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
38 pages
Zodiac Sign & Their Most Common Addictions
63% (30)
Zodiac Sign & Their Most Common Addictions
9 pages
Stative Verbs
No ratings yet
Stative Verbs
10 pages
NLP Question Paper Solution
No ratings yet
NLP Question Paper Solution
27 pages
NLP QB
100% (2)
NLP QB
14 pages
Pocket Book
No ratings yet
Pocket Book
41 pages
NLP Notes (Ch1-5) PDF
100% (1)
NLP Notes (Ch1-5) PDF
41 pages
Introduction To NLP
No ratings yet
Introduction To NLP
30 pages
Natural Language Processing
100% (2)
Natural Language Processing
48 pages
NLP UNIT 2 (Ques Ans Bank)
No ratings yet
NLP UNIT 2 (Ques Ans Bank)
26 pages
NLP Notes For Students
No ratings yet
NLP Notes For Students
18 pages
Natural Language Processing Module 1 Notes PDF
100% (2)
Natural Language Processing Module 1 Notes PDF
15 pages
Unit - 5 Natural Language Processing
No ratings yet
Unit - 5 Natural Language Processing
66 pages
NLP Notes
No ratings yet
NLP Notes
18 pages
SEM-2-NLP Questions
No ratings yet
SEM-2-NLP Questions
3 pages
CSE4022 Natural-Language-Processing ETH 1 AC41
No ratings yet
CSE4022 Natural-Language-Processing ETH 1 AC41
6 pages
Natural Language Processing
100% (1)
Natural Language Processing
21 pages
Unit 5 - Notes
No ratings yet
Unit 5 - Notes
11 pages
NLP UNIT-II
No ratings yet
NLP UNIT-II
71 pages
NLP Unit 1 Notes
100% (1)
NLP Unit 1 Notes
19 pages
NLP ORAL - Sample Question Bank: Modul e No. Sr. No - Description
No ratings yet
NLP ORAL - Sample Question Bank: Modul e No. Sr. No - Description
9 pages
NLP Lab Manual Updated
No ratings yet
NLP Lab Manual Updated
34 pages
5.2 Natural Language Processing
No ratings yet
5.2 Natural Language Processing
43 pages
NLP Unit-3-Semantics-And-Pragmatics
No ratings yet
NLP Unit-3-Semantics-And-Pragmatics
20 pages
NLP Unit 1
100% (1)
NLP Unit 1
34 pages
Solution To NLP Viva Questions
No ratings yet
Solution To NLP Viva Questions
21 pages
Unit 1 2 3 4 5 NLP Notes Merged
100% (1)
Unit 1 2 3 4 5 NLP Notes Merged
105 pages
Unit 4 NLP Notes
No ratings yet
Unit 4 NLP Notes
35 pages
NLP Presentation
No ratings yet
NLP Presentation
19 pages
Ccs369-Unit 3
No ratings yet
Ccs369-Unit 3
28 pages
Unit I
No ratings yet
Unit I
30 pages
Natural Language Processing
No ratings yet
Natural Language Processing
37 pages
Natural Language Processing Parsing Techniques:: Unit IV
100% (1)
Natural Language Processing Parsing Techniques:: Unit IV
24 pages
Irt 2 Marks With Answer
No ratings yet
Irt 2 Marks With Answer
15 pages
Text and Speech Analysis Notes CCS369-UNIT 1
No ratings yet
Text and Speech Analysis Notes CCS369-UNIT 1
27 pages
Natural Language Processing Notes
No ratings yet
Natural Language Processing Notes
26 pages
Natural Language Processing: Dr. Tulasi Prasad Sariki SCOPE, VIT Chennai
No ratings yet
Natural Language Processing: Dr. Tulasi Prasad Sariki SCOPE, VIT Chennai
29 pages
NLP Notes Unit-3.Doc
No ratings yet
NLP Notes Unit-3.Doc
19 pages
CCS369 - TSS-Unit 3
No ratings yet
CCS369 - TSS-Unit 3
55 pages
Unit-8: Natural Language: Processing
No ratings yet
Unit-8: Natural Language: Processing
16 pages
Shivangi Tyagi (NLP Assignments)
No ratings yet
Shivangi Tyagi (NLP Assignments)
60 pages
NLP Unit1
No ratings yet
NLP Unit1
51 pages
Unit 5
No ratings yet
Unit 5
20 pages
NLP UNIT 1 (Ques Ans Bank)
No ratings yet
NLP UNIT 1 (Ques Ans Bank)
20 pages
Natural Language Processing
No ratings yet
Natural Language Processing
24 pages
INT344
50% (2)
INT344
2 pages
IR UNIT I - Notes
No ratings yet
IR UNIT I - Notes
23 pages
NLP Important and Super Important Questions-18CS743
No ratings yet
NLP Important and Super Important Questions-18CS743
2 pages
Ccs369-Unit 4
No ratings yet
Ccs369-Unit 4
13 pages
NLP Unit II Notes
71% (7)
NLP Unit II Notes
18 pages
Unit V Intelligence and Applications: Morphological Analysis/Lexical Analysis
No ratings yet
Unit V Intelligence and Applications: Morphological Analysis/Lexical Analysis
30 pages
NLP Unit I Notes-1
75% (4)
NLP Unit I Notes-1
22 pages
FEATURES AND AUGMENTED GRAMMARS Overall
No ratings yet
FEATURES AND AUGMENTED GRAMMARS Overall
3 pages
NLP Iat QB
No ratings yet
NLP Iat QB
10 pages
Natural Language Processing
No ratings yet
Natural Language Processing
47 pages
NLP Unit 5
No ratings yet
NLP Unit 5
10 pages
Solutions To NLP I Mid Set A
100% (1)
Solutions To NLP I Mid Set A
8 pages
PAT Trees and PAT Arrays
No ratings yet
PAT Trees and PAT Arrays
12 pages
NLP Final
No ratings yet
NLP Final
26 pages
Information Retrieval - Question Bank
No ratings yet
Information Retrieval - Question Bank
3 pages
NATURAL LANGUAGE PROCESSING (18CS2T50) - Mid Term Exam - 2021-2022
No ratings yet
NATURAL LANGUAGE PROCESSING (18CS2T50) - Mid Term Exam - 2021-2022
2 pages
CS8080 Information Retrieval Techniques Reg 2017 Question Bank
No ratings yet
CS8080 Information Retrieval Techniques Reg 2017 Question Bank
6 pages
10 Natural Language Processing
No ratings yet
10 Natural Language Processing
27 pages
Machine Learning with Python: Design and Develop Machine Learning and Deep Learning Technique using real world code examples
From Everand
Machine Learning with Python: Design and Develop Machine Learning and Deep Learning Technique using real world code examples
Abhishek Vijayvargia
No ratings yet
Chapter - 16 Making Comparisons
No ratings yet
Chapter - 16 Making Comparisons
60 pages
Present and Past Simple Passive - 'Be' + Past Participle - Page 2 of 3 - Test-English
No ratings yet
Present and Past Simple Passive - 'Be' + Past Participle - Page 2 of 3 - Test-English
4 pages
Bahasa Inggris Written Expression Eror Sentence
No ratings yet
Bahasa Inggris Written Expression Eror Sentence
27 pages
Academic Style Â Exercise With Tips Kopyası
No ratings yet
Academic Style Â Exercise With Tips Kopyası
4 pages
Conditional Sentences Grade 11 Exercise (May 2020) Mark-Up
No ratings yet
Conditional Sentences Grade 11 Exercise (May 2020) Mark-Up
2 pages
Personal Pronauns: Sentence Structures
No ratings yet
Personal Pronauns: Sentence Structures
3 pages
Post Test Grammar 3
No ratings yet
Post Test Grammar 3
22 pages
Vocabulary and Picture Prompts For Language Teaching Book 2 PDF
100% (2)
Vocabulary and Picture Prompts For Language Teaching Book 2 PDF
72 pages
LP Format 2014 - Revised
No ratings yet
LP Format 2014 - Revised
31 pages
Test of English
100% (1)
Test of English
2 pages
Concrete and Abstract: Super Teacher Worksheets
No ratings yet
Concrete and Abstract: Super Teacher Worksheets
5 pages
Mechanics Word Factory
No ratings yet
Mechanics Word Factory
2 pages
Prepositions of Location and Their Usage in English Language
No ratings yet
Prepositions of Location and Their Usage in English Language
7 pages
EXERCICES 4EME 2
No ratings yet
EXERCICES 4EME 2
2 pages
Idiomatic Expression
100% (1)
Idiomatic Expression
9 pages
Grammar Marwa 22
No ratings yet
Grammar Marwa 22
5 pages
Past simple tense
No ratings yet
Past simple tense
2 pages
SEE 5, Structure of English
No ratings yet
SEE 5, Structure of English
32 pages
F
No ratings yet
F
14 pages
Adjectives Notes
No ratings yet
Adjectives Notes
2 pages
Grammar For The Secondary 1 5 Addtional Exercises
No ratings yet
Grammar For The Secondary 1 5 Addtional Exercises
44 pages
So, What Are Modal Particles, Anyway?: Language. in German, These Could Be Words That Soften The Harshness of A
No ratings yet
So, What Are Modal Particles, Anyway?: Language. in German, These Could Be Words That Soften The Harshness of A
7 pages
Um zu and damit
No ratings yet
Um zu and damit
1 page
Degrees of Comparison
No ratings yet
Degrees of Comparison
37 pages
G8 Au-T2-E-2187-Year-56-Grammar-Revision-Guide-And-Quiz-Main-And-Subordinate-Clauses-Powerpoint - Ver - 4
No ratings yet
G8 Au-T2-E-2187-Year-56-Grammar-Revision-Guide-And-Quiz-Main-And-Subordinate-Clauses-Powerpoint - Ver - 4
14 pages
Class Vii Pa-2 QS Bank English
No ratings yet
Class Vii Pa-2 QS Bank English
9 pages
Subj Verb Agreement (g5)
No ratings yet
Subj Verb Agreement (g5)
23 pages

Unit 4 NLP

Uploaded by

Unit 4 NLP

Uploaded by

AI: NLP

• Discourse – concerns how the immediately preceding

• World Knowledge – includes general knowledge about

BİL711 Natural Language Processing 9

BİL711 Natural Language Processing 10

BİL711 Natural Language Processing 11

BİL711 Natural Language Processing 12

• Discourse – concerns how the immediately preceding

• World Knowledge – includes general knowledge about

BİL711 Natural Language Processing 13

• Human Judge asks tele-typed questions to Computer and

BİL711 Natural Language Processing 14

BİL711 Natural Language Processing 15

BİL711 Natural Language Processing 16

BİL711 Natural Language Processing 17

BİL711 Natural Language Processing 18

Sentence Planning and Lexical Choice

BİL711 Natural Language Processing 19

adamı adam+ACC - the man

BİL711 Natural Language Processing 20

• Inflectional and Derivational Morphology.

BİL711 Natural Language Processing 22

BİL711 Natural Language Processing 23

BİL711 Natural Language Processing 24

BİL711 Natural Language Processing 25

BİL711 Natural Language Processing 26

Understand the query, understand the

The Sentence: The boy went home.

• Limit hi to n-1 preceding words

• Goal: assign a low probability to words or

• Combine a model with a lower-order model

 PGT (wi | wi 1 ) if | wi 1wi | 0

PJM ( wi | wi 1 )  wi1 PML ( wi | wi 1 )  (1  wi1 ) PJM ( wi )

There are four types:

Ex. Internet search

Vector: Assign different weights to each term in

Information in a text is mostly in its words

• where PR(p) is the PageRank of page p, N is the total

• The Hyperlink-Induced Topic Search algorithm,

You might also like