0% found this document useful (0 votes)
132 views17 pages

Understanding Natural Language Processing

Natural Language Processing (NLP) is the study of how computers can be used to understand and generate human language. NLP involves developing systems that can analyze, understand, and generate text or speech to communicate with humans naturally in everyday language. Some key NLP tasks include part-of-speech tagging, syntactic parsing, word sense disambiguation, and semantic role labeling to help computers interpret the meaning of text. NLP research also examines the different levels of human language from morphology to pragmatics to build systems that can comprehend language in context.

Uploaded by

sujeet.jha.311
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
132 views17 pages

Understanding Natural Language Processing

Natural Language Processing (NLP) is the study of how computers can be used to understand and generate human language. NLP involves developing systems that can analyze, understand, and generate text or speech to communicate with humans naturally in everyday language. Some key NLP tasks include part-of-speech tagging, syntactic parsing, word sense disambiguation, and semantic role labeling to help computers interpret the meaning of text. NLP research also examines the different levels of human language from morphology to pragmatics to build systems that can comprehend language in context.

Uploaded by

sujeet.jha.311
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd

Natural Language Processing

Aman Shakya
What is NLP?
• NLP is the branch of computer science focused
on developing systems that allow computers
to communicate with people using everyday
language
• NLP is concerned with the design and
implementation of effective natural language
input and output components for
computational systems (Dale et al 2000)
What is NLP?
• Basic problems:
– Analysis: conversion of NL input (text or speech) to
internal meaning representation
– Generation: conversion of internal meaning
representation to NL output (text or speech)
• Auxiliary problems:
– Learning: automatic construction of NLP systems from
language data
– Evaluation: assessment of NLP systems, e.g., in relation
to language data
Syntax, Semantic, Pragmatics
• Syntax concerns the proper ordering of words and its affect on
meaning.
– The dog bit the boy.
– The boy bit the dog.
– Bit boy dog the the.
– Colorless green ideas sleep furiously.
• Semantics concerns the (literal) meaning of words, phrases, and
sentences.
– “plant” as a photosynthetic organism
– “plant” as a manufacturing facility
– “plant” as the act of sowing
• Pragmatics concerns the overall communicative and social
context and its effect on interpretation.
– The ham sandwich wants another beer. (co-reference, anaphora)
– John thinks vanilla. (ellipsis)
Levels of Language
• Morphology: the structure of words
• Syntax: the structure of phrases and
sentences
• Semantics: the meaning of words and
sentences
• Pragmatics: the use of language in context
• Discourse: larger linguistic units such as texts
or dialogues
Ambiguity
I saw the man on the hill with
a telescope.

Natural language is highly


ambiguous and must be
disambiguated.
Natural Languages vs. Computer Languages

• Ambiguity is the primary difference between natural and


computer languages.
• Having a unique linguistic expression for every possible
conceptualization that could be conveyed would make language
overly complex and linguistic expressions unnecessarily long.
• Allowing resolvable ambiguity permits shorter linguistic
expressions.
• Natural language relies on people’s ability to use their knowledge
and inference abilities to properly resolve ambiguities.
• Formal programming languages are designed to be unambiguous,
i.e. they can be defined by a grammar that produces a unique
parse for each sentence in the language.
Thought and Understanding of Language

• The Turing Test (Alan Turing, 1950)


– An empirical test of thing machines
– Human-like use of language == intelligence?

• Simple programs without understanding such as


ELIZA (weizenbaum, 1966) seem to pass this test
– [Link]

• The Loebner Prize competition


NLP Tasks
• Syntactic tasks
– Word segmentation, morphological analysis, POS
tagging, phrase chunking, syntactic parsing
• Semantic tasks
– Word sense disambiguation, semantic role
labeling, semantic parsing, etc.
• Pragmatics/discourse tasks
– Anaphora/co-reference resolution, ellipsis
resolution
Word Segmentation
• Breaking a string of characters (graphemes) into a
sequence of words.
• In some written languages (e.g. Chinese) words are
not separated by spaces.
• Even in English, characters other than white-space
can be used to separate words [e.g. , ; . - : ( ) ]
• Examples from English URLs:
– [Link]  jump the shark .com
– [Link]/pluckerswingbar
 myspace .com pluckers wing bar
 myspace .com plucker swing bar
Morphological Analysis
• Morphology is the field of linguistics that studies the internal
structure of words.
• A morpheme is the smallest linguistic unit that has semantic
meaning
– e.g. “carry”, “pre”, “ed”, “ly”, “s”
• Morphological analysis is the task of segmenting a word into its
morphemes:
– carried  carry + ed (past tense)
– independently  in + (depend + ent) + ly
– Googlers  (Google + er) + s (plural)
– unlockable  un + (lock + able) ?
 (un + lock) + able ?
Morphemes
• Smallest meaning bearing units constituting a
word
Stem
Morphemes Prefix Suffix
consider
re ation

Stem Affixes
reconsideration

tree, go, fat Prefixes Suffixes

post -
-ed (tossed)
(postpone)
Tools Available for Morphological Processing

• AT&T FSM Library and Lextools


– [Link]
• OpenFST (Google and NYU)
– [Link]
• Carmel Toolkit
– [Link]
• FSA Toolkit
– [Link]
[Link]
Part Of Speech (POS) Tagging
• Annotate each word in a sentence with a part-
of-speech.
I ate the spaghetti with meatballs.
Pro V Det N Prep N
John saw the saw and decided to take it to the table.
PN V Det N Con V Part V Pro Prep Det N
• Useful for subsequent syntactic parsing and
word sense disambiguation.
Syntactic Parsing
• Produce the correct syntactic parse tree for a
sentence.
Word Sense Disambiguation (WSD)
• Words in natural language usually have a fair
number of different possible meanings.
– Ellen has a strong interest in computational
linguistics.
– Ellen pays a large amount of interest on her credit
card.
• For many tasks (question answering,
translation), the proper sense of each
ambiguous word in a sentence must be
determined.
16
Semantic Role Labeling (SRL)
• For each clause, determine the semantic role
played by each noun phrase that is an
argument to the verb.
agent patient source destination instrument
– John drove Mary from Austin to Dallas in his
Toyota Prius.
– The hammer broke the window.
• Also referred to a “case role analysis,”
“thematic analysis,” and “shallow semantic
parsing”
17

You might also like