U4 NLP Notes
U4 NLP Notes
-Predicate-argument structure, also known as semantic role labeling, is a method used to identify the
roles of different parts of a sentence.
-The "predicate" is usually a verb (but can also be a noun, adjective, or preposition), and the
"arguments" are the entities that participate in the action or state described by the predicate.
Resources:
-These resources help computers understand the meaning of sentences by identifying the action and
who is involved.
-This is important for things like translating languages, answering questions, and even helping virtual
assistants understand commands better.
(i) Framenet
(ii) PropBank
FRAMENET:
FrameNet looks at how words are used in different situations (frames) and identifies the roles that
other words play in these situations.
It is based on the theory of frame semantics, which suggests that the meaning of a word can be
understood in terms of the typical situations it describes.
Key Elements:
Frames: A frame is a type of situation or scenario. Each frame involves certain participants, which are
called frame elements.
Frame Elements: These are the roles played by the different participants in a frame.
Lexical Units (LUs): These are pairs of words and their meanings (frames). Each lexical unit is a
specific meaning of a word in a given frame.
Working:
Example:
Frame: COMMERCE_BUY
Sentence: "John bought a car from Mary for $20,000."
Frame Elements:
Buyer: John
Goods: a car
Seller: Mary
Money: $20,000
PROPBANK:
-PropBank is a corpus of texts where each verb is annotated with its arguments, giving us a clear idea
of who is doing what to whom in a sentence.
-This helps in understanding the roles of different entities in relation to the verb.
Key Elements:
Predicate: Usually a verb, it represents an action or state.
Arguments: The participants involved in the action or state described by the predicate. Arguments
are categorized as core (essential to the meaning of the predicate) or
adjunctive (providing additional information).
Working:
1. Annotations: PropBank annotates verbs in the Wall Street Journal section of the Penn Treebank.
Each verb is tagged with its core arguments
(like the subject, object) and adjunctive arguments (like time, location).
2. Framesets: Each verb has a frameset that lists possible argument structures (roles) it can take,
along with descriptions of these roles.
Core Arguments:
Adjunctive Arguments:
These provide additional information about the action and are labeled as ARGM-XYZ, where XYZ
indicates the type of information:
SOFTWARES
2. C-ASSERT
An extension of ASSERT for Chinese Language.
3. SwiRL
Another semantic role labeler trained on PropBank data.
--------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------
MEANING REPRESENTATION:
RESOURCES:
1. ATIS:
The ATIS project was one of the first major efforts to develop systems that convert natural language into
a form usable by applications for decision-making. Specifically, it focused on transforming user queries
about flight information into SQL queries to extract answers from a flight database.
Here’s how it worked:
The ATIS training corpus included over 7,300 spoken utterances from 137 subjects, with 2,900 of them
categorized and annotated, and around 600 treebanked for detailed syntactic analysis. This resource
helped promote experimentation in transforming natural language into machine-readable formats.
2. COMMUNICATOR
The Communicator program was the next step after the ATIS project. While ATIS focused on
user-initiated dialogs where users asked questions and machines provided answers, Communicator
introduced a mixed-initiative dialog system. This means both the user and the machine could actively
participate in the conversation.
3. GeoQuery
GeoQuery is a natural language interface (NLI) designed to interact with a geographic database called
Geobase. Geobase contains about 800 Prolog facts, which store geographic information such as
populations, neighboring states, major rivers, and major cities in a relational database.
4. RoboCup: CLang
RoboCup is an international competition where teams of robots play soccer, and it’s organized by the
artificial intelligence community. The goal is to advance AI and robotics research through this challenging
and fun domain.
Software
• WASP
• KRISPER
• CHILL