0% found this document useful (0 votes)
44 views5 pages

Semantic Parsing Notes

The document outlines key paradigms in semantic parsing, including system architectures (knowledge-based, unsupervised, supervised, and semi-supervised), scope (domain-dependent and independent), and coverage (shallow and deep). It discusses word sense ambiguities and disambiguation methods, along with resources such as corpora and lexical databases. Additionally, it details algorithms used in rule-based, supervised, unsupervised, and semi-supervised systems for semantic parsing.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views5 pages

Semantic Parsing Notes

The document outlines key paradigms in semantic parsing, including system architectures (knowledge-based, unsupervised, supervised, and semi-supervised), scope (domain-dependent and independent), and coverage (shallow and deep). It discusses word sense ambiguities and disambiguation methods, along with resources such as corpora and lexical databases. Additionally, it details algorithms used in rule-based, supervised, unsupervised, and semi-supervised systems for semantic parsing.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Unit 3: Semantic Parsing - Detailed Notes (From System Paradigms Onward)

System Paradigms in Semantic Parsing

Semantic parsing systems can be grouped under three key paradigms:

1. System Architectures

- Knowledge-Based Systems:

- Rely on human-crafted rules.

- Good for domains like medicine or law.

- Example: Rule-based hospital chatbot.

- Unsupervised Systems:

- No labeled data required.

- Use clustering or patterns.

- Example: Clustering word "java" by context (coffee, island, programming).

- Supervised Systems:

- Trained using labeled datasets.

- Use ML models like SVM or MaxEnt.

- Example: QA systems trained on SQuAD.

- Semi-Supervised Systems:

- Combine small labeled datasets with large unlabeled sets.

- Example: Yarowsky Algorithm bootstrapping data.


2. Scope

- Domain-Dependent:

- Specific to a field.

- Example: Airline booking assistant.

- Domain-Independent:

- Works across domains.

- Example: Alexa, Google Assistant.

3. Coverage

- Shallow Coverage:

- Produces intermediate outputs (e.g., POS tags).

- Example: POS tagging in Book a flight.

- Deep Coverage:

- Produces logical representations.

- Example: Logical form of Who is the president of India?

Word Sense

Understanding that words have multiple meanings depending on context.

- Types of Word Sense Ambiguities:

- Homonymy: Same spelling, unrelated meanings (e.g., bat - animal/tool).


- Polysemy: Related meanings (e.g., bank - financial or river side).

- Categorial Ambiguity: Multiple POS (e.g., book - noun/verb).

- Word Sense Disambiguation (WSD):

- Process of determining the right meaning of a word.

- Methods:

- Rule-based (e.g., Lesk Algorithm).

- Supervised (ML-based classifiers).

- Unsupervised (Clustering, IC, Conceptual Density).

- Semi-Supervised (Yarowsky Algorithm).

Resources

- Corpora: Structured sets of texts for training (plural: corpora).

- Dictionaries: LDOCE, Rogets Thesaurus.

- WordNet: Lexical database with synonym sets and glosses.

Rule-Based Systems

- Lesk Algorithm:

- Uses dictionary definitions and counts word overlaps in context.

- Example: Resolving bank using words like cash or river.

- Rogets Thesaurus Algorithm:

- Classifies based on category matches and word probabilities.

- SSI (Structural Semantic Interconnections):


- Graph-based representation of senses.

- Uses WordNet to construct semantic graphs and iteratively disambiguate.

Supervised Systems

- Use annotated data to train classifiers.

- Popular Classifiers: SVM, MaxEnt.

- Features:

- Lexical context

- POS tags

- Bag of Words

- Collocations

- Syntactic structure

- Topic and voice

- Subject/Object presence

- Prepositional phrase adjuncts

Unsupervised Systems

- No labeled data.

- Techniques:

- Clustering senses

- Semantic similarity

- Information Content (IC)

- Conceptual Density using WordNet hierarchy

Semi-Supervised Systems
- Combine small labeled data + bootstrapped unlabeled data.

- Yarowsky Algorithm:

- Key Principles:

- One Sense per Collocation

- One Sense per Discourse

- Bootstrapping

- Steps:

1. Initialize with seed examples

2. Extract features

3. Train classifier

4. Label data

5. Repeat

Additional Concepts

- Synset: Set of synonyms from WordNet.

- Example: "happy" happy, glad, joyful.

- Stop Words: Common but semantically weak words (e.g., the, and, is).

You might also like