0% found this document useful (0 votes)
50 views4 pages

NLP 3

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views4 pages

NLP 3

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

16/01/2025, 16:05 OneNote

Syntactic Analysis
Wednesday, 15 January 2025 11:52 AM

UNIT III:Syntactic Analysis: Context-Free Grammars, Grammar rules for English,


Treebanks, Normal Forms for grammar – Dependency Grammar – Syntactic Parsing,
Ambiguity, Dynamic Programming parsing – Shallow parsing – Probabilistic CFG,
Probabilistic CYK, Probabilistic Lexicalized CFGs – Feature structures, Unification of
feature structures.
Syntactic Analysis

Proper understanding of sentence meaning.

1. Context-Free Grammars (CFGs)


A Context-Free Grammar (CFG) is a formal system used to describe the syntax of languages, including
programming and natural languages. A CFG consists of:
• Non-terminal symbols: Represent abstract concepts or categories (e.g., Sentence (S), Noun Phrase
(NP), Verb Phrase (VP)).
• Terminal symbols: Represent actual words or symbols in the language (e.g., "dog," "runs").
• Production rules: Define how non-terminals can be replaced with terminals and/or other non-
terminals (e.g., S→NPVPS \to NP \; VP).
• Start symbol: The top-level symbol from which sentences in the language are derived (usually S).
CFGs are powerful for modeling the hierarchical structure of languages, enabling parsing and
understanding of sentences.

V - It is the collection of variables or non-terminal symbols.


T - It is a set of terminals.
P - It is the production rules that consist of both terminals and non-
terminals.
S - It is the starting symbol.

2. Grammar Rules for English


English grammar can be captured using CFGs by defining production rules such as:

https://onedrive.live.com/view.aspx?resid=B601116B79FD1A71!s8c0260e73b5f4ae7a8c34978f96df1ff&migratedtospo=true&redeem=aHR0cHM6Ly8xZHJ2Lm1zL28vYy9iNjAxMTE2Yjc5ZmQxYTcxL0V1ZGdBb3hmTy… 1/4
16/01/2025, 16:05 OneNote
• S→NPVPS \to NP \; VP(A sentence consists of a noun phrase and a verb phrase).
• NP→DetNNP \to Det \; N(A noun phrase can include a determiner and a noun).
• VP→VNPVP \to V \; NP(A verb phrase may contain a verb followed by a noun phrase).
For example:
• Sentence: "The cat eats fish."
○ S→NPVPS \to NP \; VP
○ NP→DetNNP \to Det \; N
○ VP→VNPVP \to V \; NP
These rules allow for the structural analysis of English sentences.

3. Treebanks
A Treebank is a database of sentences annotated with syntactic or semantic structures, often in the
form of parse trees. Treebanks are valuable for:
• Training and evaluating parsing algorithms.
• Capturing linguistic phenomena in a language.
Example

4. Normal Forms for Grammar


Normal forms are standardized representations of grammar rules to simplify parsing algorithms. Key
normal forms include:
• Chomsky Normal Form (CNF): Each production rule is of the form A→BCA \to BCor A→aA \to a,
where A,B,CA, B, Care non-terminals and aais a terminal.
• Greibach Normal Form (GNF): Rules are of the form A→aαA \to a\alpha, where aais a terminal
and α\alphais a sequence of non-terminals.

5. Dependency Grammar
A Dependency Grammar focuses on the relationships between words in a sentence, emphasizing how
words depend on each other. It is represented as a dependency tree, where:
• Nodes represent words.
• Edges represent dependency relations (e.g., subject, object).
Example:
• Sentence: "The cat eats fish."
○ "eats" (root)
• "cat" (subject)
• "fish" (object)
Dependency grammars are commonly used in syntactic parsing tasks.

https://onedrive.live.com/view.aspx?resid=B601116B79FD1A71!s8c0260e73b5f4ae7a8c34978f96df1ff&migratedtospo=true&redeem=aHR0cHM6Ly8xZHJ2Lm1zL28vYy9iNjAxMTE2Yjc5ZmQxYTcxL0V1ZGdBb3hmTy… 2/4
16/01/2025, 16:05 OneNote

6. Syntactic Parsing
Syntactic parsing involves analyzing a sentence to produce its syntactic structure, typically as a tree:
• Constituency Parsing: Focuses on breaking sentences into sub-phrases (constituents) using CFGs.
• Dependency Parsing: Focuses on finding the dependencies between words.

7. Ambiguity
Ambiguity arises when a sentence can have multiple interpretations:
• Lexical Ambiguity: A word has multiple meanings (e.g., "bank" as a financial institution or
riverbank).
• Structural Ambiguity: A sentence has multiple valid parse trees (e.g., "I saw the man with a
telescope").
Resolving ambiguity is critical for accurate syntactic parsing.

8. Dynamic Programming Parsing


Dynamic programming is used in parsing algorithms to avoid redundant computations:
• CYK Algorithm: A bottom-up parser for CFGs in Chomsky Normal Form.
• Earley Parser: Handles all CFGs and is both top-down and bottom-up.

9. Shallow Parsing
Shallow Parsing (or chunking) identifies phrases in a sentence without generating a full parse tree:
• Goal: Extract noun phrases, verb phrases, etc.
• Example: For "The cat sleeps," identify:
○ NP: "The cat"
○ VP: "sleeps"

https://onedrive.live.com/view.aspx?resid=B601116B79FD1A71!s8c0260e73b5f4ae7a8c34978f96df1ff&migratedtospo=true&redeem=aHR0cHM6Ly8xZHJ2Lm1zL28vYy9iNjAxMTE2Yjc5ZmQxYTcxL0V1ZGdBb3hmTy… 3/4
16/01/2025, 16:05 OneNote

10. Probabilistic CFGs (PCFGs)


PCFGs extend CFGs by associating probabilities with production rules. They help disambiguate multiple
parses by choosing the most probable one:
• Example: P(S→NPVP)=0.9P(S \to NP \; VP) = 0.9

11. Probabilistic CYK Parsing


This is an extension of the CYK algorithm for PCFGs:
Cocke-Younger-Kasami Parsing
The→ {Det}boy→ {Noun}eats→ {Verb}
• Uses probabilities to find the most likely parse for a sentence.

12. Probabilistic Lexicalized CFGs


Lexicalized CFGs associate specific words with grammar rules:
• Adds context by including head words (e.g., associating "eats" with the verb phrase).

13. Feature Structures


Feature structures represent syntactic, semantic, or morphological properties of linguistic units as
attribute-value pairs:
• Example: For "cats":
○ Number: plural
○ Part-of-Speech: noun

14. Unification of Feature Structures


Unification is the process of merging feature structures to check compatibility:
• Combines two sets of attributes if they are consistent.

https://onedrive.live.com/view.aspx?resid=B601116B79FD1A71!s8c0260e73b5f4ae7a8c34978f96df1ff&migratedtospo=true&redeem=aHR0cHM6Ly8xZHJ2Lm1zL28vYy9iNjAxMTE2Yjc5ZmQxYTcxL0V1ZGdBb3hmTy… 4/4

You might also like