0% found this document useful (0 votes)

93 views19 pages

NLP Style and Semantics Analysis

This document discusses style and semantics in natural language processing. It covers topics like stylometry, authorship attribution, style transfer, and meaning representations. It introduces concepts like Abstract Meaning Representation (AMR) and Minimal Recursion Semantics (MRS) as formal representations of meaning. It also discusses using neural models with supervised and unsupervised training to learn disentangled representations of form and meaning from text. Evaluation of such models includes style transfer, retrieval, and zero-shot prediction tasks.

Uploaded by

madhusmitha22_113024

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

93 views19 pages

NLP Style and Semantics Analysis

Uploaded by

madhusmitha22_113024

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Style, Semantics, and Other

Things
Krishnapriya Vishnubhotla (KP)
Intro: Style in NLP

● Uniqueness of writing style

● Due to:
○ Lexical choices (big words vs small words)

○ Sentence structure (short n simple vs complex with clauses)

● Stylometry:
○ Surface features (word lengths, sentence lengths)

○ Lexical features (LIWC, number of hapax legomena)

○ Syntactic features (function word frequencies, PoS tag frequencies, parse tree features, character trigrams)

● Authorship attribution, plagiarism detection, digital forensics

Form and Meaning

● Text generation process:

○ a meaning, or content +
○ Form, or style
● Multiple surface realisations are possible for the same meaning
● Natural language corpora:
○ Complex vs simple wikipedia
○ Literary translations
● Closely related to: paraphrases
Paraphrases

● Paraphrase identiﬁcation, generation

● Datasets: Quora Question Pairs, Microsoft Research Paraphrase Corpus, ParaNMT
● Semantic Textual Similarity tasks
NLP: Style Transfer

● Lots of work on style transfer in NLP

● “Style” ---> factor of variation

○ Sentiment
○ Attributes
○ Topics

● Usually guided by the dataset used.

● Problematic:
○ What should be preserved?
○ Adds to already problematic evaluation metrics
Complications

● There are no true synonyms -- “near-synonyms”

● Changing active to passive → change of focus
● Pragmatics -- viewpoint, framing, denotation, connotation, implication.
● Can draw some fuzzy boundaries between clusters of near-synonyms at a word-level
○ What about for phrases/sentences/documents?
● Style: Literary deﬁnition: what is “lost in translation”
Meaning Representations

● Formal representation of meaning/semantics

● Lots of CL research on logical forms, compositionality
● Two relatively-recent projects I came across
○ Abstract Meaning Representation (AMR)
○ Minimal Recursion Semantics (MRS)
Abstract Meaning Representation

● Rooted, directed, (edge+leaf)-labelled graph

● Uses PropBank frames
● Example: “The dog is eating a bone,”

Relations
Variable / Concept
● “The dog ate the bone that he found.”

● Has ways to handle:

○ Coreference
○ Negation
○ Numbers/quantity
○ Names
Generalisation capabilities

- The man described the mission as a disaster.

- The man’s description of the mission: disaster. Same AMR.
- As the man described it, the mission was a disaster.
- The man described the mission as disastrous.

● Abstracts away morphological and syntactic variations.

● But does not handle synonyms
○ “afraid” and “terriﬁed” are treated as different concepts.
● Useful?
○ Not yet.
○ Purpose: dataset to help develop algorithms that can generate AMRs.
Minimal Recursion Semantics

● Another formalism: phrase structure grammar

● More ﬁne-grained
● Can distinguish between tense, number.
● Practical utility:
○ Has a command-line parser you can use
○ Can generate simple paraphrases
Practical Utility

● Unlikely that they can parse many real-world sentences:

○ LIT paper: successful at 19.7% of SNLI sentences

● Using AMR to detect paraphrases:

○ ~85% on the Microsoft Paraphrase Corpus

● A separate research problem, not a tool to be used.

Back to Representation Learning

● Let us assume we have…

● Some proxy information for:

○ Form
○ Meaning Text t

Form Vector Meaning Vector

Stylistic similarity Semantic similarity

Neural Models
z classiﬁer

● Modiﬁed Autoencoders
Paraphrases
● Encode into two vectors
● Use both to reconstruct
● Restrict information using
motivational/adversarial
discriminators

Semantic z classiﬁer

Syntactic
What kinds of supervision?

Datasets
● Style class labels
● Paraphrases ● Paraphrase datasets
● Heuristic info: ● Parallel style transfer datasets
○ BoW for content ○ Formality
● Syntax: Syntax tree features ○ Diachronic language change
○ Tree edit distance ● Data-to-text datasets
○ ~Synthetic
Synthetic Dataset: PersonageNLG

● Personality model might be questionable

● BUT gives us two neat dimensions of variation.
All the losses later….

Evaluation:
● Style transfer (swap variables + generate)
● Retrieval
● Prediction (kNN)
More supervision == better representations

● Kinda boring
● Just train a separate supervised model
for each end-goal?
● Style transfer:
○ Generation problems
○ Evaluation problems
● Real-world text: not so cleanly
separable.
:(
What would be interesting?

● Unsupervised disentanglement?
○ beta-VAE in vision
○ At least for the synthetic dataset
● Evaluating the representations:
○ Probe for linguistic knowledge/features
○ Robust to “noise”? → domain adaptation/zero-shot prediction
● Using pre-trained models?
● (TBD) Should the latent spaces be entirely unrelated?
○ Where do style and semantics intersect?
○ What is a “latent space of sentences” anyway?

Introduction to Semantic Parsing Techniques
No ratings yet
Introduction to Semantic Parsing Techniques
14 pages
Syntax and Semantics in NLP
No ratings yet
Syntax and Semantics in NLP
28 pages
Understanding Semantic Parsing Techniques
No ratings yet
Understanding Semantic Parsing Techniques
66 pages
Semantic Role Labeling in NLP
No ratings yet
Semantic Role Labeling in NLP
32 pages
Semantic Parsing and Interpretation Overview
No ratings yet
Semantic Parsing and Interpretation Overview
27 pages
Semantic Interpretation in Parsing
No ratings yet
Semantic Interpretation in Parsing
72 pages
Semantic Parsing in NLP Explained
No ratings yet
Semantic Parsing in NLP Explained
16 pages
Ai CHBX1 5
No ratings yet
Ai CHBX1 5
91 pages
Semantic Analysis in NLP: Overview
No ratings yet
Semantic Analysis in NLP: Overview
51 pages
Understanding Semantic Parsing in NLP
No ratings yet
Understanding Semantic Parsing in NLP
21 pages
Understanding Semantic Parsing Techniques
No ratings yet
Understanding Semantic Parsing Techniques
26 pages
NLP Unit 3
No ratings yet
NLP Unit 3
20 pages
Unit 3
No ratings yet
Unit 3
14 pages
NLP Questions Detailed Solutions
No ratings yet
NLP Questions Detailed Solutions
63 pages
Semantic Parsing in Language Processing
No ratings yet
Semantic Parsing in Language Processing
81 pages
Semantic Analysis in NLP Explained
No ratings yet
Semantic Analysis in NLP Explained
35 pages
Semantic Parsing and Interpretation Overview
No ratings yet
Semantic Parsing and Interpretation Overview
19 pages
Natural Language Processing Study Guide
No ratings yet
Natural Language Processing Study Guide
38 pages
Ambiguity Resolution in NLP Parsing
No ratings yet
Ambiguity Resolution in NLP Parsing
35 pages
Ambiguity Resolution in Parsing Models
No ratings yet
Ambiguity Resolution in Parsing Models
15 pages
Semantics and Pragmatics in NLP
No ratings yet
Semantics and Pragmatics in NLP
33 pages
Understanding Semantic Parsing in NLP
No ratings yet
Understanding Semantic Parsing in NLP
16 pages
Overview of Semantic Parsing Techniques
No ratings yet
Overview of Semantic Parsing Techniques
11 pages
NLP
No ratings yet
NLP
29 pages
NLP Unit-03
No ratings yet
NLP Unit-03
29 pages
NLP Sem
No ratings yet
NLP Sem
25 pages
NLP QB Sol
No ratings yet
NLP QB Sol
11 pages
Semantic Parsing in NLP Explained
No ratings yet
Semantic Parsing in NLP Explained
37 pages
Semantic Parsing Techniques Overview
No ratings yet
Semantic Parsing Techniques Overview
83 pages
Understanding Semantic Role Labeling
No ratings yet
Understanding Semantic Role Labeling
43 pages
Understanding Semantic Analysis in NLP
No ratings yet
Understanding Semantic Analysis in NLP
15 pages
Natural Language Processing: Ambiguity Resolution
No ratings yet
Natural Language Processing: Ambiguity Resolution
30 pages
Semantic Analysis in Natural Language Processing
No ratings yet
Semantic Analysis in Natural Language Processing
57 pages
Coreference Resolution in NLP Explained
No ratings yet
Coreference Resolution in NLP Explained
45 pages
Lexical and Distributional Semantics
No ratings yet
Lexical and Distributional Semantics
52 pages
Semantics and Pragmatics in NLP
100% (1)
Semantics and Pragmatics in NLP
10 pages
System Paradigms in Semantic Parsing
No ratings yet
System Paradigms in Semantic Parsing
12 pages
Semantic Analysis in NLP Explained
No ratings yet
Semantic Analysis in NLP Explained
7 pages
Semantic Parsing in NLP Explained
No ratings yet
Semantic Parsing in NLP Explained
38 pages
Semantic Parsing System Paradigms Explained
No ratings yet
Semantic Parsing System Paradigms Explained
4 pages
Understanding Semantic Parsing in NLP
No ratings yet
Understanding Semantic Parsing in NLP
68 pages
Parsing Techniques in Natural Language Processing
No ratings yet
Parsing Techniques in Natural Language Processing
8 pages
Word Sense Disambiguation in Semantics
No ratings yet
Word Sense Disambiguation in Semantics
37 pages
Understanding Semantic Parsing Techniques
No ratings yet
Understanding Semantic Parsing Techniques
19 pages
Semantic Interpretation and Parsing Techniques
No ratings yet
Semantic Interpretation and Parsing Techniques
18 pages
Natural Language Processing: Ambiguity Resolution
No ratings yet
Natural Language Processing: Ambiguity Resolution
30 pages
Unsupervised Semantic Parsing with Markov Logic
No ratings yet
Unsupervised Semantic Parsing with Markov Logic
10 pages
IV Unit
No ratings yet
IV Unit
13 pages
Lexicalized vs Unlexicalized AI Models
No ratings yet
Lexicalized vs Unlexicalized AI Models
18 pages
Semantic Parsing in Natural Language Processing
No ratings yet
Semantic Parsing in Natural Language Processing
9 pages
Semantics and Pragmatics in NLP
No ratings yet
Semantics and Pragmatics in NLP
81 pages
Overview of NLP Components and Steps
No ratings yet
Overview of NLP Components and Steps
26 pages
Semi-Supervised Text Style Transfer Model
No ratings yet
Semi-Supervised Text Style Transfer Model
10 pages
Montague Semantics and Meaning
No ratings yet
Montague Semantics and Meaning
22 pages
Polysemy and Lexical Similarity Dataset
No ratings yet
Polysemy and Lexical Similarity Dataset
1 page
Semantic Representation in NLP
No ratings yet
Semantic Representation in NLP
18 pages
Semantic Interpretation in AI Systems
No ratings yet
Semantic Interpretation in AI Systems
47 pages
ADCOM 2025: Advanced Computing Conference
No ratings yet
ADCOM 2025: Advanced Computing Conference
3 pages
Understanding Tuples in Databases
No ratings yet
Understanding Tuples in Databases
54 pages
Q-Learning for MANET Routing Protocols
No ratings yet
Q-Learning for MANET Routing Protocols
2 pages
AI Techniques for Power System Analysis
No ratings yet
AI Techniques for Power System Analysis
2 pages
Cloud ML for Network Defect Detection
100% (1)
Cloud ML for Network Defect Detection
6 pages
Whiteboard Education Presentation Template
No ratings yet
Whiteboard Education Presentation Template
33 pages
Challenges of Normalization in Databases
No ratings yet
Challenges of Normalization in Databases
51 pages
SQL For Dummies (9th Edition) Taylor PDF
No ratings yet
SQL For Dummies (9th Edition) Taylor PDF
10 pages
MCQs on Cryptographic Hash Functions
100% (1)
MCQs on Cryptographic Hash Functions
3 pages
Data Structures Exam Papers Compilation
No ratings yet
Data Structures Exam Papers Compilation
19 pages
B.Tech CSBS 3rd Year Syllabus & Evaluation
No ratings yet
B.Tech CSBS 3rd Year Syllabus & Evaluation
69 pages
Bosch Rexroth Spare Parts List 2015
No ratings yet
Bosch Rexroth Spare Parts List 2015
8 pages
DBMS Overview and Applications
No ratings yet
DBMS Overview and Applications
35 pages
Weekly Study Timetable Template
No ratings yet
Weekly Study Timetable Template
25 pages
Amrita Kulkarni - Data Analyst Resume
No ratings yet
Amrita Kulkarni - Data Analyst Resume
2 pages
Image Encryption with Modified RSA
No ratings yet
Image Encryption with Modified RSA
5 pages
Citrus Disease Classification System
No ratings yet
Citrus Disease Classification System
53 pages
YOLOv5 Data Augmentation Workshop
No ratings yet
YOLOv5 Data Augmentation Workshop
56 pages
Class 10 DBMS Q&A Guide
No ratings yet
Class 10 DBMS Q&A Guide
5 pages
Unsupervised Learning in AI Explained
No ratings yet
Unsupervised Learning in AI Explained
4 pages
H-IoT Architecture: Challenges & Trends
No ratings yet
H-IoT Architecture: Challenges & Trends
4 pages
Advanced IoT Course Syllabus
No ratings yet
Advanced IoT Course Syllabus
1 page
Understanding the Semantic Web Basics
No ratings yet
Understanding the Semantic Web Basics
76 pages
Ohio Lesbian Archives Involvement
No ratings yet
Ohio Lesbian Archives Involvement
1 page
Library Management System Overview
No ratings yet
Library Management System Overview
19 pages
Career Ally: Enhancing College Placement
No ratings yet
Career Ally: Enhancing College Placement
1 page
Compiler Basics and Key Concepts
No ratings yet
Compiler Basics and Key Concepts
3 pages
Overview of Artificial Intelligence Concepts
No ratings yet
Overview of Artificial Intelligence Concepts
8 pages
Data Engineering Complete Notes
No ratings yet
Data Engineering Complete Notes
4 pages
Types of CRM and Data Management Insights
No ratings yet
Types of CRM and Data Management Insights
20 pages

NLP Style and Semantics Analysis

Uploaded by

NLP Style and Semantics Analysis

Uploaded by

Style, Semantics, and Other

● Uniqueness of writing style

○ Sentence structure (short n simple vs complex with clauses)

○ Lexical features (LIWC, number of hapax legomena)

● Authorship attribution, plagiarism detection, digital forensics

● Text generation process:

● Paraphrase identiﬁcation, generation

● Lots of work on style transfer in NLP

● “Style” ---> factor of variation

● Usually guided by the dataset used.

● There are no true synonyms -- “near-synonyms”

● Formal representation of meaning/semantics

● Rooted, directed, (edge+leaf)-labelled graph

● Has ways to handle:

- The man described the mission as a disaster.

● Abstracts away morphological and syntactic variations.

● Another formalism: phrase structure grammar

● Unlikely that they can parse many real-world sentences:

● Using AMR to detect paraphrases:

● A separate research problem, not a tool to be used.

● Let us assume we have…

● Some proxy information for:

Form Vector Meaning Vector

Stylistic similarity Semantic similarity

● Personality model might be questionable

You might also like