Designing Multimodal Datasets for NLP Challenges

Pustejovsky, James; Holderness, Eben; Tu, Jingxuan; Glenn, Parker; Rim, Kyeongmin; Lynch, Kelley; Brutti, Richard

Computer Science > Computation and Language

arXiv:2105.05999 (cs)

[Submitted on 12 May 2021]

Title:Designing Multimodal Datasets for NLP Challenges

Authors:James Pustejovsky, Eben Holderness, Jingxuan Tu, Parker Glenn, Kyeongmin Rim, Kelley Lynch, Richard Brutti

View PDF

Abstract:In this paper, we argue that the design and development of multimodal datasets for natural language processing (NLP) challenges should be enhanced in two significant respects: to more broadly represent commonsense semantic inferences; and to better reflect the dynamics of actions and events, through a substantive alignment of textual and visual information. We identify challenges and tasks that are reflective of linguistic and cognitive competencies that humans have when speaking and reasoning, rather than merely the performance of systems on isolated tasks. We introduce the distinction between challenge-based tasks and competence-based performance, and describe a diagnostic dataset, Recipe-to-Video Questions (R2VQ), designed for testing competence-based comprehension over a multimodal recipe collection (this http URL). The corpus contains detailed annotation supporting such inferencing tasks and facilitating a rich set of question families that we use to evaluate NLP systems.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2105.05999 [cs.CL]
	(or arXiv:2105.05999v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2105.05999

Submission history

From: Jingxuan Tu [view email]
[v1] Wed, 12 May 2021 23:02:46 UTC (15,360 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2021-05

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

James Pustejovsky
Eben Holderness
Kyeongmin Rim

export BibTeX citation

Computer Science > Computation and Language

Title:Designing Multimodal Datasets for NLP Challenges

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Designing Multimodal Datasets for NLP Challenges

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators