0% found this document useful (0 votes)
91 views

Lec-1 Introduction

This document provides an overview of the Introduction to NLP (ELL 881) course. The course is an introduction to natural language processing and will cover classical NLP techniques, deep learning for NLP, and advanced NLP topics. It will include lectures, assignments, quizzes, and a group mini-project. The goal is for students to gain foundational knowledge of NLP and experience applying techniques through assignments and a project.

Uploaded by

Gia By
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
91 views

Lec-1 Introduction

This document provides an overview of the Introduction to NLP (ELL 881) course. The course is an introduction to natural language processing and will cover classical NLP techniques, deep learning for NLP, and advanced NLP topics. It will include lectures, assignments, quizzes, and a group mini-project. The goal is for students to gain foundational knowledge of NLP and experience applying techniques through assignments and a project.

Uploaded by

Gia By
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 68

Introduction to NLP (ELL 881)

Special Topics in Computers 2


Image: https://www.blumeglobal.com/learning/natural-language-processing/
Neuro-linguistic
programming

Introduction to NLP (ELL 881)


Special Topics in Computers 2
Neuro-linguistic
programming

Introduction to NLP (ELL 881)


Special Topics in Computers 2
Non-Linear
Programming

Introduction to NLP (ELL 881)


Special Topics in Computers 2
Non-Linear
Programming

Introduction to NLP (ELL 881)


Special Topics in Computers 2
Natural
Language
Processing

Introduction to NLP (ELL 881)


Special Topics in Computers 2
NLP (Wiki)
1. Natural Language Processing 5. National Library of Poland
6. National Library of the Philippines
2. Natural-linear Programming
7. No light perception
3. Neuro-linguistic Programming 8. National Labour Party
4. Natural-language Programming 9. National Liberal Party
10. National Liberation Party
11. Natural Law Party
12. New Labour Party

https://en.wikipedia.org/wiki/NLP
• Course Instructor: Tanmoy Chakraborty (tanmoychak.com)
(NLP, Social Media, Graph Neural Networks)
[email protected]
• Guest Lecture: TBD
• Course page: https://sites.google.com/view/ell881-iitd/home
• Piazza: https://piazza.com/iitd.ac.in/spring2023/ell881
• TAs:
• Kshitij Alwadhi ([email protected])
• Gurusha Juneja ([email protected])
• Group Email: TBD
Useful resources/tools/libraries

• Natural Language Toolkit (NLTK)


• Stanford CoreNLP
• CMU ARK for Noisy Text
• Scikit-learn
• Spacy
• Stanza
• Shallow Parser - for Indian Language
• Universal Parser - Multi-lingual
• HuggingFace
9
Reading and Reference materials
• Books
• Speech and Language Processing, Dan Jurafsky and James H. Martin
https://web.stanford.edu/~jurafsky/slp3/

• Foundations of Statistical Natural Language Processing, Chris Manning and Hinrich Schütze
• Natural Language Processing, Jacob Eisenstein
https://github.com/jacobeisenstein/gt-nlp-class/blob/master/notes/eisenstein-nlp-notes.pdf
• A Primer on Neural Network Models for Natural Language Processing, Yoav Goldberg
http://u.cs.biu.ac.il/~yogo/nnlp.pdf

• Journals
• Computational Linguistics, Natural Language Engineering, TACL, KBS, ACM TALLIP, ....

• Conferences
• ACL, EMNLP, NAACL, COLING, AAAI, IJCNLP, ICML, NIPS, WWW, KDD, SIGIR, ….
Research papers repository
https://aclanthology.org/

11
Research papers repository
https://arxiv.org/list/cs.CL/recent

12
Prerequisite

• Excitement about language!


• Willingness to learn

Mandatory Desirable
• Data Structures & Algorithm Deep learning
• Machine Learning
• Python programming

• Strongly recommended to learn ML. This class will not cover fundamentals of ML.
• Instructor/TAs may cover DL-related prerequisites
Course Directives
HashLearn
• Class Time: Mon & Thu, 2 pm – 3:30 pm • Meet your instructor at least once
per 15 days to resolve your doubts.
• Office Hour: Mon 5-6 pm • Mon 5-5:30 pm (appointment
based, email me at least 1 hr before
• Room: LH-519 coming)

Marks distribution (tentative): • Audit: Discouraged!


• Minor 1: 10% B- (threshold to pass the course)
• Minor 2: 10%
• Grading Scheme: Relative?
• Major: 20%
• Quiz (3): 15%
• 75% attendance mandatory (Timble)
• Assignment (2): 20%
• Mini-project: 20% (group-wise) • If you want to deregister, please do it ASAP
• Paper reading (1): 5% (group-wise?) • Please allow others to register
• Registration limit (80) may not be increased
Mini Project (20%)
• A few problem statements, and datasets will be floated (in Jan 2023)*
• A leaderboard will be maintained per problem statement
• Each group should consist of 1-3 students?
• Best Project Award
Students are encouraged to publish their projects in good
• You need to conferences/journals
• develop models Deliverables:
• evaluate your models 1. Final project report (8%), 8 pages ACL format. Need to arxiv
2. Repo of dataset and source code (2%)
• prepare presentation 3. Final project presentation (5%)
• write tech report 4. Performance on leaderboard (5%)

* You are welcome to propose a new idea if you find it fascinating to be qualified for a mini project. Instructor opines!
List of Projects
• TBD
Content (Tentative)
• Introduction
• Classical NLP • Regular Expressions, Text Normalization, and Edit Distance
• Morphology & Finite-state Transducers
1980-2010

• N-grams, smoothing and entropy


• HMM, Viterbi and A* decoding
• Word classes and POS tagging
• Semantics & distributional semantics

• Intro to deep learning


• Deep Learning for NLP
2011 - 2017

• Word vectors and word window classification (Word2Vec, GloVe, etc.)


• RNNs and language models (vanishing gradients, fancy RNNs)
• Sequence-to-sequence models and applications
• Attention mechanisms & self-attention
• Transformers
• Adv. NLP • More about Transformers (BERT, RoBERTA, ELMo, transfer learning)
• Prompt-based learning
2018 – till

• In-context learning
date

• Multilingual and multimodal models


• Fairness and ethics in NLP
• Miscellaneous
Timeline
Classical NLP NLP with DL Adv. NLP

Day 1 Minor 1 Minor 2 Major

Quiz 1 Quiz 2 Quiz 3


Assignment 1 Assignment 2 Assignment 3

Mini projects Mini project


(problem statements) evaluation

Two Assignments: Max(Assignment 1, Assignment 2) + Assignment 3


Acknowledgment
These slides were adapted from the book
SPEECH and LANGUAGE PROCESSING:
An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition
Advanced NLP, Graham Nuebig http://www.phontron.com/class/anlp2022/
Advanced NLP, Mohit Ayyer https://people.cs.umass.edu/~miyyer/cs685/
NLP with Deep Learning, Chris Manning, http://web.stanford.edu/class/cs224n/
Understanding Large Language Models, Danqi Chen https://www.cs.princeton.edu/courses/archive/fall22/cos597G/

and some modifications from presentations found in the WEB by


several scholars including the following
Credits and Acknowledgment
Husni Al-Muhtaseb
Heshaam Feili Khurshid Ahmad Martha Palmer
James Martin Björn Gambäck julia hirschberg
Staffan Larsson
Jim Martin Christian Korthals Elaine Rich
Thomas G. Dietterich Robert Wilensky Christof Monz
Dan Jurafsky
Devika Subramanian Feiyu Xu Bonnie J. Dorr
Sandiway Fong Duminda Wijesekera Nizar Habash
Lee McCluskey Jakub Piskorski
Song young in David J. Kriegman Massimo Poesio
Paula Matuszek Rohini Srihari David Goss-Grubbs
Kathleen McKeown
Mark Sanderson Thomas K Harris
Mary-Angela Papalaskari Michael J. Ciaraldi John Hutchins
Dick Crouch David Finkel Andrew Elks Alexandros Potamianos
Min-Yen Kan Marc Davis Mike Rosner
Tracy Kin
Andreas Geyer-Schulz Latifa Al-Sulaiti
L. Venkata Subramaniam Franz J. Kurfess Ray Larson Giorgio Satta
Martin Volk Tim Finin Jimmy Lin Jerry R. Hobbs
Bruce R. Maxim Nadjet Bouayad Marti Hearst Christopher Manning
Kathy McCoy Hinrich Schütze
Jan Hajič Andrew McCallum Alexander Gelbukh
Hans Uszkoreit
Srinath Srinivasa Nick Kushmerick Gina-Anne Levow
Azadeh Maghsoodi Guitao Gao
Simeon Ntafos Md Shad Akhtar Mark Craven
Qing Ma
Paolo Pirjanian Mohit Ayyer Chia-Hui Chang Zeynep Altan
Ricardo Vilalta Graham Neubig Diana Maynard Edureka
Tom Lenaerts Chris Manning James Allan And many others…
Introduction
Is this a grammatically correct English sentence?

Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo


Natural Language Processing
• What is a Natural Language?
Any language that has evolved naturally in
humans through use and repetition without
conscious planning or premeditation.

• What is a Natural Language Processing?


A field of computer science, artificial intelligence and
computational linguistics concerned with the interactions
between computers and human (natural) languages.
https://www.javatpoint.com/nlp
Natural Language Processing

• Setup
• Two rooms, two humans, and a computer.
• Room 1: One human C
• Room 2: One computer (A) and one human (B)

• A response generated from room 2 (either by A or B)


• C has to figure out the source of the response
• If C is successful → “A” failed the turing test
• Else, → “A” passed the turing test

"Computing Machinery and Intelligence" which


proposed what is now called the Turing test
Natural Language Processing

In 1957, Noam Chomsky’s Syntactic Structures


revolutionized Linguistics with 'universal
grammar', a rule based system of syntactic
structures
Natural Language Processing

Aravind Krishna Joshi (August 5, 1929 – December 31, 2017)


was a Professor of Computer and Cognitive Science in University
of Pennsylvania.

Joshi defined the tree-adjoining grammar formalism which is


often used in computational linguistics and natural language
processing.
Natural Language Processing

https://en.wikipedia.org/wiki/History_of_natural_language_processing
Why NLP is challenging?

Ambiguity
The real reason why NLP is hard
“Rohit Sharma was on fire last night. He totally destroyed the other teams”
Ambiguity

● Is ambiguity present in language only?


● No, ambiguity is prevalent in every dimension!

Duck or Rabbit?

shadakhtar:nlp:iiitd:2022:intro
Who has the
telescope?
Ambiguity in language

● I saw a girl with a telescope.

● I saw a girl with a bicycle.


OR

● I saw a bus with a telescope.

No
ambiguity!
shadakhtar:nlp:iiitd:2022:intro
Ambiguity in language

● I saw a girl with a telescope.


● Mary had a little lamb.

OR

shadakhtar:nlp:iiitd:2022:intro
Who’ll gift
whom?
Ambiguity in language

● I saw a girl with a telescope.


● Mary had a little lamb. I have to gift you some sweets.
● Mujhe aapko mithai khilani padegi.
OR

You have to gift me some sweets.

shadakhtar:nlp:iiitd:2022:intro
Ambiguity in language

● I saw a girl with a telescope.


● Mary had a little lamb.
● Mujhe aapko mithai khilani padegi.
● Public demand changes

OR

Public Public
demand: demand:

(a) Public demand changes, but does anybody listen to them?


(b) Public demand changes, and we companies have to adapt to such changes. ABC OR XYZ

shadakhtar:nlp:iiitd:2022:intro
Ambiguity in language

● I saw a girl with a telescope.


● Mary had a little lamb.
● Mujhe aapko mithai khilani padegi.
● Public demand changes
● Baby changing room OR

IN OUT

Baby
changing
room

shadakhtar:nlp:iiitd:2022:intro
Ambiguity in language

● I saw a girl with a telescope.


● Mary had a little lamb.
● Mujhe aapko mithai khilani padegi.
● Public demand changes
● Baby changing room
● I ate rice with spoon.
● I ate rice with curd.
● I ate rice with Rahul.
Similar surface
structures but
different
interpretations!

shadakhtar:nlp:iiitd:2022:intro
Ambiguity and Punctuations!

A woman without her man is nothing

shadakhtar:nlp:iiitd:2022:intro
Ambiguity makes NLP hard
Surface form has multiple interpretations

• Syntactic Ambiguity
• Violinist Linked to JAL Crash Blossoms => main verb?

the study of the origin of words and the


way in which their meanings have changed
throughout history.
Is it a valid
sentence?
What about this?

Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo

The word buffalo has three senses:


1. Noun: Animal (plural is also buffalo)
2. Proper Noun: American State
3. Verb: To bully someone

Buffalo buffalo, whom other Buffalo buffalo buffalo, buffalo Buffalo buffalo

The sentence uses a restrictive clause, so there are no commas, nor is there the word "which," as in, "Buffalo buffalo, which Buffalo buffalo buffalo, buffalo
Buffalo buffalo." This clause is also a reduced relative clause, so the word that, which could appear between the second and third words of the sentence, is
omitted.
shadakhtar:nlp:iiitd:2022:intro Dmitri Borgmann's Beyond Language: Adventures in Word and Thought. 1967.
Why else is natural language
understanding difficult?
non-standard English segmentation issues Idioms/Multiword
Great job @justinbieber! Were dark horse
SOO PROUD of what youve the New York-New Haven Railroad get cold feet
accomplished! U taught us 2 the New York-New Haven Railroad lose face
#neversaynever & you yourself throw in the towel
should never give up either♥ Khana-wana (Echo)

neologisms world knowledge tricky entity names


unfriend Mary and Sue are sisters. Where is A Bug’s Life playing …
Retweet Mary and Sue are mothers. Let It Be was recorded …
bromance
… a mutation on the for gene …
NLP layers

● Understanding the semantics is a non-trivial task.


● Needs to performs a series of incremental tasks to achieve this.
● NLP happens in layers

shadakhtar:nlp:iiitd:2022:intro
NLP trinity

DL

shadakhtar:nlp:iiitd:2022:intro
Word and Token

● Word:
○ Smallest sequence of phonemes of a spoken language that can be uttered in isolation
● Word Segmentation/Tokenization:
○ Breaking a string of characters into a sequence of words.
○ Smallest sequence of graphemes that are delimited with some predefined characters (space,
comma, full-stop, etc.);

Ram, Shyam, and Mohan are playing. ⇒ [Ram] [,] [Shyam] [,] [and] [Mohan] [are] [playing] [.]

21,53,010 COVID cases in India. ⇒ [21] [,] [53] [,] [010] [COVID] [cases] [in] [India] [.]

[21,53,010] [COVID] [cases] [in] [India] [.] ✅

Check this out…https://www.abc.com ⇒ [Check] [this] [out] [.] [.] [.] [https] [:] [/] [/] [www] [.] [abc] [.] [com]

[Check] [this] [out] [...] [https://www.abc.com] ✅


#GreatDayEver ⇒ [#] [Great] [Day] [Ever]
shadakhtar:nlp:iiitd:2022:intro
Morphology

● Field of linguistics that studies the internal structure of words


○ How they are formed
○ Their relationship to other words in the same language.
● It defines word formation rule from the root word.
● Morpheme is the smallest linguistic unit that has semantic meaning
○ E.g.:
■ “Pre”, “ed”, “ing”, “s”, “es”, etc.
○ Dogs ⇒ dog + s (plural)
○ Going ⇒ go + ing (present participle)
○ Independently ⇒ independent + ly (Adverb)
⇒ in + dependent + ly (Negation)
⇒ in + depend + ent + ly (relying)
⇒ in + de + pend + ent + ly

shadakhtar:nlp:iiitd:2022:intro Pend: (verb) to remain undecided or unsettled.


Morphology

● English, Chinese, etc. are commonly referred as morphologically-poor language.


● Indian, Turkish, Hungarian, etc. are termed as morphologically-rich language.

shadakhtar:nlp:iiitd:2022:intro
Parts-of-Speech (POS) Tags
PRP: Personal Pronoun
VBD: Verb, Past
DT: Determiner
● Grammatical class of the word. NN: Noun, Singular, Mass
TO: to
IN: Preposition
He ate an apple .

PRP VBD DT NN .

● PoS disambiguation
○ A word can belong to different grammatical classes.

He went to the park in a car .

PRP VBD TO DT NN IN DT NN .

They went to park the car in the shed .

PRP VBD TO VB DT NN IN DT NN .

shadakhtar:nlp:iiitd:2022:intro
Chunking

● Identification of non-recursive phrases (noun, verb, etc.)

○ He went to the Indian city Mumbai. ⇒


[NP He] [VP went] [PP to] [NP the Indian city Mumbai]

○ Mumbai green lights women icons on traffic signals earns global praise. ⇒
[NP Mumbai green lights women icons] [PP on] [NP traffic signals] [VP earns] [NP global praise]

shadakhtar:nlp:iiitd:2022:intro
Syntax Processing
S
● Validate the grammatical structure of the sentence.
● Let, vocabulary = [the, mango, he, eats, ...]
○ He eats a mango. ⇒ ✅
○ He mango eats a. ⇒ ❌ NP VP .

● The sequence of words must follow the grammatical VBZ NP


structure of the language to form a valid sentence.
○ Construct a parse tree.
PRP DT NN

He eats a mango
Parse Tree
shadakhtar:nlp:iiitd:2022:intro
Syntax Processing
S
● Every language has a grammar G = <V, T, P, S>.

Productions (P) or rules:


S → NP VP . NP VP .
NP → PRP | NN | DT NP
VP → VBZ NP
PRP → He
VBZ → eats VBZ NP
DT → a
NN → mango

PRP DT NN

He eats a mango

shadakhtar:nlp:iiitd:2022:intro
Syntactic Ambiguity
S
S

NP VP .
NP VP .

VBZ NP
VBZ NP PP

PRP DT NN PP

PRP DT NN IN NP
IN NP

DT NN DT NN

telesco telesco
I saw a girl with a I saw a girl with a
pe pe

shadakhtar:nlp:iiitd:2022:intro
Semantic Role Labelling (SRL)

● Identify the semantic role of each argument (noun phrase) w.r.t. the predicate (main
verb) of the sentence

John drove Mary from Delhi to Pune in his car

Agent Patient source destination instrument

Ram hit Shyam with a hockey stick yesterday

Agent Patient instrument time

shadakhtar:nlp:iiitd:2022:intro
Textual Entailment

● Determine whether one natural language sentence entails (implies) another under an
ordinary interpretation

(Ram hit Shyam with a hockey stick yesterday. → Shyam got hurt) ⇒ Positive TE
(Ram hit Shyam with a hockey stick yesterday. → Shyam did not get hurt) ⇒ Negative TE
(Ram hit Shyam with a hockey stick yesterday. → Shyam got hospitalized) ⇒ non TE

shadakhtar:nlp:iiitd:2022:intro
Pragmatics

● Pragmatics considers [Thomas, 1995]:


○ the negotiation of meaning between speaker and listener.
○ the context of the utterance.
○ the intention of the user.

○ Context/World knowledge: An employee coming late to the office.


■ Utterance: Do you know what time is it?
■ Literal meaning: Are you aware of the current time? (Response: Yes, it is 12:30 PM)
■ Pragmatic meaning: Why are you coming so late? (Response: Reason for being late.)

○ Intention:
■ Utterance: Can you pass the water bottle?
■ Literal meaning: Are you able to pass the water bottle? (Response: Yes, I can.)
■ Pragmatic meaning: Pass me the water bottle. (Response: Handover the water bottle)

shadakhtar:nlp:iiitd:2022:intro
Discourse

● Processing of sequence of sentences.

Mother said to John: Go to school. It is open today. Are you planning to bunk? Father
will be very angry.

○ Discourse processing helps answering these questions.


■ What is open?
■ Bunk what?
■ Why the father will be angry?

shadakhtar:nlp:iiitd:2022:intro
Coreference Resolution

● Two referring expressions used to refer to the same entity are said to corefer.
● Determine which phrases in a document corefer.

John shows Bob his Toyota yesterday. It’s similar to the one I bought five years ago.

That was really nice, but he like this one even better.

shadakhtar:nlp:iiitd:2022:intro
Information Extraction

● Extraction of relevant piece of information

● Named Entity Recognition (NER):


○ Identify names (Proper nouns)
■ [India]Location born [Sundar Pichai]Person is the CEO of [Google]Organization and its parent company [Alphabet]Organization

● Relation extraction:
○ Relation among entities
■ CEO(Sundar Pichai, Google), CEO(Sundar Pichai, Alphabet), Born-at(Sundar Pichai,
India), ParentOrg(Alphabet, Google)

shadakhtar:nlp:iiitd:2022:intro
Word Sense Disambiguation (WSD)

● What does a word mean?

○ The fisherman went to the bank. ⇒ Financial bank or river bank?

○ The fisherman went to the bank to withdraw money.


○ The fisherman went to the bank to fish.

shadakhtar:nlp:iiitd:2022:intro
Sentiment Analysis

● Extract polarity orientation of the subjectivity

○ Really superb pillow. Love to sleep on it.. very comfortable... ⇒ Positive

○ It's a mass Chinese product. Too expensive. Thin and useless ⇒ Negative

○ My neighbours are home and it’s good to wake up at 3am in the morning. ⇒ Negative?

○ Campus has deadly snakes. ⇒ Negative

○ Shane Warne is a deadly spinner. ⇒ Positive?

○ The food was cheap. ⇒ Positive?

○ Not to mention the cheap service I got at the restaurant. ⇒ Negative

○ Movie was 4 hrs long. ⇒ Neutral?

shadakhtar:nlp:iiitd:2022:intro
Machine Translation

● Given a sentence in the source language L1, convert it to the target language L2, such that the semantic (adequacy and fluency)
is preserved.

Source: Google Translate

shadakhtar:nlp:iiitd:2022:intro
Summarization

● Given a document, summarize the semantics (extract relevant information) in shorter length text.

● Document
○ Sen. Barack Obama sealed the Democratic presidential nomination last night after a grueling
and history-making campaign against Sen. Hillary Rodham Clinton that will make him the first
African American to head a major-party ticket.

● Summary
○ Barack Obama is the Democratic presidential candidate.

shadakhtar:nlp:iiitd:2022:intro
Question Answering

● Answer natural language questions based on information presented in the repository.

● Factoid Questions
○ Question: Who is the author of the book Wings of Fire?
○ Answer: A. P. J, Abdul Kalam

● List Questions
○ Question: What are the islands in India?
○ Answer: Andaman Island, Nicobar Island, Labyrinth Island, Barren Island

● Descriptive Questions
○ Question: What is Greenhouse effect?
○ Answer: The analogy used to describe the ability of gases in the atmosphere to absorb
heat from the earth’s surface.

shadakhtar:nlp:iiitd:2022:intro
Dialog System and Chatbot

● Conversation of two or more parties.

shadakhtar:nlp:iiitd:2022:intro
Hate Speech

• Any post that targets a specific individual/group of people based on their ethnicity, religious beliefs,
geographical belonging, race, etc., with malicious intentions of disseminating hate or emboldening
violence.

• #BuildThatWall #BuildTheDamnWall I’m sorry my Lord #Jesus but people are just deaf down
here

• Women ... Can’t live with them...Can’t shoot them

• Related terms

• Insult, Abuse, Offensive, Provocative

shadakhtar:nlp:iiitd:2022:intro
Fake News

• A piece of information or an alleged claim that is verifiable to be false.


• Intentionally created posts to spread malicious and false narratives
◦ Leverages the chaos/misinformation to gain political, financial, or regional advantages in a quick time

shadakhtar:nlp:iiitd:2022:intro
Language Technology

Still really hard


Mostly solved Question answering (QA)
Q. How effective is ibuprofen in reducing
fever in patients with acute febrile illness?
Spam detection
Let’s go to Agra! ✓
Paraphrase
Buy V1AGRA … ✗
XYZ acquired ABC yesterday
ABC has been taken over by XYZ
Part-of-speech (POS) tagging
ADJ ADJ NOUN VERB ADV Summarization
Colorless green ideas sleep furiously. The Dow Jones is up Economy is
The S&P500 jumped good
Housing prices rose
Named entity recognition (NER)
PERSON ORG LOC Dialog Where is Citizen Kane playing in SF?
Einstein met with UN officials in Princeton
Castro Theatre at 7:30. Do you
want a ticket?

shadakhtar:nlp:iiitd:2022:intro
Why Study NLP?
• To get a job in industry
• e.g., many current job listings are CL jobs
• Google Inc.
• Amazon Inc.
• Facebook Inc.
• Flipkart Inc., etc.

• To get a job in academia


• As a computational linguist
• computational literacy and an understanding of computational methods will become critical
in the next decade.

You might also like