0% found this document useful (0 votes)
36 views28 pages

Eti 3111

The document contains questions about various natural language processing concepts and techniques including keyword normalization, n-grams, document term matrix, text classification models, topic modeling, word embeddings and more.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views28 pages

Eti 3111

The document contains questions about various natural language processing concepts and techniques including keyword normalization, n-grams, document term matrix, text classification models, topic modeling, word embeddings and more.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Q1 Which of the following techniques can be used for the purpose of keyword

normalization, the process of converting a keyword into its base form?

1. Lemmatization
2. Levenshtein
3. Stemming
4. Soundex

A) 1 and 2
B) 2 and 4
C) 1 and 3
D) 1, 2 and 3
E) 2, 3 and 4
F) 1, 2, 3 and 4
2) N-grams are defined as the combination of N keywords together. How many bi-
grams can be generated from given sentence:
A) 7
B) 8
C) 9
D) 10
E) 11

3) How many trigrams phrases can be generated from the following sentence,
after performing following text cleaning steps:

• Stopword Removal
• Replacing punctuations by a single space

A) 3
B) 4
C) 5
D) 6
E) 7
4) Which of the following regular expression can be used to identify date(s)
present in the text object:
A) \d{4}-\d{2}-\d{2}
B) (19|20)\d{2}-(0[1-9]|1[0-2])-[0-2][1-9] C) (19|20)\d{2}-(0[1-9]|1[0-2])-([0-2][1-9]|3[0-1])
D) None of the above

5) Which of the following models can perform tweet classification with regards to
context mentioned above?
A) Naive Bayes
B) SVM
C) None of the above
6) You have created a document term matrix of the data, treating every tweet as
one document. Which of the following is correct, in regards to document term
matrix?

1. Removal of stopwords from the data will affect the dimensionality of data
2. Normalization of words in the data will reduce the dimensionality of data
3. Converting all the words in lowercase will not affect the dimensionality of the data

A) Only 1
B) Only 2
C) Only 3
D) 1 and 2
E) 2 and 3
F) 1, 2 and 3

7) Which of the following features can be used for accuracy improvement of a


classification model?

A) Frequency count of terms


B) Vector Notation of sentence
C) Part of Speech Tag
D) Dependency Grammar
E) All of these

8) What percentage of the total statements are correct with regards to Topic
Modeling?

1. It is a supervised learning technique


2. LDA (Linear Discriminant Analysis) can be used to perform topic modeling
3. Selection of number of topics in a model does not depend on the size of data
4. Number of topic terms are directly proportional to size of the data

A) 0
B) 25
C) 50
D) 75
E) 100

9) In Latent Dirichlet Allocation model for text classification purposes, what does
alpha and beta hyper parameter represent-?

A) Alpha: number of topics within documents, beta: number of terms within topics
False
B) Alpha: density of terms generated within topics, beta: density of topics
generated within terms False
C) Alpha: number of topics within documents, beta: number of terms within topics
False
D) Alpha: density of topics generated within documents, beta: density of
terms generated within topics True
10) Solve the equation according to the sentence “I am planning to visit New Delhi
to attend Analytics Vidhya Delhi Hackathon”.

A = (# of words with Noun as the part of speech tag)


B = (# of words with Verb as the part of speech tag)
C = (# of words with frequency count greater than one)

What are the correct values of A, B, and C?

A) 5, 5, 2
B) 5, 5, 0
C) 7, 5, 1
D) 7, 4, 2
E) 6, 4, 3

5. 12) Which of the following documents contains the same number of terms
and the number of terms in the one of the document is not equal to least
number of terms in any document in the entire corpus.

A) d1 and d4
B) d6 and d7
C) d2 and d4
D) d5 and d6

6. 14) What is the term frequency of a term which is used a maximum number
of times in that document?

A) t6 – 2/5
B) t3 – 3/6
C) t4 – 2/6
D) t1 – 2/6

7. 15) Which of the following technique is not a part of flexible text matching?

A) Soundex
B) Metaphone
C) Edit Distance
D) Keyword Hashing

8. 16) True or False: Word2Vec model is a machine learning model used to


create vector notations of text objects. Word2vec contains multiple deep
neural networks

A) TRUE
B) FALSE

9. 17) Which of the following statement is(are) true for Word2Vec model?

A) The architecture of word2vec consists of only two layers – continuous bag of


words and skip-gram model
B) Continuous bag of word (CBOW) is a Recurrent Neural Network model
C) Both CBOW and Skip-gram are shallow neural network models
D) All of the above
19) What is the right order for a text classification model components

1. Text cleaning
2. Text annotation
3. Gradient descent
4. Model tuning
5. Text to predictors

A) 12345
B) 13425
C) 12534
D) 13452

20) Polysemy is defined as the coexistence of multiple meanings for a word or


phrase in a text object. Which of the following models is likely the best choice to
correct this problem?

A) Random Forest Classifier


B) Convolutional Neural Networks
C) Gradient Boosting
D) All of these

21) Which of the following models can be used for the purpose of document
similarity?

A) Training a word 2 vector model on the corpus that learns context present in the
document
B) Training a bag of words model that learns occurrence of words in the document
C) Creating a document-term matrix and using cosine similarity for each document
D) All of the above

22) What are the possible features of a text corpus

1. Count of word in a document


2. Boolean feature – presence of word in a document
3. Vector notation of word
4. Part of Speech Tag
5. Basic Dependency Grammar
6. Entire document as a feature

A) 1
B) 12
C) 123
D) 1234
E) 12345
F) 123456
23) While creating a machine learning model on text data, you created a document
term matrix of the input data of 100K documents. Which of the following remedies
can be used to reduce the dimensions of data –

1. Latent Dirichlet Allocation


2. Latent Semantic Indexing
3. Keyword Normalization

A) only 1
B) 2, 3
C) 1, 3
D) 1, 2, 3

24) Google Search’s feature – “Did you mean”, is a mixture of different


techniques. Which of the following techniques are likely to be ingredients?

1. Collaborative Filtering model to detect similar user behaviors (queries)


2. Model that checks for Levenshtein distance among the dictionary terms
3. Translation of sentences into multiple languages

A) 1
B) 2
C) 1, 2
D) 1, 2, 3

25) While working with text data obtained from news sentences, which are
structured in nature, which of the grammar-based text parsing techniques can be
used for noun phrase detection, verb phrase detection, subject detection and
object detection.

A) Part of speech tagging


B) Dependency Parsing and Constituency Parsing
C) Skip Gram and N-Gram extraction
D) Continuous Bag of Words

26) Social Media platforms are the most intuitive form of text data. You are given a
corpus of complete social media data of tweets. How can you create a model that
suggests the hashtags?

A) Perform Topic Models to obtain most significant words of the corpus


B) Train a Bag of Ngrams model to capture top n-grams – words and their combinations
C) Train a word2vector model to learn repeating contexts in the sentences
D) All of these

27) While working with context extraction from a text data, you encountered two
different sentences: The tank is full of soldiers. The tank is full of nitrogen. Which
of the following measures can be used to remove the problem of word sense
disambiguation in the sentences?

A) Compare the dictionary definition of an ambiguous word with the terms


contained in its neighborhood
B) Co-reference resolution in which one resolute the meaning of ambiguous word with
the proper noun present in the previous sentence
C) Use dependency parsing of sentence to understand the meanings
28) Collaborative Filtering and Content Based Models are the two popular
recommendation engines, what role does NLP play in building such algorithms.

A) Feature Extraction from text


B) Measuring Feature Similarity
C) Engineering Features for vector space learning model
D) All of these

29) Retrieval based models and Generative models are the two popular techniques
used for building chatbots. Which of the following is an example of retrieval model
and generative model respectively.

A) Dictionary based learning and Word 2 vector model


B) Rule-based learning and Sequence to Sequence model
C) Word 2 vector and Sentence to Vector model
D) Recurrent neural network and convolutional neural network

30) What is the major difference between CRF (Conditional Random Field) and
HMM (Hidden Markov Model)?

A) CRF is Generative whereas HMM is Discriminative model


B) CRF is Discriminative whereas HMM is Generative model
C) Both CRF and HMM are Generative model
D) Both CRF and HMM are Discriminative model
1What is the main challenge/s of NLP?
a) Handling Ambiguity of Sentences
b) Handling Tokenization
c) Handling POS-Tagging
d) All of the mentioned

Ans:- a) Handling Ambiguity of Sentences

.2 Modern NLP algorithms are based on machine learning, especially statistical machine
learning.
a) True
b) False

Ans:- a) True

3Which of the following includes major tasks of NLP?


a) Automatic Summarization
b) Discourse Analysis
c) Machine Translation
d) All of the mentioned

Ans:- d) All of the mentioned

4Natural Language generation is the main task of Natural language processing.


a) True
b) False

Ans:- a) True

OCR (Optical Character Recognition) uses NLP.


a) True
b) False

Ans - a) True

5Parsing determines Parse Trees (Grammatical Analysis) for a given sentence.


a) True
b) False

Ans- a) True

6Given a sound clip of a person or people speaking, determine the textual representation of the
speech.
a) Text-to-speech
b) Speech-to-text
c) All of the mentioned
d) None of the mentioned

Ans:- b) Speech-to-text

7 Which of the below are NLP use cases?


a. Detecting objects from an image
b. Facial Recognition
c. Speech Biometric
d. Text Summarization
Ans: d) Text Summarization
1. What are the undesirable properties of knowledge?

• A. Voluminous

• B. Difficult to characterize

• C. Variability

• D. All of the above

2. Morphological Segmentation

• A. Does Discourse Analysis

• B. Separate words into individual morphemes and identify the class of the
morphemes

• C. Is an extension of propositional logic

• D. None

How should knowledge be represented to be used for an Ai Technique?

• A. When two individual situations are represented, knowledge should provide


generalization such that only common properties of both situations are represented
rather than representing both situations individually

• B. Knowledge should be represented such that it should be understood by the people


who have provided it

• C. Knowledge should be represented in a way that it can be easily modified

• D. All of these

4. How many types of entities are there in knowledge representation?

• A. Facts

• B. Symbols

• C. Both A and B

• D. None

5. What are the properties of a good knowledge representation system?

• A. Representation Adequacy

• B. Inferential Adequacy

• C. Inferential Efficiency

• D. All of these
6. Natural Language Processing (NLP) is field of

• A. Computer Science

• B. Artificial Intelligence

• C. Linguistics

• D. All of the mentioned

7. What are difficulties in NLU?

• A. Lexical ambiguity

• B. Syntax Level ambiguity

• C. Both A and B

• D. None

8. Which domain study Artificial Included?

• A. Computer Science

• B. Cognitive Science

• C. Engineering

• D. All the above

9. What contributes to Artificial Intelligence?

• A. Computer Science

• B. Biology

• C. Psychology

• D. All of the above

10. What are roles in AI career?

• A. Software analysts and developers

• B. Computer scientists and computer engineers

• C. Algorithm specialists

• D. All of the above

1. An Artificial Intelligence system developed by Terry A. Winograd to


permit an interactive dialogue about a domain he called blocks-world.
• SIMD
• STUDENT
• SHRDLU
• BACON

2. What is Artificial intelligence?

• Programming with your own intelligence


• Putting your intelligence into Computer
• Making a Machine intelligent
• Playing a Game

View Answer
Artificial intelligence is Making a Machine intelligent

3. DARPA, the agency that has funded a great deal of American


Artificial Intelligence research, is part of the Department of:

• Education
• Defense
• Energy
• Justice

View Answer
DARPA, the agency that has funded a great deal of American Artificial Intelligence
research, is part of the Department of Defense.

4. Who is the “father” of artificial intelligence?

• John McCarthy
• Fisher Ada
• Allen Newell
• Alan Turning

View Answer
the “father” of artificial intelligence is Fisher Ada .

5. KEE is a product of:

• IntelliCorpn
• Teknowledge
• Texas Instruments
• Tech knowledge

View Answer
KEE is a product of IntelliCorpn .

6. Default reasoning is another type of -

• Analogical reasoning
• Bitonic reasoning
• Non-monotonic reasoning
• Monotonic reasoning

View Answer
Default reasoning is another type of Non-monotonic reasoning.

7. Weak AI is

• a set of computer programs that produce output that would be considered to reflect
intelligence if it were generated by humans.
• the study of mental faculties through the use of mental models implemented on a computer.
• the embodiment of human intellectual capabilities within a computer.
• All of the above

View Answer
Weak AI is the study of mental faculties through the use of mental models implemented
on a computer.

8. If a robot can alter its own trajectory in response to external


conditions, it is considered to be:

• mobile
• open loop
• intelligent
• non-servo

View Answer
If a robot can alter its own trajectory in response to external conditions, it is considered to
be intelligent .
1) What is the field of Natural Language Processing (NLP)?
a) Computer Science
b) Artificial Intelligence
c) Linguistics
d) All of the mentioned
Answer: d
2) NLP is concerned with the interactions between computers and human
(natural) languages.
a) True
b) False
Answer: a
3) What is the main challenge/s of NLP?
a) Handling Ambiguity of Sentences
b) Handling Tokenization
c) Handling POS-Tagging
d) All of the mentioned
Answer: a
4) NLP stands for Natural Language Processing.
a) True
b) False
Answer: true
5) Choose form the following areas where NLP can be useful.
a) Automatic Text Summarization
b) Automatic Question-Answering Systems
c) Information Retrieval
d) All of the above
Answer: d
6) Natural language processing is divided into the two sub fields of:
a) Symbolic and numeric
b) Time and motion
c) Algorithmic and heuristic
d) Understanding and generation
Answer: d
7) A natural language generation program must decide:
a) what to say
b) when to say something
c) why it is being used
d) both a and b
Answer: d
8) People overcome natural language problems by:
a) grouping attributes into frames
b) understanding ideas in context
c) identifying with familiar situations
d) both (b) and (c)
Answer: d
MCQS on Natural Language Processing
1. What is the field of Natural Language Processing (NLP)?
a) Computer Science
b) Artificial Intelligence
c) Linguistics
d) All of the mentioned
Answer: d

2. NLP is concerned with the interactions between computers and human (natural)
languages.
a) True
b) False
Answer: a

3. What is the main challenge/s of NLP?


a) Handling Ambiguity of Sentences
b) Handling Tokenization
c) Handling POS-Tagging
d) all of the mentioned
Answer: a

4. Modern NLP algorithms are based on machine learning, especially statistical


machine learning.
a) True
b) False
Answer: a

5. Choose form the following areas where NLP can be useful.


a) Automatic Text Summarization
b) Automatic Question-Answering Systems
c) Information Retrieval
d) All of the mentioned
Answer: d

6. Which of the following includes major tasks of NLP?


a) Automatic Summarization
b) Discourse Analysis
c) Machine Translation
d) All of the mentioned

Answer: d
7. What is Coreference Resolution?
a) Anaphora Resolution
b) Given a sentence or larger chunk of text, determine which words (“mentions”) refer to the
same objects (“entities”)
c) All of the mentioned
d) None of the mentioned
Answer: b

8. What is Machine Translation?


a) Converts one human language to another
b) converts human language to machine language
c) Converts any human language to English
d) Converts Machine language to human language
Answer: a

9. The more general task of coreference resolution also includes identifying so-called
“bridging relationships” involving referring expressions.
a) True
b) False
Answer: a

10. What is Morphological Segmentation?


a) Does Discourse Analysis
b) Separate words into individual morphemes and identify the class of the morphemes
c) Is an extension of propositional logic
d) None of the mentioned
Answer: b

11. Given a stream of text, Named Entity Recognition determines which pronoun maps
to which noun.
a) False
b) True
Answer: a

12. Natural Language generation is the main task of Natural language processing.
a) True
b) False
Answer: a

13. OCR (Optical Character Recognition) uses NLP.


a) True
b) False
Answer: a
14. Parts-of-Speech tagging determines ___________
a) part-of-speech for each word dynamically as per meaning of the sentence
b) part-of-speech for each word dynamically as per sentence structure
c) all part-of-speech for a specific word given as input
d) all of the mentioned
Answer: d

15. Parsing determines Parse Trees (Grammatical Analysis) for a given sentence.
a) True
b) False
Answer: a

16. IR (information Retrieval) and IE (Information Extraction) are the two same thing.
a) True
b) False
Answer: b

17. Many words have more than one meaning; we have to select the meaning which
makes the most sense in context. This can be resolved by ____________
a) Fuzzy Logic
b) Word Sense Disambiguation
c) Shallow Semantic Analysis
d) All of the mentioned
Answer: b

18. Given a sound clip of a person or people speaking, determine the textual
representation of the speech.
a) Text-to-speech
b) Speech-to-text
c) All of the mentioned
d) none of the mentioned
Answer: b

19. Speech Segmentation is a subtask of Speech Recognition.


a) True
b) False
Answer: a
20. In linguistic morphology _____________ is the process for reducing inflected words
to their root form.
a) Rooting
b) Stemming
c) Text-Proofing
d) both Rooting & Stemming
Answer: b

21. Collaborative Filtering and Content Based Models are the two popular
recommendation engines, what role does NLP play in building such algorithms.

a) Feature Extraction from text


b) Measuring Feature Similarity
c) Engineering Features for vector space learning model
d) All of these

Answer: d

22. With respect to this context-free dependency graphs, how many sub-trees exists in
the sentence?

a) 3
b) 4
c) 5
d) 6

Answer: d
MCQ on Natural Language Processing
1. What is the field of Natural Language Processing (NLP)?
a) Computer Science
b) Artificial Intelligence
c) Linguistics
d) All of the mentioned
Answer: d

2. NLP is concerned with the interactions between computers and human (natural) languages.
a) True
b) False
Answer: a

3. What is the main challenge/s of NLP?


a) Handling Ambiguity of Sentences
b) Handling Tokenization
c) Handling POS-Tagging
d) All of the mentioned
Answer: a

4. Modern NLP algorithms are based on machine learning, especially statistical machine
learning.
a) True
b) False
Answer: a.

5. Choose form the following areas where NLP can be useful.


a) Automatic Text Summarization
b) Automatic Question-Answering Systems
c) Information Retrieval
d) All of the mentioned
Answer: d

6. Which of the following includes major tasks of NLP?


a) Automatic Summarization
b) Discourse Analysis
c) Machine Translation
d) All of the mentioned
Answer: d
7. What is Co-reference Resolution?
a) Anaphora Resolution
b) Given a sentence or larger chunk of text, determine which words (“mentions”) refer to the
same objects (“entities”)
c) All of the mentioned
d) None of the mentioned
Answer: b

8. Which of the below are NLP use cases?


a. Detecting objects from an image
b. Facial Recognition
c. Speech Biometric
d. Text Summarization
Answer: d

9. DEC advertises that it helped to create “the world’s first expert system routinely used in an
industrial environment,” called XCON or __________
a) PDP-11
b) Rl
c) VAX
d) MAGNOM
Answer: b

10. Prior to the invention of time-sharing, the prevalent method of computer access was
____________
a) batch processing
b) telecommunication
c) remote access
d) all of the mentioned
Answer: a

11. Seymour Papert of the MIT AI lab created a programming environment for children
called ___________
a) BASIC
b) LOGO
c) MYCIN
d) FORTRAN
Answer: b
12. Which of the following is a project of the Strategic Computing Program?
a) Defense Advanced Research Projects Agency
b) National Science Foundation
c) Jet Propulsion Laboratory
d) All of the mentioned
Answer: a

13. The original LISP machines produced by both LMI and Symbolics were based on
research performed at?
a) CMU
b) MIT
c) Stanford University
d) RAMD
Answer: b

14. In LISP, the addition 3 + 2 is entered as _______________


a) 3 + 2
b) 3 add 2
c) 3 + 2 =
d) (+ 3 2)
Answer: b

15. What is Weak AI?


a) the embodiment of human intellectual capabilities within a computer
b) a set of computer programs that produce output that would be considered to reflect
intelligence if it were generated by humans
c) the study of mental faculties using mental models implemented on a computer
d) all of the mentioned
Answer: c

16. In LISP, the function returns t if is a CONS cell and nil otherwise ________
a) (cons )
b) (consp )
c) (eq )
d) (cous =)
Answer:b
17. In a rule-based system, what is the form of procedural domain knowledge?
a) production rules
b) rule interpreters
c) meta-rules
d) control rules
Answer: a

18. If a robot can alter its own trajectory in response to external conditions, it is considered to
be ____________
a) intelligent
b) mobile
c) open loop
d) non-servo
Answer: a

19. In LISP, what is the function assigns the symbol x to y?


a) (setq y x)
b) (set y = ‘x’)
c) (setq y = ‘x’)
d) (setq y ‘x’)
Answer: d

20. One of the leading American robotics centres is the Robotics Institute located at?
a) CMU
b) MIT
c) RAND
d) SRI
Answer: a
1. What is the field of Natural Language Processing (NLP)?
a) Computer Science
b) Artificial Intelligence
c) Linguistics
d) All of the mentioned
2. NLP is concerned with the interactions between computers and human (natural) languages.
a) True
b) False
3. What is the main challenge/s of NLP?
a) Handling Ambiguity of Sentences
b) Handling Tokenization
c) Handling POS-Tagging
d) All of the mentioned
4. Modern NLP algorithms are based on machine learning, especially statistical machine
learning.
a) True
b) False
5. Which of the following includes major tasks of NLP?
a) Automatic Summarization
b) Discourse Analysis
c) Machine Translation
d) All of the mentioned
6. What is Machine Translation?
a) Converts one human language to another
b) Converts human language to machine language
c) Converts any human language to English
d) Converts Machine language to human language
7. What is Morphological Segmentation?
a) Does Discourse Analysis
b) Separate words into individual morphemes and identify the class of the morphemes
c) Is an extension of propositional logic
d) None of the mentioned
8. Natural Language generation is the main task of Natural language processing.
a) True
b) False
9. OCR (Optical Character Recognition) uses NLP.
a) True
b) False
10. Parts-of-Speech tagging determines ___________
a) part-of-speech for each word dynamically as per meaning of the sentence
b) part-of-speech for each word dynamically as per sentence structure
c) all part-of-speech for a specific word given as input
d) all of the mentioned
11. Parsing determines Parse Trees (Grammatical Analysis) for a given sentence.
a) True
b) False
MCQs on Natural Language Processing
1. What is the field of Natural Language Processing (NLP)?

i. Computer Science
ii. Artificial Intelligence
iii. Linguistics
iv. All of the mentioned

2. NLP is concerned with the interactions between computers and human (natural)
languages.
i. True
ii. False

3. What is the main challenge/s of NLP?


i. Handling Ambiguity of Sentences
ii. Handling Tokenization
iii. Handling POS-Tagging
iv. All of the mentioned

4. Modern NLP algorithms are based on machine learning, especially statistical


machine learning.
i. True
ii. False

5. Choose form the following areas where NLP can be useful.


i. Automatic Text Summarization
ii. Automatic Question-Answering Systems
iii. Information Retrieval
iv. All of the mentioned

6. Which of the following includes major tasks of NLP?


i. Automatic Summarization
ii. Discourse Analysis
iii. Machine Translation
iv. All of the mentioned

7. 7. What is Coreference Resolution?


i. Anaphora Resolution
ii. Given a sentence or larger chunk of text, determine which words
(“mentions”) refer to the same objects (“entities”)
iii. All of the mentioned
iv. None of the mentioned
8. What is Machine Translation?
i. Converts one human language to another
ii. Converts human language to machine language
iii. Converts any human language to English
iv. Converts Machine language to human language

9. The more general task of coreference resolution also includes identifying so-called
“bridging relationships” involving referring expressions.
i. True
ii. False

10. What is Morphological Segmentation?


i. Does Discourse Analysis
ii. Separate words into individual morphemes and identify the class of the
morphemes
iii. Is an extension of propositional logic
iv. None of the mentioned
MCQs on IOT
1. MQTT stands for _____________

i. MQ Telemetry Things
ii. MQ Transport Telemetry
iii. MQ Transport Things
iv. MQ Telemetry Transport

2. MQTT is better than HTTP for sending and receiving data.

i. True
ii. False

3. Which protocol is lightweight?

i. MQTT
ii. HTTP
iii. CoAP
iv. SPI

4. PubNub publishes and subscribes _________ in order to send and receive messages.

i. Network
ii. Account
iii. Portal
iv. Keys

5. Which one out of these is not a data link layer technology:

i. Bluetooth
ii. UART
iii. WiFi
iv. HTTP

6. Which layer is called a port layer in OSI model:


i. Session
ii. Application
iii. Presentation
iv. Transport

You might also like