Nlp and Evaluation -Mcq
Nlp and Evaluation -Mcq
4. Which NLP technique is used to remove suffixes from words to get their root form?
a) Lemmatization
b) Tokenization
c) Stemming
d) Parsing
Answer: c) Stemming
7. Which of the following techniques is used to reduce words to their base or root form?
A. Tokenization
B. Lemmatization
C. Stopword removal
D. Sentence segmentation
Answer: B. Lemmatization
10. Which of the following can be used to remove unnecessary words such as "the," "is," or
"and" from text?
A. Stopword removal
B. Tokenization
C. Lemmatization
D. Stemming
11. If you want to replace all occurrences of numbers in text with a specific token like <NUM>,
which process would you use?
A. Lemmatization
B. Noise removal
C. Numerical substitution
D. Regular expression substitution
12. In what situation would removing punctuation during text normalization NOT be
recommended?
A. When punctuation helps in sentiment analysis
B. When performing topic modeling
C. When normalizing text for keyword search
D. When preparing text for machine translation
15. What does "IDF" (Inverse Document Frequency) penalize in the calculation of TF-IDF?
A. Words that are too frequent across the corpus
B. Words that are unique to a document
C. Words that appear in the first paragraph
D. Words with multiple meanings
17. In a scenario where a word appears in almost every document of a corpus, what will its IDF
score be?
A. High
B. Low
C. Zero
D. Undefined
Answer: B. Low
23. Which of the following is an example of a sentence that is syntactically correct but
semantically incorrect?
A. "The cat sleeps on the mat."
B. "The quick brown fox jumps over the lazy dog."
C. "The green idea sleeps furiously."
D. "Runs dog the park in."
27. Which of the following best describes the relationship between syntax and semantics?
A. Syntax focuses on sentence structure, while semantics focuses on meaning.
B. Syntax focuses on the meaning of sentences, while semantics focuses on grammar.
C. Syntax and semantics are unrelated in NLP.
D. Syntax focuses on tokens, while semantics focuses on punctuation.
29. In NLP, a sentence that is syntactically correct but semantically meaningless is known
as:
A. A grammatically incorrect sentence
B. A semantically ambiguous sentence
C. A syntactically valid but nonsensical sentence
D. A semantically rich sentence
EVALUATION MCQs
31. Which of the following metrics is commonly used to evaluate classification models?
A. Mean Absolute Error (MAE)
B. Confusion Matrix
C. Root Mean Square Error (RMSE)
D. BLEU Score
33. Which of these evaluation metrics is specifically used for regression models?
A. Precision
B. F1-Score
C. Mean Squared Error (MSE)
D. ROC-AUC
37. When an AI model performs well on the training data but poorly on evaluation data, it is
likely:
A. Underfitting
B. Overfitting
C. Well-generalized
D. Regularized
Answer: B. Overfitting
38. Which metric is most appropriate to evaluate the balance between precision and recall in
a binary classification problem?
A. Accuracy
B. F1-Score
C. Mean Absolute Error
D. Log Loss
Answer: B. F1-Score
41. Which value in a confusion matrix represents the number of correctly classified positive
samples?
A. True Positive (TP)
B. False Positive (FP)
C. True Negative (TN)
D. False Negative (FN)
43. What does the sum of all the values in a confusion matrix represent?
A. Total number of correctly classified samples
B. Total number of misclassified samples
C. Total number of samples in the dataset
D. Total number of features in the dataset
Answer: B. F1-Score
47. If a model predicts all samples as positive, which metric will be high regardless of
performance?
A. Precision
B. Recall
C. Accuracy
D. Specificity
Answer: B. Recall