DMBI-Viva Sample Questions
DMBI-Viva Sample Questions
Module 2: Preprocessing
1. What are the different types of attributes? Explain with examples
2. Problems on basic statistical descriptions of data like finding mean, median, midrange
standard deviation, variance,modes for given data.
3. What is the q-q plot and boxplot for given data.
4. What is a five number summary of data?
5. How can we compute dissimilarity between two binary attributes?
6. What is Euclidean distance, Manhattan distance, Minkowski distance?
7. What is cosine similarity?Explain in brief the major tasks in data preprocessing.
8. What are the different ways to handle missing data?
9. What are the different ways to handle noisy data?
10.Problems on correlation analysis for categorical(Chi square test) and numerical data.
11.What are the different data transformation strategies?
12.State different data reduction strategies.
Module III:Classification
1. Write Decision Tree algorithm :ID3,C4.5 and CART algorithms
2. Explain Attribute selection measures (Information Gain, Gain Ratio, Gini Index )
3. What is Overfitting and Tree Pruning
4. State Baye’s Theorem
5. Explain Naïve Bayesian Classification Algorithm with example. Why it is called Naive
6. State advantages and disadvantages of Naive Baye’s Algorithm
7. What are the different Metrics for Evaluating Classifier Performance(Accuracy, Precision,
Recall, F1 score,Specificity, Sensitivity)
8. What is Class Imbalance problem? Explain with Example.
9. What are the different Methods for evaluating accuracy of the classifier(Holdout method
,Random Subsampling,Cross Validation, Bootstrap method)
10.Explain the Ensemble Methods for Improving the Accuracy of classifier(Bagging,
boosting, and random forest )
11.What is Simple Linear Regression and multiple linear regression? Examples .
Module IV:Clustering
1. What is clustering? State the Applications of clustering algorithms.
2. What are the requirements of any clustering algorithms
3. What are the different approaches to clustering?
4. Explain the Partitioning approach to clustering ( K means and K medoid method)
5. Explain the Hierarchical Approach to clustering ( Agglomerative and Divisive clustering)
6. What is single linkage and complete linkage agglomerative clustering?
7. What is a dendrogram?
8. What is DBSCAN? Two parameters used in DBSCAN algo.