0% found this document useful (0 votes)
9 views

DMBI-Viva Sample Questions

Uploaded by

mr.mohitpatil003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

DMBI-Viva Sample Questions

Uploaded by

mr.mohitpatil003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Sample viva Questions

Module 1:Data Warehousing and Data Mining


1.​ What is DWH? Explain DWH characteristics.
2.​ What are the advantages and applications of DWH?
3.​ Why is the ER model not suitable for DWH?What are the steps in dimensional
modeling?
4.​ Define dimension, fact , fact table and dimension table with example.
5.​ Difference between star and snowflake schema.
6.​ Design star and snowflake schema for given system.
7.​ Difference between OLTP and OLAP.
8.​ What are different OLAP operations?Explain with example.
9.​ Problems on writing a sequence of OLAP operations for the given query.
10.​Explain steps of KDD
11.​State any 2 decision making activities for which organizations are using data in DWH.
12.​What is concept hierarchy, partial and total order concept hierarchy? Ex[plain with an
example.
13.​What is data mining? State applications of data mining.
14.​What are the different types of patterns that can be mined?

Module 2: Preprocessing
1.​ What are the different types of attributes? Explain with examples
2.​ Problems on basic statistical descriptions of data like finding mean, median, midrange
standard deviation, variance,modes for given data.
3.​ What is the q-q plot and boxplot for given data.
4.​ What is a five number summary of data?
5.​ How can we compute dissimilarity between two binary attributes?
6.​ What is Euclidean distance, Manhattan distance, Minkowski distance?
7.​ What is cosine similarity?Explain in brief the major tasks in data preprocessing.
8.​ What are the different ways to handle missing data?
9.​ What are the different ways to handle noisy data?
10.​Problems on correlation analysis for categorical(Chi square test) and numerical data.
11.​What are the different data transformation strategies?
12.​State different data reduction strategies.

Module III:Classification
1.​ Write Decision Tree algorithm :ID3,C4.5 and CART algorithms
2.​ Explain Attribute selection measures (Information Gain, Gain Ratio, Gini Index )
3.​ What is Overfitting and Tree Pruning
4.​ State Baye’s Theorem
5.​ Explain Naïve Bayesian Classification Algorithm with example. Why it is called Naive
6.​ State advantages and disadvantages of Naive Baye’s Algorithm
7.​ What are the different Metrics for Evaluating Classifier Performance(Accuracy, Precision,
Recall, F1 score,Specificity, Sensitivity)
8.​ What is Class Imbalance problem? Explain with Example.
9.​ What are the different Methods for evaluating accuracy of the classifier(Holdout method
,Random Subsampling,Cross Validation, Bootstrap method)
10.​Explain the Ensemble Methods for Improving the Accuracy of classifier(Bagging,
boosting, and random forest )
11.​What is Simple Linear Regression and multiple linear regression? Examples .

Module IV:Clustering
1.​ What is clustering? State the Applications of clustering algorithms.
2.​ What are the requirements of any clustering algorithms
3.​ What are the different approaches to clustering?
4.​ Explain the Partitioning approach to clustering ( K means and K medoid method)
5.​ Explain the Hierarchical Approach to clustering ( Agglomerative and Divisive clustering)
6.​ What is single linkage and complete linkage agglomerative clustering?
7.​ What is a dendrogram?
8.​ What is DBSCAN? Two parameters used in DBSCAN algo.

Module V: Frequent Pattern Mining


1.​ What do you mean by frequent itemset, frequent subsequence and frequent
substructure? State one example for each.
2.​ What is Market Basket Analysis ? What are the applications of market basket analysis?
3.​ Define the terms support,support count,confidence, Frequent itemset, closed frequent
itemset,maximal frequent itemset with an example
4.​ Explain an Apriori Algorithm for frequent itemset mining.
5.​ Explain the Join and Prune step of Apriori algorithm with an example.
6.​ Advantages and disadvantages of Apriori Algorithm
7.​ What are the different methods to improve efficiency of Apriori Algorithm?
8.​ State applications of Apriori Algorithm
9.​ Explain Frequent pattern algorithm. State advantages of it over Apriori algorithm
10.​What is the single dimensional and multidimensional association rule?

Module VI: Business Intelligence


1.​ What is BI? Examples? steps?
2.​ What are the advantages and disadvantages of BI system?
3.​ What are the components of BI architecture
4.​ What is a decision support system (DSS)?types withExamples
5.​ What are the characteristics of DSS?
6.​ What are the advantages and disadvantages of DSS.

You might also like