# 2170715 DMBI IMP QUESTIONS

Introduction to Data Warehousing

1 1 Que:-1 Explain meta data repository.

Que:-2 Differentiate between OLTP and OLAP systems.

Que:-3 Short Note: Distributive and Holistic measures.

Que:-4 Clearly state the differences between “Data Warehouses” and “Operational Database Systems”.

Que:-5 Explain the ‘Star’ and ‘Snowflake’ schemas of data warehouse.

Que:-6 With the help of a neat diagram explain the 3-tier architecture of a data warehouse.

Que:-7 What is Cuboid? Explain various OLAP Operations on Data Cube with example.

Que:-8 Explain Star, Snowflake, Fact Constellation Schema for Multidimensional Database.

Introduction to Data Mining

2 2 Que:-1 Explain following Terms:

1. Concept Hierarchy.

2. Histogram.

Que:-2 Explain Mean, Median, Mode, Variance & Standard Deviation in brief.

Que:-3 Explain different types of data on which mining can be performed.

Que:-4 What is Data Mining? Explain Data mining as one step of Knowledge Discovery Process.

Que:-5 Explain methods for data normalization.

Que:-6 List and describe major issues in data mining.

Data Preprocessing and Data Mining Primitives

3 3 Que:-1 What is noise? Describe the possible reasons for noisy data. Explain the different techniques to remove the noise from data.

Que:-2 List and describe the methods for handling the missing values in data cleaning.

Que:-3 Explain data transformation in data mining.

Que:-4 Explain the pre-processing required to handle missing data and noisy data during the process of data mining.

Concept Description and Association Rule Mining

4 4 Que:-1 Write & Explain Apriori algorithm for discovering frequent itemsets for mining Boolean Association Rules.

Que:-2 What is Market Basket Analysis? Explain Association Rules with Confidence & Support.

Que:-3 Explain the steps of the “Apriori Algorithm” for mining frequent itemsets.

Que:-4 Describe the list of techniques for improving the efficiency of Apriori-based mining.

Que:-5 Write an algorithm for finding frequent item-sets using candidate generation.

Classification and Clustering

5 5 Que:-1 Short note: Information gain, Gain ratio, Gini index.

Que:-2 Write the typical requirements of clustering in data mining.

Que:-3 Explain k-means and k-medoids algorithms of clustering.

Que:-4 Explain rule based classification and case based reasoning in details.

Que:-5 What is “Information Gain”? Explain the steps required to generate a Decision Tree from a training data set.

Que:-6 Explain with example how continuous numerical data values can be discretized.

Que:-7 Write the steps of the k-means clustering algorithm. Also state its limitations.

Que:-8 Explain how the accuracy of a classifier can be measured.

Que:-9 Write a short note on hierarchical clustering.

Que:-10 Explain “Linear Regression” using suitable example.

Que:-11 Explain how the accuracy of a classifier can be measured.

Que:-12 Explain how the topology of a neural network is designed.

Que:-13 Discuss applications of “Fuzzy Logic”.

Que:-14 Explain the Classification by Decision Tree Induction Algorithm.

Que:-15 Explain Linear & Non-Linear Regression methods of Predictions.

Que:-16 Explain k-means and k-medoids algorithm of clustering.

Que:-17 Explain Rule-based Classification in brief.

Advance Topics of Data Mining and its Applications

6 6 Que:-1 Explain different types of Web Mining with example.

Que:-2 Explain how a search engine automatically identifies authoritative web pages on a user’s search topic.

Que:-3 Explain the information retrieval methods used in text mining.

Que:-4 Explain the methodologies for stream data processing and stream data Systems.

Que:-5 What are the challenges for effective resource and knowledge discovery in mining the world wide web?