2170715 DMBI IMP QUESTIONS
Introduction to Data Warehousing
1 1 Que:-1 Explain meta data repository.
Que:-2 Differentiate between OLTP and OLAP systems.
Que:-3 Short Note: Distributive and Holistic measures.
Que:-4 Clearly state the differences between “Data Warehouses” and “Operational Database Systems”.
Que:-5 Explain the ‘Star’ and ‘Snowflake’ schemas of data warehouse.
Que:-6 With the help of a neat diagram explain the 3-tier architecture of a data warehouse.
Que:-7 What is Cuboid? Explain various OLAP Operations on Data Cube with example.
Que:-8 Explain Star, Snowflake, Fact Constellation Schema for Multidimensional Database.
Introduction to Data Mining
2 2 Que:-1 Explain following Terms:
1. Concept Hierarchy.
2. Histogram.
Que:-2 Explain Mean, Median, Mode, Variance & Standard Deviation in brief.
Que:-3 Explain different types of data on which mining can be performed.
Que:-4 What is Data Mining? Explain Data mining as one step of Knowledge Discovery Process.
Que:-5 Explain methods for data normalization.
Que:-6 List and describe major issues in data mining.
Data Preprocessing and Data Mining Primitives
3 3 Que:-1 What is noise? Describe the possible reasons for noisy data. Explain the different techniques to remove the noise from data.
Que:-2 List and describe the methods for handling the missing values in data cleaning.
Que:-3 Explain data transformation in data mining.
Que:-4 Explain the pre-processing required to handle missing data and noisy data during the process of data mining.
Concept Description and Association Rule Mining
4 4 Que:-1 Write & Explain Apriori algorithm for discovering frequent itemsets for mining Boolean Association Rules.
Que:-2 What is Market Basket Analysis? Explain Association Rules with Confidence & Support.
Que:-3 Explain the steps of the “Apriori Algorithm” for mining frequent itemsets.
Que:-4 Describe the list of techniques for improving the efficiency of Apriori-based mining.
Que:-5 Write an algorithm for finding frequent item-sets using candidate generation.
Classification and Clustering
5 5 Que:-1 Short note: Information gain, Gain ratio, Gini index.
Que:-2 Write the typical requirements of clustering in data mining.
Que:-3 Explain k-means and k-medoids algorithms of clustering.
Que:-4 Explain rule based classification and case based reasoning in details.
Que:-5 What is “Information Gain”? Explain the steps required to generate a Decision Tree from a training data set.
Que:-6 Explain with example how continuous numerical data values can be discretized.
Que:-7 Write the steps of the k-means clustering algorithm. Also state its limitations.
Que:-8 Explain how the accuracy of a classifier can be measured.
Que:-9 Write a short note on hierarchical clustering.
Que:-10 Explain “Linear Regression” using suitable example.
Que:-11 Explain how the accuracy of a classifier can be measured.
Que:-12 Explain how the topology of a neural network is designed.
Que:-13 Discuss applications of “Fuzzy Logic”.
Que:-14 Explain the Classification by Decision Tree Induction Algorithm.
Que:-15 Explain Linear & Non-Linear Regression methods of Predictions.
Que:-16 Explain k-means and k-medoids algorithm of clustering.
Que:-17 Explain Rule-based Classification in brief.
Advance Topics of Data Mining and its Applications
6 6 Que:-1 Explain different types of Web Mining with example.
Que:-2 Explain how a search engine automatically identifies authoritative web pages on a user’s search topic.
Que:-3 Explain the information retrieval methods used in text mining.
Que:-4 Explain the methodologies for stream data processing and stream data Systems.
Que:-5 What are the challenges for effective resource and knowledge discovery in mining the world wide web?