IT DMBI

 

1

What is Normalization? Explain various methods for data normalization.

2

What is Data Integration?

3

Why do we pre-process the data?

4

What are the steps involved in data pre-processing?

5

Discuss the issues to be considered during data integration?

6

Describe the different methods for data cleaning.

7

Explain how to handle noisy data?

8

Explain Data smoothing and binning methods for data smoothing.

9

What is meant by dimensionality reduction? Discuss any 2 methods.

10

What do you mean by numerosity reduction? Explain various methods to achieve it.

11

Briefly explain methods of concept hierarchy generation for categorical data.

12

Explain sampling methods for data reduction.

13

What is Binning? List and explain binning strategies.

14

 What is Data transformation? Briefly explain various steps of data transformation.

15

Explain Data cleaning as two-step process.

16

Explain the purpose of correlation analysis. Explain how to find correlation between two numeric attributes and categorical attributes.

17

In real-world data, tuples with missing values for some attributes are a

Common occurrence. Describe various methods for handling this problem.

18

List out strategies for data reduction.

19

Explain data cube aggregation of data reduction.

20

Explain Attribute subset selection technique of data reduction.

21

What is Discretization? Why it is used? Explain types of discretization.

22

What is concept Hierarchy?                 

23

Explain methods for constructing concept hierarchy for numeric attribute based on data discretization.

24

Explain methods for constructing concept hierarchy for categorical attribute based on data discretization.

25

Define the following terms:

Mean, Median, Mode, Range, Five Number Summary, Inter Quartile Range, Variance, Standard Deviation, Outlier, kth Percentile

26

Explain different Graphics display methods of basic descriptive data summaries : Box Plot, Histogram, Scatter Plot, Quantile plot, Quantile-quantile plot, Loess Curve

Unit:1 

27

Differentiate between data, information & knowledge.

28

Define Business Intelligence.

29

Define Data Warehouse.

30

Explain common functions of Business Intelligence technologies.

31

What is the relation between Data warehouse and BI.

32

Explain components and elements of data warehouse.

33

Explain components and elements of business intelligence.

34

Explain life cycle of data.

35

Explain Data warehouse metadata.

36

Explain various trends in data warehousing.

 

Unit 2

37

Why have a separate data warehouse from operational databases?

38

Explain data mart.

39

Differentiate data warehouse and data mart.

40

Differentiate Operational Systems vs. Decision Support System(Informational system).

41

What is Virtual Warehouse?

42

List the types of OLAP server.

43

Which one is faster, Multidimensional OLAP or Relational OLAP?

44

How many dimensions are selected in Slice operation?

45

How many dimensions are selected in dice operation?

46

How many fact tables are there in a star schema?

47

Explain types of data warehouse. (information  processing, analytical processing, data mining)

48

Explain two approaches for integrating heterogenous databases? (query-driven, update-driven)

49

Explain three-tier data warehouse architecture.

50

Explain Data Warehouse Models. (Virtual data warehouse, data mart, enterprise warehouse)

51

What is the difference between dependent data warehouse and independent data warehouse?

52

Briefly state different between data ware house & data mart?

53

What is the benefit of data warehouse?

54

Explain the storage models of OLAP.

55

Differentiate between Data Mining and Data warehousing.

56

Differentiate between Data warehousing and Business Intelligence.

57

What is Data purging?

58

What is Data scrubbing?

59

What are CUBES?

60

Differentiate between OLTP and OLAP.

61

What is the very basic difference between data warehouse and operational databases?

62

How does a Data Cube help?

63

Define dimension?

64

What does Metadata Respiratory contain?

65

Define metadata.

66

What do you mean by Data Extraction?

67

List the Schema that a data warehouse system can implements.

68

List the functions of data warehouse tools and utilities.

69

List the processes that are involved in Data Warehousing.

70

What is Data Warehousing?

71

What are different types of cuboids?

72

What are the forms of multidimensional model?

73

If there are n dimensions, how many cuboids are there?

74

List the typical OLAP operations.

75

Differentiate between star schema and snowflake schema.

76

What is a fact table?

77

What is a dimension table?

78

What is a ETL process?

79

What is aggregation?

80

Explain methods for indexing OLAP data.

81

Define Apex cuboid, Base cuboid.

82

Explain starnet  query model.

83

Explain pros and cons of top-down and bottom-up approaches for data warehouse development.

84

How many cuboids will be there in n-dimensional cube?

85

Explain data cube materialization.

86

Explain Online analytical mining.

 

Unit 3

87

What are issues in data mining?

88

What are the different problems that “Data mining” can solve?

89

What is Discrete and Continuous data in Data mining world?

90

How does the data mining and data warehousing work together?

91

What is data characterization?

92

What is data discrimination?

93

What are two types of data mining tasks? (Descriptive task,Predictive task)

94

What are outliers?

95

What do you mean by evolution analysis?

96

What do you mean by Time Series analysis?

97

What is Association Mining?

98

What are the components of data mining?

99

What are data mining techniques/functionalities?

100

Define  KDD.

101

What is the use of Knowledge Base?

102

Give the architecture of data mining system.

103

Discuss the issues in data mining in detail.

104

Describe the steps involved in KDD process.

105

Discuss  data mining task primitives.

106

Explain various data repositories on which data mining techniques are applied.

107

Explain architecture of data mining systems along with components in architecture of data mining system.

108

Describe multi-dimensional view of data mining classification.

109

Explain types of integration of data mining system with DBMS or data warehouse system.

 

Concept Description and Association Rule Mining

110

What are frequent patterns?

111

What is concept Hierarchy?                 

112

Explain the Apriori algorithm. Also explain how the association rules are generated from frequent item sets.

113

What do you mean by closed frequent item set? What is its

application? Which are various searching methods for it?

114

Discuss why analytical data characterization is needed and how it can be

performed. Compare the result of two induction methods.

1) With relevance Analysis

2) Without relevance Analysis

115

Explain different approaches to mining multilevel association rules.

116

Explain Market Basket Analysis.

117

Explain measures for finding rule interestingness. (support, confidence)

118

Explain various ways of classifying frequent pattern mining.

119

Explain methods for improving  the efficiency of Apriori algorithm.

 

Classification and Prediction

120

What is regression?

121

Define classification.

122

How do you choose best split while constructing a decision tree?

123

Explain the algorithm for constructing a decision tree from training samples.

124

Write Bayes theorem.

125

Compare clustering and classification.

126

Differentiate supervised and unsupervised learning.

127

Explain machine learning.

128

What is prediction? Discuss the use of regression techniques for prediction?

129

Compare association and classification. Briefly explain associative

classification with suitable example.

130

Compare various attribute selection measures for decision tree with

suitable example.

131

Define: supervised learning, training set, testing set, accuracy of classifier, sensitivity,

specificity, regression.

132

Explain various methods of evaluating accuracy of classifier.

133

Why naïve Bayesian classification is called “naïve”? Briefly outline the major idea of naïve Bayesian classification.

134

Write down short note on Backpropagation

135

Explain issues regarding classification and prediction. (Preparing data for classification & prediction, Comparing classification and prediction methods)

136

Explain criteria according to which classification and prediction methods can be compared?

137

Why decision tree classifiers are so popular?

 

Data Mining for Business Intelligence Applications

138

Explain data mining application for balanced scorecard.

139

Explain data mining application for fraud detection.

140

Explain data mining application for Click stream mining.

141

Explain data mining application for Market Segmentation.

142

Explain data mining application for retail industry.

143

Explain data mining application for telecommunication industry.

144

Explain data mining application for banking and finance.

145

Explain data mining application for CRM.

146

Explain data analytics life cycle.

147

State of the practice in analytics role of data scientists

 

148

What is spatial data mining?

149

What is multimedia data mining?

150

What are different types of multimedia data?

151

What is text mining?

152

What do you mean by web content mining?

153

Define web structure mining and web usage mining.

154

Explain clustering. Explain Various methods for clustering.

155

Define big data.

156

Explain distributed file system.

157

Explain big data applications.

158

Explain Hadoop Architecture.

159

Explain algorithm for map reduce. Solve Matrix-Vector Multiplication by Map Reduce.

160

Explain Hadoop storage – HDFS.

 

 

No comments:

Post a Comment