CS614 Current FinalTerm Paper 25 August 2016

Q.What may be possible implications if the developing organization never freezes the requirements throughout the DWH development i.e. it always behaves like an accommodating person. 5
Write down any two drawbacks if “Date” is stored in text format rather than using proper date format like “dd-MMM-yy” etc. 5
There are different data mining techniques e.g. “clustering”, “description” etc. Each of the following statement corresponds to some data mining technique. For each statement name the technique the statement corresponds to. 5
a) Assigning customers to predefined customer segments (i.e. good vs. bad)
b) Assigning credit applicants to predefined classes (i.e. low, medium, or high risk)
c) Guessing how much customers will spend during next 6 months
d) Building a model and assigning a value from 0 to 1 to each member of the set. Then classifying the members into categories based on a threshold value.
e) Guessing how much students will score more than 65% grades in midterm.
There are two justifications for a task to be performed in parallel, either it manipulates significant amount of data (i.e. size) or it can be solved by divide and conquer (D&C) strategy. From the given list, provide the justification for each of the task to perform it in parallel. 5
a) Large table scans and joins
b) Creation of large indexes
c) Partitioned index scans
d) Bulk inserts, updates, and deletes
e) Aggregations and copying

What are the tasks performed through import / export data wizard to load data?Write any three 3
In context of Four Cell Quadrant Technique, which business process (from the diagram below) will have highest priority? Justify with reason. [Marks 3]

Consider the following two statements. Specify that each statement correspond to which activity of data quality analysis project. 3

a) Identify functional user data quality requirements and establish data quality metrics.
b) Measure conformance to current business rules and develop exception reports.
Identify the given statement as correct or incorrect and justify your answer in either case.
“Bayesian Modeling is an example of Unsupervised Learning”. 3

The problems associated with the extracted data can correspond to non-primary keys. List down any four problems associated with the non-primary key.5
What is Reverse Proxy?2

Why analytics track is called as the “fun part" while designing a data warehouse?2

List two main types of unsupervised learning.2

