Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Question 1: Which of the following are issues in data integration? (Which would actually cause conflicts) (Choose all that apply): A. Two different databases may

image text in transcribedimage text in transcribed
Question 1: Which of the following are issues in data integration? (Which would actually cause conflicts) (Choose all that apply): A. Two different databases may have different column names for the same actual information B. Two databases on related subjects that you want to integrate may have different number of columns or rows C. An attribute named "weight" may be in different databases. D. There may be discrepancies between entries in two different databases for the same actual real- life entity. Question 2: Match the type of normalization to its property: Decimal Scaling 1. The new values tell how many standard deviations from the sample is from the mean of the original data Min-Max Normalization 2. result is the greatest to be between -1 and 1, but original zeros stay zero Z-score normalization 3. the values are linearly scaled from one interval into another, the middle value Question 3: Which of the following are True about Forward Selection? (Select all that apply) A. Forward Selection is a feature selection method, keeping a subset of original values to make a reduced-complexity model The best results from forward selection will be the same as for PCA because it chooses the set of variables to keep based on variance Question 4: Which of the following are ways to deal with missing data values? (Choose all that apply): A. Use a special value like "unknown" to capture that there is meaning to the fact that value is missing B. All you can do is use the only data mining algorithms that can handle data with values missing C. Replace with the average value of the attribute among data points with the same class D. Predict missing values with a model based on the data that you have (Ex: Classification of regression) Question 5: Text data can be stored in a matrix with "bag-of-words" model. This means: A. Each document is assigned a column to keep track of when it is needed. B. Each row represents a unit of text (Ex: Document) and each column represents a word C. The words are all put in one set and the set information is held per unit of text (Ex: Document) D. A graph is constructed to represent how one unit of text (Ex: Document] contains wordsQuestion 6: Which of the following are true about Forward Selection (Select all that apply): A. Forward Selection is a feature selection method, keeping a subset of the original variables to make a reduced-complexity model B. Forward Selection is a greedy algorithm that runs a classification algorithm over and over as part of evaluating subsets of features C. The best results from forward selection will be the same as for PCA because it chooses the set of variables to keep based on variance D. Using forward selection can result in a model that generalizes better (Ex: Is less subject to overfitting) Question 7: Which of these are true of using clustering for smoothing? A. Clustering is used for replacing missing values, not smoothing B. We replace data points by an average or representatives of points in their cluster C. Each cluster must have the same number of data points D. The best smoothing for a point uses centers of other clusters

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

The Equation Of Knowledge From Bayes Rule To A Unified Philosophy Of Science

Authors: Lê Nguyên Hoang

1st Edition

1000063275, 9781000063271

More Books

Students also viewed these Mathematics questions

Question

Able to describe variations in rewards practices.

Answered: 1 week ago