Question
QUESTION 5 Which of the following statements about variable selection and clustering is true? A marketer should always use all of the available numeric variables
QUESTION 5
-
Which of the following statements about variable selection and clustering is true?
A marketer should always use all of the available numeric variables in a dataset in order to build a clustering model -- but the categorical variables should not be included.
A marketer should build a clustering model using only one variable at a time. The marketer can then come back and build additional models with other variables (but still maintaining a policy of just using one variable per model).
The choice of which variables to use in a clustering analysis is up to the marketer. The decisions that a marketer makes about which variables to include will impact the resulting model.
A marketer should always use all of the available variables in a dataset in order to build a clustering model. A clustering model built with an incomplete subset of variables is often considered unreliable.
10 points
QUESTION 6
-
Which of the following statements about k-means clustering and cluster sizes is true?
k-means clustering will always evenly place records into clusters (for instance, if someone is building a clustering model with 200 records and 4 clusters, each cluster will have 50 records in it).
The k-means process is not designed to build clusters of even or equal sizes. The k-means algorithm sometimes creates clusters that contain completely different numbers of records.
A modeler who wishes to create a k-means clustering analysis with an even number of records can ensure this by specifying a k-value that is exactly 10 percent of the dataset (so with a dataset of 400 records, the modeler would have to specify that there should be exactly 40 clusters).
k-means clustering is designed to create clusters of equal sizes, but it sometimes varies by a small margin of error (so a person could build a clustering model with 200 records and could end up with clusters of the following sizes: 55, 45, 40, 60).
10 points
QUESTION 7
-
When preparing a data visualization, should an analyst stop to consider the likely audience for the visualization?
Yes. It is important to consider the audience. Some people may prefer to see simple graphs that are easy to read and interpret, whereas others might prefer more complex graphs that depict more variable relationships at one time.
If the graph is being used for exploratory purposes, then yes, the audience should be considered. Whenever the graph is being produced for another person to read/review, then it should be built without any consideration for who will see it or use it.
No, the audience should not be considered. One of the most important characteristics of any data visualization is that it should be able to stand on its own, without requiring additional explanation or context. It should be built for any audience -- not a specific one.
Regardless of the audience, a graph should always be built in the simplest, least-complex format possible. That will enable the easiest, most widespread possible interpretation -- and this is the purpose of data visualization.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started