Answered step by step
Verified Expert Solution
Question
1 Approved Answer
1. In addition to being quantitative and technical, what are key characteristics of Data Scientists as described in this module? (Select all that apply) a.
1. In addition to being quantitative and technical, what are key characteristics of Data Scientists as described in this module? (Select all that apply) a. Curious b. Skeptical c. Independent d. Introverted c. None of the above 2. In which phase of the analytic lifecycle would you expect to spend most of the project time? a. Discovery b. Data preparation c. Communicate Results d. Operationalize e. None of the above 3. What is the benefit of running a pilot project during the final phase (Operationalize) of an analytics project? a. Limit risk b. Learn about performance constraints c. Learn what is needed to retrain the model over time d. All of the above e. None of the above 4. What are the outputs generated by a k-Means clustering Analysis? a. The centroids of the discovered cluster and the assignment of each input datum to a cluster b. The rules that associate each input datum to a class and the diameter of the discovered clusters c. Within Sum of Squares for each discovered cluster and the overall cluste dispersion d. Class association for each datum and class probabilities e. None of the above 5. Which one of the following statements is false about big? a. Supports random reads and queries b. Supports complex nested data structures c. Schema is optional, can be specified at run time d. Only supports sequential reads and queries e. None of the above 6. After which phase of the analytic lifecycle should be able to share a draft of an analytic plan? a. Discovery b. Data preparation c. Communicate Results d. Operationalize e. None of the above 7. In Phase 2 (Data Preparation), data visualization is commonly used for what? a. Data exploration b. Data modeling c. Variable selection d. Data conditioning e. None of the above 8. How is Apriori property defined? a. Any subset of a frequent itemset is also frequent b. Any subset of a frequent itemset will not be frequent c. Frequent itemsets will not have the support of its superset d. The difference in the probability of X and Y appearing together in the data sef is equal to what would be expected if X and Y were statistically independent e. None of the above 9. What is the utility of TF-IDF? a. It provides a measure that will weight the presence of unusual terms in the query as higher indications of document relevance than the presence of more common terms b. It provides a measure that will weight the presence of unusual terms in the query as lower indications of document relevance than the presence of more common terms c. It provides a measure that will weight the presence of common terms in the query as higher indications of document relevance than the presence of more unusual terms d. It provides a measure of the number of times a term occurs in the Corpus e. None of the above 10. Which daemons are required to run the Hadoop Cluster? (Select all that apply) a. NameNode b. DataNode c. JobTracker d. TaskTracker e. None of the above
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started