Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Study Guide Midterm ExamDATA 3 3 0 0 The following is a list of topics that will be covered on the midterm exam. Please see
Study Guide Midterm ExamDATA The following is a list of topics that will be covered on the midterm exam. Please see the Canvas page Module : Midterm Exam for details about the exam and study materials. Data Analytics & Data Science What data science is and how its used Common sources of data:a Social networksb. Traditional business systemsc. Internet of Things Different types of data analytics and their applications:a Diagnosticb. Descriptivec. Predictived. Prescriptive The sequence of steps in the CRISPDM process and the importance of each:a Business Understandingb. Data Understandingc. Data Preparationd. Modelinge. Evaluationf. Deployment Data Quality & Preparation What ETL is and why it is important in data analytics Five data quality characteristics:a Accuracyb. Uniquenessc. Completenessd. Consistencye. TimeAppropriateness Common forms of dirty data and the threats they pose to data analysis:a Errors typos misspellingsb Inconsistent Datac. Absence of Datad. Contradicting Datae. Reused Primary Keys Common steps in data cleansing and how long data cleansing takes as part of the overall data mining process:a Parsingb. Correctingc. Standardizingd. Matchinge. Consolidating Data Understanding Familiarize yourself with the differences between qualitative and quantitative variable types:a Qualitative:i Nominalii. Ordinalb. Qualitativei. Ratioii. Interval Understand how each of the metrics below relate to data exploration data distribution, central tendency, and data dispersiona Understand each of the following descriptive statistics and their purpose:i Meanii. Medianiii. Modeiv. Variancev. Standard deviationvi. Interquartile rangevii. Outliersb. Understand and identify:i Skewnessii. Kurtosis Basic principles of visualization and how to interpret visualizations including most common chart types and their purpose and how to create visualizations that clearly communicate the data, not just reinforce prior beliefs:a Histogramsb. Line chartsc. Box plotsd. Pie Chartse. Stacked column chart Modeling Foundations What data mining is its appropriate applications, and common data mining tasks What is meant by the terms:a Data instanceRecordCaseObservationb AttributesVariablesi Target attributeDependent variable The difference between supervised and unsupervised data mining The difference between classification and regression types of supervised data mininga. Classification When the DV is categoricalb. Regression When the DV is numerical Association Rules Analysis What association analysis is the type of data it requires, and the types of business questions it can answer What an association rule looks like:a Itemsets and their role in association rulesb. Antecedents and consequents Understand how to calculate and interpret:a Supportb. Confidencec. Lift Know the tradeoffs between adjusting the minimum support and confidence thresholds and the resulting association rules generated Clustering Analysis What clustering analysis is the type of data it requires, and the types of business questions it can answer What kmeans clustering is:a Understand the steps involvedi. Why we sometimes normalize data in cluster analysis zscore normalizationii. Impact of potentially overweighting variables which are measuring the same thing in different waysb. Know what k stands for and how it is determinedi. Posthoc evaluationii. Elbow ruleiii. Tractabilityc. Understand the basics of how the algorithm worksi. How are clusters determined?ii How is a centroid value determined?iii. What is the relationship between a cluster and a centroid? How do you interpret results with centroid values? Ways to calculate similaritydissimilarity between cases and how to interpret the distancea. Euclidian Distance know and interpret formulab Intraclass similarityc. Interclass similarity Understand how to interpret a cluster analysis through a centroid table and plot. Statistical Correlation What correlation analysis is the type of data it requires, and the types of business questions it can answer Know how to identify and interpret:a Correlation coefficientb. Correlation analysis resultsc. Convergent validity and when to use itd Coefficient of determination know how to calculate it What scatter plots look like for strong versus no relationship between two variables Assumptions and limitations of correlation analysis:a Homoscedasticityb. Normal distribution of datac. Impact of outliers Good Luck!
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started