Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Comprehensive Foundations of Data Science Project Assignment Objective: Utilize a selected dataset to conduct a thorough Foundations of Data Science project, covering data exploration, cleaning,
Comprehensive Foundations of Data Science Project Assignment
Objective: Utilize a selected dataset to conduct a thorough Foundations of Data Science project, covering data exploration, cleaning, visualization, basic inferential statistics, and predictive I
analytics concepts.
Total Marks:
Project Tasks:
Dataset Selection and Business Problem Definition Mark
Select a dataset and define a clear business problem or question that your analysis
aims to address.
Data Overview Mark
Conduct a preliminary examination of the dataset. Describe its size, nature of variables quantitative vs qualitative and the potential value it brings to solving the business problem.
Data Types Identification Mark
Classify the dataset's variables into their respective data types numerical categorical, ordinal, etc.
Missing Values Assessment Mark
Identify any missing values in the dataset. Provide a summary of the missingness pattern observed.
Data Cleaning Strategy Mark
Propose a strategy for handling missing data deletion imputation, etc. and justify your approach.
Outliers Detection Mark
Employ a method to detect outliers in the dataset. Briefly describe how you identified them.
Handling Outliers Mark
Decide on a strategy for dealing with the detected outliers. Explain your choice and its implications for the analysis.
Data Transformation Mark
Perform necessary data transformations normalization scaling, etc. to prepare the data for analysis. Justify why these transformations are required.
Categorical Data Encoding Mark
Encode categorical variables as needed for analysis. Explain your encoding choices.
Exploratory Data Analysis EDA Introduction Mark
Begin EDA by summarizing key statistics mean median, mode, standard deviation for at least three variables.
Visual EDA: Distributions Mark
Create visualizations histograms box plots to explore the distribution of key variables.
Visual EDA: Relationships Mark
Generate scatter plots or correlation heatmaps to investigate relationships between variables.
Identify and Interpret Key Relationships Mark
Based on the visualizations, identify two key relationships in the data. Discuss their potential impact on the business problem.
Descriptive Analytics Summary Mark
Provide a concise summary of the descriptive analytics findings, highlighting insights relevant to the business problem.
Formulate Statistical Questions Mark
Based on your EDA, formulate two statistical questions that could further inform the business problem.
Hypothesis Testing Plan Mark
For one statistical question, outline a plan for a hypothesis test. Specify the null and alternative hypotheses, significance level, and the test statistic to be used.
Inferential Statistics Conceptual Application Mark
Conceptually apply inferential statistics to estimate a parameter or make predictions about the population based on your sample. No calculations required; explain your thought process.
Introduction to Predictive Modeling Mark
Identify a potential outcome variable for predictive modeling. Justify its selection based on your business problem and EDA findings.
Feature Selection Rationale Mark
Discuss how you would select features for the predictive model. Consider correlations, importance, and relevance to the outcome variable.
Feature selection process Mark
Select the features using at least different model driven techniques
Predictive Model Choice Mark
Choose a predictive modeling technique suitable for your data and business problem eg linear regression, logistic regression Explain why this model is appropriate.
Data Visualization for Insights Mark
Create a dashboard layout or a set of visualizations that could provide actionable insights to a business stakeholder. Describe how these insights address the business problem.
Presentation of Findings Mark
Summarize the key findings from your project, focusing on how they address the business problem. Include any recommendations or potential actions.
Interpretation of Predictive Model Output Mark
Once you've conceptually chosen a predictive model, describe how you would interpret the model's output to derive actionable business insights.
Conclusion Mark
Conclude your project by summarizing the key insights and their implications for the business problem. Reflect on the limitations of your analysis and potential improvements.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started