Answered step by step
Verified Expert Solution
Question
1 Approved Answer
In this assignment, you are given a dataset to perform an exploratory analysis to better understand the shape, structure and quality of the data, investigate
In this assignment, you are given a dataset to perform an exploratory analysis to better understand the
shape, structure and quality of the data, investigate and resolve data issues, and develop preliminary
insights & analysis. Your final submission will take the form of a report consisting of obtained results and
captioned visualizations that convey key insights gained during your analysis.
Note: Do not include the questions as well as dataset in your submission to avoid similarity with other
submissions
Business Problem
The data is related to a pharmasutical study. During the study, the subjects of study are surveyed about
their sleep quality and other demographics, in order to assess what sort of sleep disorder they might be
prone to
Dataset:
F eatu re De finition
a ge Patient s a ge
Gender Patients Gender
Occupation Patients Occupation
Sleep duration Sleep duration in hour
Quality of sleep A ranked quality of sleep for each individual
Physical activity level A ranked physical activity level
Stress level Stress level on a scale of to
BMI Category Body mass index
Blodd pressure Blood pressure highlow
Heart Rate Hear rate in BPM
Daily steps Daily steps count
y If the patient might be prone to sleep disorder and what is that
Required Analysis
You are supposed to perform an extensive exploratory analysis of the dataset, including the following
exploration phases:
The data quality report.
Identify issues in data if any, data quality plan and mitigate the identified issues.
Several meaningful data insights in the form of tables and graphs which can help to understand the
dataset. There is no limit on the number of insights you can provide, but the minimum of is at least
expected. Note: Creativity is important in this section.
Deliverables
Project Proposal
The main purpose of the proposal is for us to check on whether the scope of the project is in the range of what
we're expecting, whether your plans are crisp enough, and in cases where you plan to use a different dataset
than one from the list above, whether it looks suitable and promising. On average we expect proposals to be
about halfapage long, though we know the lengths will vary. Please create a document containing the
following two parts.
Dataset
o Describe the data. As part of this, please include the total size of the dataset eg number of rows
and a small sample of the data.
o Include a link to the source of the data, and discuss any difficulties you anticipate getting the data
ready for analysis.
Goals
o Formulate a specific set of questions you want to answer, points you want to make, or issues you
wish to explore through the data. Be as concrete as possible.
What To Turn In
Your proposal should be in a pdf document named projectproposal.pdf Include clearly at the top of the
document the names and SUIDs for the student submitting the proposal, then include the two parts of the
proposal specified above. Upload the pdf document along with the complete project.
Complete Project
Use data mining techniques and tools to manipulate, analyze, and possibly visualize the data in order to achieve
your objectives. Here are a few tips and techniques:
How to implement data mining in Excel
How to treat missing values
How to import data from a website to Excel
It is likely you will end up developing a data processing pipeline, where in each step you transform or otherwise
manipulate some or all of your data to get it into a form that's suitable for the next step. In the final step your
data should be in the best form to answer your questions or otherwise achieve your objectives.
In many cases the early steps in a pipeline are more about preparing the data correcting mistakes, filling in
missing values, creating consistent representations, mapping corresponding values while the later steps are
more focused on summarization and analysis. If you use one of the recommended datasets, your preparation
steps may be minimal.
In case you need to develop some features using Flagging, aggregation, ratio, and mapping techniques, mention
what fetaures you derived and what the feature type is
What To Turn In
You will be turning in a single PDF writeup to Gradescope.
The writeup should include parts and from the project proposal, discuss in reasonable detail how you went
about your analysis, and finally and most importantly discuss the conclusions drawn from your datadriven
study. On average we expect the writeups to be about pages long, though we know the lengths will vary.
Data visualizations can be pasted into the writeup. At the end of your writeup, include a section titled
Description of Files Used that lists all the artifacts that you used to generate the analysis and visualizations,
with a clear description of what each one contains. For example:
datavisualization.tbxThistableau file performsthemaindataanalyses,usingqueries
OR
datacleaning.xlsx Thisspreadsheetperformsadditionaldatamanipulationsandcontainsthe final
visualizations
Here is a guideline for the sections in the main writeup:
Include clearly at the top of the document the names and StudentIDs for the
student or studentpair submitting the project.
Dataset: as in project proposal
Goals: as in project proposal
Data processing: Description of steps that were taken from raw data to final results
Visualizations: you need to share your visualizations either in Power BI or Tableau.
Please share your Tableau dashboards on Tableau Public you need to first create a
profile here, and then follow this link to learn how to publish thedashboard
For Power BI: Either share them on a public workplace or if you can't, share them
under a one drive folder, and share the link.
Conclusions: resolution of questions, issues, or points from part based on your study
Description of Files Used
Upload the pdf document under the Assignment link.
Data Analysis and Visualization Tools:
Feel free to use any of the following tools for data analysis and visualization:
Tableau
Power BI
Grading Rubric
Key Points Grade Allocation
Format font type, size, table, formulas overall
content, including references if required APA
Style
Results, analysis and assumptions
Novelty and creativity in solution
NB Failure to comply with the above would result in low grades.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started