Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

For those part could you help to figure out Discussion of what is observed in the EDA: I noticed that there is much missing data

For those part could you help to figure out

Discussion of what is observed in the EDA:

I noticed that there is much missing data which means I will ...

There were some variables that were skewed which means ...

Linear model:

My research question can be answered using a linear model because...

I will be using Y as my response and use the predictors X because ...

Some anticipated issues might be/A linear model is likely appropriate because..

Some things I plan to keep an eye on when working on my data analysis are...

my own idea ## Discussion of what is observed in the EDA:

I noticed that there is much missing data which means I will remove those data cause Missing data present various problems. First, the absence of data reduces statistical power, which refers to the probability that the test will reject the null hypothesis when it is false. Second, the lost data can cause bias in the estimation of parameters. Third, it can reduce the representativeness of the samples. Fourth, it may complicate the analysis of the study. Each of these distortions may threaten the validity of the trials and can lead to invalid conclusions.[The prevention and handling of the missing data(Hyun Kang,2013)].After remove the missing date. It reduced from 764 to 394 There were some variables that were skewed which means we can compare the mean and median from those skewed plot.

## Linear model: My research question can be answered using a linear model because from that scatter plot there exists a linear relationship and it is normal distribution with countinus data. I will be using 'pedi' as my response Y and using the predictors' plas', 'pres','mass', 'age','skin', and 'inus' be X because, through my EDA, I found that these predictors are largely related to my response. Some anticipated issues might be/a linear model is likely appropriate because those data are independent and my responses are normal disturbances. Also, my numerical predictors are countiuns. Some things I plan to keep an eye on when working on my data analysis about the variance is constant variance

here is my pdf

https://drive.google.com/drive/folders/1i8ZhAOQnjokndeWbMqineSpDjb02oYN9?usp=sharing

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Practical Linear Algebra A Geometry Toolbox

Authors: Gerald Farin, Dianne Hansford

4th Edition

1003051219, 9781003051213

More Books

Students also viewed these Mathematics questions