Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Please Answer with R script thank you Q1) load the file LaptopSalesJanuary2008.csv and assign it to a local object # named LSJ. Then, report the

Please Answer with R script thank you

Q1) load the file "LaptopSalesJanuary2008.csv" and assign it to a local object # named LSJ. Then, report the first six rows of the dataset and show all the data # in a new tab. Eventually, produce the dimensions of the data frame in addition to # the class and the mean of every column. For this question, you can just complete # the following lines.

LSJ <- read.csv("LaptopSalesJanuary2008.csv", header = TRUE) head(LSJ) View(LSJ) dim(LSJ)

# # Q2) How many NAs are in the dataset? How many rows include at least one NA? Replace # the NAs in the "CustomerStoreDistance" column by the median of the rest of the values # in the same column. Check the number of complete rows again to make sure you have # successfully replaced the NAs in that column. (The number should have decreased to 8.) LSJ[complete.cases(LSJ), ]

# # Q3) Keep the subset of the data frame ranging from columns 5 through 10 (inclusive), # and assign this subset to a local object named LSJ1. How many NAs exist in this # new data frame?

# # Q4) List the unique values in each of the columns of LSJ1. From the lecture slides, # recall the usage of the "table" function to obtain the frequency of each unique # value in a series. Apply the table function to every column of LSJ1.

# # Q5) Take a random sample of 100 observations from LSJ1. Show only the "head" of # the resulting sample. # # Also notice that items with a price greater than $600 constitute a small portion # of the dataset. From LSJ1, oversample such items in the following way. Sample 100 # observations such that the probability of obtaining an item with a price greater # than $600 is 95% (and the probability of obtaining an item with a price less than # $600 is 5%). Show only the "head" of the resulting sample.

# # Q6) Notice that there is only one categorical variable of character type in LSJ1. # Think about the number of categories this variable takes and decide how many dummy # variables you will need to replace it. Use either the "dummies" package or the model.matrix # function to generate just as many dummy variables as you "require" (don't go beyond), # replace the current categorical variable with the dummy variable(s), and assign # the output to a local object named "LSJ2". Finally, show the head of LSJ2.

# # Q7) Partition LSJ2 into training set (40%), validation set (30%), and test set # (30%). Don't miss "set.seed".

# Q8) Fit a linear regression model to the training set (partitioned from LSJ2) # with Retail.Price as the target and the rest of the variables as predictors.

# Q9) Apply the regression model to the validation set.

# Q10) Compute the evaluation metrics for both the training and prediction sets.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Beginning Microsoft SQL Server 2012 Programming

Authors: Paul Atkinson, Robert Vieira

1st Edition

1118102282, 9781118102282

More Books

Students also viewed these Databases questions

Question

How do Data Types perform data validation?

Answered: 1 week ago

Question

How does Referential Integrity work?

Answered: 1 week ago