Question
Please Answer with R script thank you Q1) load the file LaptopSalesJanuary2008.csv and assign it to a local object # named LSJ. Then, report the
Please Answer with R script thank you
Q1) load the file "LaptopSalesJanuary2008.csv" and assign it to a local object # named LSJ. Then, report the first six rows of the dataset and show all the data # in a new tab. Eventually, produce the dimensions of the data frame in addition to # the class and the mean of every column. For this question, you can just complete # the following lines.
LSJ <- read.csv("LaptopSalesJanuary2008.csv", header = TRUE) head(LSJ) View(LSJ) dim(LSJ)
# # Q2) How many NAs are in the dataset? How many rows include at least one NA? Replace # the NAs in the "CustomerStoreDistance" column by the median of the rest of the values # in the same column. Check the number of complete rows again to make sure you have # successfully replaced the NAs in that column. (The number should have decreased to 8.) LSJ[complete.cases(LSJ), ]
# # Q3) Keep the subset of the data frame ranging from columns 5 through 10 (inclusive), # and assign this subset to a local object named LSJ1. How many NAs exist in this # new data frame?
# # Q4) List the unique values in each of the columns of LSJ1. From the lecture slides, # recall the usage of the "table" function to obtain the frequency of each unique # value in a series. Apply the table function to every column of LSJ1.
# # Q5) Take a random sample of 100 observations from LSJ1. Show only the "head" of # the resulting sample. # # Also notice that items with a price greater than $600 constitute a small portion # of the dataset. From LSJ1, oversample such items in the following way. Sample 100 # observations such that the probability of obtaining an item with a price greater # than $600 is 95% (and the probability of obtaining an item with a price less than # $600 is 5%). Show only the "head" of the resulting sample.
# # Q6) Notice that there is only one categorical variable of character type in LSJ1. # Think about the number of categories this variable takes and decide how many dummy # variables you will need to replace it. Use either the "dummies" package or the model.matrix # function to generate just as many dummy variables as you "require" (don't go beyond), # replace the current categorical variable with the dummy variable(s), and assign # the output to a local object named "LSJ2". Finally, show the head of LSJ2.
# # Q7) Partition LSJ2 into training set (40%), validation set (30%), and test set # (30%). Don't miss "set.seed".
# Q8) Fit a linear regression model to the training set (partitioned from LSJ2) # with Retail.Price as the target and the rest of the variables as predictors.
# Q9) Apply the regression model to the validation set.
# Q10) Compute the evaluation metrics for both the training and prediction sets.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started