Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Data Source: https://www.kaggle.com/c/bike-sharing-demand/dataLinks to an external site. You are asked to perform the following tasks by writing a script in R and submit both R

Data Source: https://www.kaggle.com/c/bike-sharing-demand/dataLinks to an external site.

You are asked to perform the following tasks by writing a script in R and submit both R codes and a Word document.

    1. Load the dataset day.csv Download day.csvinto memory.
    2. Perform the following data preparations using control structures:

a. Convert numerical season (1,2,3, 4) to characters (springer, summer, fall and winter)

b. Convert numerical weathersit (1,2,3,4) to characters (Good, Mist, Bad, Severe)

    1. Consider the following predictors, season, holiday, workingday, weathersit, atemp, hum, windspeed, casual and List all categorical variables from this list and convert them to factors.
    2. Calculate the minimum, maximum, mean, median, standard deviation and three quartiles (25th, 50th and 75th percentiles) of cnt.
    3. Calculate the minimum, maximum, mean, median, standard deviation and three quartiles (25th, 50th and 75th percentiles) of registered.
    4. Calculate the correlation coefficient of the two variables: registered and cnt. Do they have a strong relationship?
    5. Calculate the frequency table of season? What's the mode of season variable?
    6. Calculate the cross table of season and weathersit, then produce proportions by rows and columns respectively.
    7. Please plot the histogram and density of the cnt and add the vertical line denoting the mean using ggplot2.
    8. Please scatter plot of cnt (y-axis) against registered (x-axis) and add the trend line using ggplot2.
    9. Please plot the barplot of season and weathersit on the same barplot using ggplot2
    10. Please boxplot cnt (y-axis) against weathersit (x-axis) and save the graph in a file, cntweather.jpg, using ggplot2. Are there any differences in cnt with respect to weathersit?
    11. Build the following multiple linear regression models:

Perform multiple linear regression withcntas the response and the predictors are:season,weathersit,atemp,andregistered.Write down the math formula with numerical coefficients for predictors atempand registeredand skip the coefficients forseasonand weathersit.

Preform multiple linear regression withcntas the response and the predictors are:season,workingday,weathersit,atemp,andregistered.Write down the math formula with numerical coefficients for predictors atempand registeredand skip the coefficients forseason,workingdayandweathersit.

  1. Preform multiple linear regression with cnt as the response and the predictors are: season, holiday, workingday, weathersit, atemp, hum, windspeed, and registered. Write down the math formula with numerical coefficients for predictors atemp,hum,windspeed,and registered and skip the coefficients for season,holiday,workingday and weathersit.
  2. Which model do you recommend to the management based on adjusted R squared? Justify your answer.

Summarize Question 13-C usingR markdownto generate a reproducible report. Include the following scripts in your R markdown file:

  1. Load the data as specified in Question 1.
  2. Convert the two variables as specified in Question 2.
  3. Convert the categorical variables to factors as specified in Question 3
  4. Build the linear model as specified in Question 13-C. Use R markdown to report the math formula with numerical coefficients for predictors atemp, hum, windspeed, and registered. Skip the coefficients for season, holiday, workingday and weathersit.
  5. Build the following logistic models:
    1. forecast holiday using cnt, season, and registered.
    2. forecast the holiday using cnt, season, weathersit , and registered
    3. forecast the holiday using cnt, season, weathersit , workingday, and registered
    4. Which model do you recommend to the management based on McFadden/pseudo R squared to? Justify your answer

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Probability With Applications and R

Authors: Robert P. Dobrow

1st edition

1118241257, 1118241258, 978-1118241257

More Books

Students also viewed these Mathematics questions

Question

What is general priority order concerning wage attachments?

Answered: 1 week ago