Question
Assignment is related to the module, Regression. The questions that follow are based on a dataset of New York city taxi rides stored in taxi.csv.
Assignment is related to the module, Regression. The questions that follow are based on a dataset of New York city taxi rides stored in taxi.csv. Read in taxi.csv and assign it to an object, taxi. If the data is in your working directory, then you can use the following code to read in the data:
taxi = read.csv('taxi.csv')
About NYC Taxi Data
This data contains a subset of NYC taxi trips for April 2022. The goal of this assignment is to examine the factors that influence the size of the tip (tip_amount) a taxi driver receives.
Variables
trip_id: Unique identifier for each trip
trip_duration: Duration of trip in minutes
trip_distance: Distance of trip in miles
passenger_count: Number of passengers
fare_amount: Fare calculated by the meter. This does not include tolls, surcharges or tips.
tolls_amount: Amount of all tolls paid in trip
tip: whether the taxi driver received a Tip or No Tip
tip_amount: tip paid
period_of_day: Time of day for pickup: morning, afternoon, evening, night
pickup_date: Date of month for pickup
period_of_month: Period of month when the trip occurred: beginning, middle, end
pickup_day: Day of week for trip: Mon, Tue, Wed, Thu, Fri, Sat, Sun
pickup_hour: Hour of day for pickup
pickup_min: Minute of day for pickup
pickup_sec: Second of day for pickup
pickup_time: Pick up date and time
dropoff_time: Drop off date and time
Details
You will have a maximum of three attempts for this assignment. Only those attempts registered before the due date will count towards your score.
When entering your answers, please follow these instructions unless otherwise stated. (Failing to do so may mark your answer as incorrect even if it is correct.):
Do not round answers from R. Enter them as is.
Do not use commas to separate numbers in an answer. E.g., write 100000 NOT 100,000
Do not include units. E.g., 34.56 NOT $34.56
Wherever relevant, include the 0 before the decimal. E.g., state the answer as 0.34 NOT .34
Drop trailing 0s after the decimal. For e.g., state answer as 0.3 NOT 0.30
Academic Integrity
The responses on this assignment must be the product of your individual work. Copying and presenting the work of another as your own, or collaborating with others on this assignment is an academic infarction punishable with a failing grade in this assignment, or this course.
Question 1 (2 points)
Generally speaking, including a larger number of meaningful predictors will improve the quality of predictions. It is reasonable to expect the following predictors to influence tip paid: number of passengers (passenger_count), fare amount (fare_amount), hour of the day of the ride (pickup_hour), whether the trip occurred in the beginning, middle or end of the month (period_of_month), and day of the week for the trip (pickup_day). Use these variables in a multiple regression to predict tip_amount. Call this model5.
Which of the following variables are significant predictors of tip_amount? Please note, a categorical predictor variable is statistically significant if even one of the dummy variables representing it is statistically significant. Select one or more correct answers.
Question 1 options:
period_of_month | |||||||||||||||||||||||||||||||||||||||
passenger_count
| |||||||||||||||||||||||||||||||||||||||
pickup_day
| |||||||||||||||||||||||||||||||||||||||
pickup_hour
| |||||||||||||||||||||||||||||||||||||||
fare_amount Question 2 (2 points) In model5, which is the strongest predictor of tip_amount? Question 2 options:
|
Step by Step Solution
3.38 Rating (154 Votes )
There are 3 Steps involved in it
Step: 1
Questi...Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started