Question
library(tidyverse) library(broom) library(Lahman) Teams_small % filter(yearID %in% 1961:2001) %>% mutate(avg_attendance = attendance/G) Question 1b Use number of wins to predict average attendance; do not normalize
library(tidyverse) library(broom) library(Lahman) Teams_small <- Teams %>% filter(yearID %in% 1961:2001) %>% mutate(avg_attendance = attendance/G)
Question 1b
Use number of wins to predict average attendance; do not normalize for number of games.
For every game won in a season, how much does average attendance increase?
Suppose a team won zero games in a season.
Predict the average attendance.
Question 1c
Use year to predict average attendance.
How much does average attendance increase each year?
Stratify Teams_small by wins: divide number of wins by 10 and then round to the nearest integer. Keep only strata 5 through 10, which have 20 or more data points.
Use the stratified dataset to answer this three-part question.
Question 3a How many observations are in the 8 win strata?
(Note that due to division and rounding, these teams have 75-85 wins.)
Question 3b
3b.1 Calculate the slope of the regression line predicting average attendance given runs per game for each of the win strata. Which win stratum has the largest regression line slope?
3b.2 Calculate the slope of the regression line predicting average attendance given HR per game for each of the win strata.
Which win stratum has the largest regression line slope?
Question 4
Fit a multivariate regression determining the effects of runs per game, home runs per game, wins, and year on average attendance. Use the original Teams_small wins column, not the win strata from question 3. 4.1 What is the estimate of the effect of runs per game on average attendance? 4.2 What is the estimate of the effect of home runs per game on average attendance? 4.3 What is the estimate of the effect of number of wins in a season on average attendance?
Question 5
Use the multivariate regression model from Question 4. Suppose a team averaged 5 runs per game, 1.2 home runs per game, and won 80 games in a season.
5.1 What would this team's average attendance be in 2002?
5.2 What would this team's average attendance be in 1960?
Question 6
Use your model from Question 4 to predict average attendance for teams in 2002 in the original Teams data frame.
What is the correlation between the predicted attendance and actual attendance?
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started