Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Game attendance in baseball varies partly as a function of how well a team is playing. Load the Lahman library. TheTeamsdata frame contains anattendancecolumn. This

Game attendance in baseball varies partly as a function of how well a team is playing.

Load theLahmanlibrary. TheTeamsdata frame contains anattendancecolumn. This is the total attendance for the season. To calculate average attendance, divide by the number of games played, as follows:

library(tidyverse)

library(broom)

library(Lahman)

Teams_small <- Teams %>%

filter(yearID %in% 1961:2001) %>%

mutate(avg_attendance = attendance/G)

1) Use runs (R) per game to predict average attendance.

For every 1 run scored per game, average attendance increases by how much?

2)Use home runs (HR) per game to predict average attendance.

For every 1 home run hit per game, average attendance increases by how much?

3)Use number of wins to predict average attendance; do not normalize for number of games.

For every game won in a season, how much does average attendance increase?

4)Suppose a team won zero games in a season.

Predict the average attendance.

5)Use year to predict average attendance.

How much does average attendance increase each year?

6)Game wins, runs per game and home runs per game are positively correlated with attendance. We saw in the course material that runs per game and home runs per game are correlated with each other. Are wins and runs per game or wins and home runs per game correlated?

What is the correlation coefficient for wins and runs per game?

What is the correlation coefficient for wins and home runs per game?

StratifyTeams_smallby wins: divide number of wins by 10 and then round to the nearest integer. Keep only strata 5 through 10, which have 20 or more data points.

Use the stratified dataset to answer this three-part question.

1) How many observations are in the 8 win strata?

2)Calculate the slope of the regression line predicting average attendance given runs per game for each of the win strata.

Which win stratum has the largest regression line slope?

3)Calculate the slope of the regression line predicting average attendance given HR per game for each of the win strata.

Which win stratum has the largest regression line slope?

4)Fit a multivariate regression determining the effects of runs per game, home runs per game, wins, and year on average attendance. Use the originalTeams_smallwins column, not the win strata from question 3.

What is the estimate of the effect of runs per game on average attendance?

What is the estimate of the effect of home runs per game on average attendance?

What is the estimate of the effect of number of wins in a season on average attendance?

5)Use the multivariate regression model from Question 4. Suppose a team averaged 5 runs per game, 1.2 home runs per game, and won 80 games in a season.

What would this team's average attendance be in 2002 and in 1960

6)Use your model from Question 4 to predict average attendance for teams in 2002 in the originalTeamsdata frame.

What is the correlation between the predicted attendance and actual attendance?

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Geometry

Authors: David A Brannan, Matthew F Esplen, Jeremy J Gray

2nd Edition

1139200658, 9781139200653

More Books

Students also viewed these Mathematics questions