Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

The Prostate Dataset The prostate dataset comes from a study on 97 men with prostate cancer who were due to receive radical prostatectomy. The data

The Prostate Dataset

The prostate dataset comes from a study on 97 men with prostate cancer who were due to receive radical prostatectomy.

The data contain the following variables:

  • lcavol: log(cancer volume in cm3)
  • lweight: log(prostate weight in gm)
  • age: age in years
  • lbph: log(benign prostatic hyperplasia amount)
  • svi: seminal vesicle invasion
  • lcp: log(capsular penetration)
  • Gleason: Gleason score
  • pgg45: percentage Gleason scores 4 or 5
  • lpsa: log(prostate specific antigen in ng/mL)

Question 1

Validate that the prostate data frame contains 97 observations. Hint: First install the faraway package (if you haven't already) as instructed on Lesson 1, Slide 49. The following R statement will load the prostate data frame:

data("prostate", package = "faraway").

Use the nrow() function to see how many overvaluations (rows) the data frame has. For example: the following statement prints the number of observations in the car data frame:nrow(cars).

Question 2

Calculate descriptive statistics of each of the variables. Hint: Use the summary() function. For example: summary(cars).

Question 3

Create a new data frame that includes the following variables: lcavol,lweight,age andlpsa. Use this new data frame for all questions below.

Hint: In the following example, we select two variables (agegp and alcgp) from the esoph data frame and name the new data frame esophSubDf

esophSubDf <- esoph[c("agegp", "alcgp")]

Question 4

Calculate descriptive statistics of each of the variables using the new data frame.

Question 5

Create a scatter plot matrix for all the variables using the new data frame.

Hint: Use the pairs() function (see Lesson 2, Slide 50).

Question 6

Create a (Pearson) correlation matrix for all the variables. Hint: Use the cor() function (see Lesson 2, Slide 48).

Question 7

Show the same matrix again, but round the correlations (use two decimal places).

Hint: Use the round() function. The following example calculates the correlation matrix for the cars data frame and rounds the numbers: round(cor(cars),2)

Question 8

Create a regression model: The predictor variable (X) should be lpsa. The outcome variable (Y) should be lcavol. Show the summary of the model.

Hint: Use the lm() and summary() functions (see Lesson 2, Slide 51).

Question 9

Visualize the two variables and the model you just created by doing the following:

Create a scatter plot. Put lcavol in the y-axis and lpsa in the x-axis. Include the regression line and label the axis.

Hint: See Lesson 2, Slide 52.

Question 10

Update the regression model by adding a second predictor: age Show the regression model summary Hint: See Lesson 2, Slide 53.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Elementary Linear Algebra with Applications

Authors: Bernard Kolman, David Hill

9th edition

132296543, 978-0132296540

More Books

Students also viewed these Mathematics questions

Question

What is meant by depreciable base? How is it determined?

Answered: 1 week ago