Answered step by step
Verified Expert Solution
Question
1 Approved Answer
only answer 2b 1. Explain whether each scenario is a classification or regression problem, and indicate whether we are most interested in inference or prediction.
only answer 2b
1. Explain whether each scenario is a classification or regression problem, and indicate whether we are most interested in inference or prediction. Finally, provide the sample size and the number of variables for each scenario. (a) We are trying to figure out the factors potentially leading to cancer. For this, we collect data on 200 patients that either had or didn't have cancer. In particular, we record their body measurements (weight, height etc), heart rate, blood sugar level, family history of disease - measuring twelve variables. On top of that, we conduct a survey on exercise, eating and drinking habits, adding another three variables. (b) We dig into UH student database and obtain data on 1500 students. We are interested in figuring out the factors affecting the final year GPA of a student depending on the data from the entrance exams, high school and their first year at UH. We extract students' SAT scores, high school GPA, first year GPA at UH, major, age and other five variables. (c) We would like to predict the outcome of a college football game (not the exact score, but simply who wins, team A or team B?) depending on various factors. Assuming that it is mid-season, we obtain the data on all 300 games that have already been played and study such variables as: yards gained/allowed, touch- downs scored/allowed, turnovers committed/forced, whether the game is at home or away, whether it rains or not (all-in-all eight variables). (b) We know that general model formula is Y = f(x) +, (1) and we try to estimate true f with f. For each of examples (a), (b), (c) from Problem 1, proceed to answer the following: i. Can our estimate f be treated as a black box? ii. Why/Why not? 1. Explain whether each scenario is a classification or regression problem, and indicate whether we are most interested in inference or prediction. Finally, provide the sample size and the number of variables for each scenario. (a) We are trying to figure out the factors potentially leading to cancer. For this, we collect data on 200 patients that either had or didn't have cancer. In particular, we record their body measurements (weight, height etc), heart rate, blood sugar level, family history of disease - measuring twelve variables. On top of that, we conduct a survey on exercise, eating and drinking habits, adding another three variables. (b) We dig into UH student database and obtain data on 1500 students. We are interested in figuring out the factors affecting the final year GPA of a student depending on the data from the entrance exams, high school and their first year at UH. We extract students' SAT scores, high school GPA, first year GPA at UH, major, age and other five variables. (c) We would like to predict the outcome of a college football game (not the exact score, but simply who wins, team A or team B?) depending on various factors. Assuming that it is mid-season, we obtain the data on all 300 games that have already been played and study such variables as: yards gained/allowed, touch- downs scored/allowed, turnovers committed/forced, whether the game is at home or away, whether it rains or not (all-in-all eight variables). (b) We know that general model formula is Y = f(x) +, (1) and we try to estimate true f with f. For each of examples (a), (b), (c) from Problem 1, proceed to answer the following: i. Can our estimate f be treated as a black box? ii. Why/Why not
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started