Question

1 Approved Answer

Posted on Jun 12, 2024

I Question 1 V Often. we do statistical studies to nd relationships between two or more variables that can help us to better predict future

I Question 1 V Often. we do statistical studies to nd relationships between two or more variables that can help us to better predict future outcomes and perhaps make changes that will improve our lives. In the next activity, we will be focusing on studies and relationships involving two quantitative variables. In each dataset, the two variables will be linked because both observations will be measured from the same individual or unit. These types of linked data are called bivariate data and are often presented in scatterplots. Bivariate data are dened as pairs of data values, where each pair consists of two different measurements that come from the same individual or unit. Example Ateacher wonders if "number of absences per semester" is related to \"academic performance" for students in her classes. She might look back on her class records from previous semesters and generate a dataset by observing both the nal overall average grade and total number of missed classes for each student in a random sample of students. This is an example of a bivariate dataset. When working with a bivariate dataset, there are two variables to consider: 0 The explanatory variable (:12) is the variable that is thought to explain or predict the response variable of a study. 0 The response variable (3;) measures the outcome of interest in the study. This variable is thought to depend in some way on the explanatory variable. It is often referred to as the "variable of interest" for the researcher. (And in previous math classes, this variable may have been referred to as the dependent variable.) In this example, the outcome the teacher is most interested in is how well her students will do in her class, so the response variable is OveraAveroge Grade. The other variable, Number ofAbsences, is the explanatory variable. Identifying explanatory and response variables can sometimes be difficult. When trying to identify explanatory and response variables, make sure to carefully read the scenario and keep the following phrases in mind: gplanatory is used to predict mponse (or calculate) (or determine) It is good practice to identify both variables and then ask, "Which one is the main outcome or focus of the study?" This variable will be the response variable. and the other variable will be your explanatory variable. It is not up to the researcher{s) to decide the main focus or outcome of a pre-existing study. Instead, researchers need to carefully read the context of the study to identify which variable is being used to explain (the explanatory variable) an outcome or response {the response variable). True or False: The response variable can be thought ofas the predicted variable or outcome. 0 False 0 True . Question 2 v Often. we do statistical studies to nd relationships between two or more variables that can help us to better predict future outcomes and perhaps make changes that will improve our lives. In the next activity, we will be focusing on studies and relationships involving two quantitative variables. In each dataset. the two variables will be linked because both observations will be measured from the same individual or unit. These types of linked data are called bivariate data and are often presented in scatterplots. Bivariate data are dened as pairs of data values, where each pair consists of two different measurements that come from the same individual or unit. Example A teacher wonders if \"number of absences per semester\" is related to "academic performance" for students in her classes. She might look back on her class records from previous semesters and generate a dataset by observing both the nal overall average grade and total number of missed classes for each student in a random sample of students. This is an example of a bivariate dataset. When working with a bivariate dataset. there are two variables to consider: 0 The explanatory variable (m) is the variable that is thought to explain or predict the response variable ofa study. 0 The response variable (3;) measures the outcome of interest in the study. This variable is thought to depend in some way on the explanatory variable. It is often referred to as the "variable of interest" for the researcher. (And in previous math classes, this variable may have been referred to as the dependent variable.) In this example, the outcome the teacher is most interested in is how well her students will do in her class, so the response variable is Overolmverage Grade. The other variable. Number ofAbsences, is the explanatory variable. Identifying explanatory and response variables can sometimes be difficult. When trying to identify explanatory and response variables, make sure to carefully read the scenario and keep the following phrases in mind: gplanatory is used to predict mponse (or calculate) (or determine) It is good practice to identify both variables and then ask, "Which one is the main outcome or focus of the study?" This variable will be the response variable, and the other variable will be your explanatory variable. It is not up to the researcher{s) to decide the main focus or outcome of a pre-existing study. Instead. researchers need to carefully read the context ofthe study to identify which variable is being used to explain (the explanatory variable) an outcome or response {the response variable). A researcher wonders if a new cancer treatment leads to a higher veyear survival rate for people diagnosed with a certain type of lung cancer. She creates an experiment where the experimental group gets the new treatment and the control group gets the traditional treatment. After ve years, she gathers data on the people in each group to see which cancer patients survived and which did not. Part A: Identify the explanatory variable. Select the best answer. 0 Survival status of the patient after ve years (survived or did not survive) 0 Treatment status of the patient {control group or experimental group) 0 Cancer status (diagnosed with cancer or no cancer} (I) The years of study (1, 2, 3, 4, or 5) Part B: Identify the response variable. Select the best answer. 0 Survival status of the patient after ve years (survived or did not survive) O Cancer status {diagnosed with cancer or no cancer) 0 The years of study [1, 2, 3, 4, or 5) O Treatment status of the patient {control group or new treatment group) E Calculator Submit Question STAT_6A_Preview Score: 1112 2f? answered 0 Question 3 r A method we will use to make predictions about missing observations or future observations in bivariate data is called Least Squares Regression {LSR} analysis. The language might seem intimidating at first, but the ideas are quite straightforward, especially with examples to illustrate each new term. For example, LSR analysis can also be described as linear modeling, where we determine the equation of a line ofbestfit to make predictions based on an existing dataset. The line of best t is simply the best line that describes the data points. For real data with natural deviations, the line cannot go through all of the points. In fact, very often, the line does not go through any of the data points. Since no line will be perfect, the best we can do is minimize its error. In this class, we will do this by minimizing the sum total of the squared vertical errors from all data points to the line. This is why the line ofbestfit is also called the Least Squares Regression Line (LSRL). --i 'l | M . . a}? Ruldual = a 10= 5 The vertical error associated with each data point is called the residual of that observation. This error, illustrated by the length of the vertical line, represents how far off a prediction calculated from the line is compared to the actual, observed value; the larger the line, the greater the error associated with that particular observation. Note: For data points that are above the line of best fit, the residuals are positive, and for data points that are below the line, the residuals are negative. The equation for the line ofbestfit is very similar to one you may have seen in a previous math class: 1} = a + be where 3'} is the general predicted value of the response variable [pronounced y-hat], o is the estimated value of the y-intercept, and b is the estimated slope. While the actual process of nding the line of best fit might seem complicated, the concept of line of best t is very straightfonNard. We can use technology to take care of long and tedious calculations. Which ofthe following questions could be explored using LSR analysis involving bivariate data? Select all that apply. Both variables must be quantitative. D Could the number of cigarettes a person smokes per day be used to predict a person's lifespan? B Is there an association between the type of pet people own and their level of general happiness? D Does the amount of sleep we get per day have an impact on our weight? D Does our race, ethnicity, andfor gender impact the likelihood that we will be treated fairly when seeking a loan, medical treatment, or pursuing an educational degree? E Calculator Submit Question Question 4 Can we use LSR analysis to better understand the data generated from the experiment in Question 2? Select the best answer. O Yes, if it is a well-designed experiment. O No, because at least one of the variables is categorical. O Yes, because LSR analysis can be used to understand and make better predictions for all datasets. Calculator Submit QuestionA scientist gathered data on the striped ground cricket to see if ground temperature (measured in degrees Fahrenheit) can be predicted by the number of chirps the cricket makes per second (measured in number of wing vibrations per second)1' After collecting the data, he could create a scatterplot to understand if there is a positive linear trend. Part A: Can LSR analysis be used to examine these data? Select the best answer. 0 Yes. because these are bivariate data and both variables are quantitative (and it does not matter that this was an observational study). O No' because this is an observational study and not an experiment. Part B: Identify the explanatory variable. Select the best answer. 0 Number of crickets 0 Time of clay 0 Number of chirps per second 0 Ground temperature Part C: Identify the response variable. Select the best answer. 0 Number of chirps per second 0 Ground temperature 0 Time of day 0 Number of crickets The following is a chart of the data the scientist collected1 to help him answer his question. Chirps per second Temperature in degrees Fahrenheit 20 88.6 16 71.6 19.8 93.3 18.4 84.3 17.1 80.6 15.5 75.2 14.7 69.7 17.1 82 15.4 69.4 16.2 83.3 15 79.6 17.2 82.6 16 80.6 17 83.5 14.4 76.3 Go to the Linear Regression tool at ps:f/dcmathpathwaysshinygpps.io/LinearRegression g and plot the data using the following steps: under "Enter Data,\" select "Enter Own;\" name the (explanatory) and (response) variables appropriately; copy and paste the data from the table {make sure the explanatory variable is in the first column and the response variable is in the second column); under "Plot Options," select "Regression Line;" and select \"Submit Data. Part D: Does the scatterplot look fairly linear? (I:- No C) Yes Part E: What is the equation of the line of best t? Hint -=l Part F: What is the value of the correlation coefficient? 1": lPierce, G. W. {1948). The Song offnsects. Harvard University Press. E Calculator Submit Question Part E: What is the equation of the line of best t? Hint Make sure to use proper notationdon't forget your hat! ? =l l X y F: What is the value of the correlation coefficient? _' 1Pierce, G. W. {1948). The Song ofinsects. Harvard University Press. E Calculator Submit