Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Lesson 9 Lab Assignment Your Lab assignment is to do all of the calculations that I did using the data found in the excel file
Lesson 9 Lab Assignment Your Lab assignment is to do all of the calculations that I did using the data found in the excel file (Lesson 9 Data). You will notice that the file contains that data that I analyzed (Group 1) and a new set of scores (Group A2, Group B2, and Group C2; found in cells A28 through C47). Your assignment is to replicate my analyses for the Group 1 data, and then conduct the same analyses for the Group 2 data. You will notice that something interesting happens with these two sets of data. You will be asked to explain these differences by filling out some questions found in rows 54 through 61. When you have finished with the analyses and answered the questions you should save the excel file as "LastName_Lab9Assignment" and then place it in the Lesson 9 Lab Assignment Dropbox. One-way Analysis of Variance This week's lesson commentary discussed an example of how we think about the comparison of two (or more) groups by examining the variability between the groups and the variability within each group. Now we need to take those concepts and look at how they are applied using data. For this week's Lab Assignment I will be asking you to make the calculations that I am making, first with the same example data set, and then with a different data set that you will need to work on alone. The data for the example and the Lab Assignment can be found here (Lesson 9 Data). It might be best to read through everything once, and then open the Excel file, but you can play along if you prefer. The example data set that I will be referring to is labeled as Group A1, Group B1, Group C1, and the data can be found in cells A2 through C21. Lesson 9 Lab Step 1 First, a couple of things you should notice. There are three treatment groups with 20 participants each (NTotal = 60), and there are a range of scores for participants in each of the groups. For our purposes the data represent scores on a DV that could range from 0 to 25. I have taken the liberty of arranging the data in size order, and you should be able to notice that all three groups have a similar Range of scores. The first thing that we should do is take a look at the descriptive statistics for the entire sample, and for each of the three groups. In column E I have typed Mean, StDev (standard deviation), Var (variance), and Range. At the top of column F I typed A1 (for GroupA1), then B1, and C1. Lesson 9 Lab Step 2 Now I want to compute the mean {=AVERAGE(A2:A21)} the standard deviation {=STDEV(A2:A21)}, the variability {=VAR(A2:A21)} and the Range {=A21-A2} (note that because the data are ordered from smallest to biggest I can get the range this way). You should get something that looks like this. Lesson 9 Lab Step 3 We see that the means seem to be different, but the standard deviations, variabilities and ranges seem to be pretty similar. So how can we figure out if the scores from the groups really are different? Sum of Squares One of the most important computations for statistics is called the sum of squares, which is short for the sum of squared deviations. The mean tells us where the mathematical middle of a distribution lays, but the sum of squares helps us to understand how the data deviate from that average (think standard deviation). The sum of squares (SS) is defined as: SS= (X X ) 2 SS=(XX) 2 or in plain English: each score minus the mean of the scores, squared and then added together. Total SS If we first think about our data in terms of all being from one sample (in other words, forget about treatment grouping for now) then we can calculate a mean, and the sum of squares for the entire group of 60 scores. The mean for the entire group is easy to calculate, it is equal to 7.5, but to calculate the SS we would need to subtract the mean from each score, square that value and then add them up. Luckily there is a shortcut. SS= X 2 ( X) 2 N SS= X 2 ( X) 2 N No? Am I losing you? Hang in there. If we go back to our data we need to calculate three things. To do this we go to cells A23, A24, and A25 and type Sum X, Sum X SQ, and Total SS. We want to do this so that we can keep all of our numbers labeled. Next we want to make the calculations that we need in cells B23, B24, and B25. The first is the Sum of all of the data from all three groups. In cell B23 we type, =SUM(A2:C21) this adds up all of the scores In cell B24 we type, =SUMSQ(A2:C21) This is going to take each of our scores, square it, and then add them all up. Finally in cell B25 we can put in the formula for SS. We type, =B24-((B23^2)/60) (watch out for all of the parentheses!) B24 is the sum of the squared X's, B23 is the sum of the X's and I added a carrot (that ^ thingy to tell Excel to raise the value to the 2nd power) and finally the 60 is the number of scores. You should get this: The "Total SS" represents a measure of how much the entire set of data deviates from the mean. What we need to figure out now is how much of that deviation is due to the differences between the three groups, and how much is due to the differences within each group. The key here is that the Total SS will be equal to the Within Groups SS plus the Between Groups SS (SSTotal = SSWithin + SSBetween). So we now know the Total SS, we just need to figure out one of the others, and then we can know all three. We will calculate the Within Groups SS. To do this we will do the same things we did for the Total SS, except that we will do it for each group independently. Within SS On row 22 in columns E, F, and G type, Group A1, Group B1, and Group C1. In column D type Sum, Sum SQ, and SS as shown to the left. Now we need to calculate the sum of all of the scores for the people in Group A1. This is, =SUM(A2:A21) Next we calculate the Sum of squared scores: =SUMSQ(A2:A21), and finally we type in our formula for the SS for Group A1: =E24-((E23^2)/20) Now here is a neat trick. If you highlight cells E23, E24, and E25 then click on the little black square and drag (without letting go of the clicker) over to cell G25 excel will copy the formulas that you created with the data in column A but it will use the data from column B and C. You should get this: Image Alternative This has given us the SS for each of the three groups. The within group SS is just these three added up (846 + 809 + 838 = 2493). Because SSTotal = SSWithin + SSBetween we can do the math and see that 2743 = 2493 + SSBetween tells us that SSBetween = 250. I will put this calculation in cell J25 =SUM(E25:G25), with the appropriate label in cell I25. Between SS Because we know the Total SS and the Within SS we can figure out the Between SS, but for the sake of being complete, here is the formula for Between SS. SS= The i N N i ( X i X total ) 2 Ni is the number of scores for a group, the X i Xi is the mean for a particular group, and the SS= X Ni(XiX total total ) 2 X total is the mean for all scores. The idea is that we want to add up how much each group differs from the average for the whole data set, while weighting that deviation by multiplying it by the number of scores within each group. This only becomes an issue when we have groups with different numbers of scores. For our data the Between SS would be: SS=20 (57.5) 2 +20 (7.57.5) 2 +20 (107.5) 2 SS=206.25+200+206.25 SS=125+0+125 SS=250 SS=20( 5 7.5 ) 2 +20( 7.5 7.5 ) 2 +20( 10 7.5 ) 2 SS=206.25+200+206.25 SS=125+0+125 SS=250 You could enter this equation into excel but it's simple enough for a calculator, so we will move on after we enter the Between SS into cell M25 We are most of the way to getting our ANOVA calculated. So hang in there a Degrees of Freedom little longer. You should recall from the textbook that an F-ratio is actual a measure of the between groups variance divided by the within groups variance. The formula for variance (symbolized by s 2 s 2 ) is... s 2 = SS df s 2 =SSdf We have already figured out how to calculate the SS part, the next thing we need to worry about is the df part. Degrees of freedom (df) are based on the idea that I only need to know some of the information if I know the total. Let me put it to you this way: If I know that I have five numbers that need to add up to equal 100, then four of those numbers can be whatever I want, as long as the fifth number can be used to get me back to 100. So my numbers could be (20+20+20+20+x=100) ( 20 + 20 + 20 + 20 +x= 100 ) In this case x must equal 20 to make the equation work. The numbers could also be (13+45+23012+6+x=100) ( 13 +45 + 23012 + 6 +x= 100 ) . In this case x must be equal to -22886 for the equation to work. In either case I have the freedom to let four of the numbers be anything as long as the fifth number can be used to make the equation work. So when we talk about degrees of freedom in statistics we are really saying that we need to figure out how many scores there are to work with, and then the degrees of freedom will be that Degrees of Freedom Within number minus one. We have three different SS to think about. The degrees of freedom within is based on the number of scores minus one for each group, and then added together. In other words, there are 20 scores for Group A1, so the df for this group is (201=19) ( 20 1 = 19 ) , and the same is true for Group B1, and Group C1. So 19+19+19=57 19 + 19 + 19 = 57 . Another way this is often presented is: d f within =Nk with N being the total number of scores, and k bein g the number of groups. df within =Nk with N being the total number of scores, and k being the number of groups. In this case we have 60 scores and 3 groups, so (603=57). Degrees of Freedom Between The Between SS looked at the averages for each group, so the degrees of freedom are actually based on the number of groups minus one. Usually written as: d f between =k1 with k being the number of groups. df between =k 1 with k being the number of groups. In this case we have three groups, so (31=2) ( 3 1 = 2 ) The F-ratio . If the F-ratio is equal to the between groups variance divided by the within groups variance, and if variance is calculated by taking the SS and dividing it by df then we are almost there. We just need to call the variances Mean Squares (MS). F= M S between M S within with M S between = SS between d f between and M S within = S S within d f within F=MS between MS within with MS between =SS between df between and MS within =SS within df within I know, you are about to lose it. There is a reason we use this language. It is to point out that the variability we are talking about is actually only an estimate based on our sample. Statisticians are particular about these things. Think of it this way; our SS gives us a measure of how much a group of scores varies, and by dividing by the degrees of freedom we are creating an average (mean) SS score. Anyway the good news is that we already have all of this stuff figured out, so we just need to plug in some numbers to finally get our F-value. If we go to cells E8, E9, and E11 we want to type in some labels (MS bet, MS with, and F-value). I chose these cells arbitrarily, but just do the same thing so you can follow along. I want to calculate the Mean Squares Between (MS bet) so in cell F8 I type =M25/2 because I put the SS between in cell M25, and we know that the df between is equal to 2. To calculate the Mean Squares Within (MS with) I go to cell F9 and type =J25/57 because I put the SS within in cell J25, and we know that the df within is equal to 57. Lastly, to calculate the F-value I go to cell F11 and type =F8/F9 After all of that work I finally get an F-value the equals 2.858. What does that mean? Well, it depends. In the back of your textbook (Appendix E) you will find a set of statistical tables. We are interested in the table called Critical Values of F. This table tells you the minimum value of F for your test to be significant when alpha is 0.05. If an F-value is less than the critical F, it will not be significant, and we can't reject the null hypothesis. If the value is equal to, or greater than the critical F value, then the test is significant and you can reject the null hypothesis. You will notice that this table runs on for several pages. This is because the value of F is really dependent upon the degrees of freedom for your analyses. In our case we had 2 df for the between variance part, (the numerator) and 57 df for the within variance part (the denominator) of our ratio. On the table you will see across the top numbers that represent the numerator df and along the side numbers that represent the denominator df. After a bit of searching you will notice that not all values are represented in the table. In particular, you will be able to find a column for 2 df in the numerator, but you can only find 55 df (3.17) or 60 df (3.15) for the denominator. That is OK. We know that the critical value of F will be between these numbers, and in this case our F-value of 2.858 is not as big as it needs to be. This means that we cannot reject the null hypothesis (at the alpha = .05 level) that the scores for the three groups are different from one another. To put it in conceptual terms, the between groups variance is not big enough, and/or the within groups variance is too big for us to feel confident that such scores could not happen by chance. That is right, we went through all of that work, and the results are that we did not find a significant difference. GroupA1 GroupB1 GroupC1 0 0 1 0 0 2 0 1 2 0 2 3 0 2 3 0 3 5 1 3 7 1 3 7 1 4 8 1 5 8 2 6 9 2 7 10 2 9 11 5 9 13 7 10 15 13 13 17 14 16 17 15 18 19 15 19 21 21 20 22 GroupA2 GroupB2 GroupC2 0 0 1 0 7 2 0 1 10 5 6 11 6 2 6 5 8 7 4 3 7 1 3 7 1 4 8 1 5 8 2 7 9 2 7 10 5 9 11 5 9 13 7 10 15 13 13 17 11 15 17 9 13 14 10 15 13 13 13 14 Did you find a significant F-value for the Group 2 data? Answer: What is the same about the data for Group 1 and 2? Answer: What is different about the data for Group 1 and 2?
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started