ID Salary 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 65 1 27 36 1 61 4 46 6 73 3 39 7 23 2 74 6 22 8 23 4 62 40 7 23 24 51 1 69 3 36 24 2 35 5 74 1 52 9 22 54 2 24 5 22 5 43 1 75 7 73 7 48 23 4 27 3 62 5 27 2 23 6 22 7 23 7 60 35 1 23 7 42 4 Compa Midpoint 1 142 0 870 1 166 1 078 0 971 1 094 0 992 1 007 1 114 0 990 1 019 1 088 1 017 0 999 1 043 1 277 1 216 1 162 1 053 1 144 1 106 1 103 0 955 1 130 1 065 0 979 1 078 1 130 1 100 1 000 1 016 0 880 1 096 0 876 1 027 0 985 1 032 1 052 1 132 1 030 1 060 57 31 31 57 48 67 40 23 67 23 23 57 40 23 23 40 57 31 23 31 67 48 23 48 23 23 40 67 67 48 23 31 57 31 23 23 23 57 31 23 40 Age 34 52 30 42 36 36 32 32 49 30 41 52 30 32 32 44 27 31 32 44 43 48 36 30 41 22 35 44 52 45 29 25 35 26 23 27 22 45 27 24 25 Performance Service Gender Rating 85 80 75 100 90 70 100 90 100 80 100 95 100 90 80 90 55 80 85 70 95 65 65 75 70 95 80 95 95 90 60 95 90 80 90 75 95 95 90 90 80 8 7 5 16 16 12 8 9 10 7 19 22 2 12 8 4 3 11 1 16 13 6 6 9 4 2 7 9 5 18 4 4 9 2 4 3 2 11 6 2 5 0 0 1 0 0 0 1 1 0 1 1 0 1 1 1 0 1 1 0 1 0 1 1 1 0 1 0 1 0 0 1 0 0 0 1 1 1 0 1 0 0 Raise Degree 5 7 3 9 3 6 5 5 5 7 4 5 5 7 5 8 4 4 7 4 8 4 5 4 7 6 4 9 5 7 3 5 6 4 6 4 8 6 3 3 8 3 3 3 8 4 6 2 3 9 4 4 5 4 4 3 3 9 5 6 5 5 4 9 5 3 4 3 6 2 4 5 5 5 6 3 4 3 0 0 1 1 1 1 1 1 1 1 1 0 0 1 1 0 1 0 1 0 1 1 0 0 0 0 1 0 0 0 1 0 1 1 0 0 0 0 0 0 0 42 43 44 45 46 47 48 49 50 22 5 75 8 62 1 51 4 57 1 62 7 69 60 63 8 0 978 1 131 1 090 1 071 1 001 1 100 1 210 1 053 1 120 23 67 57 48 57 57 57 57 57 32 42 45 36 39 37 34 41 38 100 95 90 95 75 95 90 95 80 8 20 16 8 20 5 11 21 12 1 1 0 1 0 0 1 0 0 5 7 5 5 5 2 5 2 3 9 5 5 5 3 6 6 4 6 1 0 1 1 1 1 1 0 0 Gender 1 Gr M M F M M M F F M F F M F F F M F F M F M F F F M F M F M M F M M M F F F M F M M E B B E D F C A F A A E C A A C E B A B F D A D A A C F F D A B E B A A A E B A C The ongoing question that the weekly assignments will focus on is Are males and females p Note to simplfy the analysis, we will assume that jobs within each grade comprise equal wor The column labels in the table mean ID Employee sample number Salary Salary in thousands Age Age in years Performance Rating Appraisal rating (employee eva Service Years of service (roundGender 0 male, 1 female Midpoint salary grade midpointRaise percent of last raise Grade job pay grade Degree (0 BS BA 1 MS) Gender1 (Male or Female) Compa salary divided by midpoint F F M F M M F M M A F E D E E E E E Are males and females paid the same for equal work (under the Equal Pay Act) grade comprise equal work sal rating (employee evaluation score) This assignment covers the material presented in weeks 1 and 2 Six Questions Before starting this assignment, make sure the the assignment data from the Employee Salary Data Set file is copied o You can do this either by a copy and paste of all the columns or by opening the data file, right clicking on the Data tab (Weekly Assignment Sheet or whatever you are calling your master assignment file) It is highly recommended that you copy the data columns (with labels) and paste them to the right so that whatever yo To Ensure full credit for each question, you need to show how you got your results For example, Question 1 asks for then the cells should have an XX formula in them, where XX is the column and row number showing the value in value using fxfunctions, then each function should be located in the cell and the location of the data values should be So, Cell D31 as an example shoud contain something like T6 or average(T2 T26) Having only a numerica The reason for this is to allow instructors to provide feedback on Excel tools if the answers are not correct we need t In starting the analysis on a research question, we focus on overall descriptive statistics and seeing if differences exist 1 The first step in analyzing data sets is to find some summary descriptive statistics for key variables Since t focus mostly on the compa ratios, we need to find the mean, standard deviations, and range for our groups Sorting the compa ratios into male and females will require you copy and paste the Compa ratio and Gende The values for age, performance rating, and service are provided for you for future use, and if desired to (see if you can replicate the values) You can use either the Data Analysis Descriptive Statistics tool or the Fx average and stdev functions The range can be found using the difference between the max and min functions with Fx functions or fr Suggestion Copy and paste the compa ratio data to the right (Column T) and gender data in column U If you use Descriptive statistics, Place the output table in row 1 of a column to the right If you did not use Descriptive Statistics, make sure your cells show the location of the da Comparatio Overall Female Male Mean Standard Deviation Range Mean Standard Deviation Range Mean Standard Deviation Range Age 35 7 8 2513 30 32 5 6 9 26 0 38 9 8 4 28 0 Perf Rat Service 85 9 9 0 11 4147 5 7177 Note remember the dat 45 21 84 2 7 9 13 6 4 9 45 0 18 0 87 6 10 0 8 7 6 4 30 0 21 0 A key issue in comparing data sets is to see if they are distributed shaped the same At this point we can do this by looking at the probabilities that males and females are distributed in the same way for a grade levels 2 Empirical Probability What is the probability for a a Randomly selected person being in grade E or above b Randomly selected person being a male in grade E or above c Randomly selected male being in grade E or above d Why are the results different 3 Normal Curve based probability For each group (overall, females, males), what are the values for each que A Probability Make sure your answer cells show the Excel function and cell location of the data used The probability of being in the top 1 3 of the compa ratio distribution Note, we can find the cutoff value for the top 1 3 using the fx Large function large(range, value) Value is the number that identifies the x largest value For the top 1 3 value would be the value that starts t For the overall group, this would be the 50 3 or 17th (rounded), for the gender groups, it would be the 25 3 i How nany salaries are in the top 1 3 (rounded to nearest whole number) for each group ii What Compa ratio value starts the top 1 3 of the range for each group iii What is the z score for this value iv What is the normal curve probability of exceeding this score B How do you interpret the relationship between the data sets What does this suggest about our equal pay fo 4 A Based on our sample data set, can the male and female compa ratios in the population be equal to each othe First, we need to determine if these two groups have equal variances, in order to decide which t test to use What is the data input ranged used for this question Step 1 Ho Ha Step 2 Decision Rule Step 3 Statistical test Why Step 4 Conduct the test place cell B77 in the output location box Step 5 Conclusion and Interpretation What is the p value Is the P value 0 05 (for a one tail test) or 0 025 (for a two tail test) What is your decision REJ or NOT reject the null What does this result say about our question of variance equality B Are male and female average compa ratios equal (Regardless of the outcome of the above F test, assume equal variances for this test ) What is the data input ranged used for this question Step 1 Ho Ha Step 2 Decision Rule Step 3 Statistical test Why Step 4 Conduct the test place cell B109 in the output location box Step 5 Conclusion and Interpretation What is the p value Is the P value 0 05 (for a one tail test) or 0 025 (for a two tail test) What is your decision REJ or NOT reject the null What does your decision on rejecting the null hypothesis mean If the null hypothesis was rejected, calculate the effect size value If the effect size was calculated, what doe the result mean in terms of why the null hypothesis was rejected What does the result of this test tell us about our question on salary equality 5 Is the Female average compa ratio equal to or less than the midpoint value of 1 00 This question is the same as Does the company, pay its females on average at or below the grade midpo considered the market rate) Suggestion Use the data column T to the right for your null hypothesis value What is the data input ranged used for this question Step 1 Ho Ha Step 2 Decision Rule Step 3 Statistical test Why Step 4 Conduct the test place cell B162 in the output location box Step 5 Conclusion and Interpretation What is the p value Is the P value 0 05 (for a one tail test) or 0 025 (for a two tail test) What, besides the p value, needs to be considered with a one tail test Decision Reject or do not reject Ho What does your decision on rejecting the null hypothesis mean If the null hypothesis was rejected, calculate the effect size value If the effect size was calculated, what doe the result mean in terms of why the null hypothesis was rejected What does the result of this test tell us about our question on salary equality 6 Considering both the salary information in the lectures and your compa ratio information, what conclusions Why what statistical results support this conclusion y Data Set file is copied over to this Assignment file ht clicking on the Data tab, selecting Move or Copy, and copying the entire sheet to this file right so that whatever you do will not disrupt the original data values and relationships mple, Question 1 asks for several data values If you obtain them using descriptive statistics, mber showing the value in the descriptive statistics table If you choose to generate each he data values should be shown Having only a numerical value will not earn full credit are not correct we need to see how the results were obtained seeing if differences exist Probing into reasons and mitigating factors is a follow up activity for key variables Since the assignment problems will and range for our groups Males, Females, and Overall e Compa ratio and Gender1 columns, and then sort on Gender1 e use, and if desired to test your approach to the compa ratio answers e and stdev functions ns with Fx functions or from Descriptive Statistics der data in column U of a column to the right ow the location of the data (Example average(T2 T51) Note remember the data is a sample from the larger company population oint we can do this rade levels Probability re the values for each question below ge(range, value) d be the value that starts the top 1 3 of the range, ups, it would be the 25 3 8th (rounded) value Overall Female Male est about our equal pay for equal work question tion be equal to each other ecide which t test to use All of the functions below are in the fx statistical list Use the ROUND function (found in Math or All list) Use the LARGE function Use Excel's STANDARDIZE function Use 1 NORM S DIST function or below the grade midpoint (which is mation, what conclusions can you reach about equal pay for equal work Week 3 ANOVA Three Questions Remember to show how you got your results in the appropriate cells For questions using functions, show the input r 1 One interesting question is are the average compa ratios equal across salary ranges of 10K each While compa ratios remove the impact of grade on salaries, are they different for different pay levels, that is are people at different levels paid differently relative to the midpoint (Put data values at right ) What is the data input ranged used for this question Step 1 Ho Ha Step 2 Decision Rule Step 3 Statistical test Why Step 4 Conduct the test place cell b16 in the output location box Step 5 Conclusions and Interpretation What is the p value Is P value 0 05 What is your decision REJ or NOT reject the null If the null hypothesis was rejected, what is the effect size value (eta squared) If calculated, what does the effect size value tell us about why the null hypothesis was rejected What does that decision mean in terms of our equal pay question 2 If the null hypothesis in question 1 was rejected, which pairs of means differ Groups Compared G1 G2 G1 G3 G1 G4 G1 G5 G1 G6 Diff T Term Low to G2 G3 G2 G4 G2 G5 G2 G6 G3 G4 G3 G5 G3 G6 G4 G5 G4 G6 G5 G6 3 Since compa is already a measure of pay for equal work, do these results impact your conclusion on equal pay for equal work Why or why not High ng functions, show the input range when asked anges of 10K each for different pay levels, (Put data values at right ) Group name Salary Intervals Compa ratio values G1 G2 G3 G4 G5 G6 22 29 30 39 40 49 50 59 60 69 70 79 Why Difference Significant Why Regression and Corellation Five Questions Remember to show how you got your results in the appropriate cells For questions using functions, show the inp 1 Create a correlation table using Compa ratio and the other interval level variables, except for Suggestion, place data in columns T Y What range was placed in the Correlation input range box Place C9 in output box b What are the statistically significant correlations related to Compa ratio c Are there any surprises correlations you though would be significant and are not, or non sign d Why does or does not this information help answer our equal pay question 2 Perform a regression analysis using compa as the dependent variable and the variables used in including the dummy variables Show the result, and interpret your findings by answering the Suggestion Place the dummy variables values to the right of column Y What range was placed in the Regression input range box Note be sure to include the appropriate hypothesis statements Regression hypotheses Ho Ha Coefficient hyhpotheses (one to stand for all the separate variables) Ho Ha Place B36 in output box Interpretation For the Regression as a whole What is the value of the F statistic What is the p value associated with this value Is the p value 0 05 What is your decision REJ or NOT reject the null What does this decision mean For each of the coefficients What is the coefficient's p value for each of the variables Is the p value 0 05 Do you reject or not reject each null hypothesis Midpoint Age Perf Rat What are the coefficients for the significant variables Using the intercept coefficient and only the significant variables, what is the equation Compa ratio Is gender a significant factor in compa ratio Regardless of statistical significance, who gets paid more with all other things being equal How do we know 3 What does regression analysis show us about analyzing complex measures 4 Between the lecture results and your results, what else would you like to know before answering our question on equal pay Why 5 Between the lecture results and your results, what is your answer to the question of equal pay for equal work for males and females Why g functions, show the input range when asked vel variables, except for Salary T Significant r nt and are not, or non significant correlations you thought would be and the variables used in Q1 along with findings by answering the following questions Service Gender Degree he question Compa Midpoint ratio Age Performa Service nce Rating Raise Degree Gender Lecture 4 (Sampling basics and Hypothesis test) This week we turn from descriptive statistics to inferential statistics and making decisions about our populations based on the samples we have For example, our class case research question is really asking if in the entire company population of employees, do males and females receive the same pay for doing equal work However, we are not analyzing the entire population, instead we have a sample of 25 males and 25 females to work with This brings us to the idea of sampling taking a small group sample from a larger population To paraphrase, not all samples are created equal For example, if you wanted to study religious feelings in the United States, would you only sample those leaving a fundamentalist church on a Wednesday While this is a legitimate element of US religions, it does not represent the entire range of religious views it is representative of only a portion of the US population, and not the entire population The key to ensuring that sample descriptive statistics can be used as inferential statistics sample results that can be used to infer the characteristics (AKA parameters) of a population is have a random sample of the entire population A random sample is one where, at the start, everyone in the population has the same chance of being selected There are numerous ways to design a random sampling process, but these are more of a research class concern than a statistical class issue For now, we just need to make certain that the samples we use are randomly selected rather than selected with an intent of ensuring desired outcomes are achieved The issue about using samples that students often new to statistics is that the sample statistic values outcomes will rarely be exactly the same as the population parameters we are trying to estimate We will have, for each sample, some sampling error, the difference between the actual and the sample result Researchers feel that this sampling error is generally small enough to use the data to make decisions about the population (Lind, Marchel, Wathen, 2008) While we cannot tell for any given sample exactly what this difference is, we can estimate the maximum amount of the error Later, we will look at doing this for now, we just need to know that this error is incorporated into the statistical test outcomes that we will be studying Once we have our random sample (and we will assume that our class equal pay case sample was selected randomly), we can start with our analysis After developing the descriptive statistics, we start to ask questions about them In examining a data set, we need to not only identify if important differences exist or not but also to identify reasons differences might exist For our equal pay question, it would be legal to pay males and females different salaries if, for example, one gender performed the duties better, or had more required education, or have more seniority, etc Equal pay for equal work, as we are beginning to see, is more complex than a simple single question about salary equality As we go thru the class, we will be able to answer increasingly more complex questions For this week, we will stay with questions about involving ways to sort our salary results looking for differences might exist Some of these questions for this week with our equal pay case could include Could the means for both males and females be the same, and the observed difference be due to sampling error only Could the variances for the males and female be the same (AKA statistically equal) Could salaries per grade be statistically equal Could salaries per degree (undergraduate and graduate) be the same Etc Hypothesis Testing As we might expect, research and statistics have a set procedure process on how to go about answering these questions The hypothesis testing procedure is designed to ensure that data is analyzed in a consistent and recognized fashion so everyone can accept the outcome Statistical tests focus on differences is this difference large enough to be significant, that is not simply a sampling error If so, we say the difference is statistically significant if not, the difference is not considered statistically significant This phrasing is important as it is easy to measure a difference from some point, it is much harder to measure things are different It is that pesky sampling error that interferes with assessing differences directly Before starting the hypothesis test, we need to have a clear research question The questions above are good examples, as each clearly asks if some comparison is statistically equal or not Once we have a clear question and a randomly drawn sample we can start the hypothesis testing procedure The procedure itself has five steps Step 1 State the null and alternate hypothesis Step 2 Form the decision rule Step 3 Select the appropriate statistical test Step 4 Perform the analysis Step 5 Make the decision, and translate the outcome into an answer to the initial research question Step 1 The null hypothesis is the testable claim about the relationship between the variables It always makes the claim of no difference exists in the populations For the question of male and female salary equality, it would be Ho Male mean salary Female mean salary If this claim is found not to be correct, then we would accept the alternate hypothesis claim Ha Male salary mean (not equal) Female salary mean (Note, some alternate ways of phrasing these exist, and we will cover them shortly For now, let's just go with this format ) Step 2 This step involves selecting the decision rule for rejecting the null hypothesis claim This will be constant for our class we will reject the null hypothesis when the p value is equal to or less than 0 05 (this probability is called alpha) Other common values are 1, and 01 the more severe the consequences of being wrong if we reject the null, the smaller the value of alpha we select Recall that we defined the p value last week as the probability of exceeding a value, the value in this case would be the statistical outcome from our test Step 3 Selecting the appropriate statistical test is the next step We start with a question about mean equality, so we will be using the T test the most appropriate test to determine if two population means are equal based upon sample results Step 4 Performing the analysis comes next Fortunately for us, we can do all the arithmetic involved with Excel We will go over how to select and run the appropriate T test below Step 5 Interpret the test results, making a decision on rejecting or not rejecting the null hypothesis, and using this outcome to answer the research question is the final step Excel output tables provide all the information we need to make our decision in this step Step 1 Setting up the hypothesis statements In setting up a hypothesis test for looking at the male and female means, there are actually three questions we could ask and associated hypothesis statements in step 1 1 Are male and female mean salaries equal a Ho Male mean salary Female mean salary b Ha Male mean salary Female mean salary 2 Is the male mean salary equal to or greater than the Female mean salary a Ho Male mean salary Female mean salary b Ha Male mean salary Female mean salary 3 Is the male salary equal to or less than the female mean salary a Ho Male mean salary Female mean salary While they appear similar each answers a different question We cannot, for example, take the first question, determine the means are not equal and then say that, for example, the male mean is greater than the female mean because the sample results show this Our statistical test did not test for this condition If we are interested in a directional difference, we need to use a directional set of hypothesis statements as shown in statements 2 and 3 above Rules There are several rules or guidelines in developing the hypothesis statements for any statistical test 1 The variables must be listed in the same order in both claims 2 The null hypothesis must always contain the equal ( ) sign 3 The null can contain an equal ( ), equal to or less than ( ) claim 4 The null and alternate hypothesis statement must, between them, account for all possible actual comparisons outcomes So, if the null has the equal ( ) claim, the alternate must contain the not equal ( or ) statement If the null has the equal or less than ( ) claim Finally, if the null has the equal to or greater ( or ) claim, the null must contain the less than ( Female salary mean) and the opposite null (Male salary mean 025 and or our F (0 94 rounded) is greater than our F Critical, we fail to reject the null hypothesis of no differences in variance The correct ttest would be the two sample T test assuming equal variances Other T tests We mentioned that Excel has three versions of the t test The equal and unequal variance versions are set up in the same way and produce very similar output tables The only difference is that the equal variance version provides an estimate of the common variation called pooled variance while this row is missing in the unequal variance version A third form of the t test is the T Test Paired Two Sample for Means A key requirement for the other versions of the t test is that the data are independent that means the data are collected on different groups In the paired t test, we generally collect two measures on each subject An example of paired data would be a pre and post test given to students in a statistics class Another example, using our class case study would the comparing the salary and midpoint for each employee both are measured in dollars and taken from each person An example of NON pared data, would the grades of males and females at the end of a statistics class The paired t test is set up in the same way as the other two versions It provides the correlation (a measure of how closely one variable changes when another does to be covered later in the class) coefficient as part of its output An Excel Trick You may have noticed that all of the Excel t tests are for two samples, yet at times we might want to perform a one sample test, for example quality control might want to test a sample against a quality standard to see if things have changed or not Excel does not expressly allow this BUT, we can do a one sample test using Excel The reason is a bit technical, but boils down to the fact that the two sample unequal variance formula will reduce to the one sample formula when one of the variables has a variance equal to 0 So using the unequal variance t test, we enter the variable we are interested such as salary as variable one and the hypothesized value we are testing against such as 45 for our case as variable two, ensuring that we have the same number of variables in each column Here is an example of this outcome Research question Is the female population salary mean 45 Step 1 Ho Female salary mean 45 Ha Female salary mean 45 Step 2 Reject the null hypothesis is less than Alpha 0 05 Step 3 Selected test is the two sample unequal variance t test Step 4 Conduct the test Step 5 Conclusions and Interpretation Since the two tail p value is greater than ( ) 025 and or the absolute value of the t statistic is less than the critical two tail t value, we fail to reject the null hypothesis Our research question answer is that, based upon this sample, the overall female salary average could equal 45 Miscellaneous Issues on Hypothesis Testing Errors Statistical tests are based on probabilities, there is a possibility that we could make the wrong decision in either rejecting or failing to reject the null hypothesis Rejecting the null hypothesis when it is true is called a Type I error Accepting (failing to reject) the null when it is false is called a Type II error Both errors are minimized somewhat by increasing the sample size we work with A type I error is generally considered the more severe of the two (imagine saying a new medicine works when it does not), and is managed by the selection of our alpha value the smaller the alpha, the harder it is to reject the null hypothesis (or, put another way, the more evidence is needed to convince us to reject the null) Managing the Type II error probability is slightly more complicated and is dealt with in more advanced statistics class Choosing an alpha of 05 for most test situations has been found to provide a good balance between these two errors Reason for Rejection While we are not spending time on the formulas behind our statistical outcomes, there is one general issue with virtually all statistical tests A larger sample size makes it easier to reject the null hypothesis What is a non statistically significant outcome based upon a sample size of 25, could very easily be found significant with a sample size of, for example, 25,000 This is one reason to be cautious of very large sample studies far from meaning the results are better, it could mean the rejection of the null was due to the sample size and not the variables that were being tested The effect size measure helps us investigate the cause of rejecting the null The name is somewhat misleading to those just learning about it it does NOT mean the size of the difference being tested The significance of that difference is tested with our statistical test What it does measure is the effect the variables had on the rejection (that is, is the outcome practically significant and one we should make decisions using) versus the impact of the sample size on the rejection (meaning the result is not particularly meaningful in the real world) For the two sample t test, either equal or unequal variance, the effect size is measured by Cohen's D Unfortunately, Excel does not yet provide this calculation automatically, however it is fairly easy to generate Cohen's D (absolute value of the difference between the means) the standard deviation of both samples combined Note the total standard deviation is not given in the t test outputs, and is not the same as the square root of the pooled variance estimate To get this value, use the fx function stdev s on the entire data set both samples at the same time Interpreting the effect size outcome is fairly simple Effect sizes are generally between 0 and 1 A large effect (a value around 8 or larger) means the variables and their interactions caused the rejection of the null, and the result has a lot of practical significance for decision making A small effect (a value around 2 or less) means the sample size was more responsible for the rejection decision than the variable outcomes The medium effect (values around 5) are harder to interpret and would suggest additional study (Tanner Youssef Morgan, 2013) References Lind, D A , Marchel, W G , Wathen, S A (2008) Statistical Techniques in Business Finance (13th Ed ) Boston McGraw Hill Irwin Tanner, D E Youssef Morgan, C M (2013) Statistics for Managers San Deigeo, CA Bridgepoint Education Week 3 Lecture 7 We have so far seen how we can summarize data sets using descriptive statistics, showing several characteristics including mean and standard deviation We also found that if our data comes from a random sample of a larger population, these descriptive statistics become inferential statistics, and can be used to make inferences about the population These inferences can then be used in statistical tests to see if things have changed or not (equal to known standards or other data sets or not) We have looked at one and two sample mean tests (with the t test) and two sample comparisons of variance equality (with the F test) This week we will look at the Analysis of Variance (ANOVA) test for mean equality between three or more groups ANOVA The first question often asked is why not just do multiple t tests comparing three or more different group means One answer involves efficiency Conducting multiple t tests can become somewhat tedious Comparing just three groups (A, B, and C) requires us to compare A and B, B and C, and A and C (3 tests) With 4 groups (A, B, C, and D) we have A and B, A and C, A and D, B and C, B and D, C and D (6 tests) So a single test can save us a lot of time and is much more efficient A second reason and much more important reason is that we lose confidence in our results when multiple tests are performed on the same data With an alpha of 0 05, we are 95 certain we are right with each test, but being certain we are right for all the tests involves multiplying the results together, so for three tests we would be 95 95 95 or 86 certain with six tests, our confidence drops to 95 6 74, a long way from our desired 95 confidence So, a single test maintains our desired level of confidence in the outcome (Lind, Marchel, Wathen, 2008) Logic A second question asked comes from the name itself, how can analyzing variance tell us anything about mean differences The answer lies in how ANOVA works The key assumptions for an ANOVA analysis are that each of the groups are normally distributed AND have equal variances These mean that the distributions are shaped the same and, this allows for an easy comparison Take a look at the following two sets of normal curves 0 45 0 4 0 35 0 3 0 25 0 2 0 15 0 1 0 05 0 5 4 3 2 1 0 1 2 3 4 5 Exhibit A 0 45 0 4 0 35 0 3 0 25 0 2 0 15 0 1 0 05 0 10 5 0 5 10 Exhibit B The means of the three sample groups in Exhibit A could clearly come from three populations that have the same mean, and the differences seen are merely sampling errors However, we cannot say the same thing about the sample groups in Exhibit B ANOVA takes the variation of all of the data in the groups being tested (three in this case) and compares it with the average variation for each of the groups using the F test (discussed last week) Since for the Exhibit A groups, the overall variation will be only slightly larger than the average of the three (which are assumed to be equal) Since the resulting F value will not be statistically significant, we can say that the groups are closely distributed and the means are statistically equal In Exhibit B, however, the variation of the entire group would be around three times the variation of the average Just by looking at the average variance for the individual groups and comparing it to the variance for the entire group, we can make a judgement on how close the distributions are, and with that a judgement on mean equality As with the t test, ANOVA will let us know exactly how much difference in the population locations is enough to say means differ or not, we cannot just eyeball it Hypothesis Stating the null and alternate hypothesis for an ANOVA test is simple, as they are always the same Ho All means equal Ha At least one mean differs (Tanner Youssef Morgan, 2013) You might recall from last week that we said the alternate always states the opposite from the null statement If so, why isn't our alternate all means differ, which seems like the opposite The reason is that the ANOVA test will reject the null hypothesis if even one mean from the groups being examined is statistically significant difference So, the opposite of all means differ is actual at least one mean differs Data Set up Setting up the data for an ANOVA analysis is just a bit more complicated than for a ttest While with the T test we just highlighted the column or portion of a column of data (sometimes after sorting it by a variable such as gender), for an ANOVA test, we need to create a table For example, if we wanted to look at average salaries per grade (shown in the Week 3 Lecture 8 example), we would need a table looking like this Doing this is fairly simple Copy the grade and salary columns (separately) and paste them onto a new Excel sheet (probably in Week 3 to the right of the questions) Then, highlight both columns from labels to last value and select Data Sort Select sorting on the grade variable and click on OK Both columns are now in grade order, and you can highlight and cut the salaries for each grade and paste them into a new table you create with the grade letter as the head When finished, you will have the input table used in setting up an Excel ANOVA test References Lind, D A , Marchel, W G , Wathen, S A (2008) Statistical Techniques in Business Finance (13th Ed ) Boston McGraw Hill Irwin Tanner, D E Youssef Morgan, C M (2013) Statistics for Managers San Diego, CA Bridgeport Education Week 3 Lecture 8 Excel ANOVA Example In our on going investigation of whether or not males and females are paid equally for equal work, we have come up with contradicting results so far, average salaries are clearly different but average compa ratios are not We need to examine reasons that might impact these differences to see if we can explain what is going on For possible factors influencing individual salaries, we need to be able to, paraphrasing what they say in TV cop shows, rule it out as a suspect in causing differences or keep it in as a cause of differences between the gender pay practices One key issue in our question that has not clearly been examined yet is the impact of grades on salaries Clearly, grade differences have the potential to complicate the issue as the work done differs by grade One question to ask here is, are average salaries equal across grade levels This becomes our research question Example For the research question of are average salaries equal across the grades, we have the following hypothesis test Step 1 Ho All salary means are equal Ha At least one mean differs Step 2 Reject the null if the p value alpha 05 Step 3 Statistical Test Single factor ANOVA (Note salary variance in some of the grades may violate the equal variance requirement We will ignore this for the purposes of this example ) Step 4 Perform Test The input box for Excel's Single factor ANOVA is The input range for this example would be D1 F16 we would click on Labels in the first row, and select any output range desired (This would be given in the assignment for consistency's sake) Completing the input screen and clicking OK gives us an output table Reading the ANOVA output tables The first thing we see is the test name in cell K 1 Anova Single Factor This is just a check to ensure we have the right test Next we see a summary table Under the Groups column we should see the data labels (in this case our grades) If not, and we see something such as a number, an input error has been made, the labels were not included but the Labels box was checked If this happens, just redo the data set up and overwrite the output For each variable, we see the count, sum, average, and variance If we had some question about having equal variance, we could perform an F test on the variables with the extreme values (Again, for purposed of this example, we are going to ignore the requirement for equal variances ) The next table is the ANOVA output While, technically for our hypothesis test, we only need to look at the p value result, the other columns provide some useful information Note this is somewhat technical, and is presented only as an explanation of the table The source of variation column gives us our two variation measures Between groups refers to the overall variation while Within Groups refers to the average variation for all the groups The SS column (Sum of Squares) is an estimate of the variation (slightly different than our variance formula) This value is divided by the df (degrees of freedom) value for each group This df is conceptually the same as that discussed with the t test and the total df is N 1, where N is the number of data points Looking at this value (49 in this example) confirms we entered the right number of data points of 50 MS stands for Mean Square and is the SS divided by the df The F value is determined by dividing the MS for the between row by the MS of the Within groups row The p value and the critical F statistic complete the table Step 5 Conclusion and Interpretation The F is much larger than the F critical, and the pvalue is much less than 0 05 (Note 1 04E 35 means move the decimal point 35 places to the left (0 0000000000000000000000000000000000104) If the E (for exponent) had been positive, we would have moved the decimal to the right, example 1 04E4 10400 ) So, according to our decision rule, since the p value is (less than) 0 05, we reject the null hypothesis and conclude that at least one mean differs This suggests that grade level has an impact on salary, and that measuring pay in salary terms could be creating some issues in answering our questions Determining Differences When we reject the null hypothesis, a logical follow up question is often, which differences are meaningful There are several approaches to answering this question all involve a pair by pair comparison, and most require access to statistical tables not available within Excel One approach that we can use in our Excel worksheet involves developing confidence intervals around the difference in group means (Note Confidence intervals allow us to develop a range that contains the value we are looking for with a known level of confidence such as 95 We will discuss this again in Week 5 ) All of the required information for these intervals is available from the ANVOA output The basic approach is to 1 Find the difference between each pair of means 2 To this value, add and subtract a measure of the variation in the data (due to sample error, we know our sample means are not exactly equal to the population parameter, so we need to take this sample error into account, our real difference might be a bit larger or smaller than the samples show) 3 Examine the ranges to see if 0 is included (alternately, do the endpoints have different signs a and ) if so the real population difference could be 0 and the means do not significantly differ The formula for the interval that we will build in Excel is (mean1 mean2) t sqrt(MSW (1 n1 1 n2)) (Lind, Marchel, Wathen, 2008) Here is an example of how we work out the formula, and what each term means The value of the means for each variable is found in the Summary table, as is the count (n) for each variable The MSW is the MS for within that is found in the ANOVA table, and we find t with the t inv function from Excel So, let's walk thru constructing an interval for grades A and B, and then we can look at what it might look like in an Excel spreadsheet From our example output above, we have Mean A 23 5 (rounded) Mean B 31 7 (rounded) n for A 15 n for B 7 MSW 8 64 (rounded) T has a df equal to that of MSW (44 in this case), and the probability is our 0 05 for a 95 interval T inv(0 05, 44) equals 2 015 (rounded) So, for grades A and B, our mean difference 31 7 23 5 8 2 The term is t sqrt(MSW (1 n1 1 n2)) Plugging in our values gives us 2 015 sqrt(8 64 (1 15 1 7) 2 71 So, our interval is 8 2 2 71 5 49 to 10 91 (rounded) Since 0 is not in this range, we can say that the mean salaries for grades A and B differ significantly Setting this up in Excel (using cell references as the examples on the left show) give us the following So, all of the grade average salary differences are significantly different from each other Grade is definitely a factor in an employee's salary, and introduces a source of variation that is not an equal work measure We have not yet found an answer to our question, as we have not yet figured out how to get a measure of equal work to base our comparisons on More to follow next week References Lind, D A , Marchel, W G , Wathen, S A (2008) Statistical Techniques in Business Finance (13th Ed ) Boston McGraw Hill Irwin Week 3 Lecture 9 Effect Size When we reject the null hypothesis with an ANOVA test, we have two questions that arise The first, which pair of means differs significantly, we have dealt with already The second question, similar to what we asked with the t test null hypothesis rejection is what caused the rejection, the sample size, or the variable interactions This question is again answered using an effect size measure Recall that the effect size measure shows how likely the variable interaction caused the null hypothesis rejection Large values lead us to say the variables caused the outcome, while small values lead us to say the outcome has little to no practical significance as the sample size was the most likely cause of the rejection of the null With the single factor ANOVA, the effect size measure is eta squared, and equals the SS(between) SS(total) (Tanner Youssef Morgan, 2013) For our salary example in Lecture 8, eta squared equals 17686 02 (SS(between)) 18066 (SS(total)) 0 979 (rounded) Eta squared effect size measures have different interpretation values than Cohen's d (from the ttest) According to Nandy (2012), a small eta squared effect size has a value of 0 01, a medium of 0 06, and a large value of 0 14 or more This means we have a large effect size, and the variables of salary and grade interaction are the most likely cause of our rejecting the null hypothesis rather than the sample size Side note Eta squared can also be interpreted as the percent of differences between group scores that can be explained by the independent variable (Tanner Youssef Morgan, 2013, p 123) This is consistent with our saying the variable interactions caused the outcome Different Forms of ANOVA Just as the t test has several forms, so does the ANOVA test Excel has three versions available While we will focus only on the single factor test, a brief description of the other two versions will be presented ANOVA Two factor without replication The ANOVA two factor without replication tests mean differences from two different variables at the same time If we are interested in knowing if the mean salary differs by grade and also by gender, we can perform one two factor test rather than two separate tests As mentioned in lecture two for this week, this is more efficient and maintains our desired alpha significance level Excel Example To test the mean salaries by grade and gender at the same time, we would set up our hypothesis test as follows Step 1 Ho1 All salary means are equal across grades Ha1 At least one mean differs Ho2 All gender (male and female) means are equal Ha2 At least one mean differs Note that in this test, we need to have a hypothesis statement pair for each variable being tested Step 2 ANOVA Two sample without replication Step 3 Reject the null hypothesis if the p value is alpha 05 Step 4 Perform the test While the input screen for this test is identical to that of the one factor test, the data table used is a bit different As seen below, it has one value for each variable pair cell Since we have multiple values for each variable pair, this table was set up with the mean values for each group A B C D E F Male 24 3 27 7 43 3 48 0 61 7 75 3 Female 23 3 34 8 41 5 52 5 67 0 76 0 The data entry box would include the entire table, labels and all The output for this test is Step 5 Conclusions and Interpretation As with the single factor ANOVA, we start out with a summary table for each variable showing the sum, average, and variance for each variable label The ANOVA table has an extra row, and one renamed row The Error row is what we knew as the Within row in the single factor ANOVA The two rows dedicated to the data are Rows and Columns these refer to how the variables are presented in the data input table The row line refers to our gender variable, since that is the row variable in the input The p value is 0 16 (rounded), so we do not reject the null hypothesis of equal means The Column line refers to the grade variable, as that was listed in the column position This p value is 3 76E 05, or 0 0000376 This is less than (

The Answer is in the image, click to view ...

Answered step by step

Verified Expert Solution

Link Copied!

Question

1 Approved Answer

Posted on Oct 14, 2024

ID Salary 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

ID Salary 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 65.1 27 36.1 61.4 46.6 73.3 39.7 23.2 74.6 22.8 23.4 62 40.7 23 24 51.1 69.3 36 24.2 35.5 74.1 52.9 22 54.2 24.5 22.5 43.1 75.7 73.7 48 23.4 27.3 62.5 27.2 23.6 22.7 23.7 60 35.1 23.7 42.4 Compa Midpoint 1.142 0.870 1.166 1.078 0.971 1.094 0.992 1.007 1.114 0.990 1.019 1.088 1.017 0.999 1.043 1.277 1.216 1.162 1.053 1.144 1.106 1.103 0.955 1.130 1.065 0.979 1.078 1.130 1.100 1.000 1.016 0.880 1.096 0.876 1.027 0.985 1.032 1.052 1.132 1.030 1.060 57 31 31 57 48 67 40 23 67 23 23 57 40 23 23 40 57 31 23 31 67 48 23 48 23 23 40 67 67 48 23 31 57 31 23 23 23 57 31 23 40 Age 34 52 30 42 36 36 32 32 49 30 41 52 30 32 32 44 27 31 32 44 43 48 36 30 41 22 35 44 52 45 29 25 35 26 23 27 22 45 27 24 25 Performance Service Gender Rating 85 80 75 100 90 70 100 90 100 80 100 95 100 90 80 90 55 80 85 70 95 65 65 75 70 95 80 95 95 90 60 95 90 80 90 75 95 95 90 90 80 8 7 5 16 16 12 8 9 10 7 19 22 2 12 8 4 3 11 1 16 13 6 6 9 4 2 7 9 5 18 4 4 9 2 4 3 2 11 6 2 5 0 0 1 0 0 0 1 1 0 1 1 0 1 1 1 0 1 1 0 1 0 1 1 1 0 1 0 1 0 0 1 0 0 0 1 1 1 0 1 0 0 Raise Degree 5.7 3.9 3.6 5.5 5.7 4.5 5.7 5.8 4 4.7 4.8 4.5 4.7 6 4.9 5.7 3 5.6 4.6 4.8 6.3 3.8 3.3 3.8 4 6.2 3.9 4.4 5.4 4.3 3.9 5.6 5.5 4.9 5.3 4.3 6.2 4.5 5.5 6.3 4.3 0 0 1 1 1 1 1 1 1 1 1 0 0 1 1 0 1 0 1 0 1 1 0 0 0 0 1 0 0 0 1 0 1 1 0 0 0 0 0 0 0 42 43 44 45 46 47 48 49 50 22.5 75.8 62.1 51.4 57.1 62.7 69 60 63.8 0.978 1.131 1.090 1.071 1.001 1.100 1.210 1.053 1.120 23 67 57 48 57 57 57 57 57 32 42 45 36 39 37 34 41 38 100 95 90 95 75 95 90 95 80 8 20 16 8 20 5 11 21 12 1 1 0 1 0 0 1 0 0 5.7 5.5 5.2 5.2 3.9 5.5 5.3 6.6 4.6 1 0 1 1 1 1 1 0 0 Gender 1 Gr M M F M M M F F M F F M F F F M F F M F M F F F M F M F M M F M M M F F F M F M M E B B E D F C A F A A E C A A C E B A B F D A D A A C F F D A B E B A A A E B A C The ongoing question that the weekly assignments will focus on is: Are males and females p Note: to simplfy the analysis, we will assume that jobs within each grade comprise equal wor The column labels in the table mean: ID - Employee sample number Salary - Salary in thousands Age - Age in years Performance Rating - Appraisal rating (employee eva Service - Years of service (roundGender - 0 = male, 1 = female Midpoint - salary grade midpointRaise - percent of last raise Grade - job/pay grade Degree (0= BS\\BA 1 = MS) Gender1 (Male or Female) Compa - salary divided by midpoint F F M F M M F M M A F E D E E E E E Are males and females paid the same for equal work (under the Equal Pay Act)? grade comprise equal work. sal rating (employee evaluation score) This assignment covers the material presented in weeks 1 and 2. Six Questions Before starting this assignment, make sure the the assignment data from the Employee Salary Data Set file is copied o You can do this either by a copy and paste of all the columns or by opening the data file, right clicking on the Data tab (Weekly Assignment Sheet or whatever you are calling your master assignment file). It is highly recommended that you copy the data columns (with labels) and paste them to the right so that whatever yo To Ensure full credit for each question, you need to show how you got your results. For example, Question 1 asks for then the cells should have an "=XX" formula in them, where XX is the column and row number showing the value in value using fxfunctions, then each function should be located in the cell and the location of the data values should be So, Cell D31 - as an example - shoud contain something like "=T6" or "=average(T2:T26)". Having only a numerica The reason for this is to allow instructors to provide feedback on Excel tools if the answers are not correct - we need t In starting the analysis on a research question, we focus on overall descriptive statistics and seeing if differences exist 1 The first step in analyzing data sets is to find some summary descriptive statistics for key variables. Since t focus mostly on the compa-ratios, we need to find the mean, standard deviations, and range for our groups: Sorting the compa-ratios into male and females will require you copy and paste the Compa-ratio and Gende The values for age, performance rating, and service are provided for you for future use, and - if desired - to (see if you can replicate the values). You can use either the Data Analysis Descriptive Statistics tool or the Fx =average and =stdev functions. The range can be found using the difference between the =max and =min functions with Fx functions or fr Suggestion: Copy and paste the compa-ratio data to the right (Column T) and gender data in column U. If you use Descriptive statistics, Place the output table in row 1 of a column to the right. If you did not use Descriptive Statistics, make sure your cells show the location of the da Comparatio Overall Female Male Mean Standard Deviation Range Mean Standard Deviation Range Mean Standard Deviation Range Age 35.7 8.2513 30 32.5 6.9 26.0 38.9 8.4 28.0 Perf. Rat. Service 85.9 9.0 11.4147 5.7177 Note - remember the dat 45 21 84.2 7.9 13.6 4.9 45.0 18.0 87.6 10.0 8.7 6.4 30.0 21.0 A key issue in comparing data sets is to see if they are distributed/shaped the same. At this point we can do this by looking at the probabilities that males and females are distributed in the same way for a grade levels. 2 Empirical Probability: What is the probability for a: a. Randomly selected person being in grade E or above? b. Randomly selected person being a male in grade E or above? c. Randomly selected male being in grade E or above? d. Why are the results different? 3 Normal Curve based probability: For each group (overall, females, males), what are the values for each que A Probability Make sure your answer cells show the Excel function and cell location of the data used. The probability of being in the top 1/3 of the compa-ratio distribution. Note, we can find the cutoff value for the top 1/3 using the fx Large function: =large(range, value). Value is the number that identifies the x-largest value. For the top 1/3 value would be the value that starts t For the overall group, this would be the 50/3 or 17th (rounded), for the gender groups, it would be the 25/3 i. How nany salaries are in the top 1/3 (rounded to nearest whole number) for each group? ii What Compa-ratio value starts the top 1/3 of the range for each group? iii What is the z-score for this value? iv. What is the normal curve probability of exceeding this score? B How do you interpret the relationship between the data sets? What does this suggest about our equal pay fo 4 A Based on our sample data set, can the male and female compa-ratios in the population be equal to each othe First, we need to determine if these two groups have equal variances, in order to decide which t-test to use. What is the data input ranged used for this question: Step 1: Ho: Ha: Step 2: Decision Rule: Step 3: Statistical test: Why? Step 4: Conduct the test - place cell B77 in the output location box. Step 5: Conclusion and Interpretation What is the p-value: Is the P-value < 0.05 (for a one tail test) or 0.025 (for a two tail test)? What is your decision: REJ or NOT reject the null? What does this result say about our question of variance equality? B Are male and female average compa-ratios equal? (Regardless of the outcome of the above F-test, assume equal variances for this test.) What is the data input ranged used for this question: Step 1: Ho: Ha: Step 2: Decision Rule: Step 3: Statistical test: Why? Step 4: Conduct the test - place cell B109 in the output location box. Step 5: Conclusion and Interpretation What is the p-value: Is the P-value < 0.05 (for a one tail test) or 0.025 (for a two tail test)? What is your decision: REJ or NOT reject the null? What does your decision on rejecting the null hypothesis mean? If the null hypothesis was rejected, calculate the effect size value: If the effect size was calculated, what doe the result mean in terms of why the null hypothesis was rejected? What does the result of this test tell us about our question on salary equality? 5 Is the Female average compa-ratio equal to or less than the midpoint value of 1.00? This question is the same as: Does the company, pay its females - on average - at or below the grade midpo considered the market rate)? Suggestion: Use the data column T to the right for your null hypothesis value. What is the data input ranged used for this question: Step 1: Ho: Ha: Step 2: Decision Rule: Step 3: Statistical test: Why? Step 4: Conduct the test - place cell B162 in the output location box. Step 5: Conclusion and Interpretation What is the p-value: Is the P-value < 0.05 (for a one tail test) or 0.025 (for a two tail test)? What, besides the p-value, needs to be considered with a one tail test? Decision: Reject or do not reject Ho? What does your decision on rejecting the null hypothesis mean? If the null hypothesis was rejected, calculate the effect size value: If the effect size was calculated, what doe the result mean in terms of why the null hypothesis was rejected? What does the result of this test tell us about our question on salary equality? 6 Considering both the salary information in the lectures and your compa-ratio information, what conclusions Why - what statistical results support this conclusion? y Data Set file is copied over to this Assignment file. ht clicking on the Data tab, selecting Move or Copy, and copying the entire sheet to this file right so that whatever you do will not disrupt the original data values and relationships. mple, Question 1 asks for several data values. If you obtain them using descriptive statistics, mber showing the value in the descriptive statistics table. If you choose to generate each he data values should be shown. Having only a numerical value will not earn full credit. are not correct - we need to see how the results were obtained. seeing if differences exist. Probing into reasons and mitigating factors is a follow-up activity. for key variables. Since the assignment problems will and range for our groups: Males, Females, and Overall. e Compa-ratio and Gender1 columns, and then sort on Gender1. e use, and - if desired - to test your approach to the compa-ratio answers e and =stdev functions. ns with Fx functions or from Descriptive Statistics. der data in column U. of a column to the right. ow the location of the data (Example: =average(T2:T51) Note - remember the data is a sample from the larger company population oint we can do this rade levels. Probability re the values for each question below?: ge(range, value). d be the value that starts the top 1/3 of the range, ups, it would be the 25/3 = 8th (rounded) value. Overall Female Male est about our equal pay for equal work question? tion be equal to each other? ecide which t-test to use. All of the functions below are in the fx statistical list. Use the "=ROUND" function (found in Math or All list) Use the "=LARGE" function Use Excel's STANDARDIZE function Use "=1-NORM.S.DIST" function or below the grade midpoint (which is mation, what conclusions can you reach about equal pay for equal work? Week 3 ANOVA Three Questions Remember to show how you got your results in the appropriate cells. For questions using functions, show the input r 1 One interesting question is are the average compa-ratios equal across salary ranges of 10K each. While compa-ratios remove the impact of grade on salaries, are they different for different pay levels, that is are people at different levels paid differently relative to the midpoint? (Put data values at right.) What is the data input ranged used for this question: Step 1: Ho: Ha: Step 2: Decision Rule: Step 3: Statistical test: Why? Step 4: Conduct the test - place cell b16 in the output location box. Step 5: Conclusions and Interpretation What is the p-value? Is P-value < 0.05? What is your decision: REJ or NOT reject the null? If the null hypothesis was rejected, what is the effect size value (eta squared)? If calculated, what does the effect size value tell us about why the null hypothesis was rejected? What does that decision mean in terms of our equal pay question? 2 If the null hypothesis in question 1 was rejected, which pairs of means differ? Groups Compared G1 G2 G1 G3 G1 G4 G1 G5 G1 G6 Diff T +/- Term Low to G2 G3 G2 G4 G2 G5 G2 G6 G3 G4 G3 G5 G3 G6 G4 G5 G4 G6 G5 G6 3 Since compa is already a measure of pay for equal work, do these results impact your conclusion on equal pay for equal work? Why or why not? High ng functions, show the input range when asked. anges of 10K each. for different pay levels, (Put data values at right.) Group name: Salary Intervals: Compa-ratio values: G1 G2 G3 G4 G5 G6 22-29 30-39 40-49 50-59 60-69 70-79 Why? Difference Significant? Why? Regression and Corellation Five Questions Remember to show how you got your results in the appropriate cells. For questions using functions, show the inp 1 Create a correlation table using Compa-ratio and the other interval level variables, except for Suggestion, place data in columns T - Y. What range was placed in the Correlation input range box: Place C9 in output box. b What are the statistically significant correlations related to Compa-ratio? c Are there any surprises - correlations you though would be significant and are not, or non sign d Why does or does not this information help answer our equal pay question? 2 Perform a regression analysis using compa as the dependent variable and the variables used in including the dummy variables. Show the result, and interpret your findings by answering the Suggestion: Place the dummy variables values to the right of column Y. What range was placed in the Regression input range box: Note: be sure to include the appropriate hypothesis statements. Regression hypotheses Ho: Ha: Coefficient hyhpotheses (one to stand for all the separate variables) Ho: Ha: Place B36 in output box. Interpretation: For the Regression as a whole: What is the value of the F statistic: What is the p-value associated with this value: Is the p-value < 0.05? What is your decision: REJ or NOT reject the null? What does this decision mean? For each of the coefficients: What is the coefficient's p-value for each of the variables: Is the p-value < 0.05? Do you reject or not reject each null hypothesis: Midpoint Age Perf. Rat. What are the coefficients for the significant variables? Using the intercept coefficient and only the significant variables, what is the equation? Compa-ratio = Is gender a significant factor in compa-ratio? Regardless of statistical significance, who gets paid more with all other things being equal? How do we know? 3 What does regression analysis show us about analyzing complex measures? 4 Between the lecture results and your results, what else would you like to know before answering our question on equal pay? Why? 5 Between the lecture results and your results, what is your answer to the question of equal pay for equal work for males and females? Why? g functions, show the input range when asked. vel variables, except for Salary. T= Significant r = nt and are not, or non significant correlations you thought would be? and the variables used in Q1 along with findings by answering the following questions. Service Gender Degree he question Compa- Midpoint ratio Age Performa Service nce Rating Raise Degree Gender Lecture 4 (Sampling basics and Hypothesis test) This week we turn from descriptive statistics to inferential statistics and making decisions about our populations based on the samples we have. For example, our class case research question is really asking if in the entire company population of employees, do males and females receive the same pay for doing equal work. However, we are not analyzing the entire population, instead we have a sample of 25 males and 25 females to work with. This brings us to the idea of sampling - taking a small group/sample from a larger population. To paraphrase, not all samples are created equal. For example, if you wanted to study religious feelings in the United States, would you only sample those leaving a fundamentalist church on a Wednesday? While this is a legitimate element of US religions, it does not represent the entire range of religious views - it is representative of only a portion of the US population, and not the entire population. The key to ensuring that sample descriptive statistics can be used as inferential statistics - sample results that can be used to infer the characteristics (AKA parameters) of a population - is have a random sample of the entire population. A random sample is one where, at the start, everyone in the population has the same chance of being selected. There are numerous ways to design a random sampling process, but these are more of a research class concern than a statistical class issue. For now, we just need to make certain that the samples we use are randomly selected rather than selected with an intent of ensuring desired outcomes are achieved. The issue about using samples that students often new to statistics is that the sample statistic values/outcomes will rarely be exactly the same as the population parameters we are trying to estimate. We will have, for each sample, some sampling error, the difference between the actual and the sample result. Researchers feel that this sampling error is generally small enough to use the data to make decisions about the population (Lind, Marchel, & Wathen, 2008). While we cannot tell for any given sample exactly what this difference is, we can estimate the maximum amount of the error. Later, we will look at doing this; for now, we just need to know that this error is incorporated into the statistical test outcomes that we will be studying. Once we have our random sample (and we will assume that our class equal pay case sample was selected randomly), we can start with our analysis. After developing the descriptive statistics, we start to ask questions about them. In examining a data set, we need to not only identify if important differences exist or not but also to identify reasons differences might exist. For our equal pay question, it would be legal to pay males and females different salaries if, for example, one gender performed the duties better, or had more required education, or have more seniority, etc. Equal pay for equal work, as we are beginning to see, is more complex than a simple single question about salary equality. As we go thru the class, we will be able to answer increasingly more complex questions. For this week, we will stay with questions about involving ways to sort our salary results - looking for differences might exist. Some of these questions for this week with our equal pay case could include: Could the means for both males and females be the same, and the observed difference be due to sampling error only? Could the variances for the males and female be the same (AKA statistically equal)? Could salaries per grade be statistically equal? Could salaries per degree (undergraduate and graduate) be the same? Etc. Hypothesis Testing As we might expect, research and statistics have a set procedure/process on how to go about answering these questions. The hypothesis testing procedure is designed to ensure that data is analyzed in a consistent and recognized fashion so everyone can accept the outcome. Statistical tests focus on differences - is this difference large enough to be significant, that is not simply a sampling error? If so, we say the difference is statistically significant; if not, the difference is not considered statistically significant. This phrasing is important as it is easy to measure a difference from some point, it is much harder to measure \"things are different.\" It is that pesky sampling error that interferes with assessing differences directly. Before starting the hypothesis test, we need to have a clear research question. The questions above are good examples, as each clearly asks if some comparison is statistically equal or not. Once we have a clear question - and a randomly drawn sample - we can start the hypothesis testing procedure. The procedure itself has five steps: Step 1: State the null and alternate hypothesis Step 2: Form the decision rule Step 3: Select the appropriate statistical test Step 4: Perform the analysis Step 5: Make the decision, and translate the outcome into an answer to the initial research question. Step 1. The null hypothesis is the \"testable\" claim about the relationship between the variables. It always makes the claim of no difference exists in the populations. For the question of male and female salary equality, it would be: Ho: Male mean salary = Female mean salary. If this claim is found not to be correct, then we would accept the alternate hypothesis claim: Ha: Male salary mean =/= (not equal) Female salary mean. (Note, some alternate ways of phrasing these exist, and we will cover them shortly. For now, let's just go with this format.) Step 2. This step involves selecting the decision rule for rejecting the null hypothesis claim. This will be constant for our class - we will reject the null hypothesis when the p-value is equal to or less than 0.05 (this probability is called alpha). Other common values are .1, and .01 - the more severe the consequences of being wrong if we reject the null, the smaller the value of alpha we select. Recall that we defined the p-value last week as the probability of exceeding a value, the value in this case would be the statistical outcome from our test. Step 3. Selecting the appropriate statistical test is the next step. We start with a question about mean equality, so we will be using the T-test - the most appropriate test to determine if two population means are equal based upon sample results. Step 4. Performing the analysis comes next. Fortunately for us, we can do all the arithmetic involved with Excel. We will go over how to select and run the appropriate T-test below. Step 5. Interpret the test results, making a decision on rejecting or not rejecting the null hypothesis, and using this outcome to answer the research question is the final step. Excel output tables provide all the information we need to make our decision in this step. Step 1: Setting up the hypothesis statements In setting up a hypothesis test for looking at the male and female means, there are actually three questions we could ask and associated hypothesis statements in step 1. 1. Are male and female mean salaries equal? a. Ho: Male mean salary = Female mean salary b. Ha: Male mean salary =/= Female mean salary 2. Is the male mean salary equal to or greater than the Female mean salary? a. Ho: Male mean salary => Female mean salary b. Ha: Male mean salary < Female mean salary 3. Is the male salary equal to or less than the female mean salary? a. Ho: Male mean salary <= Female mean salary b. Ha: Male mean salary > Female mean salary While they appear similar each answers a different question. We cannot, for example, take the first question, determine the means are not equal and then say that, for example, the male mean is greater than the female mean because the sample results show this. Our statistical test did not test for this condition. If we are interested in a directional difference, we need to use a directional set of hypothesis statements as shown in statements 2 and 3 above. Rules. There are several rules or guidelines in developing the hypothesis statements for any statistical test. 1. The variables must be listed in the same order in both claims. 2. The null hypothesis must always contain the equal (=) sign. 3. The null can contain an equal (=), equal to or less than (<=) or equal to or greater than (=>) claim. 4. The null and alternate hypothesis statement must, between them, account for all possible actual comparisons outcomes. So, if the null has the equal (=) claim, the alternate must contain the not equal (=/= or ) statement. If the null has the equal or less than (<= or ) claim, the alternate must contain the greater than (>) claim. Finally, if the null has the equal to or greater (=> or ) claim, the null must contain the less than (<) claim. Deciding which pair of statements to use depends on the research question being asked - which is why we always start with the question. Look at the research question being asked; does it contain words indicating a simple equality (means are equal, the same, etc.) or inequality (not equal, different, etc.), if so we have the first example Ho: variable 1 mean = variable 2 mean, Ha: variable 1 mean =/= variable 2 mean. If the research question implies a directional difference (larger, greater, exceeds, increased, etc. or smaller, less than, reduced, etc.) then it is often easier to use the question to frame the alternate hypothesis and back into the null. For example, the question is the male mean salary greater than the female mean salary would lead to an alternate of exactly what was said (Ha: Male salary mean > Female salary mean) and the opposite null (Male salary mean <= Female salary mean). Step 2: Decision Rule Once we have our hypothesis statements, we move on to deciding the level of evidence that will cause us to reject the null hypothesis. Note, we always test the null hypothesis, since that is where our claim of equality lies. And, our decision is either reject the null or fail to reject the null. If the latter, we are saying that the alternate hypothesis statement is the more accurate description of the relationship between the two variable population means. We never accept the alternate. When we perform a statistical test; we are in essence asking if, based on the evidence we have is, the difference we observe be large enough to have been caused by something other than chance or is it due to sampling error? A statistical test gives us a statistic as a result. We know the shape of the statistical distribution for each type of test, therefore we can easily find the probability of exceeding this test value. Remember we called this the p-value. Now all we need to decide is what is an acceptable level of chance - that is, when would the outcome be so rare that we would not expect to see it purely by chance sampling error alone? Most researchers agree that if the p-value is 5% (.05) or less than, then chance is not the cause of the observed difference, something else must be responsible. This decision point is called alpha. Other values of alpha frequently used are 10% (often used in marketing tests) and 1% (frequently used in medical studies). The smaller the chosen alpha is, the more serious the error is in rejecting the null when we should not have. For our analysis, we will use an alpha of .05 for all our tests. Final Point You may have noticed that we have two basic types of hypothesis statements - those testing equality and those testing directional differences. This leads to two different types of statistical tests - the two-tail and the one-tail. In the one-tail test, the entire value of alpha is focused on the distribution tail - either the right or left tail depending upon the phrasing of the alternate hypothesis. A neat hint, the arrow head in the alternate hypothesis shows which tail the result needs to be in to reject the null. In the case of the two-tail test (equality), we do not care if one variable is bigger or smaller than the other, only that they differ. This means that the rejection statistic could be in either tail, the right or left. Since the reject region is split into two areas, we need to split alpha into these areas - so with a two-tail test, we use alpha/2 as the comparison with our p-value (e.g., 0.05/2 = 0.025). The example in Lecture 5 will review this in more detail. References Lind, D. A., Marchel, W. G., & Wathen, S. A. (2008). Statistical Techniques in Business & Finance. (13th Ed.) Boston: McGraw-Hill Irwin. Lecture 5 The T-Test In the previous lecture, we introduced the hypothesis testing procedure, and developed the first two steps of a statistical test to determine if male and female mean salaries could be equal in the population - where our differences were caused simply by sampling errors. This lecture continues with this example by completing the final three steps. It also introduces our first statistical test, the t-test for mean equality. Last week we looked at the normal curve and noted several of its characteristics, such as mean = median = mode, symmetrical around the mean, curve height drops off the further the score gets from the mean (meaning scores further from the mean are less likely to occur). Our first statistical test, the t-test, is based on a population that is distributed normally. The t-test is used when we do not have the population variance value - this is the situation every time we use a sample to make decisions about their related populations. While the t-test has several different versions, we will focus on the most commonly used form - the two sample test for mean equality assuming equal variance. When we are testing measures for mean equality, it is fairly rare for the variances to be much difference, and the observed difference is often merely sample error. (In Lecture 6, we will revisit this assumption.) The logic of the test is that the difference between mean values divided by a measure of this difference's variation will provide a t statistic that is distributed normally, with the mean equaling 0 and the standard deviation equaling 1. This outcome can then be tested to see what the likelihood is that we would get a value this large or larger purely by chance - our old friend the p-value. If this p-value exceeds our decision criteria, alpha, then we reject the null hypothesis claim of no difference (Lind, Marchel, & Wathen, 2008). Setting up the t-test Before selecting any test from Excel, the data needs to be set up. For the t-test, there are a couple of steps needed. First, copy the data you want to first set up the data. In our question about male and female salaries, copy the gender variable column from the data page to a new worksheet page (the recommendation is on the week 2 tab) and paste it to the right of the questions (such as in column T), then copy and paste the salary values and paste them next to the gender data. Next, sort both columns by the gender column - this will give you the salary data sorted by gender. Then, in column V place the label/word Males, and in column W place the label Females. Now copy the male salaries and paste them under the Male label, and do the same for the female salaries and the female label. The data is now set up for easy entry into the T-test data entry section. The t-test is found in the Analysis Toolpak that was loaded into your Excel program last week. To find it, click on the Data button in the top ribbon, then on the Data Analysis link in the Analyze box at the right, then scroll down to the T-test: Two-Sample Assuming Equal Variances. For assistance in setting up the t-test, please see the discussion in the Week 2 Excel Help lecture. Interpreting the T-test Output The t-test output contains a lot of information, and not all of it is needed to interpret the result. The important elements of the t-test outcome will be shown with an example for our research case question. Equal Pay Example - continued In Lecture 4 we set up the first couple of steps for our testing of the research question: Do males and females receive equal pay for equal work? Our first examination of the data we have for answering this question involves determining if the average salaries are the same. Here is the completed hypothesis test for the question: Is the male average salary equal to the female average salary? Step 1. Ho: Male mean salary = female mean salary Ha: Male mean salary female mean salary Step 2. Reject the null if the p-value is < (less than) alpha = .05. Step 3. The selected test is the Two-Sample T-test assuming equal variances. Step 4. The test results are below. The screen shot shows output table. Step 5. Interpretation and conclusions. The first step is to ensure we have all of the correct data. We see that we have 25 males and females in the Observations row, and that the respective means are equal to what we earlier calculated. The calculated t statistic is 2.74 (rounded). We have two ways to determine if our result rejects or fails to reject the null hypothesis; both involve the two-tail rows, as we have a two tail test (equal or not equal hypothesis statements). The first is a comparison of the t-values - if the critical t of 2.74 (rounded) is greater than the T-Critical two-tail value of 2.01, we reject the null hypothesis. The second way is to compare the p-value with our criteria of alpha = .05. Remember, since this is a two-tail test, the alpha for each tail is half of the overall alpha or .025. If the p-value (shown as P(T<=t) two -tail value of 0.0085 is less than our one tail alpha (.025) then we reject the null hypothesis. Note: at times Excel will report the p-value in an E format, such as 3.45E-04. This is called an Exponent format, and is the same as 3.45 * 10-04. This means move the decimal point 4 places to the left, making 3.45E-04 = 0.000345. Virtually any p-value reported with an E-xx form will be less than our alpha of 0.05 (which would be 5E-02). Since we rejected the null hypothesis in both approaches (and both will always provide the same outcome), we can answer our question with: No - the male and female mean salaries are not equal. Note that for this set of data, we would have rejected the null for a one-tail test if and only if the null hypothesis had been: Male mean salary is <= Female mean salary and the alternate was Male mean salary is > Female mean salary. The arrow in the alternate points to the positive/right tail and that is where the calculated t-statistic is. So, even if the p-value is smaller than alpha in a one tail test, we need to ensure the t-statistic is in the correct tail for rejection. References Lind, D. A., Marchel, W. G., & Wathen, S. A. (2008). Statistical Techniques in Business & Finance. (13th Ed.) Boston: McGraw-Hill Irwin. Lecture 6 (Additional information on t-tests and hypothesis testing) Lecture 5 focused on perhaps the most common of the t-tests, the two sample assuming equal variance. There are other versions as well; Excel lists two others, the two sample assuming unequal variance and the paired t-test. We will end with some comments about rejecting the null hypothesis. Choosing between the t-test options As the names imply each of the three forms of the t-test deal with different types of data sets. The simplest distinction is between the equal and unequal variance tests. Both require that the data be at least interval in nature, come from a normally distributed population, and be independent of each other - that is, collected from different subjects. The F-test for variance. To determine if the population variances of two groups are statistically equal - in order to correctly choose the equal variance version of the t-test - we use the F statistic, which is calculated by dividing one variance by the other variance. If the outcome is less than 1.0, the rejection region is in the left tail; if the value is greater than 1.0, the rejection region is in the right tail. In either case, Excel provides the information we need. To perform a hypothesis test for variance equality we use Excel's F-Test Two-Sample for Variances found in the Data Analysis section under the Data tab. The test set-up is very similar to that of the t-test, entering data ranges, checking Labels box if they are included in the data ranges, and identifying the start of the output range. The only unique element in this test is the identification of our alpha level. Since we are testing for equality of variances, we have a two sample test and the rejection region is again in both tails. This means that our rejection region in each tail is 0.25. The F-test identifies the p-value for the tail the result is in, but does not give us a one and two tail value, only the one tail value. So, compare the calculated p-value against .025 to make the rejection decision. If the p-value is greater than this, we fail to reject the null; if smaller, we reject the null of equal variances. Excel Example. To test for equality between the male and female salaries in the population, we set up the following hypothesis test. Research question: Are the male and female population variances for salary equal? Step 1: Ho: Male salary variance = Female salary variance Ha: Male salary variance Female salary variance Step 2: Reject Ho if p-value is less than Alpha = 0.025 for one tail. Step 3: Selected test is the F-test for variance Step 4: Conduct the test Step 5: Conclusion and interpretation. The test resulted in an F-value less than 1.0, so the statistic is in the left tail. Had we put Females as the first variable we would have gotten a right tail F-value greater than 1.0. This has no bearing on the decision. The F value is larger than the critical F (which is the value for a 1-tail probability of 0.25 - as that was entered for the alpha value). So, since our p-value (.44 rounded) is > .025 and/or our F (0.94 rounded) is greater than our F Critical, we fail to reject the null hypothesis of no differences in variance. The correct ttest would be the two-sample T-test assuming equal variances. Other T-tests. We mentioned that Excel has three versions of the t-test. The equal and unequal variance versions are set up in the same way and produce very similar output tables. The only difference is that the equal variance version provides an estimate of the common variation called pooled variance while this row is missing in the unequal variance version. A third form of the t-test is the T-Test: Paired Two Sample for Means. A key requirement for the other versions of the t-test is that the data are independent - that means the data are collected on different groups. In the paired t-test, we generally collect two measures on each subject. An example of paired data would be a pre- and post-test given to students in a statistics class. Another example, using our class case study would the comparing the salary and midpoint for each employee - both are measured in dollars and taken from each person. An example of NON-pared data, would the grades of males and females at the end of a statistics class. The paired t-test is set up in the same way as the other two versions. It provides the correlation (a measure of how closely one variable changes when another does - to be covered later in the class) coefficient as part of its output. An Excel Trick. You may have noticed that all of the Excel t-tests are for two samples, yet at times we might want to perform a one-sample test, for example quality control might want to test a sample against a quality standard to see if things have changed or not. Excel does not expressly allow this. BUT, we can do a one-sample test using Excel. The reason is a bit technical, but boils down to the fact that the two-sample unequal variance formula will reduce to the one-sample formula when one of the variables has a variance equal to 0. So using the unequal variance t-test, we enter the variable we are interested - such as salary - as variable one and the hypothesized value we are testing against - such as 45 for our case - as variable two, ensuring that we have the same number of variables in each column. Here is an example of this outcome. Research question: Is the female population salary mean = 45? Step 1: Ho: Female salary mean = 45 Ha: Female salary mean 45 Step 2: Reject the null hypothesis is less than Alpha = 0.05 Step 3: Selected test is the two sample unequal variance t-test Step 4: Conduct the test Step 5: Conclusions and Interpretation. Since the two tail p-value is greater than (>) .025 and/or the absolute value of the t-statistic is less than the critical two tail t value, we fail to reject the null hypothesis. Our research question answer is that, based upon this sample, the overall female salary average could equal 45. Miscellaneous Issues on Hypothesis Testing Errors. Statistical tests are based on probabilities, there is a possibility that we could make the wrong decision in either rejecting or failing to reject the null hypothesis. Rejecting the null hypothesis when it is true is called a Type I error. Accepting (failing to reject) the null when it is false is called a Type II error. Both errors are minimized somewhat by increasing the sample size we work with. A type I error is generally considered the more severe of the two (imagine saying a new medicine works when it does not), and is managed by the selection of our alpha value - the smaller the alpha, the harder it is to reject the null hypothesis (or, put another way, the more evidence is needed to convince us to reject the null). Managing the Type II error probability is slightly more complicated and is dealt with in more advanced statistics class. Choosing an alpha of .05 for most test situations has been found to provide a good balance between these two errors. Reason for Rejection. While we are not spending time on the formulas behind our statistical outcomes, there is one general issue with virtually all statistical tests. A larger sample size makes it easier to reject the null hypothesis. What is a non-statistically significant outcome based upon a sample size of 25, could very easily be found significant with a sample size of, for example, 25,000. This is one reason to be cautious of very large sample studies - far from meaning the results are better, it could mean the rejection of the null was due to the sample size and not the variables that were being tested. The effect size measure helps us investigate the cause of rejecting the null. The name is somewhat misleading to those just learning about it; it does NOT mean the size of the difference being tested. The significance of that difference is tested with our statistical test. What it does measure is the effect the variables had on the rejection (that is, is the outcome practically significant and one we should make decisions using) versus the impact of the sample size on the rejection (meaning the result is not particularly meaningful in the real world). For the two-sample t-test, either equal or unequal variance, the effect size is measured by Cohen's D. Unfortunately, Excel does not yet provide this calculation automatically, however it is fairly easy to generate. Cohen's D = (absolute value of the difference between the means)/the standard deviation of both samples combined. Note: the total standard deviation is not given in the t-test outputs, and is not the same as the square root of the pooled variance estimate. To get this value, use the fx function stdev.s on the entire data set - both samples at the same time. Interpreting the effect size outcome is fairly simple. Effect sizes are generally between 0 and 1. A large effect (a value around .8 or larger) means the variables and their interactions caused the rejection of the null, and the result has a lot of practical significance for decision making. A small effect (a value around .2 or less) means the sample size was more responsible for the rejection decision than the variable outcomes. The medium effect (values around .5) are harder to interpret and would suggest additional study (Tanner & Youssef-Morgan, 2013). References Lind, D. A., Marchel, W. G., & Wathen, S. A. (2008). Statistical Techniques in Business & Finance. (13th Ed.) Boston: McGraw-Hill Irwin. Tanner, D. E. & Youssef-Morgan, C. M. (2013). Statistics for Managers. San Deigeo, CA: Bridgepoint Education. Week 3 Lecture 7 We have so far seen how we can summarize data sets using descriptive statistics, showing several characteristics including mean and standard deviation. We also found that if our data comes from a random sample of a larger population, these descriptive statistics become inferential statistics, and can be used to make inferences about the population. These inferences can then be used in statistical tests to see if things have changed or not (equal to known standards or other data sets or not). We have looked at one and two sample mean tests (with the t-test) and two sample comparisons of variance equality (with the F test). This week we will look at the Analysis of Variance (ANOVA) test for mean equality between three or more groups. ANOVA The first question often asked is why not just do multiple t-tests comparing three or more different group means? One answer involves efficiency. Conducting multiple t-tests can become somewhat tedious. Comparing just three groups (A, B, and C) requires us to compare A and B, B and C, and A and C (3 tests). With 4 groups (A, B, C, and D) we have A and B, A and C, A and D, B and C, B and D, C and D (6 tests)! So a single test can save us a lot of time and is much more efficient. A second reason and much more important reason is that we lose confidence in our results when multiple tests are performed on the same data. With an alpha of 0.05, we are 95% certain we are right with each test, but being certain we are right for all the tests involves multiplying the results together, so for three tests we would be .95*.95*.95 or 86% certain; with six tests, our confidence drops to .95^6 = .74, a long way from our desired 95% confidence. So, a single test maintains our desired level of confidence in the outcome (Lind, Marchel, & Wathen, 2008). Logic A second question asked comes from the name itself, how can analyzing variance tell us anything about mean differences? The answer lies in how ANOVA works. The key assumptions for an ANOVA analysis are that each of the groups are normally distributed AND have equal variances. These mean that the distributions are shaped the same and, this allows for an easy comparison. Take a look at the following two sets of normal curves. 0.45 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 -5 -4 -3 -2 -1 0 1 2 3 4 5 Exhibit A 0.45 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 -10 -5 0 5 10 Exhibit B The means of the three sample groups in Exhibit A could clearly come from three populations that have the same mean, and the differences seen are merely sampling errors. However, we cannot say the same thing about the sample groups in Exhibit B. ANOVA takes the variation of all of the data in the groups being tested (three in this case) and compares it with the average variation for each of the groups using the F-test (discussed last week). Since for the Exhibit A groups, the overall variation will be only slightly larger than the average of the three (which are assumed to be equal). Since the resulting F value will not be statistically significant, we can say that the groups are closely distributed and the means are statistically equal. In Exhibit B, however, the variation of the entire group would be around three times the variation of the average. Just by looking at the average variance for the individual groups and comparing it to the variance for the entire group, we can make a judgement on how close the distributions are, and with that a judgement on mean equality. As with the t-test, ANOVA will let us know exactly how much difference in the population locations is enough to say means differ or not, we cannot just \"eyeball\" it. Hypothesis Stating the null and alternate hypothesis for an ANOVA test is simple, as they are always the same: Ho: All means equal. Ha: At least one mean differs (Tanner & Youssef-Morgan, 2013). You might recall from last week that we said the alternate always states the opposite from the null statement. If so, why isn't our alternate: all means differ, which seems like the opposite? The reason is that the ANOVA test will reject the null hypothesis if even one mean from the groups being examined is statistically significant difference. So, the opposite of all means differ is actual at least one mean differs. Data Set-up Setting-up the data for an ANOVA analysis is just a bit more complicated than for a ttest. While with the T-test we just highlighted the column or portion of a column of data (sometimes after sorting it by a variable such as gender), for an ANOVA test, we need to create a table. For example, if we wanted to look at average salaries per grade (shown in the Week 3 Lecture 8 example), we would need a table looking like this. Doing this is fairly simple. Copy the grade and salary columns (separately) and paste them onto a new Excel sheet (probably in Week 3 to the right of the questions). Then, highlight both columns - from labels to last value - and select Data Sort. Select sorting on the grade variable and click on OK. Both columns are now in grade order, and you can highlight and cut the salaries for each grade and paste them into a new table you create with the grade letter as the head. When finished, you will have the input table used in setting up an Excel ANOVA test. References Lind, D. A., Marchel, W. G., & Wathen, S. A. (2008). Statistical Techniques in Business & Finance. (13th Ed.) Boston: McGraw-Hill Irwin. Tanner, D. E. & Youssef-Morgan, C. M. (2013). Statistics for Managers. San Diego, CA: Bridgeport Education. Week 3 Lecture 8 Excel ANOVA Example In our on-going investigation of whether or not males and females are paid equally for equal work, we have come up with contradicting results so far, average salaries are clearly different but average compa-ratios are not. We need to examine reasons that might impact these differences to see if we can explain what is going on. For possible factors influencing individual salaries, we need to be able to, paraphrasing what they say in TV cop shows, \"rule it out as a suspect\" in causing differences or keep it in as a cause of differences between the gender pay practices. One key issue in our question that has not clearly been examined yet is the impact of grades on salaries. Clearly, grade differences have the potential to complicate the issue as the work done differs by grade. One question to ask here is, \"are average salaries equal across grade levels?\" This becomes our research question. Example For the research question of: are average salaries equal across the grades, we have the following hypothesis test. Step 1: Ho: All salary means are equal. Ha: At least one mean differs. Step 2: Reject the null if the p-value < alpha = .05. Step 3: Statistical Test: Single-factor ANOVA. (Note: salary variance in some of the grades may violate the equal variance requirement. We will ignore this for the purposes of this example.) Step 4: Perform Test. The input box for Excel's Single factor ANOVA is The input range for this example would be D1:F16; we would click on Labels in the first row, and select any output range desired (This would be given in the assignment for consistency's sake). Completing the input screen and clicking OK gives us an output table. Reading the ANOVA output tables The first thing we see is the test name in cell K-1: Anova: Single Factor. This is just a check to ensure we have the right test. Next we see a summary table. Under the Groups column we should see the data labels (in this case our grades). If not, and we see something such as a number, an input error has been made, the labels were not included but the Labels box was checked. If this happens, just redo the data set up and overwrite the output. For each variable, we see the count, sum, average, and variance. If we had some question about having equal variance, we could perform an F-test on the variables with the extreme values. (Again, for purposed of this example, we are going to ignore the requirement for equal variances.) The next table is the ANOVA output. While, technically for our hypothesis test, we only need to look at the p-value result, the other columns provide some useful information. Note: this is somewhat technical, and is presented only as an explanation of the table. The source of variation column gives us our two variation measures; Between groups refers to the overall variation while Within Groups refers to the average variation for all the groups. The SS column (Sum of Squares) is an estimate of the variation (slightly different than our variance formula). This value is divided by the df (degrees of freedom) value for each group. This df is conceptually the same as that discussed with the t-test; and the total df is N-1, where N is the number of data points. Looking at this value (49 in this example) confirms we entered the right number of data points of 50. MS stands for Mean Square and is the SS divided by the df. The F value is determined by dividing the MS for the between row by the MS of the Within groups row. The p-value and the critical F statistic complete the table. Step 5: Conclusion and Interpretation: The F is much larger than the F critical, and the pvalue is much less than 0.05 (Note: 1.04E-35 means move the decimal point 35 places to the left (0.0000000000000000000000000000000000104). If the E (for exponent) had been positive, we would have moved the decimal to the right, example 1.04E4 = 10400.) So, according to our decision rule, since the p-value is < (less than) 0.05, we reject the null hypothesis and conclude that at least one mean differs. This suggests that grade level has an impact on salary, and that measuring pay in salary terms could be creating some issues in answering our questions. Determining Differences When we reject the null hypothesis, a logical follow-up question is often, which differences are meaningful? There are several approaches to answering this question; all involve a pair by pair comparison, and most require access to statistical tables not available within Excel. One approach that we can use in our Excel worksheet involves developing confidence intervals around the difference in group means. (Note: Confidence intervals allow us to develop a range that contains the value we are looking for with a known level of confidence such as 95%. We will discuss this again in Week 5.) All of the required information for these intervals is available from the ANVOA output. The basic approach is to 1. Find the difference between each pair of means 2. To this value, add and subtract a measure of the variation in the data (due to sample error, we know our sample means are not exactly equal to the population parameter, so we need to take this sample error into account, our real difference might be a bit larger or smaller than the samples show). 3. Examine the ranges to see if 0 is included (alternately, do the endpoints have different signs a + and -); if so the real population difference could be 0 and the means do not significantly differ. The formula for the interval that we will build in Excel is: (mean1 - mean2) +/- t*sqrt(MSW * (1/n1 + 1/n2)) (Lind, Marchel, & Wathen, 2008). Here is an example of how we work out the formula, and what each term means. The value of the means for each variable is found in the Summary table, as is the count (n) for each variable. The MSW is the MS for within that is found in the ANOVA table, and we find t with the t.inv function from Excel. So, let's walk thru constructing an interval for grades A and B, and then we can look at what it might look like in an Excel spreadsheet. From our example output above, we have: Mean A = 23.5 (rounded) Mean B = 31.7 (rounded) n for A = 15 n for B = 7 MSW = 8.64 (rounded) T has a df equal to that of MSW (44 in this case), and the probability is our 0.05 for a 95% interval. T.inv(0.05, 44) equals 2.015 (rounded). So, for grades A and B, our mean difference = 31.7 - 23.5 = 8.2 The +/- term is t * sqrt(MSW * (1/n1 +1/n2)). Plugging in our values gives us 2.015* sqrt(8.64 * (1/15 + 1/7) = 2.71. So, our interval is 8.2 +/- 2.71 = 5.49 to 10.91 (rounded). Since 0 is not in this range, we can say that the mean salaries for grades A and B differ significantly. Setting this up in Excel (using cell references as the examples on the left show) give us the following: So, all of the grade average salary differences are significantly different from each other. Grade is definitely a factor in an employee's salary, and introduces a source of variation that is not an equal work measure. We have not yet found an answer to our question, as we have not yet figured out how to get a measure of equal work to base our comparisons on. More to follow next week. References Lind, D. A., Marchel, W. G., & Wathen, S. A. (2008). Statistical Techniques in Business & Finance. (13th Ed.) Boston: McGraw-Hill Irwin. Week 3 Lecture 9 Effect Size When we reject the null hypothesis with an ANOVA test, we have two questions that arise. The first, which pair of means differs significantly, we have dealt with already. The second question, similar to what we asked with the t-test null hypothesis rejection is: what caused the rejection, the sample size, or the variable interactions? This question is again answered using an effect size measure. Recall that the effect size measure shows how likely the variable interaction caused the null hypothesis rejection. Large values lead us to say the variables caused the outcome, while small values lead us to say the outcome has little to no practical significance as the sample size was the most likely cause of the rejection of the null. With the single factor ANOVA, the effect size measure is eta squared, and equals the SS(between)/SS(total) (Tanner & Youssef-Morgan, 2013). For our salary example in Lecture 8, eta squared equals 17686.02 (SS(between)) / 18066 (SS(total)) = 0.979 (rounded). Eta squared effect size measures have different interpretation values than Cohen's d (from the ttest). According to Nandy (2012), a small eta squared effect size has a value of 0.01, a medium of 0.06, and a large value of 0.14 or more. This means we have a large effect size, and the variables of salary and grade interaction are the most likely cause of our rejecting the null hypothesis rather than the sample size. Side note: Eta squared can also be interpreted as the percent of \"differences between group scores that can be explained by the independent variable\" (Tanner & Youssef-Morgan, 2013, p. 123). This is consistent with our saying the variable interactions caused the outcome. Different Forms of ANOVA Just as the t-test has several forms, so does the ANOVA test. Excel has three versions available. While we will focus only on the single factor test, a brief description of the other two versions will be presented. ANOVA: Two factor without replication The ANOVA - two factor without replication tests mean differences from two different variables at the same time. If we are interested in knowing if the mean salary differs by grade and also by gender, we can perform one two-factor test rather than two separate tests. As mentioned in lecture two for this week, this is more efficient and maintains our desired alpha significance level. Excel Example. To test the mean salaries by grade and gender at the same time, we would set up our hypothesis test as follows. Step 1: Ho1: All salary means are equal across grades. Ha1: At least one mean differs. Ho2: All gender (male and female) means are equal. Ha2: At least one mean differs. Note that in this test, we need to have a hypothesis statement pair for each variable being tested. Step 2: ANOVA: Two sample without replication. Step 3: Reject the null hypothesis if the p-value is < alpha = .05. Step 4: Perform the test. While the input screen for this test is identical to that of the one factor test, the data table used is a bit different. As seen below, it has one value for each variable pair cell. Since we have multiple values for each variable pair, this table was set up with the mean values for each group. A B C D E F Male 24.3 27.7 43.3 48.0 61.7 75.3 Female 23.3 34.8 41.5 52.5 67.0 76.0 The data entry box would include the entire table, labels and all. The output for this test is: Step 5: Conclusions and Interpretation. As with the single factor ANOVA, we start out with a summary table for each variable showing the sum, average, and variance for each variable label. The ANOVA table has an extra row, and one renamed row. The Error row is what we knew as the Within row in the single factor ANOVA. The two rows dedicated to the data are Rows and Columns; these refer to how the variables are presented in the data input table. The row line refers to our gender variable, since that is the row variable in the input. The p-value is 0.16 (rounded), so we do not reject the null hypothesis of equal means. The Column line refers to the grade variable, as that was listed in the column position. This p-value is 3.76E-05, or 0.0000376. This is less than (<) our alpha of .05, so we reject the null hypothesis of equal salary means in each grade. We can find which pair(s) of means differ using the same technique as with the single factor ANOVA discussed in Lecture 8. The effect size measure for a Two-factor ANOVA without replication is generally the same as with the single factor ANOVA. For each variable it would be eta squared = SS(for variable) divided by the SS(total) value (Tanner & Youssef-Morgan, 2013). The effect size for our rejected null hypothesis is 3865.341/3917.059 = .987 (rounded), a very large effect - meaning the variable interaction caused the rejection of the null, and we have significant practical outcome; one we can make decisions with. But, let's go back to the other result, the failure to reject the null hypothesis claiming that the male and female average salaries are equal. What goes with this outcome? We have clear evidence from t-test done in Week 2 that the average salaries are not equal. This brings us to the other reason for using this test. This is to reduce one cause of error or variation in the measurement of a variable. For example, if we think that grade level may be a cause of differences in the salaries by grade (a reasonable assumption), then we can remove their impact by using this approach. It will take the grade variation out of the overall analysis of salary and include it only in the grade results. What does this mean? We have been concerned that we have not been able to measure salary for \"equal work,\" this approach does this for us. The salary average difference examined in this test has the impact of grade level differences removed, in essence, the salary that is analyzed is the salary impact of gender if everyone did \"equal work\" (at least as far as job duties). There is still some questions around the impact of performance ratings, education, seniority, etc. But for now, we have a better view of \"equal pay for equal work\" salary differences. It appears that perhaps males and females are being paid equally for equal work, on average. Ah, the power of statistics to make things clearer. ANOVA: Two-factor with Replication (AKA Factorial ANOVA) This form of the ANOVA test is somewhat different than the previous two forms. While it can test for mean equality (or differences), this is not its primary purpose. The main purpose is to look at the impact of interaction between variables - that is do the results show different patterns when graphed? Interaction means the variables react differently at different measurement levels (Lind, Marchel, & Wathen, 2008). An example is water and temperature, at cold temperatures water is a solid, at mid-range temperatures it is a liquid, at high temperatures it is a gas; there is a clear interaction going on. As with the without replication test, an example will help demonstrate this test. We will continue with our gender and grade impact on salary. While our primary research question will be if an interaction between gender and grades impacts salary, we will also repeat our questions about mean salary differences by gender and grade. Excel Example. To test the mean salaries by grade and gender at the same time, we would set up our hypothesis test as follows. Step 1: Ho1: All salary means are equal across grades. Ha1: At least one mean differs. Ho2: All gender (male and female) means are equal. Ha2: At least one mean differs. Ho3: The interaction impact is not significant. Ha3: The interaction is significant. Note that in this test, we need to have a hypothesis statement pair for each variable being tested, as well as the interaction. Step 2: ANOVA: Two sample with replication. Step 3: Reject the null hypothesis if the p-value is < alpha = .05. Step 4: Perform the test. The input screen for this test is similar to that of the other ANVOA forms, it asks for the number of rows for each variable, which seen below would be two. The data table used is a bit different, as seen below, it has multiple values for each cell. Since several grades have only two males or females, we can only use two values in each cell in our table. If your data has more counts per cell, you can include more values. The data entry table was set up with the minimum and maximum salary values for each cell. Male Female A B C D E F 24.0 27.0 40.0 47.0 62.0 72.0 25.0 28.0 47.0 49.0 66.0 77.0 22.0 34.0 41.0 50.0 65.0 75.0 24.0 36.0 42.0 55.0 69.0 77.0 The data entry box would include the entire table, labels and all. The output for this test is: Step 5: Conclusions and Interpretation. As with the other ANOVA forms, we start out with a summary of the variables. In the ANOVA table itself, we have added ano