Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Running head: VALLO STAT REPORT 1 Statistical Report Dr. Mohamed Elseifi QSO-530 Applied Statistics for Managers June 27, 2015 VALLO STAT REPORT 2 Executive Summary

Running head: VALLO STAT REPORT 1 Statistical Report Dr. Mohamed Elseifi QSO-530 Applied Statistics for Managers June 27, 2015 VALLO STAT REPORT 2 Executive Summary This report summarizes my finding relating to the correlation of Marines' two different fitness scores, all belonging to the same unit. This study was important to ensure that all Marines are in compliance with the established standard of the unit commander. With the results of this study, the commander can either continue with the current physical training regimen or alter as necessary. The report contains the following: (1) an explanation of the study, (2) the method by which the samples were selected from the population and the chosen sample size, (3) the data, (4) the statistical methodology, and (5) a summary of my results. The Study The Marine Corps requires each Marine to run two different fitness tests during each calendar year. A physical fitness test (PFT) is required between January and June while the combat fitness test (CFT) is required between July and December. The two tests measure different physical abilities through a series of three different events. Because the tests' event composition is drastically different, performance could vary significantly; just because a subject scores high on one test does not guarantee they will score high onthe other test. I aimed to identify if a significant correlation exists between the scores of these two tests. These tests are part of Marines' promotion performance standards whereby the integrity of these score is legitimized. One hypothesis tested will be written as such: H0: p = 0 HA: p 0 We will write another hypothesis test by running a two-factor ANOVA. VALLO STAT REPORT 3 H0: Factors A and B do not interact to affect the mean response HA: Factors A and B do interact The sample selection and size Because access to these scores is restricted, to remain in compliance with the sensitive nature of personally identifiable information regulations, I submitted a request through the Operations Department for a total of 20 Marines PFT and CFT scores. The sample size is consists of approximately 20% of the total on hand of the permanent staff which fluctuates between 90 and 110.My personal hypothesis is that there will be a significant correlation between the two scores; this assumption is based on experience and having performed these tests for over 15 years. The Data The data I collected consists of PFT scores and CFT scores for the same 20 Marines, all of whom belong to the same unit but belong to different departments. The Marines are unaware that their scores are being utilized to conduct this analysis. As noted in Exhibit 1, the CFT mean and median is significantly higher than that of the PFT; also, the PFT has a standard deviation of over twice that of the CFT. Statistical Methodology I intend to compare 20 samples to determine if there is a statistically significant correlation between the two fitness scores. In layman's terms: if someone has a high PFT score, we can presume with 95% certainty that they will also have a high CFT score. For the purposes of this test, the term high refers the ranging above the respective sample mean. The two samples tests are independent; while the data is paired (where P1= PFT score for subject 1 & C1 = CFT score VALLO STAT REPORT 4 for subject 1), the two test scores are composed of entirely different events and do not affect the other score.The time that can lapse between one test being conducted and the other is quite significant, up to 12 months in some cases. Therefore, if a correlation exists, it likely also determines that the Marine maintains the same relative level of fitness throughout the year. In order to obtain the correlation value, r, I will use the following formula: x ix y i y 2 x ix 2 N i=1 r= = 0.6337 N i=1 N ()( y i y ) i=1 t=r N 2 1r 2 0.6337 2 1 = 0.6337 202 = 0.6337 18 0.598 = 0.6337 30.1 = 0.6337(5.49) = 3.48 Utilizing a p-value from Pearson (r) calculator: r = 06337 & N = 20, I have obtained that p = 0.0027 as displayed in Exhibit 4. VALLO STAT REPORT 5 Now that I have obtained the r value and the p-value, I will test the correlation coefficient for significance to determine the likelihood that this would hold true for an entire population, perhaps the entire Marine Corps. Reject H0 if > p; otherwise, do not reject. Given the = 0.05 > p = 0.0027, I will reject the null hypothesis that there is no correlation between the two samples. I also conducted a two-factor ANOVA to test for an interaction between the two scores; PFT scores labeled as Factor A and the CFT scores labeled as Factor B. Utilizing Microsoft Excel's Add-in Data Analysis tool, I have conducted a two-factor ANOVA without replication as displayed in Exhibit 5. F= M S AB MSE = 21.86 12.14 = 1.8 F0.05 = (0.05,12,64) = 1.9 Because F = 1.8 < F0.05 = 1.9, we do not reject the null hypothesis; factors A and B do not interact. Results In order to determine the correlation coefficient, r, I utilized Microsoft Excel's 'Correlation' {=Correl(array1, array2)} function which produced a value of 0.6337. I then manually computed the t-statistic utilizing the formula above to arrive at the value of 3.48. I utilized the Excel MegaStat Add-in to run a scatterplot using the PFT score on the x-axis and the CFT score VALLO STAT REPORT 6 on the y-axis. The graph, listed as Exhibit 2 shows that the data points tend to stay close to the linear regression line which indicates a positive correlation. I then utilized the MegaStat Add-in to run a Regression Analysis, the results are displayed in Exhibit 3; the t-value computed via Excel is identical to that manually computed above. Exhibit 6 shows a regression analysis summary output which displays R2 = 0.4016. This figure indicates that knowing a Marines' CFT score will account for 40.16% of the variation in their PFT scores. While all data up to this point indicates a significant correlation, the two factor ANOVA did not confirm this. I believe this to be a Type I error, and will fail to reject the null hypothesis. The difference in F-values of 0.1, is not sufficient to outweigh the remainder of the data obtained. Given that I conducted these tests with an = 0.05, I am able to inform the commander with 95% confidence that a Marine with a high PFT score will also have a high CFT score. VALLO STAT REPORT 7 Exhibit 1 Subjec t 1 2 3 4 5 PFT Score 181 272 273 261 260 CFT Score 275 284 258 293 288 6 7 279 241 287 279 8 9 10 11 12 13 14 15 16 17 18 19 20 251 223 281 284 293 300 222 237 235 287 279 282 223 288 273 290 299 300 298 261 261 280 291 297 295 281 Exhibit 2 Summary of seleceted variables PFT CFT Mean 258 284 Median 267 288 30.7016200 Standard Deviation 5 12.89593979 Correlation 0.63374376 8 VALLO STAT REPORT 8 Fitness Score Correlation 310 300 f(x) = 0.27x + 215.17 R = 0.4 290 280 CFT Score 270 260 250 240 230 150 170 190 210 230 250 270 290 310 PFT Score Exhibit 3 Regression Analysis r r Std. Error 0.40 0.63 10.25 n k Dep. Var. 20.00 1.00 Y ANOVA table Source Regression Residual Total SS 1269.07 1890.73 3159.80 df 1.00 18.00 19.00 MS 1269.07 105.04 F 12.08 p-value 0.00 VALLO STAT REPORT 9 Regression output variables Intercept X1 coefficient s 215.17 0.27 std. error 19.91 0.08 t (df=18) 10.81 3.48 pvalue 0.00 0.00 Exhibit 4 R Score: .6336 N: 20 Significance Level: 0.01 0.05 0.10 The P-Value is 0.002705. The result is significant at p < 0.05. confidence interval 95% 95% lower upper 173.35 256.99 0.11 0.43 VALLO STAT REPORT 10 Exhibit 5 Anova: Two-Factor Without Replication Row 1 Row 2 Row 3 Row 4 Row 5 Row 6 Row 7 Row 8 2.000 2.000 2.000 2.000 2.000 2.000 2.000 2.000 456.000 556.000 531.000 554.000 548.000 566.000 520.000 539.000 228.000 278.000 265.500 277.000 274.000 283.000 260.000 269.500 Row 9 Row 10 Row 11 Row 12 Row 13 Row 14 Row 15 2.000 2.000 2.000 2.000 2.000 2.000 2.000 496.000 571.000 583.000 593.000 598.000 483.000 498.000 248.000 285.500 291.500 296.500 299.000 241.500 249.000 Row 16 Row 17 Row 18 Row 19 2.000 2.000 2.000 2.000 515.000 578.000 576.000 577.000 257.500 289.000 288.000 288.500 Row 20 2.000 504.000 5164.00 0 5678.00 0 252.000 Varianc e 4418.00 0 72.000 112.500 512.000 392.000 32.000 722.000 684.500 1250.00 0 40.500 112.500 24.500 2.000 760.500 288.000 1012.50 0 8.000 162.000 84.500 1682.00 0 258.200 942.589 283.900 166.305 SUMMARY Count Column 1 20.000 Column 2 20.000 Sum Average ANOVA Source of Variation Rows SS 15301.90 0 df 19.000 Columns 6604.900 1.000 MS 805.363 6604.90 0 F 2.653 21.760 Pvalue F crit 2.16 0.020 8 4.38 0.000 1 VALLO STAT REPORT Error Exhibit 6 SUMMARY OUTPUT Regression Statistics Multiple R 0.633743768 R Square 0.401631163 Adjusted R Square 0.36838845 Standard Error 24.39980325 Observations 20 11 5767.100 19.000 303.532 Milestone 4 Final Report QSO530 Q3768 Instructor feedback Good job on your analysis. To answer your research question you do not need ANOVA, but a mean comparison test (assuming equal or unequal variances). You will need to include at least one more test to support your initial findings (Chi square for example) or add more data related to this test and include one more analysis step. You will also need to elaborate on the conclusions of each test as well as the limitations of the results obtained. You will need to include a section describing the motivation for the study and possible future work to improve this project. Leak Test QSO 530: Applied Statistics for Managers 16TW3 Q3768 Southern New Hampshire University, College of Continuing Education Submitted to Dr. Mohamed Elseifi Shelby Burroughs May 5th, 2016 1 Milestone 4 Final Report QSO530 Q3768 Executive Summary The purpose of this report is to analysis the leak test machine downtime. The test will be for two different shifts and one leak tester for comparison. Staff level management requested the study to ensure the processes were being followed according to company policy. The standard for escalating downtime issues on any assembly line is a threshold of two hours. This standard is for cumulative downtime or a single event. This will help to determine if retraining the operators in proper escalation procedures will be necessary or if the equipment needs to be adjusted. The report is broken down in the following areas; a description of the report, the sampling and size of the sampling, the data that was collected, the statistical methods, and last will be the conclusion. Description of the Report The company has a standard operations procedure to cover excessive downtime during a shift. The policy is to escalate downtime issues if total downtime during a shift, for the same reason, reaches two hours. This analysis was conducted to see if there is indeed excessive downtime and if the process is being followed. The report is the compilation of the data collected. The data collection was collected for one working month for both shifts. This involved one leak test machine with multiple operators. This will allow for examining the differences of the means. The data collected and calculations are shown throughout the charts and tables. The hypothesis is written as: H01: Shift 2 downtime > Shift 1 downtime HA: population means are not the same And for a H0: 1 = 2 2 Milestone 4 Final Report QSO530 Q3768 HA: 1 2 The Sample and Size The samples and the sample sizing was chosen for one month of assembly production. The data includes 22 samples from one leak test machine and 22 samples from the total leak test machines. These are listed in Table 1 and Table 2 under Data Collected. There are a total of 5 stations set up for leak testers. The samples from the one leak tester represents 20% of the total population. The results of the calculations should be statistically accurate. Data Collected Table 1 Sample List 1 Leak Test Samples 1 Leak Tester 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 1st Shift 61 65 62 67 69 70 67 67 69 70 61 67 70 65 67 62 62 65 70 64 63 61 2nd Shift 157 136 160 160 144 188 123 144 195 159 171 117 160 126 166 196 154 116 157 155 169 198 3 Milestone 4 Final Report QSO530 Q3768 Statistical Methodology The report compares 22 samples from 1 Leak Test machine in 2 shifts. The determination will be if there is a significant difference in the means of the two samples. The samples are independent of each other (Statistics Solutions, n.d.). To explain this, if second shift has a larger mean, we can assume with 95% probability that there is excessive downtime on second shift without escalating in the correct time frame. Table 1 was converted to a chart to see the trend lines. Independent T test sample Chart 1: Trend Line of Downtime Leak Test DT 250 200 150 Minutes 100 50 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 Axis Title 1st Shif 2nd Shif The summary of the statistical measures was calculated with Excel and are shown below. Table 2: Statistical Summary Groups Count 1 Leak 1st Shift 22 1 Leak 2nd Shift 22 Sum 144 4 345 1 Averag e Varianc e 65.64 10.62 156.86 583.27 Standard Deviation Median 66 3.26 158 24.15 The coefficient of variation for the samples was also calculated using Excel. 4 Milestone 4 Final Report QSO530 Q3768 Exhibit 1: Coefficient of Variation Groups 1 Leak 1st Shift 1 Leak 2nd Shift Coefficient of Variation 0.05 0.15 From this we know that second shift has a greater variance as well. Conclusion The hypothesis tested is H0: Shift 2 downtime > Shift 1 downtime and HA: the two population means are not the same. The H0 is not rejected because the downtime on shift 2 is higher than shift 1. The mean downtime in Shift 1 is 65.64 minutes with a standard deviation of 3.26 minutes and the median downtime in shift 1 is 66 minutes. The mean downtime in Shift 2 is 156.86 minutes with a standard deviation of 24.16 minutes and the median downtime in shift 2 is 158 minutes. There is sufficient evidence not to reject H0, the null hypothesis, at 95% level of significance. We can say there is higher downtime on shift 2 and we can say the shifts do not have the same mean downtime. 5 Milestone 4 Final Report QSO530 Q3768 References Groebner, D., Shannon, P. and Fry P. (2014) Business Statistics: A Decision-Making Approach ninth edition Retrieved from https://bookshelf.vitalsource.com/#/books/9780133557886/cfi/6/6!/4/4@0:0 Stockburger, D. (1998) Why Multiple Comparisons Using t-tests is NOT the Analysis of Choice Retrieved from http://www.psychstat.missouristate.edu/introbook/sbk27.htm StatisticsSolutions (n.d.) Conduct and Interpret an Independent Sample T-Test Retrieved from http://www.statisticssolutions.com/independent-sample-t-test/ 6 Samples 1 Leak Tester 1st Shift 2nd Shift 1 61 157 2 65 136 3 62 160 4 67 160 5 69 144 6 70 188 7 67 123 8 67 144 9 69 195 10 70 159 11 61 171 12 67 117 13 70 160 14 65 126 15 67 166 16 62 196 17 62 154 18 65 116 19 70 157 20 64 155 21 63 169 22 61 198 ANOVA: Single Factor Summary Groups Count 1 Leak 1st Shift 22 1 Leak 2nd Shift 22 Source SS Between Groups 91546.57 Within Groups 12471.68 Total 104018.25 L 250 150 100 50 0 250 200 150 100 50 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 Axis Title 1st Shif 0 5 1st Shif Leak Test DT Minutes f(x) = 10.5939393939x R = 0.7693827315 200 2nd Shif This chart isn't available in your version of Excel. Editing this shape or saving this workbook into a different file format will permanently break the chart. 0 Sum Average Variance Median Standard Deviation Coefficient of Variation 1444 65.64 10.62 66 3.26 0.05 3451 156.86 583.27 158 24.15 0.15 df MS F p-value F-ratio 1 91546.57 42 296.94 43 308.29 5.87834E-021 4.07 Leak Test Downtime f(x) = 10.5939393939x R = 0.7693827315 5 10 1st Shif 15 2nd Shif 20 Linear (2nd Shif) 25

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Advanced Engineering Mathematics

Authors: ERWIN KREYSZIG

9th Edition

0471488852, 978-0471488859

More Books

Students also viewed these Mathematics questions