Hello,
I am really struggling to understand the following questions and would really appriciate some assistance. The reference made in the first question to pages 177-179 of unit 4 lists three types - Summarising, Comparing, and Seeking a Relationship.
This question is based on your work on MU123 up to and including Unit 4. An IT company develops software for the hospitality sector. It tests its code using an industry standard software package. Aiming to expand their business and be more competitive in the sector, the company has decided to pilot an inhouse software package to test its code. A company researcher wishes to compare the times taken for each package to complete tests. Both software packages were run on 20 occasions testing the same code. The researcher records the times taken on each occasion and these are shown in Table 1. Table 1 Comparison of detection times in minutes for the industry standard and inhouse software testing packages Test number Industry standard software package Inhouse software package 354 323 327 322 341 308 390 332 373 316 386 324 301 326 356 322 653 335 370 342 307 316 342 326 384 345 379 338 363 333 364 336 17 313 322 18 336 320 19 390 351 20 373 332 (a) (i) Which of the three types of investigation discussed on pages 177-9 of Unit 4 is this? Explain your answer briefly. [2] (ii) Are these primary or secondary data, from the researcher's point of view? Explain your answer briefly. [2] (b) Spend a few minutes scanning these datasets by eye. (i) List any three features you should be looking for when scanning datasets by eye. [3] (ii) Comment on whether or not you think there might be a problem with any of the most extreme values in each column. [3](c) Copy and complete the following table using the datasets given above to work out the missing values. [4] Number of minutes to complete the test Industry standard Inhouse Minimum (Min) Lower quartile (Q1) 338.5 322 Median 363.5 326 Upper quartile (Q3) 381.5 335.5 Maximum (Max) Mean 370.1 328.5 Standard deviation (SD) 70.21 10.46 Interquartile range (IQR) Range Size of dataset (n) 20 20 (d) (i) Identify the two measures of location from the table in part (c). Use both of these measures to determine which of the two datasets has the higher location. [4] (ii) Identify the three measures of spread from the table in part (c). Which of the two datasets has the wider spread, as measured by each of these three measures? [4] (e) (1) The researcher concludes that the inhouse software package runs quicker than the industry standard software package. Is this a reasonable conclusion? Explain your answer briefly. [2] (ii) Which stage of the statistical investigation is used in part (e)(i)? Briefly justify your answer. [2] (f) The researcher notices that the industry standard software package entry for the ninth test was a typing error. The correct value should have been 353. The revised mean and median for the industry standard software package data with the correct value for the seventh test are given in the following table.(f) The researcher notices that the industry standard software package entry for the ninth test was a typing error. The correct value should have been 353. The revised mean and median for the industry standard software package data with the correct value for the seventh test are given in the following table. Industry standard software package With typing error With correct value Mean 370.1 355.1 Median 363.5 359.5 Size of dataset 20 20 What is the effect on the mean and on the median of including the typing error instead of the correct value? Explain why this happens. (2] (g) Would having the correct value affect the researcher's conclusion in part (e)(i)? Explain your answer briefly. [2] page 4 of 6