Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

An Investigation into obesity rates Researchers have been interested in studying the obesity trends worldwide over the past 25 years. In 2014 the World Health

An Investigation into obesity rates Researchers have been interested in studying the obesity trends worldwide over the past 25 years. In 2014 the World Health Organization (WHO) declared there were. 1.6 billion overweight adults worldwide and 600 million of those are considered obese due to the concerns related to obese related diseases such as diabetes and heart diseases it has been of interest to see which countries are increasing and decreasing in obesity rates. This task looks at obesity rate from 1990 and 2000 for 26 countries. The rates are percentage based on the total adult for that country. By using the data attach on this question. "obesity datasheet" Workout the following questions in full working out. PART THREE Regression Analysis Here is a scatterplot relationship between the obesity rate for 1990s and 2010 for the 26 countries list in the data sample below. A) Write down the explanatory variable/ response variable. B) Perform linear regression analysis and state the least squares regression line for the scatterplot above. Write the equation in the form y=a+bx and state values to 2 place decimal. C)Draw the least squares regression line found in part b on the scatter plot above. D) Interpret the slope of this least squares regression line in term of the variable 1990 obesity rate and 2010 obesity rate. E) Interpret the vertical intercept of the least squares regression line in terms of the variable 1990 obesity rate and 2010 obesity rate. F) In 1990, 18.4% of population in Belarus are obesity whereas in 2010, 17. 1 % are obese. Calculate the residual when the least squares regression line used to predict the 1010 obesity rate from the 1990 obesity rate. Round to 1 decimal place. G) Calculate the correction coefficient (r ) for the 20 countries and comment on what this indicates about the relationship between the prevalence of obesity in 1990 and 20010. State value 4 decimal place. H) Calculate the coefficient of determination (r2) for the 26 countries and comment on what this indicates between the relationship between prevalence of obesity in 1990 and 2010 state value 4 decimal place. PART FOUR: TRANSFORMING THE DATA IN an attempt to improve the linearity of the data, we will look at different transformations. A) Suggest three possible transformations B) Apply each of the 3 transformations to your data. Complete the table below for the transformed data values. Use the same explanatory and response variable from part 3. State all equations in terms of the variable used and round the intercept and slope to 2 decimal places and the correlation coefficient to 4 decimal places. Transformation Equation of least squares regression line Correlation Coefficient C) Complete the residual plots for each transformation Transformation Residual Plot D) Compare and comment of the residual plots of each transformation. E) Comment on the improvement or otherwise of each transformation equation over the original equation. Use mathematical reasoning to support your comments. F) Choose the best equation for calculating predicted values of the four considered (original from (q3b) plus 3 transformations. H) Use the provided equation to predict a country's obese rate in 2010 if it is 31% in 1990. I) Which prediction from part d) and e) is more reliable. Explain. PART FIVE: TIME SERIES The time series graph below shows the obesity trend in Australia from 2000- 2016 per 100,000 people. A) Use the time series graph to describe the trend of the obesity rate in Australia from 2000 to 2016. B) Smooth the graph above using 5 point median smoothing graphical method. Mark in your points using X. Use the table below to complete a 5 point moving medan. Round all values to 1 decimal place. Year 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 1010 2011 2012 2013 2014 2015 2016 Obesity Rate Per 100,000 5 5.4 6 6.3 6.3 7.5 7 6.8 7.4 8 8.5 7.5 8.8 9.4 9.1 10.8 11.3 Five Point Moving Mean C) Plot the 5 point moving mean data onto the scatterplot above. D) Comment on the effectiveness of the two smoothing techniques to identify any other trends. MAV SACs 2017 CORE: DATA ANALYSIS APPLICATION TASK SHOT PUT A LOOK AT THE OLYMPIC SHOT PUT RESULTS The Olympic Games were first held in 1986 in Athens, Greece. Since then the competition has moved all over the world and over 200 nations participate. There are many different sports to compete in and in this task you will be focusing on shot put results for men and women. The excel spreadsheet contains any information you will need 1. Year the Olympic Game was held 2. The city the Olympic Games was held in. 3. The Gold Medallist Men's Shot Put Throw (metres) 4. The Gold Medallist Women's Shot Put Throw (metres) Note: Women's results are only available from 1948 onwards so there is a blank for the years before 1948. MAV SACs 2017 CORE: DATA ANALYSIS APPLICATION TASK SHOT PUT PART ONE: UNIVARIATE DATA FOR MENS SHOT PUT a) Create a stem and leaf plot for the Men's shot put data using the key below. Mens Olympic Shot Put Results in metres 11 12 13 14 15 16 17 18 19 20 21 22 Key: 11 2 = 11.2 2 8 2 0 0 1 6 7 3 1 5 8 8 3 8 2 4 5 2 2 3 4 4 5 6 7 9 5 (2 marks) b) Determine the range for the Men's shot put results. = 22.5 11.2 = 11.3 (1 mark) c) Enter the Men's shot put data into your calculator and calculate the 5 figure summary and place values in the table below. Round values to 1 decimal place. Minimum 11.2 Q1 15.6 Median 20 Q3 21.4 Maximum 22. 5 (2 marks) d) Conduct an outlier test for the Men's Shot Put results and state if there are any outliers Lower Outlier: 15.6 (1.5 5.8) = 6.9 Upper Outlier: 21.4 + (1.5 5.8) = 30.1 There are no outliers as no values are less than 6.9 or more than 30.1. (2 marks) MAV SACs 2017 CORE: DATA ANALYSIS APPLICATION TASK SHOT PUT e) Plot a box plot of the Men's Shot Put results on the axes below. Use an appropriate scale. (2 marks) f) Comment the shape of the boxplot The boxplot shows a negatively skewed shape. (1 mark) g) Calculate the mean and standard deviation for the Men's Shot-Put results and place results in the table below. Round all values to 1 decimal place. Mean 18.6 Standard Deviation 3.3 (2 marks) h) Calculate the z-score for Men's Shot Put Result from London (1948) and round to 2 decimal places. = 17.118.6 3.3 = 0.45 (1 mark) i) The z-score for Athens (2004) is 0.8. Explain what this result means. The shot put result for Athens 2004 is 0.8 standard deviations about the mean. (1 mark) j) Which measure of central tendency - mean or median - would be most suitable to describe the Men's Shot put results? Explain your choice. It would be most suitable to use the median as the measure of central tendency as the data is skewed. (2 marks) MAV SACs 2017 CORE: DATA ANALYSIS APPLICATION TASK SHOT PUT PART TWO: REGRESSION ANALYSIS The table below shows some statistics that have been calculated for this data. Year = (x) Mens Shot Put Result = (y) 1958.43 37.28 18. 57 3.29 0.9583 a) Using the information in the table, show that the equation of the least squares regression line for the men's shot put result ,y, in terms of the Olympic year ,x, is given by = 0.08457x 147 3.29 = 0.9583 37.28 = 0.08457 = 18.57 0.08457 1958.43 = 147.1 After rounding : = 0.08 147 (2 marks) Below is a scatterplot of the Men's Olympic Shot-put results against Year. 25.0 20.0 15.0 Throw (metres) 10.0 5.0 0.0 1880 1900 1920 1940 1960 Year 1980 2000 2020 2040 MAV SACs 2017 CORE: DATA ANALYSIS APPLICATION TASK SHOT PUT b) Draw the least squares regression line found in part (a) on the scatterplot above. Shown on graph. (1 mark) c) Interpret the slope of this least squares regression line in terms of the variables Year and Shot Put Throw On average, the shot put throw increases by 0.08457 for every one year. (1 mark) d) Explain why it does not make sense to interpret the vertical intercept. Since the vertical intercept is -147 it does not make sure to interpret it as it is impossible to throw a shot put -147 metres. (1 mark) e) Calculate the residual values for each Olympic year and place values in the table. Plot the residual plot on the axes below to 1 decimal place. Year 1896 Residual -2.1 1900 0.2 1904 0.8 1908 -0.1 1912 0.7 1920 -0. 5 1924 -0.6 1928 -0.2 1932 -0.3 Year 1948 1952 1956 1960 1964 1968 1972 1976 Residual -0.5 -0.6 -0.6 0.2 1.0 1.3 1.1 1.5 1.0 Year 1980 Residual 1.0 1984 0.6 1988 1.4 1992 0.3 1936 1996 -0.2 2000 -0.7 2004 -1.2 2008 -1.3 2012 -1.2 2016 -0.9 MAV SACs 2017 CORE: DATA ANALYSIS APPLICATION TASK SHOT PUT ( 6 marks) f) Comment on what the residual plot indicates. The residual plot shows a slightly curved pattern which could indicate a non-linear relationship. (1 mark) g) Comment on what the correlation coefficient indicates about the relationship between the Olympic year and Shot put throw distance. = 0.96. , . (1 mark) h) What percentage of the variation in the shot put throw distance is explained by the variation in the Olympic year? State answer to 2 decimal places. 0.9184 = 91.84% (1 mark) MAV SACs 2017 CORE: DATA ANALYSIS APPLICATION TASK SHOT PUT PART THREE: TRANSFORMATIONS In an attempt to improve the linearity of the data, we will look at 3 different transformations. 1 log() 2 a) Apply each of the 3 transformation to your data. Complete the table below for the transformed data values. Use y to represent the shot put throw in metres and x to represent the year in your equations. Round all values to 3 decimals places Transformation 1 Equation of least squares regression line = 324336 log() 2 1 + 184.240 Correlation Coefficient -0.961 0.960 = 381.590 log() 1237.557 2 = 3.032 5582.73 0.963 ( 6marks) b) Comment on the improvement or otherwise of each transformation equation over the original equation. Use mathematical reasoning to support your comments. Choose the best equation for calculating predicted values of the four considered (original from (Q2a) plus 3 transformations). Each transformation has improved the linearity as all three r values were stronger than the original r value of 0.9583. The best equation for calculating predictions is 2 = 3.032 5582.73 as its r value of 0.963 is closest to 1. (3marks) c) Use your chosen equation to predict the Olympic Shot Put throw in 2020 to 1 decimal place. 2 = 3.032 (2020) 5582.73 = 23.3 (1 mark) MAV SACs 2017 CORE: DATA ANALYSIS APPLICATION TASK SHOT PUT d) Use your chosen equation to predict the Olympic Shot Put throw in 2032. 2 = 3.032 (2032) 5582.73 = 24.0 (1 mark) e) Which prediction from part c) and d) is more reliable. Explain The prediction from part c (2020) is likely to be more reliable as it is closer to the data set than 2032 is. (2 marks) PART FOUR: REGRESSION ANALYSIS WOMENS SHOT PUT Below is a scatterplot of the Women's Shot Put Results against the Olympic Year for the years that the women's event was held. 24.0 22.0 20.0 Shot Put Throw 18.0 (Metres) 16.0 14.0 12.0 10.0 1940 1950 1960 1970 1980 1990 2000 2010 2020 Year a) Describe the association between the Shot put throw and the Olympic year. Positive, moderate, linear association. (1 mark) b) Perform linear regression analysis and state the least squares regression line for the scatterplot above. Write the equation in the form = + and state values to 3 decimal places. = 0.075 129.235 (2 marks) (3 MAV SACs 2017 CORE: DATA ANALYSIS APPLICATION TASK SHOT PUT c) Draw the least squares regression line found in part (b) on the scatterplot above. Shown on graph above. (1 mark) d) Calculate the r value to 3 decimal places. = 0.683 (1 mark) e) Predict the women's shot put throw for the Olympics in 2020. Round your answer to 1 decimal place. 22.3 metres (1 mark) f) In the women's shot put data there are a number of outliers that may be affecting the relationship. Remove the data points for the years 2004 and 2008 and recalculate the least squares regression line. State the new equation and correlation coefficient. Round all values to 3 decimal places. = 0.089 155.959 = 0.736 (4 marks) g) Compare the two least squares regression line from part (b) and part (f). Which regression line would be best used to make predictions. Explain. The best regression line to use would be the equation from part f as the r value 0.736 is closer to 1 than the r value from the equation in part (b) 0.683. (2 marks) PART FIVE: SMOOTHING To improve the linearity of the women's shot put results it is thought best to smooth the data. a) In the table below fill in the shaded cells by completing a 3 point moving mean. Round all values to 1 decimal place. Olympic Year 1948 1952 1956 1960 1964 Women's Shot Put Result 13.8 15.3 16.6 17.3 18.1 3 point moving mean 15.2 16.4 17.3 18.3 MAV SACs 2017 CORE: DATA ANALYSIS 1968 1972 1976 1980 1984 1988 1992 1996 2000 2012 2016 APPLICATION TASK 19.6 21 21.2 22.4 20.5 22.2 20.6 20.6 20.6 20.7 20.6 SHOT PUT 19.6 20.6 21.5 21.4 21.7 21.1 21.1 20.6 20.3 20.3 (4marks) b) Calculate the least squares regression line in the form = + for the smoothed data in Q5a) and the r value. Round all values to 3 decimal places. = 0.083 145.439 = 0.720 (2 marks) c) Another method of identifying a trend in the data is to graphically smooth the data using a 3 point moving median. Complete this on the graph below. 25.0 20.0 15.0 Womens Shot Put Throw (Metres) 10.0 5.0 0.0 1940 1950 1960 1970 1980 Year See graph below for 3 point moving median 1990 2000 2010 2020 MAV SACs 2017 CORE: DATA ANALYSIS APPLICATION TASK SHOT PUT 25.0 20.0 15.0 Smoothed Shot Put Throw (metres) 10.0 5.0 0.0 1940 1950 1960 1970 1980 1990 2000 2010 2020 Year (2 marks) d) Comment on the effectiveness of the two smoothing techniques conducted in part (a) and part (c) in revealing the underlying trend of the data. Identify which technique best smoothed this data. Both smoothing techniques show an increasing trend that is beginning to become constant and flatten out. The 3 point moving median graphical technique really shows the constant trend from 1988 onwards. The 3 point moving mean techniques smoothed the data best as it takes into account all of the 3 points. (2 marks) MAV SACs 2017 CORE: DATA ANALYSIS APPLICATION TASK SHOT PUT PART FIVE (alternative: Deseasonalising Data) The number of hours the Australian Shot put competitor's train each week fluctuates. A 4 week schedule for one competitor is shown below for Monday to Friday. Week 1 Week 2 Week 3 Week 4 Monday 4 5. 5 4.8 5 Tuesday 6.2 6.7 6.0 5.8 Wednesday 3.5 3 2.8 3.8 Thursday 7.1 7.4 7.4 7 Friday 2.8 3.4 4.1 2 a) Plot the data as a time series plot using and explaining an appropriate time code. 8 7 6 5 4 3 2 1 0 0 5 10 15 20 25 Time code : 1= Monday week 1, 2= Tuesday Week 1 etc (2 marks) b) Describe the time series plot in terms of pattern and trend. The time series plot shows a seasonal trend. (1 mark) c) Explain why it would be beneficial to deseasonalise the data. It would be beneficial as the data is seasonal and it would be helpful to remove any high or low fluctuations to reveal any other trends. (1 mark) MAV SACs 2017 CORE: DATA ANALYSIS APPLICATION TASK SHOT PUT A deseasonslised calculation has been started in the tables below. a) Fill in the blank spaces (highlighted grey) to complete the deseasonalisation. Step 1: Round answers to 2 decimal places. Week 1 Week 2 Week 3 Week 4 Monday 4 5. 5 4.8 5 Tuesday 6.2 6.7 6.0 5.8 Wednesday 3.5 3 2.8 3.8 Thursday 7.1 7.4 7.4 7 Friday 2.8 3.4 4.1 2 Average 4.72 5.20 5.02 4.72 (2 marks) Step 2: Round answers to 3 decimal places Monday 0.847 1.058 0.956 1.059 Week 1 Week 2 Week 3 Week 4 Tuesday 1.314 1.288 1.195 1.229 Wednesday 0.742 0. 577 0. 558 0.805 Thursday 1. 504 1.423 1.474 1.483 Friday 0. 593 0.654 0.817 0.424 (5 marks) Step 3: Round answers to 2 decimal places Seasonal Indices Monday 0.98 Tuesday 1.26 Wednesday 0.67 Thursday 1.47 Friday 0.62 (2 marks) Step 4: Round answers to 1 decimal place. Deseasonalise the original data Week 1 Week 2 Week 3 Week 4 Monday 4.1 5.6 4.9 5.1 Tuesday 4.9 5.3 4.8 4.6 Wednesday 5.2 4. 5 4.2 5.7 Thursday 4.8 5.0 5.0 4.8 Friday 4. 5 5. 5 6.6 3.2 (4marks) MAV SACs 2017 CORE: DATA ANALYSIS APPLICATION TASK SHOT PUT b) Add the deseasonalised data to the graph in part (a). Label the new graph. 8 7 6 5 Series1 4 Series2 3 2 1 0 0 5 10 15 20 25 (1 mark) c) Comment on the effectiveness of deseasonalising the data in revealing any underlying trends. Deseasonalising has remove the high and low fluctuations and allows us to see a greater linear trend than with the original data. (1 mark) d) Determine the equation of the least squares regression line for the deseasonalised data using the least squares regression program on your calculator. Round all values to 3 decimal places. = 0.002 + 4.939 (2 marks) e) Would this least squares regression line be helpful in making future predictions? Explain. The r value for this regression equation is -0.02 therefore this indicates no association. Due to this fact the regression line would not be helpful in making future predictions. (2 marks) USE THE FOLLOWING DATE TO WORKOUT THE QUESTIONS FOR PART 1 & 2 ETC. OBESITY RATE % Country 1990 2010 Country Australia Austria Belarus Brazil Canada Chile China Congo Cuba Finland India Iran Japan Morocco Netherlands New Zealand Nigeria Oman Russia Saudi Arabia Sweden Thailand Tunisia Turkey USA Venezuela 1990 19.30 9.00 18.40 6.90 13.00 11.00 1.00 2.30 2.70 19.90 2.50 2.50 1.80 4.30 8.50 13.90 6.50 10.50 7.10 26.40 9.10 3.00 6.70 7.80 27.70 6.70 2010 27.50 23.00 17.10 12.50 27.60 19.20 11.80 2.50 19.70 20.40 11.00 14.70 3.80 6.00 10.40 32.10 6.50 16.70 10.30 24.10 14.00 4.70 13.30 15.30 42.00 34.20 An Investigation into obesity rates Researchers have been interested in studying the obesity trends worldwide over the past 25 years. In 2014 the World Health Organization (WHO) declared there were. 1.6 billion overweight adults worldwide and 600 million of those are considered obese due to the concerns related to obese related diseases such as diabetes and heart diseases it has been of interest to see which countries are increasing and decreasing in obesity rates. This task looks at obesity rate from 1990 and 2000 for 26 countries. The rates are percentage based on the total adult for that country. By using the data attach on this question. "obesity datasheet" Workout the following questions in full working out. PART THREE Regression Analysis Here is a scatterplot relationship between the obesity rate for 1990s and 2010 for the 26 countries list in the data sample below. A) Write down the explanatory variable/ response variable. B) Perform linear regression analysis and state the least squares regression line for the scatterplot above. Write the equation in the form y=a+bx and state values to 2 place decimal. C)Draw the least squares regression line found in part b on the scatter plot above. D) Interpret the slope of this least squares regression line in term of the variable 1990 obesity rate and 2010 obesity rate. E) Interpret the vertical intercept of the least squares regression line in terms of the variable 1990 obesity rate and 2010 obesity rate. F) In 1990, 18.4% of population in Belarus are obesity whereas in 2010, 17. 1 % are obese. Calculate the residual when the least squares regression line used to predict the 1010 obesity rate from the 1990 obesity rate. Round to 1 decimal place. G) Calculate the correction coefficient (r ) for the 20 countries and comment on what this indicates about the relationship between the prevalence of obesity in 1990 and 20010. State value 4 decimal place. H) Calculate the coefficient of determination (r2) for the 26 countries and comment on what this indicates between the relationship between prevalence of obesity in 1990 and 2010 state value 4 decimal place. PART FOUR: TRANSFORMING THE DATA IN an attempt to improve the linearity of the data, we will look at different transformations. A) Suggest three possible transformations B) Apply each of the 3 transformations to your data. Complete the table below for the transformed data values. Use the same explanatory and response variable from part 3. State all equations in terms of the variable used and round the intercept and slope to 2 decimal places and the correlation coefficient to 4 decimal places. Transformation Equation of least squares regression line Correlation Coefficient C) Complete the residual plots for each transformation Transformation Residual Plot D) Compare and comment of the residual plots of each transformation. E) Comment on the improvement or otherwise of each transformation equation over the original equation. Use mathematical reasoning to support your comments. F) Choose the best equation for calculating predicted values of the four considered (original from (q3b) plus 3 transformations. H) Use the provided equation to predict a country's obese rate in 2010 if it is 31% in 1990. I) Which prediction from part d) and e) is more reliable. Explain. PART FIVE: TIME SERIES The time series graph below shows the obesity trend in Australia from 2000- 2016 per 100,000 people. A) Use the time series graph to describe the trend of the obesity rate in Australia from 2000 to 2016. B) Smooth the graph above using 5 point median smoothing graphical method. Mark in your points using X. Use the table below to complete a 5 point moving medan. Round all values to 1 decimal place. Year 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 1010 2011 2012 2013 2014 2015 2016 Obesity Rate Per 100,000 5 5.4 6 6.3 6.3 7.5 7 6.8 7.4 8 8.5 7.5 8.8 9.4 9.1 10.8 11.3 Five Point Moving Mean C) Plot the 5 point moving mean data onto the scatterplot above. D) Comment on the effectiveness of the two smoothing techniques to identify any other trends. \f\fChapter 5 5 Core: Data analysis Data transformation &DPEULGJH\u00038QLYHUVLW\\\u00033UHVV &DPEULGJH\u00036HQLRU\u00030DWKV\u0003$&\u00129&(\u0003 ,6%1\u0003\u001c\u001a\u001b\u0010\u0014\u0010\u0016\u0014\u0019\u0010\u0019\u0014\u0019\u0015\u0015\u0010\u0015\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003-RQHV\u0003HW\u0003DO\u0011\u0003\u0015\u0013\u0014\u0019\u0003 )XUWKHU\u00030DWKHPDWLFV\u0003\u0016\t\u0017 \u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u00033KRWRFRS\\LQJ\u0003LV\u0003UHVWULFWHG\u0003XQGHU\u0003ODZ\u0003DQG\u0003WKLV\u0003PDWHULDO\u0003PXVW\u0003QRW\u0003EH\u0003WUDQVIHUUHG\u0003WR\u0003DQRWKHU\u0003SDUW\\\u0011 166 Core \u0002 Chapter 5 \u0002 Data transformation 5A Introduction You rst encountered data transformation in Chapter 1 where you used a log scale to transform a skewed histogram into a more easily interpreted symmetric histogram. In this chapter, you will learn to use the squared, log and reciprocal transformations to linearise scatterplots, the rst step towards solving problems involving non-linear associations. \u0003 The circle of transformations The types of scatterplots that can be transformed by the squared, log or reciprocal transformations can be tted together into what we call the circle of transformations. The circle of transformations Possible transformations Possible transformations y2 y2 log x x2 1 x log y log y 1 y 1 y log x x2 1 x The purpose of the circle of transformations is to guide us in our choice of transformation to linearise a given scatterplot. There are two things to note when using the circle of transformations: 1 In each case, there is more than one type of transformation that might work. 2 These transformations only apply to scatterplots with a consistently increasing or decreasing trend. For example, the scatterplot opposite has a consistently increasing trend so the circle of transformations applies. y Comparing the scatterplot to those in the circle of transformations we see that there are three transformations, the x2 , the 1/y or the log x, that have the potential to linearise this scatterplot. x At this stage you might nd it helpful to use the interactive 'Data transformation' (accessible through the Interactive Textbook) to see how these dierent transformations can be used to linearise scatterplots. &DPEULGJH\u00038QLYHUVLW\\\u00033UHVV &DPEULGJH\u00036HQLRU\u00030DWKV\u0003$&\u00129&(\u0003 ,6%1\u0003\u001c\u001a\u001b\u0010\u0014\u0010\u0016\u0014\u0019\u0010\u0019\u0014\u0019\u0015\u0015\u0010\u0015\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003-RQHV\u0003HW\u0003DO\u0011\u0003\u0015\u0013\u0014\u0019\u0003 )XUWKHU\u00030DWKHPDWLFV\u0003\u0016\t\u0017 \u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u00033KRWRFRS\\LQJ\u0003LV\u0003UHVWULFWHG\u0003XQGHU\u0003ODZ\u0003DQG\u0003WKLV\u0003PDWHULDO\u0003PXVW\u0003QRW\u0003EH\u0003WUDQVIHUUHG\u0003WR\u0003DQRWKHU\u0003SDUW\\\u0011 5A 5B Using data transformation to linearise a scatterplot 167 Exercise 5A 1 The scatterplots below are non-linear. For each, identify the transformations x2 , log x, 1/x, y2 , log y, 1/y or none that might be used to linearise the plot. a 5 b 5 4 4 3 3 2 2 1 1 0 0 1 2 3 4 5 6 7 8 9 10 0 c 5 d 5 4 4 3 3 2 2 1 1 0 0 1 2 3 4 5 6 7 8 9 10 0 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 5B Using data transformation to linearise a scatterplot \u0003 The squared transformation The squared transformation is a stretching transformation. It works by stretching out the upper end of the scale on either the x- or y-axis. The eect of applying an x2 transformation to a scatterplot is illustrated graphically below. Transformation Outcome x2 Spreads out the high x-values relative to the lower x-values, leaving the y-values unchanged. This has the eect of straightening out curves like the one shown opposite. The y-squared transformation works in a similar manner but stretches out the scale on the y-axis. Graph y x &DPEULGJH\u00038QLYHUVLW\\\u00033UHVV &DPEULGJH\u00036HQLRU\u00030DWKV\u0003$&\u00129&(\u0003 ,6%1\u0003\u001c\u001a\u001b\u0010\u0014\u0010\u0016\u0014\u0019\u0010\u0019\u0014\u0019\u0015\u0015\u0010\u0015\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003-RQHV\u0003HW\u0003DO\u0011\u0003\u0015\u0013\u0014\u0019\u0003 )XUWKHU\u00030DWKHPDWLFV\u0003\u0016\t\u0017 \u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u00033KRWRFRS\\LQJ\u0003LV\u0003UHVWULFWHG\u0003XQGHU\u0003ODZ\u0003DQG\u0003WKLV\u0003PDWHULDO\u0003PXVW\u0003QRW\u0003EH\u0003WUDQVIHUUHG\u0003WR\u0003DQRWKHU\u0003SDUW\\\u0011 168 Core \u0002 Chapter 5 \u0002 Data transformation The following example shows how the x-squared transformation works in practice. Example 1 Applying the squared transformation A base jumper leaps from the top of a cli, 1560 metres above the valley oor. The scatterplot below shows the height (in metres) of the base jumper above the valley oor every second, for the rst 10 seconds of the jump. After this time she opened her parachute to bring her safely to the ground. However, the association is clearly non-linear as can be seen from the dotted line on the scatterplot. Because the association is clearly non-linear, it makes no sense to try to model the association with a straight line. 1600 1500 Height (metres) A scatterplot shows that there is a strong negative association between the height of the base jumper above the ground and time. 1400 1300 1200 1100 1000 Before we can t a least squares line to the data, we need to linearise the scatter plot. 0 1 2 3 4 5 6 7 8 9 10 Time (seconds) The circle of transformation suggests that we could use either an x2 or a y2 to linearise this scatterplot. We will use the x2 transformation. That involves changing the scale on the time axis to time2 . Now that we have a linearised scatterplot, we can use a least squares line to model the association between height and time2 . The equation of this line is: height = 1560 - 4.90 time2 1600 1500 Height (metres) When we make this change, we see that the association between height and time2 is linear. See the plot opposite. 1400 1300 1200 1100 1000 0 10 20 30 40 50 60 70 80 90100 Time2 Like any regression line, we can use its equation to make predictions. For example, after 3.4 seconds, we predict that the height of the base jumper is: height = 1560 - 4.90 3.42 = 1503 m (to nearest m) Performing a data transformation is quite computationally intensive, but your CAS calculator is well suited to the task. &DPEULGJH\u00038QLYHUVLW\\\u00033UHVV &DPEULGJH\u00036HQLRU\u00030DWKV\u0003$&\u00129&(\u0003 ,6%1\u0003\u001c\u001a\u001b\u0010\u0014\u0010\u0016\u0014\u0019\u0010\u0019\u0014\u0019\u0015\u0015\u0010\u0015\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003-RQHV\u0003HW\u0003DO\u0011\u0003\u0015\u0013\u0014\u0019\u0003 )XUWKHU\u00030DWKHPDWLFV\u0003\u0016\t\u0017 \u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u00033KRWRFRS\\LQJ\u0003LV\u0003UHVWULFWHG\u0003XQGHU\u0003ODZ\u0003DQG\u0003WKLV\u0003PDWHULDO\u0003PXVW\u0003QRW\u0003EH\u0003WUDQVIHUUHG\u0003WR\u0003DQRWKHU\u0003SDUW\\\u0011 5B Using data transformation to linearise a scatterplot 169 Using the TI-Nspire CAS to perform a squared transformation The table shows the height (in m) of a base jumper for the rst 10 seconds of her jump. Time 0 1 2 3 4 5 6 7 8 9 10 Height 1560 1555 1540 1516 1482 1438 1383 1320 1246 1163 1070 a Construct a scatterplot displaying height (the RV) against time (the EV). b Linearise the scatterplot and t a least squares line to the transformed data. c Use the regression line to predict the height of the base jumper after 3.4 seconds. Steps 1 Start a new document by pressing / + N . 2 Select Add Lists & Spreadsheet. Enter the data into lists named time and height, as shown. 3 Name column C as timesq (short for 'time squared'). 4 Move the cursor to the grey cell below timesq. Enter the expression = time^2 by pressing = , then typing time^2. Pressing calculates and displays the values of timesq. 5 Press / + I and select Add Data & Statistics. Construct a scatterplot of height against time. Let time be the explanatory variable and height the response variable. The plot is clearly non-linear. 6 Press / + I and select Add Data & Statistics. Construct a scatterplot of height against time2 . The plot is now linear. &DPEULGJH\u00038QLYHUVLW\\\u00033UHVV &DPEULGJH\u00036HQLRU\u00030DWKV\u0003$&\u00129&(\u0003 ,6%1\u0003\u001c\u001a\u001b\u0010\u0014\u0010\u0016\u0014\u0019\u0010\u0019\u0014\u0019\u0015\u0015\u0010\u0015\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003-RQHV\u0003HW\u0003DO\u0011\u0003\u0015\u0013\u0014\u0019\u0003 )XUWKHU\u00030DWKHPDWLFV\u0003\u0016\t\u0017 \u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u00033KRWRFRS\\LQJ\u0003LV\u0003UHVWULFWHG\u0003XQGHU\u0003ODZ\u0003DQG\u0003WKLV\u0003PDWHULDO\u0003PXVW\u0003QRW\u0003EH\u0003WUDQVIHUUHG\u0003WR\u0003DQRWKHU\u0003SDUW\\\u0011 170 Core \u0002 Chapter 5 \u0002 Data transformation 7 Press b>Analyze>Regression>Show Linear (a + bx) to plot the line on the scatterplot with its equation. Note: The x in the equation on the screen corresponds to the transformed variable time2 . 8 Write down the regression equation in terms of the variables height and time2 . height = 1560 4.90 time2 9 Substitute 3.4 for time in the equation to nd the height after 3.4 seconds. height = 1560 4.90 3.42 = 1503 m Using the CASIO Classpad to perform a squared transformation The table shows the height (in m) of a base jumper for the rst 10 seconds of her jump. Time 0 1 2 3 4 5 6 7 8 9 10 Height 1560 1555 1540 1516 1482 1438 1383 1320 1246 1163 1070 a Construct a scatterplot displaying height (the RV) against time (the EV). b Linearise the scatterplot and t a least squares line to the transformed data. c Use the regression line to predict the height of the base jumper after 3.4 seconds. &DPEULGJH\u00038QLYHUVLW\\\u00033UHVV &DPEULGJH\u00036HQLRU\u00030DWKV\u0003$&\u00129&(\u0003 ,6%1\u0003\u001c\u001a\u001b\u0010\u0014\u0010\u0016\u0014\u0019\u0010\u0019\u0014\u0019\u0015\u0015\u0010\u0015\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003-RQHV\u0003HW\u0003DO\u0011\u0003\u0015\u0013\u0014\u0019\u0003 )XUWKHU\u00030DWKHPDWLFV\u0003\u0016\t\u0017 \u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u00033KRWRFRS\\LQJ\u0003LV\u0003UHVWULFWHG\u0003XQGHU\u0003ODZ\u0003DQG\u0003WKLV\u0003PDWHULDO\u0003PXVW\u0003QRW\u0003EH\u0003WUDQVIHUUHG\u0003WR\u0003DQRWKHU\u0003SDUW\\\u0011 5B Using data transformation to linearise a scatterplot 171 Steps 1 In the Statistics application enter the data into lists named time and height. 2 Name the third list timesq (short for time squared). 3 Place the cursor in the calculation cell at the bottom of the third column and type time^2. This will calculate the values of time2 . Let time be the explanatory variable (x) and height the response variable (y). 4 Construct a scatterplot of height against time. \u0002 Tap and complete the Set StatGraphs dialog box as shown. \u0002 Tap to view the scatterplot. The plot is clearly non-linear. 5 Construct a scatterplot of height against time2 . \u0002 Tap and complete the Set StatGraphs dialog box as shown. \u0002 Tap to view the scatterplot. The plot is now clearly linear. 6 Fit a regression line to the transformed data. \u0002 Go to Calc, Regression, Linear Reg. \u0002 Complete the Set Calculation dialog box as shown and tap OK. Note: The 'x' in the linear equation corresponds to the transformed variable time2 . \u0002 Tap OK a second time to plot and display the regression line on the scatterplot. 7 Write down the equation in terms of height and time2 . height = 1560 4.90 time2 . 8 Substitute 3.4 for time in the equation. height = 1560 4.90 3.42 = 1503 m &DPEULGJH\u00038QLYHUVLW\\\u00033UHVV &DPEULGJH\u00036HQLRU\u00030DWKV\u0003$&\u00129&(\u0003 ,6%1\u0003\u001c\u001a\u001b\u0010\u0014\u0010\u0016\u0014\u0019\u0010\u0019\u0014\u0019\u0015\u0015\u0010\u0015\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003-RQHV\u0003HW\u0003DO\u0011\u0003\u0015\u0013\u0014\u0019\u0003 )XUWKHU\u00030DWKHPDWLFV\u0003\u0016\t\u0017 \u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u00033KRWRFRS\\LQJ\u0003LV\u0003UHVWULFWHG\u0003XQGHU\u0003ODZ\u0003DQG\u0003WKLV\u0003PDWHULDO\u0003PXVW\u0003QRW\u0003EH\u0003WUDQVIHUUHG\u0003WR\u0003DQRWKHU\u0003SDUW\\\u0011 172 Core \u0002 Chapter 5 \u0002 5B Data transformation Exercise 5B The x-squared transformation: some prerequisite skills 1 Evaluate y in the following expression, correct to one decimal place. a y = 7 + 8x2 when x = 1.25 b y = 7 + 3x2 when x = 1.25 c y = 24.56 - 0.47x2 when x = 1.23 d y = -4.75 + 5.95x2 when x = 4.7 The x-squared transformation: calculator exercises 2 The scatterplot opposite was constructed from the data in the table below. x 0 1 2 3 4 y 16 15 12 7 0 y 20 15 10 From the scatterplot, it is clear that the association between y and x is non-linear. 5 0 0 1 2 3 4 x a Linearise the scatterplot by applying an x-squared transformation and t a least squares line to the transformed data. b Give its equation. c Use the equation to predict the value of y when x = -2. 3 The scatterplot opposite was constructed from the data in the table below. x 1 2 3 4 5 y 3 9 19 33 51 From the scatterplot, the association between y and x is non-linear. y 60 50 40 30 20 10 5 0 0 1 2 3 4 5 x a Linearise the scatterplot by applying an x-squared transformation and t a least squares line to the transformed data. b Give its equation. c Use the equation to predict the value of y when x = 6. The y-squared transformation: some prerequisite skills 4 Evaluate y in the following expression. Give the answers correct to one decimal place. a y2 = 16 + 4x when x = 1.57 b y2 = 1.7 - 3.4x when x = 0.03 c y2 =16 + 2x when x = 10 (y > 0) d y2 = 58 + 2x when x = 3 (y < 0) &DPEULGJH\u00038QLYHUVLW\\\u00033UHVV &DPEULGJH\u00036HQLRU\u00030DWKV\u0003$&\u00129&(\u0003 ,6%1\u0003\u001c\u001a\u001b\u0010\u0014\u0010\u0016\u0014\u0019\u0010\u0019\u0014\u0019\u0015\u0015\u0010\u0015\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003-RQHV\u0003HW\u0003DO\u0011\u0003\u0015\u0013\u0014\u0019\u0003 )XUWKHU\u00030DWKHPDWLFV\u0003\u0016\t\u0017 \u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u00033KRWRFRS\\LQJ\u0003LV\u0003UHVWULFWHG\u0003XQGHU\u0003ODZ\u0003DQG\u0003WKLV\u0003PDWHULDO\u0003PXVW\u0003QRW\u0003EH\u0003WUDQVIHUUHG\u0003WR\u0003DQRWKHU\u0003SDUW\\\u0011 5B 5B Using data transformation to linearise a scatterplot 173 The y-squared transformation: calculator exercises 6.0 5 The scatterplot opposite was constructed from the data in the table below. x 0 2 4 6 8 y 1.2 2.8 3.7 4.5 5.1 5.0 4.0 10 3.0 5.7 2.0 From the scatterplot, the association between y and x is non-linear. 1.0 0 0 1 2 4 6 8 10 a Linearise the scatterplot by applying a y-squared transformation and t a least squares line to the transformed data. b Give its equation. Write the coecient, correct to two signicant gures. c Use the equation to predict the value of y when x = 9. Give the answer correct to one decimal place. Applications of the squared transformation The table gives the diameter (in m) of ve dierent umbrellas and the number of people each umbrella is designed to keep dry. A scatter plot is also shown. Diameter Number 0.50 1 0.70 2 0.85 3 1.00 4 1.10 5 Number of people 6 5 4 3 2 1 0 0 0.2 1 0.4 0.6 0.8 Diameter (metres) 1.2 a Apply the squared transformation to the variable diameter and determine the least squares regression line for the transformed data. Diameter is the EV. Write the slope and intercept of this line, correct to one decimal place, in the spaces provided. number = + diameter2 b Use the equation to predict the number of people who can be sheltered by an umbrella of 1.3 m. Give your answer correct to the nearest person. &DPEULGJH\u00038QLYHUVLW\\\u00033UHVV &DPEULGJH\u00036HQLRU\u00030DWKV\u0003$&\u00129&(\u0003 ,6%1\u0003\u001c\u001a\u001b\u0010\u0014\u0010\u0016\u0014\u0019\u0010\u0019\u0014\u0019\u0015\u0015\u0010\u0015\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003-RQHV\u0003HW\u0003DO\u0011\u0003\u0015\u0013\u0014\u0019\u0003 )XUWKHU\u00030DWKHPDWLFV\u0003\u0016\t\u0017 \u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u00033KRWRFRS\\LQJ\u0003LV\u0003UHVWULFWHG\u0003XQGHU\u0003ODZ\u0003DQG\u0003WKLV\u0003PDWHULDO\u0003PXVW\u0003QRW\u0003EH\u0003WUDQVIHUUHG\u0003WR\u0003DQRWKHU\u0003SDUW\\\u0011 174 Core \u0002 Chapter 5 \u0002 5B Data transformation 7 The time (in minutes) taken for a local anaesthetic to take eect is associated with to the amount administered (in units). To investigate this association a researcher collected the data. Amount 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 Time 3.7 3.6 3.4 3.3 3.2 3.0 2.9 2.7 2.5 2.3 2.1 The association between the variables amount and time is non-linear as can be seen from the scatterplot below. A squared transformation applied to the variable time will linearise the scatterplot. b Use the equation to predict the time for the anaesthetic to take eect when the dose is 0.4 units. Give the answer correct to one decimal place. 4 Time (minutes) a Apply the squared transformation to the variable time and t a least squares regression line to the transformed data. Amount is the EV. Write the equation of this line with the slope and intercept, correct to two signicant gures. 3.5 3 2.5 2 1.5 0 0 0.4 0.6 0.8 1 1.2 Amount (units) 1.4 1.6 5C The log transformation Skillsheet The logarithmic transformation is a compressing transformation and the upper end of the scale on either the x- or the y-axis. The eect of applying a log x transformation to a scatterplot is illustrated graphically below. Transformation log x Outcome Compresses the higher x-values relative to the lower x-values, leaving the y-values unchanged. This has the eect of straightening out curves like the one shown. The log y transformation works in similar manner but compressing the scale on the y-axis. Graph y x &DPEULGJH\u00038QLYHUVLW\\\u00033UHVV &DPEULGJH\u00036HQLRU\u00030DWKV\u0003$&\u00129&(\u0003 ,6%1\u0003\u001c\u001a\u001b\u0010\u0014\u0010\u0016\u0014\u0019\u0010\u0019\u0014\u0019\u0015\u0015\u0010\u0015\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003-RQHV\u0003HW\u0003DO\u0011\u0003\u0015\u0013\u0014\u0019\u0003 )XUWKHU\u00030DWKHPDWLFV\u0003\u0016\t\u0017 \u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u00033KRWRFRS\\LQJ\u0003LV\u0003UHVWULFWHG\u0003XQGHU\u0003ODZ\u0003DQG\u0003WKLV\u0003PDWHULDO\u0003PXVW\u0003QRW\u0003EH\u0003WUDQVIHUUHG\u0003WR\u0003DQRWKHU\u0003SDUW\\\u0011 5C The log transformation 175 Example 2 Applying the log transformation Because the association is non-linear, it makes no sense to try to model the association with a straight line. But before we can t a least squares regression line to the data, we need to transform the data. The circle of transformation suggests that we could 1 use the y2 , log x or transformation to linearise y the data. Lifespan (years) The general wealth of a country, often measured by its Gross Domestic Product (GDP), is one of several variables associated with lifespan in dierent countries. However, the association it not linear, as can be seen in the scatterplot below which plots lifespan (in years) against GDP (in dollars) for 13 dierent countries. 83 81 79 77 75 73 71 69 67 0 1 0 10000 20000 30000 40000 50000 When we make this change, we see that the association between the variables lifespan and log (GDP) is linear. See the plot opposite. Note: On the plot, when log (GDP) = 4, the actual GDP is 104 or $10 000. We can now t a least squares line to model the association between the variables lifespan and log (GDP). Lifespan (years) We will use the log x transformation. That is, we change the scale on the GDP-axis to log (GDP). 83 81 79 77 75 73 71 69 67 0 0 2.5 GDP 3 3.5 4 log (GDP) 4.5 The equation of this line is: lifespan = 54.3 + 5.59 log (GDP) Like any other regression line we can use its equation to make predictions. For example, for a country with a GDP of $20 000, the lifespan is predicted to be: lifespan = 54.3 + 5.59 log 20 000 = 78.3 years (correct to one decimal place) 1 Following the normal convention, log x means log10 x. &DPEULGJH\u00038QLYHUVLW\\\u00033UHVV &DPEULGJH\u00036HQLRU\u00030DWKV\u0003$&\u00129&(\u0003 ,6%1\u0003\u001c\u001a\u001b\u0010\u0014\u0010\u0016\u0014\u0019\u0010\u0019\u0014\u0019\u0015\u0015\u0010\u0015\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003-RQHV\u0003HW\u0003DO\u0011\u0003\u0015\u0013\u0014\u0019\u0003 )XUWKHU\u00030DWKHPDWLFV\u0003\u0016\t\u0017 \u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u00033KRWRFRS\\LQJ\u0003LV\u0003UHVWULFWHG\u0003XQGHU\u0003ODZ\u0003DQG\u0003WKLV\u0003PDWHULDO\u0003PXVW\u0003QRW\u0003EH\u0003WUDQVIHUUHG\u0003WR\u0003DQRWKHU\u0003SDUW\\\u0011 176 Core \u0002 Chapter 5 \u0002 Data transformation Using the TI-Nspire CAS to perform a log transformation The table shows the lifespan (in years) and GDP (in dollars) of people in 12 countries. The association is non-linear. Using the log x transformation: \u0002 linearise the data, and t a regression line to the transformed data (GDP is the EV) \u0002 write its equation in terms of the variables lifespan and GDP correct to three signicant gures. \u0002 use the equation of the regression line to predict the lifespan in a country with a GDP of $20 000, correct to one decimal place. Lifespan GDP 80.4 36 032 79.8 34 484 79.2 26 664 77.4 41 890 78.8 26 893 81.5 25 592 74.9 7 454 72.0 1 713 77.9 7 073 70.3 1 192 73.0 631 68.6 1 302 Steps 1 Start a new document by pressing / + N . 2 Select Add Lists & Spreadsheet. Enter the data into lists named lifespan and gdp. 3 Name column C as lgdp (short for log (GDP)). Now calculate the values of log (GDP) and store them in the list named lgdp. 4 Move the cursor to the grey cell below the lgdp heading. We need to enter the expression = log(gdp). To do this, press = then type in log(gdp). Pressing calculates and displays the values of lgdp. 5 Press / + I and select Add Data & Statistics. Construct a scatterplot of lifespan against GDP. Let GDP be the explanatory variable and lifespan the response variable. The plot is clearly non-linear. &DPEULGJH\u00038QLYHUVLW\\\u00033UHVV &DPEULGJH\u00036HQLRU\u00030DWKV\u0003$&\u00129&(\u0003 ,6%1\u0003\u001c\u001a\u001b\u0010\u0014\u0010\u0016\u0014\u0019\u0010\u0019\u0014\u0019\u0015\u0015\u0010\u0015\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003-RQHV\u0003HW\u0003DO\u0011\u0003\u0015\u0013\u0014\u0019\u0003 )XUWKHU\u00030DWKHPDWLFV\u0003\u0016\t\u0017 \u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u00033KRWRFRS\\LQJ\u0003LV\u0003UHVWULFWHG\u0003XQGHU\u0003ODZ\u0003DQG\u0003WKLV\u0003PDWHULDO\u0003PXVW\u0003QRW\u0003EH\u0003WUDQVIHUUHG\u0003WR\u0003DQRWKHU\u0003SDUW\\\u0011 5C The log transformation 177 6 Press / + I and select Add Data & Statistics. Construct a scatterplot of lifespan against log GDP. The plot is now clearly linear. 7 Press b>Analyze>Regression>Show Linear (a + bx) to plot the line on the scatterplot with its equation. Note: The x in the equation on the screen corresponds to the transformed variable log (GDP). 8 Write the regression equation in terms of the variables lifespan and log (GDP). 9 Substitute 20 000 for GDP in the equation to nd the lifespan of people in a country with GDP of $20 000. lifespan = 54.3 + 5.59 log (GDP) lifespan = 54.3 + 5.59 log 20 000 = 78.3 years Using the CASIO Classpad to perform a log transformation The table shows the lifespan (in years) and GDP (in dollars) of people in 12 countries. The association is non-linear. Using the log x transformation: \u0002 linearise the data, and t a regression line to the transformed data (GDP is the EV) \u0002 write its equation in terms of the variables lifespan and GDP correct to three signicant gures. \u0002 use the equation to predict the lifespan in a country with a GDP of $20 000 correct to one decimal place. Lifespan GDP 80.4 36 032 79.8 34 484 79.2 26 664 77.4 41 890 78.8 26 893 81.5 25 592 74.9 7 454 72.0 1 713 77.9 7 073 70.3 1 192 73.0 631 68.6 1 302 &DPEULGJH\u00038QLYHUVLW\\\u00033UHVV &DPEULGJH\u00036HQLRU\u00030DWKV\u0003$&\u00129&(\u0003 ,6%1\u0003\u001c\u001a\u001b\u0010\u0014\u0010\u0016\u0014\u0019\u0010\u0019\u0014\u0019\u0015\u0015\u0010\u0015\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003-RQHV\u0003HW\u0003DO\u0011\u0003\u0015\u0013\u0014\u0019\u0003 )XUWKHU\u00030DWKHPDWLFV\u0003\u0016\t\u0017 \u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u00033KRWRFRS\\LQJ\u0003LV\u0003UHVWULFWHG\u0003XQGHU\u0003ODZ\u0003DQG\u0003WKLV\u0003PDWHULDO\u0003PXVW\u0003QRW\u0003EH\u0003WUDQVIHUUHG\u0003WR\u0003DQRWKHU\u0003SDUW\\\u0011 178 Core \u0002 Chapter 5 \u0002 Data transformation Steps 1 In the Statistics application enter the data into lists named Lifespan and GDP. 2 Name the third list logGDP. 3 Place the cursor in the calculation cell at the bottom of the third column and type log (GDP). Let GDP be the explanatory variable (x) and lifespan the response variable (y). 4 Construct a scatterplot of lifespan against log (GDP). \u0002 Tap and complete the Set StatGraphs dialog box as shown. \u0002 Tap to view the scatterplot. \u0002 The plot is linear. 5 To nd the least squares regression equation and t a regression line to the transformed data. \u0002 Go to Calc, Regression, Linear Reg. \u0002 Complete the Set Calculation dialog box as shown and tap OK. This generates the regression results. Note: The x in the linear equation corresponds to the transformed variable log (GDP). \u0002 Tap OK a second time to plot and display the regression line on the scatterplot. 6 Write the equation in terms of lifespan and log (GDP). 7 Substitute 20 000 for GDP in the equation. lifespan = 54.3 + 5.59 log (GDP) lifespan = 54.3 + 5.59 log 20 000 = 78.3 years &DPEULGJH\u00038QLYHUVLW\\\u00033UHVV &DPEULGJH\u00036HQLRU\u00030DWKV\u0003$&\u00129&(\u0003 ,6%1\u0003\u001c\u001a\u001b\u0010\u0014\u0010\u0016\u0014\u0019\u0010\u0019\u0014\u0019\u0015\u0015\u0010\u0015\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003-RQHV\u0003HW\u0003DO\u0011\u0003\u0015\u0013\u0014\u0019\u0003 )XUWKHU\u00030DWKHPDWLFV\u0003\u0016\t\u0017 \u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u00033KRWRFRS\\LQJ\u0003LV\u0003UHVWULFWHG\u0003XQGHU\u0003ODZ\u0003DQG\u0003WKLV\u0003PDWHULDO\u0003PXVW\u0003QRW\u0003EH\u0003WUDQVIHUUHG\u0003WR\u0003DQRWKHU\u0003SDUW\\\u0011 5C 5C The log transformation 179 Exercise 5C The log x transformation: some prerequisite skills 1 Evaluate the following expressions correct to one decimal place. a y = 5.5 + 3.1 log 2.3 b y = 0.34 + 5.2 log 1.4 c y = -8.5 + 4.12 log 20 d y = 196.1 - 23.2 log 303 The log x transformation: calculator exercise 2 The scatterplot opposite was constructed from the data in the table below. x 5 y 3.1 10 150 4.0 500 7.5 y 10 9 8 7 6 5 4 3 0 1000 9.1 10.0 From the scatterplot, it is clear that the association between y and x is non-linear. 0 200 400 600 800 1000 x a Linearise the scatterplot by applying a log x transformation and t a least squares line to the transformed data. b Write down its equation and the coecient, correct to one signicant gure. c Use the equation to predict the value of y when x = 100. 3 The scatterplot opposite was constructed from the data in the table below. x 10 44 y 15.0 11.8 132 9.4 436 6.8 981 5.0 From the scatterplot, it is clear that the relationship between y and x is non-linear. y 16 14 12 10 8 6 4 0 200 400 600 800 1000 x a Linearise the scatterplot by applying a log x transformation and t a least squares line to the transformed data. b Write down its equation and coecient, correct to one signicant gure. c Use the equation to predict the value of y when x = 1000. The log y transformation: some prerequisite skills 4 Find the value of y in the following, correct to one decimal place if not exact. a log y = 2 b log y = 2.34 c log y = 3.5 + 2x where x = 1.25 d log y = -0.5 + 0.024x where x = 17.3 &DPEULGJH\u00038QLYHUVLW\\\u00033UHVV &DPEULGJH\u00036HQLRU\u00030DWKV\u0003$&\u00129&(\u0003 ,6%1\u0003\u001c\u001a\u001b\u0010\u0014\u0010\u0016\u0014\u0019\u0010\u0019\u0014\u0019\u0015\u0015\u0010\u0015\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003-RQHV\u0003HW\u0003DO\u0011\u0003\u0015\u0013\u0014\u0019\u0003 )XUWKHU\u00030DWKHPDWLFV\u0003\u0016\t\u0017 \u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u00033KRWRFRS\\LQJ\u0003LV\u0003UHVWULFWHG\u0003XQGHU\u0003ODZ\u0003DQG\u0003WKLV\u0003PDWHULDO\u0003PXVW\u0003QRW\u0003EH\u0003WUDQVIHUUHG\u0003WR\u0003DQRWKHU\u0003SDUW\\\u0011 180 Core \u0002 \u0002 Chapter 5 5C Data transformation The log y transformation: calculator exercise 5 The scatterplot opposite was constructed from the data in the table below. x 0.1 0.2 0.3 y 15.8 25.1 39.8 y 100 0.5 80 63.1 100.0 60 0.4 From the scatterplot, it is clear that the relationship between y and x is non-linear. 40 20 0 a Linearise the scatterplot by applying a log y transformation and t a least squares line to the transformed data. b Write down its equation. 0.1 0.2 0.3 0.4 0.5 x c Use the equation to predict the value of y when x = 0.6, correct to one decimal place. Applications of the log transformation 6 The table below shows the level of performance level achieved by 10 people on completion of a task. Also shown is the time spent (in minutes) practising the task. In this situation, time is the EV. The association between the level and time is non-linear as seen in the scatterplot. Level 0.5 1 1 1.5 1.5 2 2 3 3 3 4 3.5 5 4 6 3.5 7 3.9 7 3.6 4 3.5 3 2.5 2 1.5 1 0.5 0 Level Time 0 1 2 3 4 5 6 7 Time (minutes) A log transformation can be applied to the variable time to linearise the scatterplot. a Apply the log transformation to the variable time and t a least squares line to the transformed data. log (time) is the EV. Write the slope and intercept of this line, correct to two signicant gures in the spaces provided. level = + log (time) b Use the equation to predict the level of performance (correct to one decimal place) for a person who spends 2.5 minutes practising the task. &DPEULGJH\u00038QLYHUVLW\\\u00033UHVV &DPEULGJH\u00036HQLRU\u00030DWKV\u0003$&\u00129&(\u0003 ,6%1\u0003\u001c\u001a\u001b\u0010\u0014\u0010\u0016\u0014\u0019\u0010\u0019\u0014\u0019\u0015\u0015\u0010\u0015\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003-RQHV\u0003HW\u0003DO\u0011\u0003\u0015\u0013\u0014\u0019\u0003 )XUWKHU\u00030DWKHPDWLFV\u0003\u0016\t\u0017 \u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u00033KRWRFRS\\LQJ\u0003LV\u0003UHVWULFWHG\u0003XQGHU\u0003ODZ\u0003DQG\u0003WKLV\u0003PDWHULDO\u0003PXVW\u0003QRW\u0003EH\u0003WUDQVIHUUHG\u0003WR\u0003DQRWKHU\u0003SDUW\\\u0011 5C 5D The reciprocal transformation 181 The table below shows the number of internet users signing up with a new internet service provider for each of the rst nine months of their rst year of operation. A scatterplot of the data also shown. Month Number 1 24 2 32 3 35 4 44 5 60 6 61 7 78 8 92 9 118 120 100 80 60 40 20 0 Number 7 0 1 2 3 4 5 6 7 8 9 Month The association between number and month is non-linear. a Apply the log transformation to the variable number and t a least squares line to the transformed data. Month is the EV. Write the slope and intercept of this line, correct to four signicant gures, in the spaces provided. log (number) = + month b Use the equation to predict the number of internet users after 10 months. Give answer to the nearest whole number. 5D The reciprocal transformation The reciprocal transformation is a stretching transformation that compresses the upper end of the scale on either the x- or y-axis. The eect of applying a reciprocal y transformation to a scatterplot is illustrated below. Transformation 1 y Outcome y The reciprocal y transformation works by compressing larger values of y relative to lower values of y. This has the eect of straightening out curves like the one shown opposite. The reciprocal x transformation works the same way but in the x-direction. Graph x &DPEULGJH\u00038QLYHUVLW\\\u00033UHVV &DPEULGJH\u00036HQLRU\u00030DWKV\u0003$&\u00129&(\u0003 ,6%1\u0003\u001c\u001a\u001b\u0010\u0014\u0010\u0016\u0014\u0019\u0010\u0019\u0014\u0019\u0015\u0015\u0010\u0015\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003-RQHV\u0003HW\u0003DO\u0011\u0003\u0015\u0013\u0014\u0019\u0003 )XUWKHU\u00030DWKHPDWLFV\u0003\u0016\t\u0017 \u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u00033KRWRFRS\\LQJ\u0003LV\u0003UHVWULFWHG\u0003XQGHU\u0003ODZ\u0003DQG\u0003WKLV\u0003PDWHULDO\u0003PXVW\u0003QRW\u0003EH\u0003WUDQVIHUUHG\u0003WR\u0003DQRWKHU\u0003SDUW\\\u0011 182 Core \u0002 Chapter 5 \u0002 Data transformation The following example shows how the 1/y transformation works in practice. Applying the reciprocal transformation A homeware company makes rectangular sticky labels with a variety of lengths and widths. The scatterplot opposite displays the width (in cm) and length (in cm) of eight of their sticky labels. Width (cm) Example 3 3.5 3 2.5 2 1.5 There is a strong negative association between the width of the sticky labels and their lengths, but it is clearly non-linear. Before we can t a least squares regression line to the data, we need to linearise the scatterplot. 3.5 4 4.5 5 5.5 6 6.5 7 Length (cm) The circle of transformation suggests that we could use the log y, 1/y, 1/x or log x transformation to linearise the scatterplot. We will use the 1/y transformation. That is, we will change the scale on the width axis to 1/width. When we make this change, we see that the association between 1/width and length is linear. See the plot opposite. We can now t a least squares line to model the association between 1/width and length. Note: On the plot opposite, when 1/width = 0. 4, the actual width is 1/0.4 = 2.5 cm. The equation of this line is: 1/width = 0.015 + 0.086 length 1/width This type of transformation is known as a reciprocal transformation. 0.6 0.55 0.5 0.45 0.4 0.35 0.3 0.25 0.2 3.5 4 4.5 5 5.5 6 6.5 7 Length (cm) Like any other regression line we can use its equation to make predictions. For example, for a sticky label of length 5 cm, we would predict that: 1/width = 0.015 + 0.086 5 = 0.445 1 = 2.25 cm (to 2 d.p.) or width = 0.445 &DPEULGJH\u00038QLYHUVLW\\\u00033UHVV &DPEULGJH\u00036HQLRU\u00030DWKV\u0003$&\u00129&(\u0003 ,6%1\u0003\u001c\u001a\u001b\u0010\u0014\u0010\u0016\u0014\u0019\u0010\u0019\u0014\u0019\u0015\u0015\u0010\u0015\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003-RQHV\u0003HW\u0003DO\u0011\u0003\u0015\u0013\u0014\u0019\u0003 )XUWKHU\u00030DWKHPDWLFV\u0003\u0016\t\u0017 \u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u0003\u00033KRWRFRS\\LQJ\u0003LV\u0003UHVWULFWHG\u0003XQGHU\u0003ODZ\u0003DQG\u0003WKLV\u0003PDWHULDO\u0003PXVW\u0003QRW\u0003EH\u0003WUDQVIHUUHG\u0003WR\u0003DQRWKHU\u0003SDUW\\\u0011 5D The reciprocal transformation 183 Using the TI-Nspire CAS to perform a reciprocal transformation The table shows the length (in cm) and width (in cm) of eight sizes of sticky labels. Width 6.8 5.6 4.6 4.2 3.5 4.0 5.0 5.5 Length 1.8 2.0 2.5 3.0 3.5 2.6 2.0 1.9 Using the 1/y transformation: \u0002 linearise the data, and t a regression line to the transformed data (length is the EV) \u0002 write its equation in terms of the variables length and width \u0002 use the equation to predict the width of a sticky label with a length of 5 cm. Steps 1 Start a new document by pressing / + N. 2 Select Add Lists & Spreadsheet. Enter the data into lists named length and width. 3 Name column C as recipwidth (short for 1/width). Calculate the values of recipwidth. Move the cursor to the grey cell below the recipwidth heading. Type in =1/width. Press to calculate the values of recipwidth. 4 Press / + I and select Add Data & Statistics. Construct a scatterplot of width against length. Let length be the explanatory variable and width the response variable. The plot is clearly non-linear. 5 Press / + I and select Add Data & Statistics. Construct a scatterplot of recipwidth (1/width) against length. The plot is now clearly linear. 6 Press b>Analyze>Regression>Show Linear (a + bx) to plot the line on the scatterplot with its equation. Note: The y in the equation on the screen corresponds to the transformed variable 1/width. Cambridge Senior Maths AC/VCE Further Mathematics 3&4 Cambridge University Press ISBN 978-1-107-56757-3 Jones et al. 2016 Photocopying is restricted under law and this material must not be transferred to another party. 184 Core \u0002 Chapter 5 \u0002 Data transformation 7 Write down the regression equation in terms of the variables width and length. 8 Substitute 5 cm for length in the equation. 1/width = 0.015 + 0.086 length 1/width = 0.015 + 0.086 5 = 0.445 or width = 1/0.445 = 2.25 cm (to 2 d.p) Using the CASIO Classpad to perform a reciprocal transformation The table shows the length (in cm) and width (in cm) of eight sizes sticky labels. Width 6.8 5.6 4.6 4.2 3.5 4.0 5.0 5.5 Length 1.8 2.0 2.5 3.0 3.5 2.6 2.0 1.9 Using the 1/y transformation: \u0002 linearise the data, and t a regression line to the transformed data. Length is the RV. \u0002 write its equation in terms of the variables length and width. \u0002 use the equation to predict the width of a sticky label with length of 5 cm. Ste

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Linear Algebra and Its Applications

Authors: David C. Lay

4th edition

321791541, 978-0321388834, 978-0321791542

More Books

Students also viewed these Mathematics questions

Question

6. What is a contingency reserve used for?

Answered: 1 week ago