1. [10 marks] Researchers developed a linear regression model to predict the fuel consumption (mpg; in miles per gallon) by the weight (wt; in 1,000 pounds) of the automobiles designed in the year 2018. The results appearing below were obtained from a statistical analysis of data from a random sample of 30 cars. Use this summary information to answer the following questions. mpg=Miles/(US)gallon wt= weight of a car (in 1000lbs) Dependent variable: mpg Independent variable: wt Linear model: Y = a + b*X Parameter Least Squares Estimate Standard Error T Statistic P-Value Intercept 37.2851 1.87763 19.8576 0.0000 Slope -5.34447 0.559101 -9.55904 0.0000 Correlation Coefficient = -0.867659 Standard Error of Est. = 3.04588 Plot of Fitted Model mpg = 37.2851 - 5.34447*wt 34 30 26 10 2. 3.5 5.5 a) [1 mark] Identify the elements (subjects) of interest in this study.b) [2 marks] For the variable(s) described in this study, specify the type of data and scale of measurement (for each of the variables). 0) [3 marks] Is there a signicant linear relationship between the fuel consumption and weight of the car? Use an appropriate decision point and show all necessary steps to answer this question. S4800_EXERCISE Q Q a a V E? Page 3 of 11 d) [1 mark] Write down the equation of the estimated regression line. e) [2 marks] What is the value of the slope of the regression line? Explain what the number means in terms of the variables, fuel consumption and weight of a car, paying special attention to the units of measurement for these data. f) [1 mark] Provide an estimate of the fuel consumption when the weight of the car is 3,000 pounds. 2. [6 marks] For each of the following situations, name (i) a graphical display and (ii) a statistic (for example, sample mean, sample propcltion, sample correlation, Chi-square statistic, etc.) which are the most appropriate to summarize or to describe the data. No explanations and no drawing of graphs are needed. a) b) [2 marks] An employer wants to investigate the relationship between his employees' daily commute time to work (in minutes) and their annual salary (in dollars). Graphical display: Statistic: [2 marks] In order to estimate the average household incomes in Vancouver, a city counsellor will survey 300 households. Each household will be asked to provide the household's total annual income (in dollars) in 2018. Graphical display: Statistic: c) [2 marks] A marketer wants to investigate if consumers from different geographic regions (West, East, South, or North) prefer different types of automobile (Sedan, Coupe, or SUV). Graphical display: Statistic: 43. [6 marks] The Biology Department at a university plans to recruit a new faculty member. Data collected by a different university on the 400 possible candidates were available. The Biology Department is debating whether to put a requirement of 10 years of teaching experience in the job adveltisement. The available data on the candidates are shown below: m5- Less than10 ears a) [2 marks] What percentage of candidates is male or has 10 or more years of experience? b) [1 mark] We randomly select a male candidate. What is the probability that the selected male candidate has 10 or more years of experience? 0) [3 marks] A candidate is randomly selected. Are the events A = {a candidate is male} and B = {a candidate has 10 or more years of experience} independent? Support your answer by using probability calculations. 4. [6 marks] Answer each of the following questions. To get full marks, you must show sufficient justification for your answer. a) [2 marks] When the distribution of data values is highly skewed, which measure of variability/spread is better: the interquartile range or standard deviation? Briefly explain your answer. b) [2 marks] When surveying shoppers at a supermarket, is it appropriate to use a simple random sampling method to select a sample? Briefly explain your answer. c) [2 marks] Suppose that a 90% confidence interval for the population mean grade (in percentage) of all Stat 4800 students at Langara College is (65.6, 70.2). Explain why the following definition of this interval is wrong: "90% of all Langara students have grades that are between 65.6 and 70.2"