Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

ECON 1203 Tutorial Workshop Questions Semester 1 2016 ***This document will be periodically updated with questions to be discussed in succeeding tutorials, and re-posted to

ECON 1203 Tutorial Workshop Questions Semester 1 2016 ***This document will be periodically updated with questions to be discussed in succeeding tutorials, and re-posted to Moodle every fortnight.*** Weeks 1 and 2 1. (a) What is meant by a variable in a statistical sense? Distinguish between qualitative and quantitative statistical variables, and between continuous and discrete variables. Give examples. (b) Distinguish between (i) a statistical population and a sample; (ii) a parameter and a statistic. Give examples. 2. In order to know the market better, the second-hand car dealership, Anzac Garage, wants to analyze the age of second-hand cars being sold. A sample of 20 advertisements for passenger cars is selected from the second-hand car advertising/listing website www.drive.com.au The ages in years of the vehicles at time of advertisement are listed below: 5, 5, 6, 14, 6, 2, 6, 4, 5, 9, 4, 10, 11, 2, 3, 7, 6, 6, 24, 11 (a) Calculate the frequency, cumulative frequency and relative frequency distributions for the age data using the following bin classes: More than 0 to less than or equal to 8 years More than 8 to less than or equal to 16 years More than 16 to less than or equal to 24 years. (b) Sketch a frequency histogram using the calculations in part (a). What can you say about the distribution of the age of these second-hand cars? Is there anything that concerns you about the frequency table and histogram? Specifically, is the choice of bin classes appropriate? What needs to be done differently? (c) Halve the width of the bins (0 to 4, 4 to 8, etc) and recalculate the frequency, cumulative frequency and relative frequency distributions. Using the new distributions and histogram, what can you now say about the distribution of the age of second-hand cars? 3. Health expenditure A recent report by Access Economics provides a comparison of Australian expenditures on health with that of comparable OECD countries. Data from that report relating to the year 2005 have been used to reproduce their Figure 2.2 (below denoted as Figure 2.1). (a) What are the key features of these data? (b) While this is a bivariate scatter plot, there are three variables involved: health expenditure, GDP and population. Why account for population by expressing health expenditure and GDP in per capita terms? Health expenditure per capita (US$000) Figure 2.1 OECD Health Expenditure and GDP 7 6 5 4 3 2 1 0 0 10 20 30 50 40 60 70 GDP per capita (US$000) 4. Australian housing prices Recent research by Dr Nigel Stapledon at the UNSW School of Economics provides an extensive analysis of Australian housing prices since 1880. In Figure 2.2 his data are used to provide a comparison of Sydney and Melbourne housing prices over time. (a) What are the key features of these data? (b) Why have prices been expressed in constant dollars? Figure 2.2 Comparison of Sydney and Melbourne median house prices in constant 2007-08 Dollars 600 Thousands of dollars 500 400 300 200 100 0 1860 1880 1900 1920 1940 1960 Year Sydney Melbourne 1980 2000 2020 5. Using the car data from Question 2: (a) Calculate the mean, median and mode for this sample of data and use these statistics to further describe the distribution of car ages. (b) If the largest observation were removed from this data set, how would the three measures of central tendency you have calculated change? 6. For the following statistical population, compute the mean, range, variance and standard deviation: 3, 3, 5, 12, 13, 14, 17, 20, 21, 21. What would happen to each of the measures you have calculated if: (a) ...4 were added to each data point (observation)? (b) ...each data point were multiplied by 2? 7. Migrant wealth. Suppose the Minister for Immigration is interested in research on the assimilation of migrant households (a household where the chief income-earner is foreign born). The Household, Income and Labour Dynamics in Australia (HILDA) survey is a representative survey of Australian households. Using 4,669 household observations for 2002 from HILDA, we find there are 3,567 households classified as Australianborn and 1,102 classified as migrants. One key consideration is how migrant households are doing in terms of wealth compared with Australian-born households. Using these data, we find the following: Summary statistics for net household wealth ($A) Australian-born Mean 236,064 10th percentile 1,545 Median 123,020 90th percentile 560,006 Migrant 248,970 1,720 131,152 524,372 (a) What can you say about the distribution of net household wealth, for both Australian-born and migrant households, by looking at just the mean and the median figures? (b) More generally, what can you say about the distribution of wealth for migrant households compared to that for Australian-born households? In particular, which type of household has greater variation in wealth? (c) Suppose the minister has net household wealth of $600,000. What can you say about his or her financial circumstances relative to other Australian-born households? 8. Sydney housing prices. Figure 3.2 depicts a scatter plot of Sydney-area housing prices versus distance from the CBD. The unit of observation is a suburb, price is the mean of the median price of houses sold in each suburb for two quarters (those ending in September and December 2002), and distance is measured in kilometers from downtown. (a) What would you expect the correlation to be between price and distance? (b) Does it appear that there is a linear relationship between the two variables? (c) What other key features of these data can be determined from the plot? Figure 3.2: House prices in Sydney suburbs versus distance to CBD 6000000 5000000 Price $ 4000000 3000000 2000000 1000000 0 0 10 20 30 40 50 60 70 80 Distance to CBD (kms) 9. Anzac Garage wants to develop guidelines for setting prices of cars according to the car's age. They hire a business consultant who chooses a sample of 117 second-hand passenger car advertisements collected from www.drive.com.au and retrieves data on the age and price of the cars. (a) The business consultant first calculates the correlation coefficient between age and price and finds it to be -0.278. Interpret this result. (b) Sketch what you think the scatter diagram from which this correlation coefficient was calculated might look like. Suppose the business consultant constructs a simple linear regression model using price as the dependent variable, and age as the independent variable. What do you think the estimated regression line might look like here? (We will return to this particular example later in the course and address this question more formally.) 10. Big Data. Suppose you are sitting at the NSW Department of Health and have access to information on hospital admissions, diagnosis, private insurance coverage, sex, age, smoking status, and length of hospital stay for all patients at all NSW hospitals for 2000 through 2015. A team of statisticians in your department are available to analyse these data following your direction. (a) You get a phone call from the State treasurer wanting to know how much of your budget you spend on smokers and smoking-related health problems. You promise to get back to her, and put down the phone. What do you tell your team? (b) You get a phone call from the Australian Council on Smoking and Health, asking about any evidence that the State has on the association between smoking and health outcomes. You promise to get back to them and put down the phone. What do you tell your team? 11. Work through problem 34 on page 165 of Sharpe (Chapter 4). Weeks 3 and 4 1. (a) Explain what it means to say that two probabilistic events in a sample space are mutually exclusive of one another. (b) Explain what it means to say that two probabilistic events in a sample space are independent of one another. (c) Why can two events not at the same time be both mutually exclusive and independent of one another? 2. A department store wants to study the relationship between the way customers pay for an item and the price of the item. 250 transactions are recorded and the following table is formed. Price category Under $20 $20-$100 Over $100 Cash 15 11 6 Payment Credit card 9 53 38 Debit card 18 52 48 Convert the table to a joint distribution. Express each of the following questions in terms of probability statements, and then solve: (a) What is the probability that an item is under $20? (b) What is the probability that an item with a price tag of $43 is paid for in cash? (c) What is the probability that people pay for an item that is at least $20 by credit? (d) If somebody used a debit card to pay for an item, what is the probability that the item was less than $100? (e) Are price and means of payment independent? 3. In a small batch of 20 manufactured widgets, there are, in fact, 3 defective ones. You, as quality control officer for the company making the widgets, decide to examine a sample of 3 widgets, selected without replacement, to see how many defective ones are selected. (a) Use a probability tree to evaluate the probability distribution of the number of defectives sampled. (b) How would your answer change if the sampling were done with replacement? 4. Work through problem 16 on page 200 of Sharpe (Chapter 5). 5. Work through problem 18 on page 200 of Sharpe (Chapter 5). 6. Work through problem 44 on page 203 of Sharpe (Chapter 5). 7. The manager of a factory has determined from past experience that X, the number of repairs required to machines in her factory on any one day, has the following probability distribution: x P(X = x) 0 1 2 3 4 0.41 0.25 0.18 0.10 0.06 Calculate the following: (a) P(1 50, n = 100, = 55, = 10, = 0.05 (b) H0: = 25, H1: < 25, n = 100, = 24, = 5, = 0.1 (c) H0: = 80, H1: 80, n = 100, = 80.5, = 4, = 0.05 6. A real estate expert claims the current mean value of houses in a particular area is more than $250,000. A random sample of 150 recent sales prices in the area yields a sample mean of $265,000. It is known that house values in the area are approximately normally distributed with a standard deviation of $50,000. (a) Perform an upper tail test of the null hypothesis that the population mean house value in the area is $250,000. Use a 5% level of significance and state the rejection (critical) region in terms of both and z. (b) Why is an upper tail test most appropriate in this case? (c) What is the p-value associated with the test statistic used in the part (a) test? Interpret this value. (d) Define in words the type I and II errors that could afflict the part (a) test. 7. What effect does increasing the sample size have on the outcome of a hypothesis test? Explain your answer using the example of a one-tail test concerning the mean of a normally distributed population with known variance. 8. Work through problem 40 on page 420 of Sharpe (Chapter 12). Recalling Exercise 39: Then, re-do the analysis with all settings the same except supposing that: c) The professor's students scored 108 points on the final exam, having used the software (and nothing else changed). d) The number of students enrolled in the course decreased from 481 to 210 (and nothing else changed). e) The standard deviation of the students' scores increased from 6.3 to 25.2 points (and nothing else changed). 9. Project Review: For the course project, you are only expected to use statistical methods covered in lectures and tutorials up to and including those in Week 9. Thus you should now have sufficient material to complete the project in a timely fashion. What might be useful at this stage is to think about presentation. See the Examples of Statistical Reports section of the Project folder on Moodle for some ideas in general. As a directed exercise for this tutorial, compare and contrast the presentation of material in the NSW BOCSAR report on driving under the influence of cannabis (driving-cannabis.pdf) and Queensland Office of Economic and Statistical Research bulletin on computer and internet usage in Queensland (computer-internet-useageqld-c01.pdf). You should be able to read these reports comfortably, although there are a few methods that may be unfamiliar in the cannabis report (although these methods will be covered later in the course). Weeks 9 and 10 1. State whether the normal distribution, the t distribution, or neither would be the right type of sampling distribution to assume for the sample mean in order to test hypotheses regarding the population mean in the following situations: (a) Population variable normally distributed, 2 unknown, sample size less than 30. (b) Population variable normally distributed, 2 unknown, sample size greater than 30. (c) Population variable normally distributed, 2 known, sample size less than 30. (d) Population variable not normally distributed, 2 unknown, sample size greater than 30. (e) Population variable not normally distributed, 2 unknown, sample size less than 30. 2. Reconsider the example used earlier in the course in which a real estate expert claimed the current mean value of houses in a particular area was more than $250,000. A random sample of 150 recent sales prices in the area yielded a sample mean of $265,000, and it is known that house values in the area are approximately normally distributed with a standard deviation of $50,000. (a) If in fact the population mean house value in the area is $260,000, what is the probability of committing a type II error in performing an upper-tail test of the null hypothesis that the mean house value price in the area is $250,000, as was done in Part (a) of the prior week's exercise? What is the power of the test in these circumstances? State in words what the power of the test means. (b) Illustrate your answer to part (a) above by showing on a diagram the areas representing the probability of a type II error and the power of the test. 3. A company running an urban rail service wishes to estimate its daily average number of late-running trains on weekdays. For 10 randomly selected weekdays, it finds the following numbers of late running trains: 32, 10, 9, 18, 25, 15, 14, 18, 22, 16 (a) (b) Assuming the number of late running trains on a weekday is approximately normally distributed, calculate a 90% confidence interval for the mean number of late-running trains on a weekday. If we did not have the assumption of normality, could we still calculate a confidence interval in this example? If not, suggest a way of overcoming this problem. 4. Reconsider the question from a previous week that used the Anzac Garage data, available from the course website (in the \"Tutorial Questions and Information\" folder) in an Excel file called Anzacg.xls. Would normality be a good approximation for the population distribution of distance travelled by used passenger cars? (Hint: look at the summary statistics and a histogram.) Do you need to assume normality? Redo the 95% confidence interval for the population mean distance travelled by used passenger cars without assuming a known population standard deviation. 5. It is known that 80% of people suffering from a particular disease are cured by a certain standard medication. Test the claim of the developers of a new medication that their product is more effective than the standard medication in curing the disease, using a 5% significance level, given a random sample of 400 people with the disease of whom 330 are cured by using the new medication. (Hint: Use the normal approximation, and ignore the continuity correction.) 6. Download the data \"Credit_Card_Bank\" from the MyStatLab website (available under the heading of Chapter 1: Data and Decisions). Using the variables \"Offer Status\" and \"Spendlift Positive\

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Algebra 1

Authors: Mary P. Dolciani, Richard A. Swanson

(McDougal Littell High School Math)

9780395535899, 0395535891

More Books

Students also viewed these Mathematics questions

Question

Add rules defining adjectives and adverbs to the grammar of Example

Answered: 1 week ago