Answered step by step
Verified Expert Solution
Question
1 Approved Answer
STAT 2263 - UNB Online College of Extended Learning, University of New Brunswick Assignment #2: Probability Instructions: Students are advised to submit this assignment within
STAT 2263 - UNB Online College of Extended Learning, University of New Brunswick Assignment #2: Probability Instructions: Students are advised to submit this assignment within 2 months of their course start date. Assignment is to be submitted via pdf file attachment to: robert.krausz@unb.ca. Your submitted assignment must be your own, original work. Any submitted work that is found not to be your own original work is considered to be plagiarism, and will receive a grade of zero - and may result in an automatic grade of 'F' in this course, as well as academic disciplinary action, as per University of New Brunswick policy. Please answer all questions as indicated, and show all of your work for full marks. Some questions can have more than one possible correct answer, so your own choice of answer must be clear and well-justified. Question 1: a) Write the set A={x|x is a 2-digit perfect square whole number} in roster notation. b) Write the set B={-9, -8, -7, -6, -5, -4, -3, -2, -1} using set-builder notation. Question 2: For each of the following, draw a Venn diagram that satisfies the stated condition: a) A B = A b) (Cc Dc )c = Question 3: For each of the following, draw a Venn diagram showing the indicated region as a shaded area. On each diagram, include only the sets indicated, and assume that all sets intersect each other, but none are subsets of any other. a) Ac Bc b) Cc Dc E c) H (F G H)c (F G H c ) Question 4: A set is comprised of 5 distinct elements. Calculate the following: a) the total number of different subsets of this set. b) the total number of different proper subsets of this set. c) the total number of different subsets of this set, which contain 2 or more elements. Question 5: Calculate the total number of different: a) 5-letter strings that can be made, using the letters of the word POLAND . b) 9-letter strings that can be made, using the letters of the word MIRAMICHI . c) 4-letter strings that can be made, using the letters of the word CANADA . Question 6: A market gardener is planning her planting scheme for the new growing season, and she has 10 different crops to choose from - 7 of which are vegetables and 3 of which are grains. a) Calculate the number of different ways she can select 6 crops to plant, if there are no restrictions. b) Redo (a), with the extra restriction that she must select at least 2 grain crops. c) Say that she has selected the following 6 crops: tomatoes, carrots, peas, radishes, beans, lettuce. She wishes to plant these in 6 adjacent rows, with 1 crop in each row. Calculate the number of different ways she can arrange these rows, if there are no other restrictions. d) Redo (c), with the extra restriction that the tomatoes and carrots must be planted in adjacent rows. e) Redo (d), with the further extra restriction that the peas and beans must not be planted in adjacent rows. Question 7: In how many different ways can 4 couples be seated at a table with 8 seats, if: a) all the seats are on the same side of the table, and there are no other restrictions? b) all the seats are on the same side of the table, and each couple must be seated together? c) Redo (b) if the table is round, and the seats are around the outside of the table. Question 8: The following information is given about events A and B: ( ) = . (( ) ) = . ( ) = . 5 Based on this information, determine whether A and B are independent or dependent with respect to each other. Question 9: A fair coin is tossed until 2 H are obtained. Calculate the probability that: a) b) c) d) the minimum number of tosses is required. exactly 8 tosses are required. fewer than 2 tosses are required. more than 4 tosses are required. Question 10: A deck of 10 cards consists of 7 cards that are black on both sides, and 3 cards that are black on one side and white on the other. 2 cards are selected randomly and placed flat on the table. Calculate the probability that 1 black and 1 white side are showing. Question 11: A jar contains 2 red balls, 2 blue balls, 2 green balls, and 1 orange ball. Balls are randomly selected, without replacement, until 2 of the same colour are obtained. Calculate the probability that more than 3 balls must be selected. Question 12: On a TV game show, a contestant is selected randomly from the audience, and offered a chance to choose 1 of 3 closed doors. Behind 1 of the doors is a fantastic prize, while there is nothing behind the other 2 doors. Once the contestant makes his choice, the host (who knows where the prize is) always opens one of the other 2 doors, intentionally revealing an empty door. This leaves 2 remaining doors - the contestant's choice, and one other - with the prize behind 1 of those doors. Before revealing the outcome, the host offers the contestant one last choice: he can stick with his original choice, or else switch to the other remaining door. In order to have the greatest probability of winning the prize, what should the contestant do? Justify your answer by calculating the probabilities of winning, for each option. (Hint: there are 3 possible answers to choose from: either it is better to stick, or it is better to switch, or it makes no difference) Question 13: Coin 1 is biased, with P(H)=0.3; Coin 2 is also biased, with P(H)=0.8; Coin 3 is fair. One of these coins is randomly selected, and then flipped. Calculate: a) P(coin comes up H). b) P(coin selected was the fair one, if the result of the flip is H). STAT 2263 - UNB Online College of Extended Learning, University of New Brunswick Assignment #3: Random Variables and Probability Distributions Instructions: Students are advised to submit this assignment within 3 months of their course start date. Assignment is to be submitted via pdf file attachment to: robert.krausz@unb.ca. Your submitted assignment must be your own, original work. Any submitted work that is found not to be your own original work is considered to be plagiarism, and will receive a grade of zero - and may result in an automatic grade of 'F' in this course, as well as academic disciplinary action, as per University of New Brunswick policy. Please answer all questions as indicated, and show all of your work for full marks. Some questions can have more than one possible correct answer, so your own choice of answer must be clear and well-justified. Question 1: Given the following probability distribution for a random variable X: a) b) c) d) e) x P(X=x) -2 0.10 -1 0.15 0 0.40 1 0.20 2 0.15 Explain why the above distribution is a valid probability distribution. There is one very specific reason that can be seen in the table. Calculate E(X) and SD(X). Determine the cdf(X), and write it as an additional column in the table. Calculate P(2 < X 1) . Draw a histogram that represents the probability distribution of X. Question 2: A casino game works as follows: A player pays $1 to play. Then, the player draws a card randomly from a standard playing deck. If the card is an ace, the player gets back $10. Otherwise, the player gets back nothing. a) Calculate, to the nearest cent, the expected net result to the player from playing this game. b) Calculate, to the nearest cent, the payoff (instead of $10) that would make this a fair game. c) Another way to make this game more attractive to players, other than changing the payoff amount (as in Part (b)), is to remove non-ace cards from the deck, to increase P(drawing an ace). Using the original payoff amount of $10 for an ace, calculate the minimum number of non-ace cards that would have to be removed from the deck, to make this a game that favours the player. Question 3: Wine is shipped from a vineyard in cases of 6 bottles. Historical data indicates that, on average, 2.7% of bottles turn sour by the time they are uncorked. Calculate, to the nearest %: a) the probability that 1 case of this wine will contain no more than 1 sour bottle. b) the probability that, if 10 cases of this wine are purchased, at least 8 of them will be 'perfect' - ie, contain 6 good bottles and 0 sour ones. c) the probability that, if 1 shipped case is opened at a time until a perfect case is found, more than 2 cases will have to be opened. Question 4: A university student is writing a multiple-choice test with 10 questions. Each question on the test has 5 choices for an answer, one of which is the correct one. To pass the test, the student must get at least 6 correct answers out of 10. The student has absolutely no idea what any of the correct answers are, and so guesses on all of the questions. Calculate, to the nearest %, the probability that: a) the student passes the test. b) if the student writes 4 different tests like this, they would pass none of them. c) if the student were to write tests like this until they passed 1 of them, they would need to write more than 100 tests. Question 5: a) Based on historical data, 7% of all cedar logs brought into a lumber mill are of suitable quality for use in timber frame construction. Also based on historical data, the average number of cedar logs brought into the mill per day is 43. Calculate, to the nearest %, the probability that at least 3 cedar logs suitable for timber frame construction arrive at the mill on any given day. b) If 4 cedar logs arriving at the mill from Part (a) are randomly selected, what is the probability, to the nearest %, that exactly half of them will be suitable for timber framing? c) If 1001 cedar logs arriving at the mill from Part (a) are randomly selected, what is the probability that exactly half of them will be suitable for timber framing? Question 6: In a plantation of sunflowers, the height of plants tends to follow a normal distribution, with a mean of 2.86 m and a standard deviation of 0.37 m. a) What is the probability that a randomly-selected sunflower measures exactly 3.00 m in height? b) What is the probability that a randomly-selected sunflower measures 3.00 m in height, using a measuring stick that is precise to the nearest cm? c) Calculate, to the nearest %, the proportion of sunflowers that are shorter than 200 cm in height. d) Calculate, to the nearest %, the probability that a randomly-selected sunflower from this plantation has a height between 200 and 286 cm. e) It has been decided that the tallest 3% of sunflowers in the plantation are to be set aside for seedsaving. Calculate, to the nearest cm, the minimum height for a sunflower to be saved for seeds. f) A certain species of crawling insect is known to favour the roots of the variety of sunflower grown in this plantation. If 8 of these insects arrive in the plantation and each one selects a different sunflower, and it is assumed that they are unable to discern the heights of the plants from their position on the ground, then what is the probability, to the nearest %, that less than half of these insects will choose a sunflower with a height between 200 and 286 cm? Question 7: The mass of garbage put out for weekly curbside pickup, per household, in a certain community, is reported to have a mean of 6.1 kg and a standard deviation of 2.2 kg. Based on this information: a) Assume that the population is very large. For a sample of 100 households from this community, calculate, to the nearest %, the probability that the sample mean will be between 5.9 and 6.3 kg. b) Redo Part (a), for a sample size of 1,000 households instead of 100. c) Redo Part (a) (ie, n=100), but this time assume a population of 400 households. d) Assuming again that the population is very large, let us now suppose that 5 consecutive samples of 100 households are taken, and each one of them has a sample mean greater than 6.6 kg. Calculate the probability of obtaining this result. e) Based on your answer for Part (d), is there reason to believe that the mean weekly garbage output per household is not actually around 6.1 kg? If so, in which direction has the mean garbage output shifted? Explain your answers using your previous calculations for this question, plus your knowledge of statistics fundamentals. Question 8: Based upon historical data, the successful seed germination rate for a certain variety of lettuce is 31%. A plant nursery has recently sowed 10,000 of these seeds, and then shortly after received orders for lettuce plants that can only be filled from successful germinations from this sowing. a) If the order is for 3,000 lettuce plants, what is the probability that they can fill all of these orders? b) Redo Part (a) for an order of 3,300 lettuce plants. c) Failing to deliver on orders is something that cannot be 100% avoided in this line of work. However, one can decide on a maximum acceptable level of risk of this failure occurring. If this particular nursery is willing to accept a risk of such failure - ie, of not having enough germinated plants from a sowing to meet all orders - of up to but not exceeding 5%, then what is the largest total amount of plants it should take orders for, for each sowing of 10,000 of these seeds? STAT 2263 - UNB Online College of Extended Learning, University of New Brunswick Assignment #4: Introduction to Inferential Statistics Instructions: Students are advised to submit this assignment within 4 months of their course start date. Assignment is to be submitted via pdf file attachment to: robert.krausz@unb.ca. Your submitted assignment must be your own, original work. Any submitted work that is found not to be your own original work is considered to be plagiarism, and will receive a grade of zero - and may result in an automatic grade of 'F' in this course, as well as academic disciplinary action, as per University of New Brunswick policy. Please answer all questions as indicated, and show all of your work for full marks. Some questions can have more than one possible correct answer, so your own choice of answer must be clear and well-justified. Question 1: Given the following sample of tree diameters at breast height, for randomly-selected trees from a forest area (measurements in cm): 32, 15, 25, 20, 19, 25, 39, 42, 43, 18 a) Calculate x and s for the sample, rounded to 2 dp where necessary. b) Calculate confidence intervals for = mean dbh for trees in this forest area, for: i. LOC = 90% ii. LOC = 95% iii. LOC = 99% c) Recalculate the 95% CI, using the same values for x and s but with: i. n = 5 instead of 10. ii. n = 100 instead of 10. d) Estimate the minimum sample size required to generate a 95% CI for , with a margin of error within: i. 2 cm ii. 1 cm iii. 0.5 cm Question 2: An inspector visited a forestry clearcut area that had been replanted with seedlings a year earlier. Out of 80 seedlings inspected, 13 had survived. a) Calculate confidence intervals for p = the overall 1-year survival rate for all replanted seedlings in this clearcut area, for: i. LOC = 95% ii. LOC = 99% iii. LOC = 99.5% b) Comment on whether or not the 95% CI would support a claim that the actual overall survival rate for all replanted seedlings in this clearcut area is 27%. c) Would your answer for Part (b) remain the same or change for LOC = 99.5%? Explain your answer. d) Estimate the minimum sample size required to generate an estimate of p to within 5% at LOC = 95%: i. based on the sample results from the inspector above. ii. assuming that there is no previous sample data to use. Question 3: The table below shows random sample data of tree heights (in m), taken from two separate forest areas. Tree heights in these forests are assumed to follow an approximately normal distribution. Forest Area 1 11.6 8.3 12.8 13.3 12.0 13.2 10.1 12.1 14.7 14.0 11.9 12.4 12.5 11.2 Forest Area 2 14.4 14.9 21.9 21.8 14.4 13.4 18.7 17.7 23.1 23.9 15.2 18.5 14.9 19.0 a) Calculate 90%, 95%, and 99% confidence intervals for the difference between the mean tree heights in these two forest areas. b) Comment on what the results from Part (a) suggest about any claims that might be made suggesting that the average tree heights in these two forests are about the same. Explain your answer in reference to the confidence intervals which you calculated. Question 4: The table below shows (in kg) the body-mass (aka: 'weight') of a group of study subjects from a town, who were weighed before and then 1 year after switching from driving to walking to and from their work: Subject 1 2 3 4 5 6 7 8 9 10 Weight Before Weight After 93 87 67 77 71 70 80 111 74 96 97 88 95 88 83 103 92 89 94 86 a) Assuming that the net change in weight among individuals follows a roughly normal distribution, calculate 90%, 95%, and 99% confidence intervals for the average net change in weight among all people from this town who switch from driving to walking to work. b) What do the results say about whether or not there is compelling evidence that switching modes of commuting leads to weight loss? Explain why or why not the answers to this question are the same, at all LOCs from Part (a). Question 5: A proposal to amalgamate the two towns of Palookatown and Smallville into one municipality is scheduled to be put to a referendum vote at the next local election. A random survey of 100 voters in each town is conducted, with 57 voters in Palookatown indicating their support for the proposal, and 43 voters in Smallville indicating their support. a) Calculate 90% and 95% confidence intervals for the difference between the levels of support for amalgamation between the two towns. b) Comment on whether or not the results from Part (a) support the idea that one town is more supportive, overall, of the amalgamation proposal. c) Redo Part (a) at LOC = 99%. d) Redo Part (b) based on the answer from Part (c). Question 6: A local manufacturer of wooden furniture orders timber from nearby mills, to make their various products such as tables, chairs, and bedframes. One of the woodworkers there expresses their dissatisfaction with the quality of the timber supplied by the mills, and states that in their opinion, over three-quarters of timber pieces of a certain size range contain unworkable flaws. a) Test this claim, at the 95% level of confidence, against a subsequent random sample of 40 timber pieces in this size range, of which 35 pieces contain unworkable flaws. Use the critical-value method. b) Use the p-value method to determine if the result from Part (a) would be any different at = 0.001, 0.01, or 0.10 . Question 7: One of the local timber mills receives raw logs from a regional logging company. The mill manager believes that the balsam fir logs they are receiving from this company average less than 8.5 m in length. The claim is tested against a subsequent sample of balsam fir logs, which has a mean length of 8.7 m and a standard deviation of 0.6 m. Explain why the mill manager's claim would be rejected, for any commonlyused value of . (Hint: a diagram would be helpful to explain this) Question 8: A government report on career prospects for university graduates in Canada lists average starting salaries for graduates of various programmes of study. The average starting annual salary for Environmental Studies graduates is listed as $40,000. To test this reported figure, at LOC = 99%, a random sample of 20 recent Environmental Studies graduates' starting salaries was collected. The raw data is below (in $/year): 34,500, 38,000, 41,000, 32,500, 28,000, 45,000, 30,000, 34,000, 37,500, 47,250, 30,500, 36,000, 29,750, 31,800, 42,250, 35,000, 34,500, 39,500, 40,000, 27,750 a) Assess the government report's claim, using the critical-value method. b) Use the p-value method to determine if there are any commonly-used LOC values for which the conclusion would be opposite to your answer from Part (a). Question 9: Referring to the data from Question 3, comparing tree heights in two different forest areas: a) Assume that this data was collected after a claim was made that the mean tree heights in these two forest areas are equal. Test this claim at LOC = 95%, using the critical value method. b) Explain how the 95% confidence interval for the difference in mean tree heights from these two forest areas (as calculated in Question 3(a)) confirms the result from Part (a) of this question above. c) Use the p-value method to determine if your decision from Part (a) above would change for any of = 0.10, 0.01, 0.005, 0.001 . d) Assuming that the samples were truly random, do you think that the average tree heights in these two forest areas are truly different, or are the differences observed probably just attributable to random sampling error? Explain in the context of your answers above (Note: there is no single right answer to this question - but your answer needs to be consistent with the arguments supporting it). Question 10: Referring to the data from Question 4, comparing the body-mass of people before and after switching from driving to walking to/from work: a) Assume that this data was collected after a claim was made that switching from driving to walking does not lead to a change in weight. Test this claim at LOC = 99%, using the critical value method. b) Explain how the 99% confidence interval for the average net change in weight (as calculated in Question 4(a)) confirms the result from Part (a) of this question above. c) Use the p-value method to determine if your decision from Part (a) above would change for any of = 0.10, 0.05, 0.005, 0.001 . d) Assuming that the sampling in this study was done in a random and unbiased manner, do you think that switching from driving to walking is linked to weight change among this population, or are the observed weight changes probably just attributable to random sampling error? Explain in the context of your answers above (Note: there is no single right answer to this question - but your answer needs to be consistent with the arguments supporting it). Question 11: Referring to the data from Question 5, comparing the levels of support for an amalgamation proposal in the two potentially-affected towns: a) Assume that this data was collected after a claim was made that the level of support is different in the two towns. Test this claim at LOC = 95%, using the critical value method. b) Use the p-value method to determine if your decision from Part (a) above would change for any of = 0.10, 0.01, 0.005, 0.001 . c) Assuming that the sampling in this study was done in a random and unbiased manner, do you think that the level of support for amalgamation is equal in the two towns, or are observed differences probably just attributable to random sampling error? Explain in the context of your answers above (Note: there is no single right answer to this question - but your answer needs to be consistent with the arguments supporting it). STAT 2263 - UNB Online College of Extended Learning, University of New Brunswick Assignment #5: Additional Topics in Inferential Statistics Instructions: Students are advised to submit this assignment within 5 months of their course start date. Assignment is to be submitted via pdf file attachment to: robert.krausz@unb.ca. Your submitted assignment must be your own, original work. Any submitted work that is found not to be your own original work is considered to be plagiarism, and will receive a grade of zero - and may result in an automatic grade of 'F' in this course, as well as academic disciplinary action, as per University of New Brunswick policy. Please answer all questions as indicated, and show all of your work for full marks. Some questions can have more than one possible correct answer, so your own choice of answer must be clear and well-justified. Question 1: A study is conducted to examine the influence of time spent watching TV or on the internet, on student performance on Statistics exams. A class of 8 students is observed over a period of time, with the independent variable being the total amount of time each student spends on TV/internet, and the dependent variable being their subsequent Statistics exam score, in %. The data is shown in the table below: (a) Determine the equation of the line of best fit, relating Y = exam score (%) to X = hrs/wk spent watching TV or on the internet. (b) Use the line of best fit calculated in Part (a) to calculate an estimate for the exam score that a student would get, to the nearest %, if they spent an average of 30 hrs/wk watching TV or on the internet. Repeat for an estimate of the exam score that would result after a student spent an average of 10 hrs/wk watching TV or on the internet. (c) Plot the raw data from the table on an x-y graph, and then draw the line of best fit showing at least 2 calculated points that are on that line (Hint: your answers for Parts (a) and (b) provide you with 3 such points). (d) What is the predicted exam score, to the nearest %, for a student who completely avoids the TV or internet? Repeat for a predicted exam score for a student who spends an average of 56 hrs/wk watching TV or on the internet. Comment briefly on your answers for these two estimates, and what it implies about the limitations of the linear regression model generated in Part (a). (e) Conduct a hypothesis test on the significance of correlation between hrs/wk spent watching TV or on the internet, and exam performance, using the critical-value method at LOC = 95%. (f) Use the p-value method to determine the common values of LOC (if any) for which your decision in Part (e) would be that there is no significant correlation between X and Y, and the common LOC values (if any) for which the opposite decision would be made. Question 2: A study was undertaken to compare the waste-generating behaviour of residents in four remote, isolated communities: Ptaouchnok, Malakazoo, Erehwon, and Closna. 10 households were randomly selected from each of these communities, and the average daily garbage output measured over a specified period of time. The data obtained is shown in the table below (values in kg/day of waste): a) At LOC = 95%, what would you conclude about whether or not there is any difference in garbage generation rates across these four towns? Use the critical-value method. b) Using the p-value method, determine if your conclusion from Part (a) would be different for any common values of LOC. Question 3: A market gardener grows four different varieties of tomato: cherry, black, plum, and brandywine. A test is conducted whereby 40 seeds of each variety are sown under similar conditions, with the number of germinated and non-germinated seeds recorded. The results are as follows: a) Conduct a test, at LOC = 95%, for whether the seed germination rate for the market gardener's tomatoes are independent of, or dependent upon, the particular variety. Use the critical-value method. b) Using the p-value method, determine if your conclusion from Part (a) would be different for any common values of LOC
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started