Answered step by step
Verified Expert Solution
Question
1 Approved Answer
#6 Theme: ANOVA Due: Dec 1, 2015 For your homework please submit your SAS program code (P), your output (O) and your written answers (W)
#6 Theme: ANOVA Due: Dec 1, 2015 For your homework please submit your SAS program code (P), your output (O) and your written answers (W) as specified below, in the provided HW template (answer template available on blackboard). Late submissions will receive zero credit. The number of possible points for each problem is specified at the beginning of each question. Reading: Chapters 17, 13, 18 (KKNR) 1. One-way ANOVA. Use data from problem 19 of Ch. 17 (ex1719_XX.dat, where XX is your HWnumber) from the \"Best Mutual Funds\" of 1996 from U.S. News & World Report. Variables in this dataset (in order) are: fund, cat (fund category: 1=Aggressive growth, 2=Long-term Growth, 3=Growth and income, 4=income), load (N= no load, L= load), vol (volatility: letter grade from A+ to F, with A+ = least variability and F = most variability), OPI (Overall Performance Index: higher=better). See textbook for more details. a. (2: P) Create and apply a format that will appropriately identify the 4 levels of the fund category [cat] variable (Reminder: see lab 2). Also create meaningful labels for the OPI and cat variable. The labels/formatting will improve the readability of the boxplot below. b. (3: P,O) Create a boxplot of OPI by fund category. Make sure the mean is plotted for each group. (Hint: to allow the names of the 4 cat variable levels to be fully printed, create a horizontal boxplot. Do this by: (1) turning on ODS graphicsif not already on and (2) using the horizontal option in the boxplot statement.) c. (2: P,O) In a small table, report the sample size, mean, standard deviation, minimum, and maximum of OPI for each fund category group. Comment on whether the standard deviations are similar across groups. d. (3: W) In the context of this problem, write three forms of the 1-way ANOVA model, along with the corresponding null and alternative hypotheses. e. (2: P,O) Use SAS to perform an ANOVA and to obtain both Tukey's and Scheffe's tests for multiple pairwise comparisons. f. (3: W) Summarize your conclusions. 2. One-way ANOVA using categorized continuous predictor. Use the individualized Louisiana cholesterol data set to investigate the association between triglycerides and BMI. We will use Log(Tg) rather than Tg as we have found it better satisfies linear model assumptions. a. (3: P,O) Create a new variable (BMICAT) that has four equal sized categories based on quartiles of the BMI distribution with values 1,2,3,4 for each quartile that are appropriately labeled using a format. Verify that you have 48 subjects in each BMICAT group by using PROC FREQ. b. (3: W,P,O) Create a scatterplot of Log(Tg) vs. BMI (include a simple linear regression line), and then a boxplot displaying the mean of Log(Tg) for each BMICAT category. What do you conclude from these plots? c. (4: W,P,O) Perform one-way ANOVA to determine whether Log(Tg) is associated with BMICAT. Use an appropriate multiple comparisons procedure if your overall F-test is significant. Summarize your conclusions. 3. Nonparametric ANOVA. Use the ex1719_XX.dat dataset again and: a. (3: W, P,O) Report the median OPI in each fund category and perform the Kruskall Wallis test for differences in median OPI among fund categories. Summarize your findings. a. (3: W, P,O) Use ANOVA on ranks to nonparametrically test for differences in OPI among fund categories. Use a multiple comparisons procedure to investigate pairwise differences between fund categories. Summarize your findings. b. (2: W) Compare your findings from a. and b., and summarize your overall conclusions based on nonparametric analysis. c. (2: W) Would you recommend the parametric approach you used in Question 1 or a nonparametric approach from this Question? Why? 4. ANOVA with a covariate/ANACOVA. Use the data from exercise 19 in Ch. 12 (ex1219_XX.dat, where XX is your HWnumber) from a random sample of residential home sales in a large city. The variables in this dataset (in order) are: House ID, Y (sales price in $1,000s), X1 (area in hundreds of square feet), X2 (number of bedrooms), X3 (total number of rooms), X4 (age in years), and location (dummy variables Z1= inner suburbs indicator and Z2 = outer suburbs indicator, with the reference of \"in town\"). a. (4: W, P,O) Use ANOVA to determine whether the mean selling price differs by location (in town, inner suburbs, outer suburbs). If indicated, use an appropriate multiple comparisons procedure. Summarize your findings. (Hint: first create a single variable called 'loc' that has three categories based on the values of Z1 and Z2). b. (4: W,P,O) Using a centered version of the area variable, determine whether area is associated with selling price and whether area is associated with location, and comment on the implications of these associations. c. (4: W,P,O) Use ANACOVA to determine whether the mean selling price differs by location after adjusting for (centered) area. Use an appropriate multiple comparisons procedure if necessary. Summarize your findings. d. (5: W,P,O) Create scatterplots of selling price vs. centered area, separately for each location, with a regression line for each of the three associations (Hint: proc gplot with a plot statement like: plot y*x=categoricalvar may be helpful). Comment on your model from part (c) in light of this data visualization in part (d). Describe how you might change your model. e. (3: W,P,O) Implement an updated model based on (d), and discuss the results. 5. Two-way ANOVA: Randomized Blocks. Use the data on silkworms from exercise 7 in Ch. 18 (ex1807_XX.dat, where XX is your HWnumber). This was a study to compare body sizes of three genotypes of fourth-instar silkworm, with the mean lengths for separately reared cocoon of each type of silkworms determined at 5 laboratory sites. The variables of this dataset, in order, are: GENE (genotype - heterozygous, homozygous, or wild type), site (one of 5 laboratory sites), and Y (the mean length in mm of the silkworms). a. (2: W) Viewing this as a randomized-blocks experiment, what are the \"blocks\" and what are the \"treatments\" in this experiment? b. (2: W,P,O) Create scatterplots of length vs. genotype and of length vs. site. Briefly describe your general impressions. c. (2: W) Write out the fixed effect indicator variable regression model representation of this design. d. (2: W,P,O) Formally state the null and alternative hypothesis for a test of \"treatment effect\". Perform the test using SAS and state your conclusion. e. (2: W,P,O) Perform a hypothesis test to assess whether blocking was justified in this experiment. 6. ANOVA in the literature. Read the article by Bellar et al (2011) posted on Blackboard (\"HW6 article\") in which the authors perform analyses using ANOVA methods. Provide the following information. a. (2: W) Viewing this as a randomized-blocks/repeated measures experiment, what are the blocks and what are the treatments in this experiment? How many blocks and how many treatments are there? b. (2: W) Write out the fixed effect indicator variable regression model representation of this design, where time to exhaustion is the outcome. c. (2: W) Write out the null and alternative hypotheses for the model in Part b. for the treatment group effect. d. (3: W) Bellar et al wrote: \"Given the nature of the design, because subjects will be compared in a within-subject repeated-measures fashion, the caffeine usage of each will be statistically adjusted for.\" A colleague who has not taken PM511a asks you how this design automatically adjusts for the caffeine usage of each subject. Explain in 1-2 sentences. e. (1: W) Name another valid method that could be used to perform the hypothesis test in Part c. HW Quiz (25 points) The HW quiz is available on Blackboard. The quiz is timed and is open book, open notes, open HW. You may not discuss the HW quiz with others
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started