All Matches
Solution Library
Expert Answer
Textbooks
Search Textbook questions, tutors and Books
Oops, something went wrong!
Change your search query and then try again
Toggle navigation
FREE Trial
S
Books
FREE
Tutors
Study Help
Expert Questions
Accounting
General Management
Mathematics
Finance
Organizational Behaviour
Law
Physics
Operating System
Management Leadership
Sociology
Programming
Marketing
Database
Computer Network
Economics
Textbooks Solutions
Accounting
Managerial Accounting
Management Leadership
Cost Accounting
Statistics
Business Law
Corporate Finance
Finance
Economics
Auditing
Hire a Tutor
AI Study Help
New
Search
Search
Sign In
Register
study help
business
applied statistics and multivariate
Questions and Answers of
Applied Statistics And Multivariate
In this problem you will modify the data set created in Problem 8.7 to make it suitable for the theoretical exercises in discriminant analysis. Generate the sample data for X1, X2,. . . ,X9 as in
(Continuation of Problem 11.2.) Now divide the companies into three groups: group I consists of those companies with a P/E of 7 or less, group II consists of those companies with a P/E of 8 to 10,
(Continuation of Problem 11.2.) Perform a variable selection analysis, using stepwise and best-subset programs. Compare the results with those of the variable selection analysis given in Chapter 9.
(Continuation of Problem 11.2.) Choose a different set of prior probabilities and costs of misclassification that seems reasonable and repeat the analysis.
(Continuation of Problem 11.2.) Test whether D/E alone does as good a classification job as all six variables.
For the data shown in Table 9.1, divide the chemical companies into two groups: group I consists of those companies with a P/E less than 9, and group II consists of those companies with a P/E greater
Using the depression data set, perform a stepwise discriminant function analysis with age, sex, log(income), bed days, and health as possible variables. Compare the results with those given in
For the variables describing the average number of cigarettes smoked during the past 3 months(SMOKEP3M) and the variable describing the mother’s education (EDUMO) in the Parental HIV data determine
Using the data from the Parents HIV/AIDS study, for those adolescents who have started to use alcohol, predict the age when they first start their use (AGEALC). Predictive variables should include
Using dummy variables, run a regression analysis that relates CESD as the dependent variable to marital status in the depression data set given in Chapter 3. Do it separately for males and females.
Using the family lung function data, find the regression of height for the oldest child on mother’s and father’s height. Include a dummy variable for the sex of the child and any necessary
Perform a ridge regression analysis of the family lung function data using FEV1 of the oldest child as the dependent variable and height, weight and age of the oldest child as the independent
Using the family lung function data, relate FEV1 to height for the oldest child in three ways:simple linear regression (Problem 7.9), regression of FEV1 on height squared, and spline regression
In the depression data set, define Y = the square root of total depression score (CESD), X1 =log(income), X2 = Age, X3 = Health and X4 = Bed days. Set X1 = missing whenever X3 = 4(poor health). Also
Take the family lung function data described in Appendix A and delete (label as missing) the height of the middle child for every family with ID divisible by 6, that is, families 6, 12, 18 etc.(To
(Continuation of Problem 10.8.) Using the data in the table given in Problem 10.8, compute the midpoints of weight range for all frame sizes for men and women separately. Pretending that the results
Use the data described in Problem 8.7. Since some of the X variables are intercorrelated, it may be useful to do a ridge regression analysis of Y on X1 to X9. Perform such an analysis, and compare
Unlike the real data used in Problem 10.5, the accompanying data are “ideal” weights published by the Metropolitan Life Insurance Company for American men and women. Compute Y = midpoint of
(Continuation of Problem 10.5.) Do a similar analysis for the first boy and girl. Include age and age squared in the regression equation.
Another way to answer the question of interaction between the independent variables in Problem 8.13 is to define a dummy variable that indicates whether an observation is above the median weight, and
Use the lung function data described in Appendix A. For the parents we wish to relate Y =weight to X = height for both men and women in a single equation. Using dummy variables, write an equation for
Draw a ridge trace for the accompanying data. Variable Case X1 X2 X3 Y 1234 0.46 0.96 6.42 3.46 0.06 0.53 5.53 2.25 3 1.49 1.87 8.37 5.69 1.02 0.27 5.37 2.36 5 1.39 0.04 5.44 2.65 6 0.91 0.37 6.28
In the depression data set, determine whether religion has an effect on income when used as an independent variable along with age, sex, and educational level.
Repeat Problem 10.1, but now use a dummy variable for education. Divide the education level into three categories: did not complete high school, completed at least high school, and completed at least
In the depression data set described in Chapter 3, data on educational level, age, sex, and income are presented for a sample of adults from Los Angeles County. Fit a regression plane with income as
Using the Parental HIV data find the best model that predicts the age at which adolescents started drinking alcohol among those who have started drinking alcohol. Since the data were collected
Using the Parental HIV data consider performing a confirmatory data analysis investigating the relationship between the age at which children started drinking alcohol (if they have already started)
From among the candidate variables given in Problem 9.11, find the subset of three variables that best predicts height in the oldest child, separately for boys and girls. Are the two sets the same?
Using the methods described in this chapter and the family lung function data described in Appendix A, and choosing from among the variables OCAGE, OCWEIGHT, MHEIGHT, MWEIGHT, FHEIGHT, and FWEIGHT,
Force the variables you selected in Problem 9.9(a) into the regression equation with OCFEV1 as the dependent variable, and test whether including the FEV1 of the parents (i.e., the variables MFEV1
(a) For the lung function data set described in Appendix A with age, height, weight, and FVC as the candidate independent variables, use subset regression to find which variables best predict FEV1 in
In Problem 8.7 the population multiple R2 of Y on X4, X5,. . . , X9 is zero. However, from the sample alone we don’t know this result. Perform a variable selection analysis on X4 to X9, using your
For the data from Problem 8.7, perform a variable selection analysis, using the methods described in this chapter. Comment on the results in view of the population parameters.
Use the data you generated from Problem 8.7, where X1, X2,. . . ,X9 are the independent variables and Y is the dependent variable. Use the generalized linear hypothesis test to test the hypothesis
Using the data given in Table 9.1, repeat the analyses described in this chapter with (P/E)1=2 as the dependent variable instead of P/E. Do the results change much? Does it make sense to use the
For adult males it has been demonstrated that age and height are useful in predicting FEV1.Using the data described in Appendix A, determine whether the regression plane can be improved by also
Forbes gives, each year, the same variables listed in Table 9.1 for the chemical industry. The changes in lines of business and company mergers resulted in a somewhat different list of chemical
Repeat Problem 9.1 using subset regression, and compare the results.
Use the depression data set described in Table 3.4. Using CESD as the dependent variable, and age, income, and level of education as the independent variables, run a forward stepwise regression
For the Parental HIV data generate a variable that represents the sum of the variables describing the neighborhood where the adolescent lives (NGHB1–NGHB11). Is the age at which adolescents start
Repeat Problem 8.15(a) for fathers’ measurements instead of those of the oldest children. Are the regression coefficients more stable? Why?
(Continuation of Problem 8.13.)a) For the oldest child, find the regression of FEV1 on (i) weight and age; (ii) height and age; (iii)height, weight, and age. Compare the three regression equations.
(Continuation of Problem 8.13.) Find the partial correlation of FEV1 and age given height for the oldest child, and compare it to the simple correlation between FEV1 and age of the oldest child. Is
For the lung function data described in Appendix A, find the regression of FEV1 on weight and height for the fathers. Divide each of the two explanatory variables into two intervals:greater than, and
(Continuation of Problem 8.11.) For the regression of CESD on INCOME and AGE, choose 15 observations that appear to be influential or outlying. State your criteria, delete these points, and repeat
(Continuation of Problem 8.5.) Fit a regression plane for CESD on INCOME and AGE for males and females combined. Test whether the regression plane is helpful in predicting the values of CESD. Find a
(Continuation of Problem 8.7.) Perform a multiple regression analysis, with the dependent variable = Y and the independent variables = X1 to X9, on the 100 generated cases. Summarize the results and
(Continuation of Problem 8.7.) Calculate the population partial correlation coefficient between X2 and X3 after removing the linear effect of X1. Is it larger or smaller than r23? Explain. Also,
Repeat Problem 8.7 using another statistical package and see if you get the same sample.
Using a statistical package of your choice, create a hypothetical data set which you will use for exercises in this chapter and some of the following chapters. Begin by generating 100 independent
Search for a suitable transformation for CESD if the normality assumption in Problem 8.5 cannot be made. State why you are not able to find an ideal transformation if that is the case.
From the depression data set described in Table 3.4, predict the reported level of depression as given by CESD, using INCOME, SEX, and AGE as independent variables. Analyze the residuals and decide
Fit the regression plane for mothers with MFVC as the dependent variable and age and height as the independent variables. Summarize the results in a tabular form. Test whether the regression results
Write the results for Problem 8.2 so they would be suitable for inclusion in a report. Include table(s) that present the results the reader should see.
Fit the regression plane for the fathers using FFVC as the dependent variable and age and height as the independent variables.
Using the chemical companies’ data in Table 9.1, predict the price earnings (P/E) ratio from the debt to equity (D/E) ratio, the annual dividends divided by the 12-months’ earnings per share
Using the summary variable describing the neighborhood in Problem 7.15, generate a loess graph to examine the relationship between this variable and the age at which adolescents started using
For the Parental HIV data generate a variable that represents the sum of the variables describing the neighborhood where the adolescent lives (NGHB1–NGHB11). Does the age at which adolescents start
For the Parental HIV data produce a scatterplot of the age at which adolescents first started smoking versus the age at which they first started drinking alcohol. Based on the graph, do adolescents
For the mother, perform a regression of FEV1 on weight. Test whether the coefficients are zero. Plot the regression line on a scatter diagram of MFEV1 versus MWE1. On this plot, identify the
Examine the residual plot from the regression of FEV1 on height for the oldest child. Choose an appropriate transformation, perform the regression with the transformed variable, and compare the
What is the correlation between height and weight in the oldest child? How would your answer to the last part of Problem 7.10 change if r = 1? r = ????1? r = 0?
For the oldest child, perform the following regression analyses: FEV1 on weight, FEV1 on height, FVC on weight, and FVC on height. Note the values of the slope and correlation coefficient for each
From the depression data set described in Table 3.4 create a data set containing only the variables AGE and INCOME.a) Find the regression of income on age.b) Successively add and then delete each of
(Continuation of Problem 7.7.) Calculate the variance of CESD for observations in each of the groups defined by income as follows: INCOME 59. For each observation, define a variable WEIGHT equal to 1
Using the depression data set (see Table 3.4), perform a regression analysis of depression, as measured by CESD, on income. Plot the residuals. Does the normality assumption appear to be met? Repeat
Examine the plot you produced in Problem 7.1 and choose some transformation for X and/or Y and repeat the analysis described there. Compare the correlation coefficients for the original and
Repeat Problem 7.2 using log(weight) and log(height) in place of the original variables. Using graphical and numerical devices, decide if the transformations help.
For the data in Problem 7.3, pretend that the index increases linearly in time and use linear regression to obtain an equation to forecast the index value as a function of time. Using “volume”as
In Problem 5.8, the New York Stock Exchange Composite Index and daily volume for August 9 through September 17, 1982, were presented. Describe how volume appears to be affected by the price index,
From the family lung function data set in Appendix A, perform a regression analysis of weight on height for fathers. Repeat for mothers. Determine the correlation coefficient and the regression
In Table 9.1, financial performance data of 30 chemical companies are presented. Use growth in earnings per share, labelled EPS5, as the dependent variable and growth in sales, labelled SALESGR5, as
The Parental HIV data include information on the age at which adolescents started smoking.Where does this variable fit into Stevens’s classification scheme? Particularly comment on the issue
Suppose you would like to analyze the relationship between the number of times an adolescent has been absent from school without a reason and how much the adolescent likes/liked going to school for
In the depression study, information was obtained on the respondent’s religion (Chapter 3).Describe why you think it is incorrect to obtain an average score for religion across the 294 respondents.
Using the lung function data described in the Appendix, an investigator would like to predict a child’s lung function based on that of the parents and the area they live in. What analyses would be
A psychologist would like to predict whether or not a respondent in the depression study described in Chapter 3 is depressed. To do this, she would like to use the information contained in the
Two methods are currently used to treat a particular type of cancer. It is suspected that one of the treatments is twice as effective as the other in prolonging survival regardless of the severity of
A member of the admissions committee notices that there are several women with high grade point averages but low SAT scores. He wonders if this pattern holds for both men and women in general, only
For the data described in Problem 6.6 we wish to relate health data such as infant mortality(the proportion of children dying before the age of one year) and life expectancy (the expected age at
For the data described in the prior problem we wish to put together similar countries into groups. Suggest possible analyses.
Large amounts of data are available from the United Nations and other international organizations such as the World Bank for each country and sovereign state of the world, including health,
Data on men and women who have died have been obtained from health maintenance organization records. These data include age at death, height and weight, and several physiological and lifestyle
A college admissions committee wishes to predict which prospective students will successfully graduate. To do so, the committee intends to obtain the college grade point averages for a sample of
A coach has made numerous measurements on successful basketball players, such as height, weight, and strength. He also knows which position each player is successful at. He would like to obtain a
An investigator is attempting to determine the health effects on families of living in crowded urban apartments. Several characteristics of the apartment have been measured, including square feet of
Compute an appropriate measure of the center of the distribution for the following variables from the depression data set: MARITAL, INCOME, AGE, and HEALTH.
Using the lung cancer data described in Appendix A, examine the distribution of the variable days separately for those who died (death=1) and for those who did not (death=0). Plot a normal
Using the Parental HIV data calculate an overall Brief Symptom Inventory (BSI) score for each adolescent (see the codebook for details). Log-transform the BSI score. Obtain a normal probability plot
Using the Parental HIV data set (see Appendix A), plot a histogram, a boxplot, and a normal probability plot for the variable AGESMOKE. This variable is the age in years when the respondent started
Repeat Problem 5.7 with weights expressed in ounces instead of pounds. How will your conclusions change? Obtain normal probability plots of the logarithm of mothers’ weights expressed in pounds and
Obtain a normal probability plot of the index given in Problem 5.8. Suppose that you had been ignorant of the lack of independence of these data and had treated them as if they were independent
Generate ten random normal deviates. Display a probability plot of these data. Suppose you didn’t know the origin of these data. Would you conclude they were normally distributed?What is your
The accompanying data are from the New York Stock Exchange Composite Index for the period August 9 through September 17, 1982. Run a program to plot these data in order to assess the lack of
Obtain normal probability plots of mothers’ and fathers’ weights from the lung function data set described in Appendix A. Discuss whether or not you consider weight to be normally distributed in
Take the logarithm of the CESD score plus 1 and compare the histograms of CESD and log(CESD + 1). (A small constant must be added to CESD because CESD can be zero.)
Use the two sets of 100 random numbers from Problem 5.4. Display boxplots of these two sets of values and state which of the three graphical methods (histograms, normal probability plots, and
Generate a set of 100 random normal deviates by using the random number generator in your software program of choice (R: rnorm, SAS: NORMAL, Stata: rnormal, SPSS:RV.NORMAL). Display a histogram and
Repeat Problems 5.1 and 5.2, taking the square root of income.
Take the logarithm to base 10 of the income variable in the depression data set. Compare the histogram of income with the histogram of log(INCOME). Also, compare the normal probability plots of
Showing 100 - 200
of 2180
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Last