All Matches
Solution Library
Expert Answer
Textbooks
Search Textbook questions, tutors and Books
Oops, something went wrong!
Change your search query and then try again
Toggle navigation
FREE Trial
S
Books
FREE
Tutors
Study Help
Expert Questions
Accounting
General Management
Mathematics
Finance
Organizational Behaviour
Law
Physics
Operating System
Management Leadership
Sociology
Programming
Marketing
Database
Computer Network
Economics
Textbooks Solutions
Accounting
Managerial Accounting
Management Leadership
Cost Accounting
Statistics
Business Law
Corporate Finance
Finance
Economics
Auditing
Hire a Tutor
AI Study Help
New
Search
Search
Sign In
Register
study help
business
statistical techniques in business
Questions and Answers of
Statistical Techniques in Business
5.6 Repeat Problem 5.4 or 5.5 for D = 2, 5, 10, 20, and 50 multiple imputes and compare answers. For what value of D does the inference stabilize?
5.5 As discussed in Section 5.4, the imputation method in Problem 5.4 is improper because it does not propagate the uncertainty in the regression parameter estimates. One way of making it proper is
5.4 For the data in Problem 5.3, create ten multiply imputed data sets with different sets of conditional draws of the parameters, using the method of Problem 5.3. Compute 90% intervals for the mean
5.3 Repeat Problem 5.2with the same observed data, butwithmissing values imputed using conditional draws rather than conditional means. That is, add a random normal deviate with mean 0 and variance
5.2 Create missing values of Y2 for the data in Problem 5.1 by generating a latent variable U with values ui = 2∗( yi1 −1) + zi3, where zi3 is a standard normal deviate, and setting yi2 as
Compute and compare estimated standard errors of estimates of (a) the mean of Y2 and (b) the coefficient of variation of Y2, computed using the bootstrap, the jackknife and analytical formulae (exact
5.1 As in Problem 1.6, generate 100 bivariate normal units {( yi1, yi2), i = 1,…, 100} on(Y1, Y2) as follows:yi1 = 1 + zi1, yi2 = 5 + 2∗zi1 + zi2, 102 5 Accounting for Uncertainty fromMissing
(d) Hot-deck imputation, with adjustment cells computed by categorizing the complete units into quartiles based on the distribution of Y1.Suggest a situation where (d) might be a superior method to
(c) Stochastic regression imputation based on the normal model, where a random normal deviate N(0, s2 22.1) is added to each of the conditional means from (b);
(b) Buck’s method, imputing the conditional mean of Y2 given Y1 from the linear regression based on complete units;
(a) Complete-case analysis;
4.14 For the artificial data sets generated for Problem 1.6, compute and compare estimates of the mean and variance of Y2 from the following methods:
4.13 Outline a situation where the “last observation carried forward” method of Example 4.12 gives poor estimates (see, for example, Little and Yau 1996).
4.12 Another method for generating imputations is the sequential hot deck, where responding and nonresponding units are treated in a sequence, and a missing value of Y is replaced by the nearest
4.11 Consider a hot deck like that of Example 4.8, except that imputations are by random sampling of donors without replacement. To define the procedure when there are fewer donors than recipients,
4.10 Derive expressions (4.6)–(4.8) for the simple hot deck where imputations are by simple random sampling with replacement. Assuming r/N is small and large samples, show that the proportionate
4.9 Derive the expressions for the large-sample sampling variance of the Cmean and Cdraw estimates of ????2 in the discussion of Example 4.6.
4.8 Derive the expressions for large-sample bias in Table 4.1.Problems 83
4.7 Buck’s method (Example 4.3) might be applied to data with both continuous and categorical variables, by replacing the categorical variables by a set of dummy variables, numbering one less than
4.6 Show that Buck’s (1960)method yields consistent estimates of themeans when the data are MCAR.
4.5 Suppose data are an incomplete random sample on Y1 and Y2, where Y1 given θ = (????1, ????11, ????20⋅13, ????21⋅13, ????23⋅13, ????22⋅13) is N(????1, ????11) and Y2 given Y1 and θ is
4.4 Derive the expressions for the biases of Buck’s (1960) estimators of ????jj and ????jk, stated in Example 4.3.
4.3 Describe the circumstances where Buck’s (1960) method clearly dominates both complete-case analysis and available-case analysis.
4.2 Repeat Problem 4.1 when the missing values are filled in by Buck’s (1960)method of Example 4.3 and compare the answers.
4.1 Consider a bivariate sample with n = 45; r = 20 complete units, 15 units with only Y1 recorded, and 10 units with only Y2 recorded. The data are filled in using unconditional means, as in Section
3.18 Review the results of Haitovsky (1968), Kim and Curry (1977), and Azen and Van Guilder (1981). Describe situations where CC analysis is more sensible than AC analysis, and vice versa.
3.17 When the data are not MCAR, consider the relative merits of CC analysis and AC analysis for estimating (a) means, (b) correlations, and (c)regression coefficients.
3.16 (a)Why does the estimated correlation (3.19) always lie in the range (−1, 1)? (b) Suppose the means y( jk)j and y( jk)k in the definitions of s( jk)jj , s( jk)jk , and s( jk)kk in (3.17) are
3.15 Construct a data set where the estimated correlation given by Eq. (3.18)lies outside the range (−1, 1).
(a) Sample {njl} (b) Respondent {rjl} (c) Population {Njl}8 10 18 5 9 14 ? ? 300 15 17 32 5 8 13 ? ? 700 23 27 50 10 17 27 500 500 1000 3.14 For the data in Problem 3.13, compute the odds ratio of
3.13 Compute raked estimates of the class counts from the sample counts and respondent counts in (a) and (b) below, using population marginal counts in (c):
3.12 Show that raking the class sample sizes and raking the class respondent sample sizes (as in Problem 3.11) yield the same estimate if and only if pijpkl∕pilpjk =1 for all i, j, k, and l, where
3.11 Oh and Scheuren (1983, section 4.4.3) propose an alternative to the raked estimate yrake in Eq. (3.16), where the estimated counts N∗j???? are found by raking the respondent sample sizes
3.10 Generalize the response propensity method in Example 3.7 to a monotone pattern of missing data (see Little 1986; Robins et al. 1995).
(c) The mean from mean imputation within adjustment classes defined as in (b). Explain why adjusted estimates are higher than the unadjusted estimates.
(b) The weighted mean from response propensity stratification, with three strata defined by combining classes in the table with response rates less than 0.4, between 0.4 and 0.8, and greater than 0.8.
(a) The unadjusted mean based on complete units.
3.9 The following table shows respondent means of an incomplete variable Y (income in $1000), and response rates (respondent sample size/sample size), classified by three fully observed covariates:
3.8 Apply the Cassel et al. (1983) estimate, discussed in Example 3.7, to the data of Problem 3.7. Comment on the resulting weights as compared with those of the weighting class estimate.
3.7 CalculateHorvitz–Thompson and weighting class estimates of the overall mean of Y in the following artificial example of a stratified random sample, where the xi and yi values displayed are
3.6 Suppose census data yield the following age distribution for the county of interest in Problems 3.4 and 3.5: 20–30: 20%; 30–40: 40%; 40–50: 30%;and 50–60: 10%. Calculate the
3.5 Compute the weighting class estimate (3.6) of the mean cholesterol level in the population and its estimated mean squared error (3.7). Thereby construct an approximate 95% confidence interval for
3.4 Compute the mean cholesterol for the respondent sample and its standard error. Assuming normality, compute a 95% confidence interval for the mean cholesterol for respondents in the county. Should
3.3 Show that for dichotomous Y1 and Y2, the odds ratio based on complete units is a consistent estimate of the population odds ratio if the logarithm of the probability of response is an additive
3.2 Show that if missingness (of Y1 or Y2) depends only on Y2, and Y1 has a linear regression on Y2, then the sample regression of Y1 on Y2 based on complete units yields unbiased estimates of the
3.1 List some standard multivariate statistical analyses that are based on sample means, variances, and correlations.
2.13 Carry out a standard ANOVA for the following data, where three values have been deleted from a (5 × 5) Latin square (Snedecor and Cochran 1967, p. 313).
2.12 Carry out the computations leading to the results of Example 2.3.
2.11 Carry out the computations leading to the results of Example 2.2.
2.10 Show (2.22) and then (2.23) and (2.24).
2.9 Justify (2.17)–(2.20).
2.8 Carry out the computations leading to the results of Example 2.1.
2.7 Using the notation and results of Section 2.5.4, justify (2.16) and the method for calculating B and ???? that follows it.
2.6 Provide intermediate steps leading to (2.13), (2.14), and (2.15).
2.5 Prove that (2.12) follows from the definition of U−1.
2.4 Summarize the argument that Bartlett’s ANCOVA method leads to correct least squares estimates of missing values.
2.3 Outline the distributional results leading to (2.6) being distributed as F.Problems 45
2.2 Prove that ????̂ in (2.2) is (a) the least squares estimate of ????, (b) the minimum variance unbiased estimate, and (c) the maximum likelihood estimate under normality. Which of these
2.1 Review the literature on missing values in ANOVA from Allan and Wishart (1930) through Dodge (1985).
C. Repeat parts (A) and (B) with (i) a = 2, b = 0 and (ii) a = 0 and b = 2.
B. Conduct a t-test comparing the means of Y1 for complete and incomplete units. Is there evidence from this test that the data are not (a)MCAR, (b) MAR, and (c) MNAR?
A. Display the marginal distributions of Y1 and Y2 for complete and incomplete units. (Note that in reality the marginal distribution of Y2 is not available formissing units.) Is this mechanism
1.6 Generate 100 triplets {(zi1, zi2, zi3), i = 1, …, 100} of independent standard normal (that is, mean 0, variance 1) deviates. From these triplets, create 100 trivariate normal observations
(d) Express the marginal distribution of yi in terms of the conditional distributions of yi given the variousmissingness patterns and their probabilities.28 1 Introduction
(c) Consider the simple situationwheremij =1ormij =0.Whenattention is focused only on the units that fully respond, the conditional distribution of yi given mi = (0, 0, …, 0) is being estimated,
(b) Nearly always it is assumed thatMis fully observed. Describe a realistic case when it maymake sense to regard part of M itself as missing.(Hint: Can you think of a situation where the meaning of
(a) Propose situations where two values of mij are not sufficient. (Hint:See Heitjan and Rubin (1990).)
1.5 Let Y = ( yij) be the data matrix and let M = (mij) be the corresponding missingness indicator matrix, wheremij = 1 indicates missing andmij = 0 indicates observed.
1.4 What impact does the occurrence of missing values have on (a) estimates and (b) tests and confidence intervals for the analyses in Problem 1.2? For example, are estimates consistent for
1.3 What assumptions about the missingness mechanism are implied by the statistical analyses used in Problem 1.2? Do these assumptions appear realistic?
1.2 List methods for handling missing values in an area of statistical application of interest to you, based on experience or relevant literature.
1.1 Find the monotone pattern for the data of Table 1.1 that involves minimal deletion of observed values. Can you think of better statistical criteria for deleting values than this one?
Let X = (X1,...,Xn)T and consider the linear model Xi = s j=1 ai,jβj + σi , where the i are i.i.d. F, where F has mean 0 and variance 1. Here, the ai,j are known, β = (β1,...,βs)T and σ are
Assume X1,...,Xn are i.i.d. according to a location scale model with distribution of the form F[(x − θ)/σ], where F is known, θ is a location parameter, and σ is a scale parameter. Suppose
Suppose (X1, Y1),... (Xn, Yn) are i.i.d. bivariate observations in the plane, and let ρ denote the correlation between X1 and Y1. Let ˆρn be the sample correlationρˆn =(Xi − X¯n)(Yi −
for testing equality of Poisson means λi based on the test statistic T, show how to construct a randomization test based on T. Examine the limiting behavior of the randomization distribution under
Under the setting of
Using Theorem 15.2.3, prove a result analogous to Theorem 15.2.5 with Tm,n replaced by T˜m,n defined in (15.19). Deduce that the two-sample permutation test is consistent in level for testing
In the two-sample problem of Example 15.2.6, suppose the underlying distributions are normal with common variance. For testing µ(PY ) =µ(PZ ) against µ(PY ) > µ(PZ ) compute the limiting power of
Provide the remaining details for the proof of Theorem 15.2.5.
Verify (15.15) and (15.16). Hint: Let S be the number of positive integers i ≤ m with Wi = 1, and condition on S.
In Theorem 15.2.4, show the conclusion may fail if ψP is not an odd function.
Suppose X1,...,Xn are i.i.d. according to a q.m.d. location model with finite variance. Show the ARE of the one-sample t-test with respect to the randomization t-test (based on sign changes) is 1
As an approximation to (15.9), let g1,...,gB−1 be i.i.d. and uniform on G. Also, set gB to be the identity. Define R˜n,B(t) = 1 BB i=1 I{Tn(giX) ≤ t} .Show, conditional on X, sup t|R˜n,B(t)
With ˆp and ˜p defined in (15.5) and (15.7), respectively, show that ˆp − p˜ → 0 in probability.
(i) Suppose Y1,...,YB are exchangeable real-valued random variables; that is, their joint distribution is invariant under permutations. Let q˜ be defined by q˜ = 1 B!1 +B−1 i=1 I{Yi ≥
With ˆp defined in (15.5), show that (15.6) holds.
with δ = δk → 0 as k → ∞. At what rate should δk → 0 as k → ∞ so that the limiting maximin power is strictly between α and 1?
Consider the setting of
Show why (14.77) is true.
Show that the expression (14.65) exceeds α if there exists a j for which aj > 0 and cj = 0. Also, show that (14.65) is an increasing function of|cj |.Section 14.6
What is the characteristic function of the limiting random variable W of Theorem 14.5.1? As a special case, show that the characteristic function of the limiting null distribution of the Cram´er-von
Verify the claims made in Example 14.5.4.
Consider Wn with Tj (x) = √2 cos(πjx). Fix γj ≥ 0 with γ2 j
Show that the Anderson-Darling statistic (14.58) can be rewritten in the form (14.59).
Let F be a c.d.f. on (0, 1). If 1 0cos(πjx)dF(x)=0 for all j = 1, 2,..., then F must be the uniform distribution on (0, 1). Hint:Integrate by parts and use the fact the functions √2 sin(πjx)
Show that the Cram´er-von Mises test statistic Cn given by(14.57) can be computed by Cn = 1 12n +n i=1[X(i) − 2i − 1 2n ]2 , where X(1) ≤···≤ X(n) denote the order statistics; see
Show that the distribution of the Cram´er-von Mises test statistic (14.57) under F0 is the same for all continuous distributions F0.
In Theorem 14.5.1, show that W has a continuous, strictly increasing distribution function on (0, ∞). Hint: Write W = aiZ2 i + R for some i with ai > 0 and note that aiZ2 i has a density.
Argue the validity of (14.51).Section 14.5
Showing 1200 - 1300
of 5757
First
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Last