All Matches
Solution Library
Expert Answer
Textbooks
Search Textbook questions, tutors and Books
Oops, something went wrong!
Change your search query and then try again
Toggle navigation
FREE Trial
S
Books
FREE
Tutors
Study Help
Expert Questions
Accounting
General Management
Mathematics
Finance
Organizational Behaviour
Law
Physics
Operating System
Management Leadership
Sociology
Programming
Marketing
Database
Computer Network
Economics
Textbooks Solutions
Accounting
Managerial Accounting
Management Leadership
Cost Accounting
Statistics
Business Law
Corporate Finance
Finance
Economics
Auditing
Hire a Tutor
AI Study Help
New
Search
Search
Sign In
Register
study help
business
categorical data analysis
Questions and Answers of
Categorical Data Analysis
3.3 Repeat Problem 3.1 using the score test.
3.2 Suppose that in the general population the proportion of obese men who have heart disease is 10%. In a random sample of 50 obese men who were placed on a low-fat diet, two men were found to have
3.1 In a certain school the proportion of students who are proficient in mathematics is 40%.A new teaching approach is introduced in the school, and after a year it is found that in a random sample
2.12 Provide a substantive illustration of a situation that would require the use of each of the five probability distributions described in this chapter.
2.11 Telephone calls are received by a college switchboard at the rate of four calls every 3 minutes. What is the probability of obtaining five calls in a 3-minute interval?
2.10 On average, 10 people enter a particular bookstore every 5 minutes.a. What is the probability that only four people enter the bookstore in a 5-minute interval?b. What is the probability that
2.9 An owner of a boutique store knows that 45% of the customers who enter her store will make purchases that total less than $200, 15% of the customers will make purchases that total more than $200,
2.8 The probability that an entering college freshman will obtain his or her degree in four years is 0.4. What is the probability that at least one out of five admitted freshmen will graduate in four
2.7 For a multiple-choice test item with four response options, the probability of obtaining the correct answer by simply guessing is 0.25. If a student simply guessed on all 20 items in a
2.6 Researchers at the Food Institute have determined that 67% of women tend to crave sweets over other alternatives. If 10 women are randomly sampled from across the country, what is the probability
2.5 Suppose that the principal of Learn More School from Problem 2.2 is only able to choose one second-grade student to represent the school in a poetry contest. If she randomly selects a student,
2.4 The CEO of a toy company would like to hire a vice president of sales and marketing.Only 2 of the 5 qualified applicants are female, and the CEO would really like to hire a female VP if at all
2.3 Suppose there are 48 Republican senators and 52 Democrat senators in the United States Senate and the president of the United States must appoint a special committee of 6 senators to study the
2.2 At Learn More School, 15 of the 20 students in second grade are proficient in reading.a. If the principal of the school were to randomly select two second-grade students to represent the school
2.1 A divorce lawyer must choose 5 out of 25 people to sit on the jury that is to help decide how much alimony should be paid to his client, the ex-husband of a wealthy business woman. As luck would
1.8 Provide a substantive research question that would need to be addressed using procedures for categorical data analysis. Be sure to specify how the dependent and independent variables would be
1.7 Determine whether procedures for analyzing categorical data are needed to address each of the following research questions. Indicate what analytic procedure (e.g., ANOVA, regression) you would
1.6 Determine whether procedures for analyzing categorical data are needed to address each of the following research questions. Provide a rationale for each of your answers by identifying the
1.5 Determine whether procedures for analyzing categorical data are needed to address each of the following research questions. Provide a rationale for each of your answers by identifying the
1.4 For each of the following research scenarios, identify the dependent and independent variables (or indicate if not applicable) as well as the scale of measurement used for each variable. Explain
1.3 For each of the following research scenarios, identify the dependent and independent variables (or indicate if not applicable) as well as the scale of measurement used for each variable. Explain
1.2 Indicate the scale of measurement used for each of the following variables and explain your answer by describing the probable scale:a. Self-efficacy, as measured by a 10-item scale.b. Race, as
1.1 Indicate the scale of measurement used for each of the following variables and explain your answer by describing the probable scale:a. Sense of belongingness, as measured by a 20-item scale.b.
10. Compute the HPD confidence interval in Example B.9 for a general K, and compare its length to the unadjusted and Bonferroni adjusted intervals.
9. In an errors-in-variables simple regression model, the least squares estimate of the regression slope (β) is biased toward 0, an example of attenuation. Specifically, if the true regression
8. Consider a linear regression having true model of the form Yi = α +βxi + i, where the i are i.i.d. with mean 0 and variance σ2. Suppose you have a sample of size n, use least-squares
7. Let X be a random variable with mean μ and variance σ2. You want to estimate μ under SEL, and propose an estimate of the form (1 − b)X.(a) Find b∗, the b that minimizes the MSE.(Hint: Use
6. Find the frequentist risk under squared error loss (i.e., the MSE) for the estimator da,c(x) = a+cX¯. What true values of θ favor choosing a small value of c (high degree of shrinkage)? Does
5. For the exponential distribution based on a sample of size n, find the minimax rule for squared error loss.
4. For the binomial distribution with n trials, find the minimax rule for squared error loss and normalized squared error loss, L(θ,a) = (θ−a)2θ(1−θ) .(Hint: The rules are Bayes rules or
3. Consider the following loss function:l(θ,a) = % p|θ − a|, θ>a(1 − p)|θ − a|, θ ≤ a
2. Suppose that the posterior distribution of θ, p(θ|x), is discrete with support points {θ1, θ2,...}. Show that the Bayes rule under 0–1 loss(B.2) is the posterior mode.
1. Show that the Bayes estimate of a parameter θ under weighted squared error loss, l(θ,a) = w(θ)(θ − a)2, is given by dπ(x) = E[θw(θ)|x]E[w(θ)|x] .
15. Table 7.7 presents observed (Yi) and expected (Ei) cases of lip cancer in 56 counties in Scotland, which have been previously analyzed by Clayton and Kaldor (1987) and Breslow and Clayton (1993).
14. The data in Table 7.6 were originally analyzed by Tolbert et al. (2000), and record the horizontal (s1) and vertical (s2) coordinates (in meters×104) of 10 ozone monitoring stations in the
12. Suppose we have a collection of binary responses yi ∈ {0, 1}, i = 1,...,n, and associated k-dimensional predictor variables xi. Define the latent variables y∗i as y∗i = xT i β + i, i =
11. Consider again the model of Example 7.2. Suppose that instead of a straight-line growth curve, we wish to investigate the exponential growth
10. Table 7.4 gives a dataset of male mortality experience originally presented and analyzed by Broffitt (1988). The rates are for one-year intervals, ages 35 to 64 inclusive (i.e., bracket i
9. Consider the estimation of human mortality rates between ages x and x + k, where x ≥ 30. Data available from a mortality study of a group of independent lives includes di, the number of deaths
8. (Devroye, 1986, p. 38) Suppose X is a random variable having cdf F, and Y is a truncated version of this random variable with support restricted to the interval [a, b]. Then Y has cdf G(y)
7. Prove that if the target parameters (θ1,...,θk) are independent and stochastically ordered, then for all j, P˜j (γ) = Pˆj (and therefore P˜j (γ)does not depend on γ). (Hint: Use the
6. Table 7.1 shows that the AR(1) model “calms” longitudinal variation in the ranks. Discuss why this calming can be advantageous and why it might be disadvantageous.
5. Identify applications in which SEL in estimating ranks is the appropriate metric, and others in which (above γ)/(below γ) loss is appropriate. For the latter, discuss how you would select γ.
4. Let G(0.5)k (t) be the posterior median of Gk(t).(a) Show that for each t, G(0.5)k (t) minimizes expected, integrated absolute loss, E 8 | Gest k (t) − Gk(t) | dt9.(b) Outline an algorithm to
3. Compute and compare the mean and variance induced by edfs formed from the coordinate-specific MLEs, the standard Bayes estimates, and G¯k for the normal/normal, Poisson/gamma, and beta/binomial
2. Let G∗k be a discrete distribution with mass 1/k at mass points (u1
1. Let Gk(t) be defined as in Subsection 7.1.2. Show that the posterior mode of Gk(t), denoted G˜k(t), has equal mass at exactly k mass points that are derived from an expression which for each t is
5. Consider again the simple beta-binomial setting of Example 6.5, where we have a binomial response and the preliminary data from Study A given in Table 6.1. We again seek to base posterior
4. Consider a clinical trial of a cholesterol-lowering drug, where we wish to use equal sample sizes in the experimental and control arms. Assume that LDL values (Y ) in the population are Y ∼
3. Consider again the clinical trial design setting of O’Hagan and Stevens(2001), as described in Subsection 6.2.2.(a) Confirm that expression (6.11) is the result of the analysis objective P(β >
2. Using equations (6.2) and (6.4), verify that ˜n ≥ nθ. Will this relation hold for all G? (Hint: see equation (6.3).)
1. Actually carry out the analysis in Example 6.4, where the prior g(θ) in(6.6) is(a) the Unif(0, 1) prior used in the example, g(θ) = 1.(b) a mixture of two Beta priors, g(θ) = .5 · 3θ2 + .5 ·
17. Show that if the sampling variance is unknown and has an inverse gamma prior distribution, then in the limit as information in this inverse gamma goes to 0 and as τ 2 → ∞, Bayesian intervals
16. Study the frequentist and Bayesian coverage probabilities for Bayesian intervals based on the Gaussian/Gaussian model with a N(μ, τ 2) prior and a sampling distribution, conditional on θ, that
15. In the interval estimation setting involving a beta prior and a binomial likelihood,(a) Under what condition on the prior will the HPD interval for θ be one-sided when X = n? Find this
14. In the beta/binomial point estimation setting of Subsection 5.6.2,(a) For μ = .5 and general n and M, find the region where the Bayes rule has smaller risk than the MLE.
13. In the bivariate Gaussian example of Subsection 5.5.2, show that if Σ =I, then B has the form (5.39).
12. Do a simulation comparison of the MLE and three EB approaches: Robbins, the Poisson/gamma, and the Gaussian/Gaussian model for estimating the rate parameter in a Poisson distribution. Note that
11. In the Gaussian/Gaussian EM example,
10. Consider again the data in Table 3.1, also reproduced here for convenience (Table 5.7). These are the numbers of pump failures, Yi, observed in ti thousands of hours for k = 10 different systems
9. Consider the PEB model Yi|θi ind∼ P oisson(θiti), θi iid∼ G(a, b), i = 1,..., k.(a) Find the marginal distribution of Y = (Y1,...,Yk)T .(b) Use the method of moments to obtain closed form
8. Consider again Fisher’s sleep data:1.2, 2.4, 1.3, 1.3, 0.0, 1.0, 1.8, 0.8, 4.6, 1.4 .Suppose these k = 10 observations arose from the Gaussian/Gaussian PEB model, Yi|θi ind∼ N(θi, σ2), i =
7. Prove result (5.26) on the maximum coordinate-specific loss for the James-Stein estimate.
6. Show how to evaluate (5.24) for a general θ. (Hint: A non-central chisquare can be represented as a Poisson mixture of central chi-squares with mixing on the degrees of freedom.)
5. Prove result (5.24) using the completeness of the noncentral chi-square distribution.
4. In the Gaussian/Gaussian model (5.5), if σ2 = 1 and μ = 0, the Yis are marginally independent with distribution Yi|τ iid∼ N(0, 1 + τ 2) ≡ N(0, 1/B) .(a) Find the marginal MLE of B, B MLE,
3. Consider the gamma/inverse gamma model, i.e., Y1,...,Yk ind∼ G(α, θi),α a known tuning constant, and θ1,...,θk iid∼ IG(η, 1).(a) Find the marginal density of yi, m(yi|η).(b) Suppose α
2. Under the compound sampling model (5.27) with fi = f for all i, show that (5.28) holds (i.e., that the Yis are marginally i.i.d.). What is the computational significance of this result for
1. (Berger, 1985, p. 298) Suppose Yi ind∼ N(θi, 1), i = 1,...,k, and that theθi are i.i.d. from a common prior G. Define the marginal density m(yi)in the usual way as m(yi) = N(yi | θi, 1)
16. Consider again the stack loss data originally presented in Example 2.16.Suppose we wish to compare the assumptions of normal, t4, and DE errors for these data.(a) Repeat the outlier analysis in
15. In the previous problem, implement both models in WinBUGS, and use DIC instead of Bayes factors to choose between them. Is your preference between the two models materially altered by this
14. Consider the dataset of Williams (1959), displayed in Table 4.7. For n = 42 specimens of radiata pine, the maximum compressive strength parallel to the grain yi was measured, along with the
13. Consider again the data in Table 3.3. Define model 1 to be the three variable model given in Example 3.7, and model 2 to be the reduced model having m1 = 1 (i.e., the standard logistic regression
12. In the previous problem, suppose we wish to evaluate the model using the model check pD given in equation (2.33) with an independent validation data sample z. Give a computational formula we
11. Suppose we have a convergent MCMC algorithm for drawing samples from p(θ|y) ∝ f(y|θ)π(θ). We wish to locate potential outliers by computing the conditional predictive ordinate f(yi|y(i))
10. Refer again to the cross-protocol data and model in Example 2.12.(a) Create WinBUGS code that will estimate cross validation (“leave one out”) residuals and CPO values via importance
9. Suppose we are estimating a set of residuals ri using a Monte Carlo approach, as described in Section 4.3. Due to a small sample size n, we are concerned that the approximation E(yi|y(i)) =
8. For the interval null prior partitioning setting of Subsection 4.2.2, derive an expression for the set of priors Hc that correspond to rejecting H0, similar to expression (4.12) for the point null
7. For the point null prior partitioning setting described in Subsection 4.2.2, show that requiring G ∈ Hc as given in (4.12) is equivalent to requiring BF ≤ p 1−p 1−ππ, where BF is
6. Consider the data displayed in Table 4.6, originally collected by Treloar(1974) and reproduced in Bates and Watts (1988). These data record the “velocity” yi of an enzymatic reaction (in
5. Consider www.biostat.umn.edu/~brad/data/copresence_data.txt, a data set for which a few records are shown in Table 4.5. Here, Y is a binary variable indicating co-presence of two species in a
4. Spiegelhalter et al. (1995b) analyze the flour beetle mortality data in Table 3.3 using WinBUGS. These authors use only the usual, two-parameter parametrization for pi ≡ P(death|wi), but compare
3. Consider again the binary dugong modeling of Example 4.4. Suppose we wished to obtain side-by-side boxplots of the posteriors of the effect of log-age, β1, across all three link functions (logit,
2. Refer to the data in Table 2.3, also available on the web in WinBUGS format at http://www.biostat.umn.edu/~brad/data.html. These pharmacokinetic (PK) data of Wakefield et al. (1994) were initially
1. Steensma et al. (2005) presented the data in Table 4.4, from a randomized controlled trial comparing two dosing schedules for the drug erythropoiten. Serum hemoglobin (HGB, in g/dL) was recorded
3. Which model (or models) should I ultimately choose for the final presentation of my results?
2. How can I tell if my model is providing adequate fit to the data?
1. How can I tell if any of the assumptions I have made (e.g., the specific choice of prior distribution) is having an undue impact on my results?
21. To further study the relationship between identifiability and MCMC convergence, consider again the two-parameter likelihood model Y ∼ N(θ1 + θ2 , 1) , with prior distributions θ1 ∼ N(a1,
20. Consider the balanced, additive, one-way ANOVA model, Yij = μ + αi + ij , i = 1,...,I, j = 1, . . . , J, (3.34)where ij iid∼ N(0, σ2 e ), μ ∈ , αi ∈ , and σ2 e > 0. We adopt a prior
19. Consider the following two complete conditional distributions, originally analyzed by Casella and George (1992):f(x|y) ∝ ye−yx, 0
18. Repeat the preceding investigation for a multivariate Nk(0, Ik) target density, p(x) ∝ exp(−1 2xx), where x = (x1,...,xk). Now we wish to compare these three samplers:
17. Consider three approaches for sampling from a N(0, 1) target density, p(x) ∝ exp(−1 2x2):a standard Metropolis algorithm using a Gaussian proposal density, x∗|x(t−1) ∼ N(x(t−1), σ2),
16. A random variable Z defined on (0, ∞) is said to have a D-distribution with parameters δ, β > 0 and k ∈ {0, 1, 2,...} if its density function is defined (up to a constant of
15. Show that Adler’s overrelaxation method (3.21) leaves the desired distribution invariant. That is, show that if θ(g−1)i is conditionally distributed as N(μi, σ2 i ), then so is θ(g)i .
14. Consider the generalized Hastings algorithm that uses the following candidate density:q(v, u) = % p(vi|uj=i) for vj=i = uj=i 0 otherwise .That is, the algorithm chooses (randomly or
13. Returning again to the flour beetle mortality data and model of Example 3.7, note that the decision to use Σ=2 . Σ in equation (3.17)was rather arbitrary. That is, univariate Metropolis
12. Write an R program to reanalyze the flour beetle mortality data in Table 3.3, replacing the multivariate Metropolis algorithm used in Example 3.7 with(a) a Hastings algorithm employing
11. In the previous problem, replace the third-stage prior given above with b1 ∼ G(c1, d1), b2 ∼ G(c2, d2), b1 and b2 independent, thus destroying the conjugacy for these two complete
10. In the previous problem, assume k is unknown, and adopt the following prior:k ∼ Discrete Uniform(1,...,n), independent of θ and λ .Add k into the sampling chain, and obtain a marginal
Showing 100 - 200
of 1009
1
2
3
4
5
6
7
8
9
10
11