Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

sample of colon tissues were collected, 40 of which were tumor tissues and 22 non-tumor tissues. Tissues were analyzed using an Affymetrix oligonucleotide array and

sample of colon tissues were collected, 40 of which were tumor tissues and 22 non-tumor tissues. Tissues were analyzed using an Affymetrix oligonucleotide array and the expression of a particular gene was measured. A file with these data called "gene.txt" is posted on Canvas. To get the dataset into a data frame called gene in R's workspace, we ensure that the file "gene.txt" is in R's current working directory and then type gene=read.table("gene.txt") Researchers are interested in studying the association between this gene and tumor status. (a) (5 points) We wish to analyze these data with the two independent-samples model, where our goal is to make inference for whether or not the distribution of the expression of this gene is associated with tumor status, which has levels {tumor, healthy}. To do this, what must we assume about these measurements? If these assumptions are true, then what is the probability distribution of the random variable for which the first healthy tissue's gene expression measurement of 202.90000 is assumed to be a realization? Specify as much information about this distribution as possible. (b) (3 points) One of the assumptions that the two-independent samples t-test requires is that the distribution of the response has the same unknown standard deviation for both levels of the categorical explanatory variable. Is this a reasonable assumption for these data? In your response, please include either graphical or numerical support. (c) (4 points) Suppose that the assumptions stated in your solution to part 1a are true. Is there statistical evidence at the 1% significance level that distribution of the expression of this gene is associated with the tissue type? Perform a two-independent samples t-test. (d) (3 points) Repeat part 1c with the response changed to the natural logarithm of the expression of this gene. After this transformation, is it reasonable to assume that the distribution of the response has the same standard deviation for both levels of the cate- gorical explanatory variable? Explain

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Mathematical Applications For The Management, Life And Social Sciences

Authors: Ronald J. Harshbarger, James J. Reynolds

12th Edition

978-1337625340

More Books

Students also viewed these Mathematics questions