Question
For this stage of the study we will be comparing the gene activity levels in one healthytissue sample to the gene activity levels in one
For this stage of the study we will be comparing the gene activity levels in one healthytissue sample to the gene activity levels in one diseased (lung cancer) tissue sample.Eachsample will provide one activity reading for each of the 40 genes included in our hand-pickedbundle of likely candidates for analysis. one for the healthy tissue sample (column A) and one for the disease tissue sample(column B)
B).Question 1, Using Geogebra (or a statistical software package of your choice), produce a side-by-side pair of boxplots, one for each of the two tissue samples being compared.These boxplots will show the overall distribution of gene activity levels within eachtissue.Please attach this boxplot as thefirstfigure of your report.
Question 2:The overall appearance of the two boxplots will be very similar.What does this suggest about the two tissue samples?Comment on what it means for theboxplotstobesimilar,anddiscusswhetheryoubelievethismakesitimpossibleforanyindividual genes to be showing di erences in activity levels between the two samples.
Question 3:Using Geogebr, produce a scatterplot comparing the values from healthy tissue (on the horizontal axis) tothe values from diseased tissue (on the vertical axis).Include a linear regression line andprovide the equation for the regression model.Please attach this scatterplot regression lineto your report as secondgure of your report.
Question 4:(2 marks)The regression model will be an equation of the form:y=bx aThe value ofbis close to 1.discuss what this says about the two tissuesamples being compared.
Question 5:As a preliminarylter to further narrow down our list of genestopursue,let'sidentifythethreegeneswiththelargestregressionresiduals.Includeanimage (either a second copy of your scatter plot or a esidual plot", see instructions) thatmakes it easy to identify the three genes with the largest residuals and circle the data pointscorresponding to these three values.Checking back with the column of gene names in theprovided excel spreadsheet, determine the identi ers for these genes and add these identi- ers as labels adjacent to the circled points on your image.Include this image as the third gure in your report.
Question6:Produceasummarystatementforeachofthethreegenesyouidenti ed in Question 5.For these statements, make use of the following template:[Gene Name] is a candidate.This gene had an activity level of [Value] in normal tissueand an activity level of [Value] in diseased tissue, suggesting an [increase)
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access with AI-Powered Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started