Question
Complete the following questions and upload a PDF of your answers to Gradescope. Please be sure to describe your approach thoroughly as partial credit may
Complete the following questions and upload a PDF of your answers to Gradescope. Please be sure to describe your approach thoroughly as partial credit may be awarded even if the final answer is incorrect. 1. In addition to the proteins from the SARS-CoV-2 virus, dataset MSV000085507 from class contains many spectra matching to human proteins, thus informing how a COVID infection might alter or change protein expression within an individual. One common type of protein in blood is hemoglobin, which is responsible for carrying oxygen throughout the body. Given the following hemoglobin protein, P69905: 10 20 30 40 50 MVLSPADKTN VKAAWGKVGA HAGEYGAEAL ERMFLSFPTT KTYFPHFDLS 60 70 80 90 100 HGSAQVKGHG KKVADALTNA VAHVDDMPNA LSALSDLHAH KLRVDPVNFK 110 120 130 140 LLSHCLLVTL AAHLPAEFTP AVHASLDKFL ASVSTVLTSK YR a. What are all the fully-tryptic peptides from P69905? (3 points) b. What is the percentage of the protein sequence between amino acid 1 and amino acid 32 that is covered by fully-tryptic peptides found in the amino acid (AA) coordinates of identified peptides in the COVID dataset from class. (1 point) c. Consider the protein region between amino acid 1 and amino acid 12. Is there a semi-trypic peptide found in the set of identifications from before with zero missed cleavages contained in that region? If so, what is it? (.5 points) What is a peptide found in the set of identifications from before with one missed cleavage and containing all of or a portion of this region (.5 points) d. If peptide VDPVNFKLLSH (at position 94-104 in P69905) is matched to a spectrum of m/z 1270 and charge 1, what is the unit mass error? (1 point) e. A protein subsequence FLASVSTVLTSKYR (positions 129-142) may be covered by various peptides with different start/end positions. Given the set of identifications from before with sequences overlapping with this region, which of the identifications seems to be the best evidence for coverage of sequence FLASVSTVLTSKYR? Be sure to describe your reasoning.(1 point) 2. Given spectrum S, from the NCI60 cell lines, the identification for S was found to be KSM+15.995YEEEINETR. However NEQDAYAINSYTR and FYKNEGGTWSVEK also have parent masses that are within 0.1 Da of the parent mass of S, and could also potentially match to S. a. What is the quality of the match between each peptide and S? Is there a plausible alternative for the sequence of the peptide that can be found by inspecting the three different sequences as potential matches to S? (3 points) b. Slides 19 to 23 in the slide deck for lecture 2 (02 - CSE190 overview part 2.pdf in Canvas) show how to view knowledge base (MassIVE-KB) reference spectra for any spectrum reported as identified in an online results view. Using the same steps, evaluate each of the 3 matches against the reference spectrum for each peptide in MassIVE-KB. For each sequence, how does the match to S compare with the match to the reference spectrum in MassIVE-KB? Describe each match both qualitatively and quantitatively (e.g., using USI cosines). (2 points)
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started