Question
This assignment focuses on arrays and file/text processing. Turn in a file named DNA.java. You will also need the two input files dna.txt and ecoli.txt
This assignment focuses on arrays and file/text processing. Turn in a file named DNA.java. You will also need the two input files dna.txt and ecoli.txt from below. Save these files in the same folder as your program.The assignment involves processing data from genome files. Your program should work with the two given input files. If you are curious (this is not required), the National Center for Biotechnology Information publishes many other bacteria genome files. The last page tells you how to use your program to process other published genome files.Background Information About DNA:Note: This section explains some information from the field of biology that is related to this assignment. It is for your information only; you do not need to fully understand it to complete the assignment.Deoxyribonucleic acid (DNA) is a complex biochemical macromolecule that carries genetic information for cellular lifeforms and some viruses. DNA is also the mechanism through which genetic information from parents is passed on during reproduction. DNA consists of long chains of chemical compounds called nucleotides. Four nucleotides are present in DNA: Adenine (A), Cytosine (C), Guanine (G), and Thymine (T). DNA has a double-helix structure (see diagram below) containing complementary chains of these four nucleotides connected by hydrogen bonds.Certain regions of the DNA are called genes. Most genes encode instructions for building proteins (they're called "protein-coding" genes). These proteins are responsible for carrying out most of the life processes of the organism.Nucleotides in a gene are organized into codons. Codons are groups of three nucleotides and are written as the first letters of their nucleotides (e.g., TAC or GGA). Each codon uniquely encodes a single amino acid, a building block of proteins.The process of building proteins from DNA has two major phases called transcriptionand translation, in which a gene is replicated into an intermediate form called mRNA, which is then processed by a structure called a ribosometo build the chain of amino acids encoded by the codons of the gene.DNA translation.The chemical structure of DNA.The sequences of DNA that encode proteins occur between a start codon(which we will assume to be ATG) and a stopcodon (which is any of TAA, TAG, or TGA). Not all regions of DNA are genes; large portions that do not lie between avalid start and stop codon are called intergenic DNAand have other (possibly unknown) functions. Computational biologists examine large DNA data files to find patterns and important information, such as which regions are genes. Sometimes they are interested in the percentages of mass accounted for by each of the four nucleotide types. Often high percentages of Cytosine (C) and Guanine (G) are indicators of important genetic data.For more information, visit the Wikipedia page about DNA: http://en.wikipedia.org/wiki/DNA1of 4In this assignment you read an input file containing named sequences of nucleotides and produce information about them. For each nucleotide sequence, your program countsthe occurrences of each of the four nucleotides (A, C, G, and T). The program also computes the mass percentageoccupied by each nucleotide type, rounded to one digit past the decimal point. Next the program reports the codons(trios of nucleotides) present in each sequence and predicts whether or not the sequence is a protein-coding gene. For us, a protein-coding gene is a string that matches allof the following constraints*:begins with a valid start codon(ATG)ends with a valid stop codon(one of the following: TAA, TAG, or TGA)contains at least 5 total codons(including its initial start codon and final stop codon)Cytosine (C) and Guanine (G) combined account for at least 30% of its total mass(*These are approximations for our assignment, not exact constraints used in computational biology to identify proteins.)The DNA input data consists of line pairs. The first line has the name of the nucleotide sequence, and the second is the nucleotide sequence itself. Each character in a sequence of nucleotides will be A, C, G, T, or a dash character, "-". The nucleotides in the input can be either upper or lowercase.Input file dna.txt(partial):cure for cancer proteinATGCCACTATGGTAGcaptain picard hair growth protein ATgCCAACATGgATGCCcGATAtGGATTgA bogus proteinCCATt-AATgATCa-CAGTt...The dash"-"charactersrepresent "junk" or "garbage" regions in the sequence. For most of the program they should be ignored in your computations, though they do contribute to the total mass of the sequence as described later.Program Behavior:Your program begins with an introduction and prompts for input and output file names. You may assume the user will type the name of an existing input file that is in the proper format. Your program reads the input file to process its nucleotide sequences and outputs the results into the given output file. Notice the nucleotide sequence is output in uppercase, and that the nucleotide counts and mass percentages are shown in A, C, G, T order. A givencodon such as GAT might occur more than once in the same sequence.Log of execution (user input underlined):This program reports information about DNA nucleotide sequences that may encode proteins. Input file name? dna.txtOutput file name? output.txtOutput file output.txtafter above execution (partial):Region Name:cure for cancer proteinNucleotides:ATGCCACTATGGTAGNuc. Counts:[4, 3, 4, 4]Total Mass%:[27.3, 16.8, 30.6, 25.3] of 1978.8Codons List:[ATG, CCA, CTA, TGG, TAG]Is Protein?:YESRegion Name:captain picard hair growth proteinNucleotides:ATGCCAACATGGATGCCCGATATGGATTGANuc. Counts:[9, 6, 8, 7]Total Mass%:[30.7, 16.8, 30.5, 22.1] of 3967.5Codons List:[ATG, CCA, ACA, TGG, ATG, CCC, GAT, ATG, GAT, TGA]Is Protein?:YESRegion Name:bogus proteinNucleotides:CCATT-AATGATCA-CAGTTNuc. Counts:[6, 4, 2, 6]Total Mass%:[32.3, 17.7, 12.1, 29.9] of 2508.1Codons List:[CCA, TTA, ATG, ATC, ACA, GTT]Is Protein?:NO2 of 4Implementation Guidelines, Hints, and Development Strategy:The main purpose of this assignment is to demonstrate your understanding of arrays and array traversals with for loops. Therefore, you should use arrays to store the various data for each sequence. In particular, your nucleotide counts, masspercentages, and codons should all be stored using arrays. Additionally,you shoulduse arrays and forloops totransform the data from one form to another as follows:from the original nucleotide sequence string to nucleotide counts;from nucleotide counts to mass percentages; andfrom the original nucleotide sequence string to codon triplets.These transformations are summarized by the following diagram using the "cure for cancer" protein data:Nucleotides:"ATGCCACTATGGTAG"What is computedOutput to fileCounts:4, 4]{4, 3,4, 4}Nuc. Counts: [4, 3,Mass %:{27.3,16.8, 30.6, 25.3}Total Mass%: [27.3,16.8,30.6, 25.3] of 1978.8Codons:{ATG, CCA, CTA, TGG, TAG}Codons List: [ATG, CCA, CTA, TGG, TAG]Is protein?: YESRecall that you can print any array using the method Arrays.toString. For example:int[] numbers = {10, 20, 30, 40};// my data is [10, 20, 30, 40]System.out.println("my data is " + Arrays.toString(numbers));To compute mass percentages,use the following as the mass of each nucleotide (grams/mol). The dashes representing "junk" regions are excluded from many parts of your computations, but they docontribute mass to the total.Adenine (A): 135.128Cytosine (C): 111.103Guanine (G):151.128Thymine (T): 125.107Junk (-): 100.000For example, the mass of the sequence ATGG-AC is (135.128 + 125.107 + 151.128 + 151.128 + 100.000 + 135.128 + 111.103) or 908.722. Of this, 270.256 (29.7%) is from the two Adenines; 111.103 (12.2%) is from the Cytosine; 302.256 (33.3%) is from the two Guanines; 125.107 (13.8%) is from the Thymine; and 100.000 (11.0%) is from the "junk" dash.We suggest that you start this program by writing the code to read the input file. Try writing code to simply read each protein's name and sequence of nucleotides and print them. Read each line from the input file using Scanners nextLinemethod. This will read an entire line of input and return it as a String.Next, write code to pass over a nucleotide sequence and count the number of As, Cs, Gs, and Ts. You can use a String's charAt method to get individual characters. Put your counts into an array of size 4. To map between nucleotides andarray indexes, you may want to write a method that converts a single character (i.e. A, C, T, G) into indices (i.e. 0 to 3).Once you have the counts working correctly, you can convert your counts into a new array of percentages of mass for each nucleotide using the preceding nucleotide mass values. If you've written code to map between nucleotide letters and array indexes, it may also help you to look up mass values in an array such as the following:double[] masses = {135.128, 111.103, 151.128, 125.107};You may store your mass percentages already rounded to one digit past the decimal or you can round when printing the mass percentages array using printf. If you choose to store the percentages pre-rounded, use Math.roundas follows:doublenum = 1.6666667;doublerounded = Math.round(num * 10.0) / 10.0;// the answer is1.7System.out.print("the answer is " + rounded);Remember that the "junk" dashes do contribute mass to the total. For other parts of your program you may want to remove dashes from the input; consider using the replacemethod on the nucleotide string to eliminate these characters.After computing mass percentages, you must break apart the sequence into codons and examine each codon. You may wish to review the methods of Stringobjects as presented in Chapters 3 and 4, such as substring, charAt, indexOf, replace,toUpperCase, andtoLowerCase.3 of 4We also suggest that you first get your program working correctly printing its output to the consolebefore you save the output to a file. Once you have your program printing correct output to the console,save the output to a file by using a PrintStream as described in Section 6.4 of the textbook.You may assume that the input file exists, is readable, and contains valid input. (In other words, you should not re-prompt for input or output file names.) You may assume that each sequence's number of nucleotides (without dashes) will be a multiple of 3, although the nucleotides on a line might be in either uppercase or lowercase or a combination. Your program should overwrite any existing data in the output file (this is the default PrintStreambehavior).Style Guidelines:For this assignment you are required to have the following four class constants:one for the minimum number of codonsa valid protein must have, as an integer (default of 5)a second for the percentage of mass from C and G in order for a protein to be valid, as an integer (default of 30)a third for the number of unique nucleotides(4, representing A, C, G, and T)a fourth for the number of nucleotides per codon(3)For full credit it should be possible to change the first two constant values (minimum codons and minimum mass percentage) and cause your program to change its behavior for evaluating protein validity. The other two constants won't ever be changed but are still useful to make your program more readable. Refer to these constants in your code and do not refer to the bare number such as 4 or 3 directly. You may use additional constants if they make your code clearer.We will grade your method structure strictly on this assignment. Use at least four nontrivial methods besides main. These methods should use parameters and returns, including arrays, as appropriate. The methods should be well-structured and avoid redundancy. No one method should do too large a share of the overall task. The textbook's case study at the end of Chapter 7 is a good example of a larger program with methods that pass arrays as parameters.In particular, we require that you have the following particular method in your program:A method to print all file output for a given potential protein (nucleotides, counts, %, is it a protein, etc.)In other words, all output to the file should be done through one method called on each nucleotide sequence from the input. Your other methods should do the computations to gather information to be passed to this output method.Your main method should be a concise summary of the overall program. It is okay for main to contain some code such as print ln statements. But main should not perform too large a share of the overall work itself, such as examining each character of an input line. Also avoid "chaining," when many methods call each other without ever returning to main.We will also check strictly for redundancy on this assignment. If you have a very similar piece of code that is repeated several times in your program, eliminate the redundancy such as by creating a method, by using for loops over the elements of arrays, and/or by factoring if/else code as described in section 4.3 of the textbook.Since arrays are a key component of this assignment, part of your grade comes from using arrays properly. For example, you should reduce redundancy as appropriate by using traversals over arrays (for loops over the array's elements). This is preferable to writing out a separate statement for each array element (a statement for element [0], then another for [1], then for [2], etc.). Also carefully consider how arrays should be passed as parameters and/or returned from methods as you are decomposing your program. Recall that arrays usereference semantics when passed as parameters, meaning that an array passed to a method can be modified by that method and the changes will be seen by the caller.You are limited to features in Chapters 1 through 7. Follow past style guidelines such as indentation, names, variables, types, line lengths, and comments (at the beginning of your program, on each method, and on complex sections of code).
dna.txt:
cure for cancer protein ATGCCACTATGGTAG captain picard hair growth protein ATgCCAACATGgATGCCcGATAtGGATTgA bogus protein CCATtAATgATCaCAGTt michael jordan mad hops protein ATgAGATCCgtgatGTGggaTCCTaCTCATTaa paris hilton phony protein AtgCCaacaTGGATGCCCTAAGATAtgGATTagtgA george w bush approval rating protein atgataattagttttaatatcagactgtaa jimi hendrix guitar talent protein ATGCAATTGCTCGATTAG tyler durden's brain protein ATGATAcctatgagtaaTGTGGACCatatccaaACTATAGGCATtgtcggACCAACGATcgattggtTATACTGA mini me growth hormone AtGgGaCGCTgA
ecoli.txt:
PLEASE answer in java code i've found the python one already but i'm having a rough time with getting my java code to function. Thanks!
thr operon leader peptide ATGAAACGCATTAGCaCCACCATTACCACCACCATCaCCATTACCACAGGTAACGGTGCGGGCTGA aspartokinase I/homoserine dehydrogenase I ATGCGAGE GTTGAAGTTCGGCGGTCATCASTGGCAAATGCAGAACGTDTTCTGCGGGTTGCCGATAttCTGGAAAGCAATGCCAGGCAGGGGCAGGTGGCCACCGTCCTCCTGCCCCCGCCAAAATCACCAACCATCtGGTaGCGATGATtGaaAAaACCATTAGCGGTCAGGADGCt TTCcCATATCAGCGATGCCGAACGTATTTTTGCCGAACTICTGACGGACTCGCCGCCGCCCAGCCGGGATTTCCGCTGGCACAATTgAAAACTTTCGTCGACCAGGAATTTGCCCAAATAAAACATGTCCTGCATGGCatCAGTTTGTTGGGGCAGTGCCCGGATAGCATCAACGCT GCGCTGATTTGCCGTGgCGAGAAAATGTcGaTcgccattaTGGCCGGCGTGTTAGAAGCGCGTGGTCACAACGTTACCGTTATCGATCCGGTCGAAAAACTGCTGCAGTGGGTCATTACCtCgAaTCTACCGTTGATaTtGCTGAATCCACCCGCCGTATTGCGGCAAGCCGCATTCCg GCTGACCACATEGtGCTGATGGCTGGTTTCACTGcCggTAATGAAAAAGGCGaGCTGGEGGTECTGGGACGCAACGGTTCCGACI CGGTCCTGGCGGCCTGTTTACGCGCCGATTGTTGCGAgaTCTGGACGGATGTTGACGGTGTTTATACCTGCG CGCGTCAGGTG CCCGATGCGAGGTTGTTGAAGTCGATGTCCTATCAGASGCGATGGAGCTTTCTTACTTCGGCGCTAAA CacccccccAC ATCGCCCAG CAGATCCCTtgCCUGATTAAAAATACCGgAAAECCCCAAGC TACGCECATTG AGCCGTGAT GAAGACGAATTACCGGTCAAGGGCATTTCCAATSTGAATACATGGCAATETTCAGCGTTTCCGCCCCGGGGADGAAAGGGATggTTEGCATGGCGGCGCGCGTCTTTGCAGCGTGTCACGCGCCCGTaTTUCCGTGGTGCUGATTACGC TCTTCCGAAT GTATCAGTTTC TGCGTTCCGCAAGCGACTGTGTGCGAGCTgAaCGGGCA TGCAGGAAGAGETCTACCTGGAaCTGAAGAAGGCTTACTGGAGCCGTTGGCgGt GACGGAACGGCTGGCCATTATCTCGGTGGTAGGTGATGGTATGCGCAC taCGTGGGAUCTCGgCGAAATECTETGCCGCGCTS GCCCGCGCCAATATCAACATTGTCGCCATTGCtCaGGGaTCTTTGAaCGCTCAAUCTCTGTCGTGGTCATAACGATgATGCGACCACTGGCGTGCGCGTTACTCATCAGATGCTGTTCAATACCGATCAGGTTATCGAAGTGTTTGTGATTGgCGTCGGTGGCGTTGCCGGTGCGCTG CTGGAGCAACTGAAGCGTCAGCAAAGCTGGTTGAAGAATAA CATATCGCTTACGTGTCTGCGGTGTTGCTAACTCGAAGECACEGCTCACCAATGTACATGGCCTTAATCTGGAAAACTGGCAGGAAGAACTGGCGCAAGCcAAAGAGCCGTTTAATCTCGCGCGcTtAATTCGCCTC GTGAAAGAATATCATCTGCEGAaCCCGGTCATTGTTGACTWTACTTCCACCAGGCTGTGGCAGATCAATATECCGACTECCTgCGCGAAGGTTTCCACGTTGTTACGCCGAaCAAAAAGGCCACACCTCGTCSATGGATTACTACCATCAGTIGCGTTATGCGGCGGAAAAATCGCGG CGTAATTCCTCLATGACACCACGTTGGGGCTGGATTACCGGTTATTGAGAACCTGCAAAATCTGCTCAATGCEGGTGATGAATTGATGAAGTTCTCCGGCATTCTTTCAGGTTCGCTTTCTTALATCTTCGGCAAGTTAGACGAAGGCaTGAGTETCTCCGAGGCGACCACACTGGCG CGGGAAATGGGTTATACCGAACCGGACCCGCGAGATGATCTTECtGGTATGJATGTGGCGCGTAagCTADTGATECTCGCTCGTGAAACGGGACGTGAACTGGAGCEGGCGGATATTGAAATTGAACCTgTGCTGCCCGCaGaGTTTAACGCCGAGGGTGATGTCGCCGCTTTTATGGCG AATCTGTCACAGCTCGACGaTCECTTTGCCGCGCGTGTGGCGAAGGCCCGTGATGAAGGAAAAGTTTTGCGCTATGTTGGCAATALOGATGAAGATGGCgTCTGCCGCGTGAAGTTGCCGAAGTGGATGSTAATGaTCCGCTGTTCAAAGTGAaAaATGGCGAAAACGCCCTGGCCTTC TATAGCCACTATLATCAGCCGCTGCCGTTGGTACTGCGCGGATATGGTGCGGGCATGACGTTACAGCTGCCGGTGTCTTTGCTGATCTGCTACGEACCCTCTCATGGAGTTAGGAGTCTGA homoserine kinase ATGGTTAAAGTTTAEGCCCCGGCUTCCAGTGCCATATGaGCGTCGGGTTTGATGTGCTCGGGGCGGCGGTGACACCTGTTGATGGTGCATTGCTCGGAGATGTagTcaCGGTTGAGGCGGCAGAGACATTCASTCTCAACAACCTCGGACGCTTTGCCGAEAAGCTGCCGTCAGAGCCA CGgGaaAATATCGTTLATCAGTGcTGGGAGCGTTTTTGCCAGGAGCTTGGCAAGCAAATTCCAGTGGCGATGACTCTGGAAAAGAATatGCCGAECOGTTCGGGCTTAGGCTCCAGCGCCGTTCAGTGGTCGCGGCgCTgAt GGCGATGAATGAACACTGCGGCAGCCGCTTAATGAC ACTCGTTTGCTGGCTTUGATGGGCGAGTTGGAAGGGCGTATCTCCGGCAGCALTCATTACGACAACGEGGCACCGTGTTtTCETGGTGGTATGCAGTUGATGATCGAAGAAACGACATCATCAGCCAGCAGTGCCAGGGTTTGATGAGEGGCTGTGGGTGCTGGCGTATCCGGGGAET AAAGTCECGCGGCAGAAGCCAGGGCTaTTTTACCGGCGCAGTATCGCCGCCAGGATTGCATTGCGCACGGGCGACATCTGCCAGGCTTCATTCACGCCTGCTATTCCCGTCAGCTTGAGCTTGCCGCGAAGCTGATgAAAGTGTTATCGCTGAACCCTACCGTGACGGTTACTGCCA GGCTTCCGGCAGGCGCGGCAGGCGGTTGCGGAAATCGGCGCGGTAGCGAGCGGTATCTCCGGCTCCGGCCCGACTETGTTCGCTCTGTGEGACAAGCCGGATACCGCCCAGCGCGTTGCCGACTGGTTGGGTAAGAACtACCTGCAAAATCAGGAAGGTTTTGTTCATATTTGCCGGCTG GATACGGCGGGCGCACGACTACTGGAAAACTAA threonine synthase ATGAAACTCtacaATCTGAAAGATCACAATGAGCAGGTCaGCTTTGCGCAAGCCGTAACCCAGGGOTTAGGCAAAAATCAGGGGCtGTTTTTTCCGCACgaCCTGCCGGaaTTCAGCcTgACTGAAaTTGATGAGATCCTGAAGCtGGATTTTGTCACCCGCAGTGCGAAGATCCTCTCg GCGTTTATTGGTGATGAAATCCCGCAGGAAaTCCTGGAAGAGCGCGTACGTGCGGCGTTTGCCTTCCCGGCTCCGGTCGCCAATGTTGAASGCGATGTCGGTTGTCTGGA TTGTTCCACGGGCCAACGCTGGCATTTAAAGATTTCGGCGGTCGCTTTATGGCACAAATGCTgACCCAT ATTGCGGGCGATAAGCCAGTGACCATTCTGACCGCGACATCCGGTGATACTGGAGCGGCAGTGGCTCATGcTTTCEACGGTETACCGAATGTGAAAGTGGTTATCCTCTATCCACGAGGCAAAATCAGTCCACTGCAAGAAAAACTgTTCTGTACATTG9GCggCAATATCGaAACTGTT GCCATCGAcggCGaTTTCGATGCCTGTCAGGCGCTGGTGAAGCAGGCGTTTGATGATGAAGAACTGAAAGTGgCgCtGGGGCE GAATTCTGCTAACTCCATCAACATCAGTCGCTTGCTGGCGCAGATTTGTTATTACTTTGAGGCTGTCGCACAGTE GCCGCAAGAAGCACGTAACCAG TTGGTTGTCTCGT CCGAGTGCAAACETCGGCGATETGACGGGGGGTCTGCTGGCGAaGTCACTCGGTCDGCCGGTAAAACGTETTATTGCE GCGACCAACGTGAACGALACCGTACCACGTTTCCTGCaCGCGGTCAGTGGTCACCCAAaGCGACTCAGOCGAcgTtaTCCAATGCG ATGGATGTTAGCCAGCCAACAACTGGCCGCGTGTGGAAGAGTTGE TCCGCCGCAAAATCTGGCAACTGAAAGAGCTGGETTATGCAGCCGTGGATGATGAAACCACGCAACAGACAATGCGTGAGE TAALAGAACTGGGCTATACCTCGGAGCCGCACGCTGCCGTAGCTTATCGTGCG CTGCGTGACCAGTTGAAECCAGGCGAATATGGCTTGTU CCTCGGCACCGCGCATCCGGCGAAatTtAAAGAGAGCGTGGAAGCGATTCTCGGTGAAACGTTGGatCTGCCAAAAGAGCTGGCAGAACGTGCTgATTTACCCTTGCTTTCGCATAACCTGCCCGCCGATTTTGCTGCGTTG CGTAAatTgaTGATGAaTCATCAGTAA hypothetical protein AtGCAGCCCGGCTETTTTTATGAAGAAAATATGGAGaAaAACGACagGGAAAAAGGAGAAATTCECAATAAATGCGGEAACTTAGASATTAGGATTGCGGAGAAT ACAACTGCCGTTCTCATCGCGTAATCTCCGGATATCGACCCATAACGGGCAATGATAAAAGGAGTAACCTGTGA Non-protein region aAAAACTGCTGGAAACAATGAAAGACGTACCGGACGACCAACGTCAGGCGC transaldolase B ATGACGGACAAATTGaCCTCCCTTCGTCAGTACACCACCGTAGTGGCCGACACTGGGGACATCGCGGCAATGAAGCTGTATCAACCGCAGGATGCCACAACCAACCCTECTCTCATTCTTAACGCAGCGCAGATTCCGGAATACCGTAASTTgATTGATGATGCTGTCGCCTGGGGAA CaGCAGAGCAACGATCGCgCgCAGCAGATCGEGGACGCGACCGACAAACTGGCAGTAATATTOGTCT GAAaTCCTGAAACTGETTCCGCGCCgTATCTCAACUGAAGTUGATGCGCGTCTTTCCTATGACaCCGAAGCGTCAATTGCGAAAGCAAAACGCCTGATCAAACTCTACAAC GATGCAGGTATTAGCAACGATCgTaTTCTGATCAAACTGGCTTCTACCTGGCAGGGTATCCGTGCTGCAGAACAGCTGGAAAAAGAaCGTATTAACTGTAACCTGACCCTGCTgt TCTCct TCGCCAGGCTCGTGCTTGTGCGGAGCGCGCGTGTTCCTGaTCTCGCCGTTTGTTGGC CGTATTCTTGACTGGTACAAaGCGAATACCGATAAGAAAGAGEACGCTCCGGCAGAAGATCCGGGCGTGGTTTCTGTatCtGAAATCEACCAGEACTACAAGAGCATGGTTaTgAAACCGTGGTTATGGGCGCAAGCTTCCGTAACATCGGCGAAATTCTGGAACTGGCAGGCTGCGAC CGTCTGACCat CGCACCGCACTGCTGAAAGAGCTGCGGAGAGCGAAGGGGCTATC,AACGTAAACTOTCTTACACTGgTGAAGTGAAAGCgCGTCCGGCGCGTATCACEGAGE CCGAGTTCCTGTGCAOCACAACCAGGATCCAATGGCAGTAGATAAACTgGcGGAGETATCCGT AAGTTTGCTGTTGACCAGGAAAAACTGGAAAAAATGATCGGCGATCTGCEGTAA molybdopterin biosynthesis mog protein ATGAATACTTTACGTATTGGCTTAGTtTcCaTCTCTGATCGCGCATCCAGCGGCGTTTALCAGgaTAAAGGCATCCCTGCGCTGGAAGAATGGCTGACAUCGGCGCTAACCACGCCGTTTGAaCTGGAAACCCgcTTaATCCCCGATGAGCAGGCGATCATCGAGCAAACGTTGTGTGAG CTGGTGGATGAATGAGETGCCATCTGGTGCTCACCACGGGCGGAACTGGCCCTGCGCGTCGTGACETAACGCCCGATGCGACGCTGGCAGTAGCGGACCGCGAGATOCCAGGCTTTGGTGAACAGATGCGCCAGATCAGCCTGCATTTTGTACCaaCTGCGATCCTTTCGCGTCAGGTS gGGGTgATTCGCAAACAGGCGCTGATCCTTAACTT CCCGGTCAACCGAAGECTATTAAAGAGACGCE GSAAGGTGEGAAGGACGCTGAGEGTAACGTTGTGGTGCACGSTATTTTTGCCGCGTaCcGTACTGCATTCAGTTGCTGGAAGGGCCATACGTTGAACGGCaCCgGaAGTG GTTGCAGCATTCAGCCGAAGAGTGCAGACGCGAAGE TAGCGAATAA chaperone protein DnaK a TGGGTAAAATAaTTGGTATCGACCTGGGTACTACCAaCTCTTGTGTagCGATTAEGGATGGCACCACTCCUCGTGEACTGGAGAACGCCGAAGGCGATCGCACCACGCCTTCTATCATTGCCTATACCCAGGADGGTGAAACTCTGGTTGGTCAGCCGGCTAAACGTCAGGCAgt GACG AACCCGCAAACACCCTGTU TGCGATTAAACGCCUGATTGGCCGCCGCTTCCAGGACGAAGAAGTACAGCGXGATOTTTCCATCATGCCGTTCAAAATTAUTGCTGCtgatAACGGCGACGCATGGGTCGAAGETAAAGGCCAGAAAATGGCACCGCCGCAGAUCTCTGCTGAAGTGCTG AAAAAAAEGAAGAAAACCGCTGAAGATTACCTGGGTGAACCGGTAACTGaAGCTgt TATTACCGTACCGGCATACTEtaACGATGCTCAGCGTCAGGCAACCAAAGACGCAGGCCGTATCGCTGGTCTGGAAGTAAAaCGTATCATCAACGAaCCGACCGCAGCTGCGCTGGCTEACGGt CTGGACAAAGSTACTGGCAACCgtACTATCGCGGTTTATGACCTGGGTGGTGGTACTTTCGATATTTCCATTATCGAATCGACGAAGTTGACGGCgAAAAAACCUECGAAGTTCTGGCAACCAACGGTGATACCCACCTGGGTGGEGAAGACTTCGACAGTCGTCTGATCAACTACTG GTTGAAGAATTCAAGAAAGATCAGGGCATTGacCtGCGCAACGaTcCGCTGGCAATGCAGCGCCTGAAGAAGCGGCAGAAAAAGCgAAAATCGAACTGTctTCCGCTCAGCAGACCGCGTTAACCTGCCGTACATCACTGCAGACGCGACCGGTCCGAAACACATGAACATCAASTG act CGTGCGAAACTGGAAAGCCTgGtTGAAGAUCTGGTAAACCGt TCCATTGAGCCGCTGAAAGTTGCACTGCAGGACGCTGGCCTGTCCGTATCTGATAECGACgaCGTTATTCTCGTTGGTGGTCAGACTCGTATGCCAATGGETCAGAAGAAAGTTGCTGaATTCTTTGGTAAAGAG CCGCGTAAAGATGTTAACCCGGACGAAGCTGTGCCATCGGTGCTGCTGTTCAGGGTGGTGTTCTGACTGGEGACGTAAAAGaCGTacTGCTgCtGGACGTTACCCCGCTGTCtCTGGGTATCGaAACCaTGGGCGGTGTGATGACCACGCTGATCGCgAAAACACCACTATCCCGACC AaGcAcaGCCAGGTGTTCTCTACCGCTG AGACAACCAGTCTGCGGTAACCATCCATgtGCTGCAGGGTGAACGTAACGTGCGGCTGATAACAATCTCTggSTCAGTTCAACCTGGATGGTATCACCCGGCACCGCGCGGCATECCGCAGATCGAAGUTACCETCGATATCGTGCT GACGGTATCCTGCaCGTTTCCGCGAAAGACAAAAACAGCGGTAAAGAGCAGAAGATCACTATCAGGCTTCTTCTGGECTGAaCGAAGAT GAAATCCAGAAAATGGTACGCGCGCAGAAGCTAACGCCGAAGCTGACCGTAGTTTGAAGAGCTGGTACAGACtcGCAACCAGGGCGAC CATCTGCTGCACAOCACCCGTAAGCAGGTTGAAGAAGCAGGCGACACACTGCCGGCTGACGACAAAACTGCTATCGAGTCTGCGCTGACTGCACTGGAAACXGCTCTGAAGGTGAAGACAAAGCCCTATCGAAGCGAAAATGCAGGAACTGGCACAGGTTTCCCAGAAACTGATGGAA ATCGCCCaGCAGCAACATGCCCAGCAGCAGACTGCCGGTGCTgATCCTTCEGCAAACAACGCGAAAGATGACGATGTTGTCGACGCEGAATTTGAAGAAGTCAAAGACAAAAAATAA chaperone protein Dnaj GTGCatTCatCTAGGGGCAATTTAAAAAAGATGGCTAAGCAAGATTATTACGAGTTTTAGGCGTTTCCAAAACAGCGGAAGAGCGXGAaa TCAAAAAGGCCTACAAACGCCTGGCCATGAAa TACCaCCCGGaCcGTAACCAGGSTGACAAAGGGCCGAGGCGAAATTTAAAGAGATC AAGGAGCTTATGAAGTTCTGACCGACECGCAAAAACGTGCGCATACGATCAGTATGGTCATGCTGCGTTTGAGCAAGGTGGCATGGGCGGCGGCGGETTTGGCGGCGGCGCAGACTTCAGCGATAETTETGGTGACGETTTCGGCGATATTTTTGGCGGCGGACGTGGTCGTCAACGT GCGGCGCGCGGTGCTGATTTACGCTATAACATGGAGE CACCCLCGAAGAAGCTGTACGUgGCGtGaCCAAAGGATCCGCATECCGACTCEGGAAGAGTGTGACGTTTGCCACSGTAGC GTGCAAAACCAGGTACACACC CAGACCTGTCCGACCTETCATGGTTCTGGCCAGGEG CAGATGCGCCAGGGTTTCTTT AGACCTgTCCACACTGTCAGGGCCGCGGTACGCTGaTCAAAGATCCGTGCAACAAATGTCATGGTCATGGTCGTGETGAGCGCaGCAAAACGCTGTCCGTTAAAATCCCGGCaGGGGTGGACACTGGAGACCGCATCCGTCTTGCGCGC GAAGGTGAAGCGGGTGAACACGCCGCACCGGCAGGCGATCTETACGTTCAGGTECAGGTTAAACAGCACCCGATTTTCGAGCGTGAAGGCAACAACCTGTATTGCGAAGTCCCGATCAACTTCGCTATGgCGGCGCTGGGTGGTgaAATCGAAGTACCGACCcTTGATGGTCGCGTCaaA CTGAAAGTGCCTGGCGAAACCCAGACCGGTAAGCTGETCCgTaTGCGCGGTAAAGGCGTCAAGTCUGTCCGCGGTGGcgCACAGGGTGATTEGCTATGCCGCGTTGTTGTCgaAACACCGGTAGGTTTGAACAGAAGCAGAAACAGCTGCTGCAAGaGct GCAAGAAAGCETTGGTGGC CCAACCGGCGAGCACAACAGCCCGCGTTCAAAGAGCETCTTUGATGGCGTGAAGAAGTTTTTTGACGaCCTGACTCGCTAA hypothetical protein TTGCTCTTCTCGGATTCGTAAGCCGTGAAAACAGCACCTCCGtCTGGCCAGTTCGGATGTGAACCTCACAGAGGTCTTTTCTCGTTACCAGCGCCGCCACTACGGCGGTGATACAGATGACGATCAGOGcgACaAtcAtCgCcTTATGCTGCTTCATTGCTCECTECTCCTTGACCTT TCGGTCAGTAAGAGGCACTCTACATGTGTTCTGCATATAGGGGGCCTCG9GTUGATGETAAAATAT CACTCGGGGCTTTTCTCTACTGCCGTTCAGCTAATSCcTGA hypothetical protein aTGTCTGCCAAaa GACGACTTCTTATTGCGTGTACCTTGATACAGCTATCTATCAETTTCCTGCATATTCTTCATTASAATATAAAGGAECCTTTGGTTCAATAATGCGGGTTATGCAGACTGGAATAGTGGETTTETAACACTCACCGTGGTGAAGTATGGAAAGTGACEGCGGAT TTTGGGGTAATTTTAAAGAAGCAGAATTTTACTCAUTTTATGAAGTAATGTACTCAATCATGCTGTAGCAGGGAGAAATCATACOGETTCAGCAATGACGCATGTCAGACTCUTTGactCTGATATGACATTCTTTGGCAAAATTTTGOCCAATGGGATAACTCATEGggTGAGAT CTCGACATGTTTTATGGATTCGGTTACCTCGGCTGGAACGGCCAGTGGGGCTTTTTTAAACCGTATATTGGATTGCATAATCAATCTGGTGACTACGTATCAGCTAAATATGGTCAAACGAAT GTTGGAATGGETATGTTGTTGGCTGGACAGCAGTATTACCATTTACGTTATTTGAC GAAAAATTTGTTTTATCTAACTGGAATGAATAGAACTGGACAGGACGATGCTTACACGGAGCAGCAATTTGGCCGGAACGGETTaAaTGGCGGETTAACTATTGCCTGGAAGTTCTATCCTCGCTGGAAAGCCAGTGTGACGTGGCGTTATTTCGATAALAGCTGGGCTACGATGGC TTTgGcgaTCAAATGATTTALATGCTTGGTTATGATTTCEAA putative secreted sulfatase ATGCAGAAAACGTTAATGGCCAGTTTGATCGGCCTTGCAGTTTGCACAGGGAALGCTTTTAGECCTGCCTTAGCCGCAGAGGCTAAACAACCTAATTTAGTCATtaTTATGGCGGTGATE TAGGTtaTGGCGAETTAGAaCaTATGGTCATCAGATCGTTAAAACACCIAATATCGAC AGGCUTGCCCAGGAAGGGGTCAATTEACTGACTACTATGCCCCCGCTCCTTLAGTTCACCTECACGCGCaGGGCTATTAACCGGCCGGATGCCATTLCGTACTGGAATTCGCTCATGGATECCtt SGCAAAGATGTTGCCD TAGGGCGTAACGAAC TCACGATTGCTAaTCTACTC AaAgCGCAAGGGTACGACACOCAATGATGGGTAAGCTGCATCTGAATGCAGGCGGCGaTCCC CCACAAGCACACATATGGGcTTTGATTACTCACTGGTTAATACSGCGGG GCCACGCTC TAAAGAACGCCCGCGTTATGGCATGGTT tACCCGACAGGCtgGCUACGTAACGGGCAACCCACTCCACGaGCTGATAAAAEGAGCGGTGAGTATGTCGTTCGGAAGTCGTCAACTGGCTGGATAACAAAaaGGACaGCAAGCCTTTCTT TTGCTTTTACCGAAG CATAGCCCCCTGGCTTCGCCCAAAaaATACCTC GATTGTCTCACATATATGAGCGCGTATCAG GCATCCTGATTTAUTTTATGGCGACTGGGCASACAAACCCTSGCGT GTGTGGGGGAATAT AATATCAGCTATCUGGAT AGGTTGGAAAAGTGCTGGATAAAA AAAGCTGTGGgtGaGaaGaTAACACA ATCGTTATTTTTACCAGTGatAACGGTCCgGTAaCGCGTGAAGCGCGCAAAGTGTATGAGCTGAATTTGGCAGGGGASACGGA TGGATTACGCGGTCGCAAGGATAACCTTTGGGAAGGCGGAATTCGTGTTCCAGCCATTATTAAATATGGTAAACATCTACCACAGGGAATGGTTTCA GATACACCCGTTTATGGUCTOGACTGGATGCCTACETTaGCgAAAATGATGAACTTCAAATTACCTACAGACCGTACTTTCGATGGTGAATCGCTGGTTCCTGtTcTTGAGCAAAAGCATTGAAACGCGAAAAGCCATTAATTTTCGGGATTGATATGCCATTCCAGGATJATCCAACC GATGAATGGGCGATCCGTGATGGTGACTGGAAGAT GATTATCGATCGCATAATAAACCGAAATATCTCTACAATCTGAAATCTGATCGTTATGAAaCacTTaAtCTGATCGGTAAAAAAACAGATATTGAAAAACAGATGTATGGTAGETTETAAAATATAAAACTGATATTGATAAT GATECTCTAATGAAAGCCAGAGGTGATAAACCAGAAGCGGTGACCTGGGGCTAa putative cytoplasmic protein ATGTTTACCAMCGTAAATGTTGATTGTTGCAAAACACCAGGAUGTAAaaACCTGGGGTTGCTGAATAGCCAGGATTATGTCGCACAGOGTaAaAATATTTTATGCCGTGAATGT GTTCTTGTVTCCAGUGATATCTGAACAGTCGCTTAALATTTATCGTAATATTGTGAAUCACTCC TGGAGAGGTTTGATTTGCCAATGTTCAACTEGCGGAGGCACGTCCCTCAAAAAATATGJATATECtGCACAagGCCAGAGAAGAATSTATTGCCATCAETGTGAGAAAACATTLATCACTCTGGAACAUGTAATTACCACACCACGAGGAGCCCTGTTAGCATTGATGATTGAGCAAGGG GAGGCACTTGCGGATATCAGAAAGTCATTACGTCTTAACA SGACTTAGCCGTGAACTGTTAAAATTAGCGCGTGAAGCAAACTATAAAGAAAGTCGACAGTGTTTCCCTGCTTCTGATATTACCCTGAGEACCCGCGCTTtTCGCGTCAAGTATAATGGTAGCAATAACTCTCTT TATGCTCTTGTTACCGCAGAAGAACAAAGGGCAGGGTGGTTGcCaTCTCAACCAATTACTCCCCATC GCCGTAGagCaaCATTATCAATACCATCGAACUATGAAGAGCGTATGTCTCCAGGGACGCTGGCACAECATGTCCAGCGCAAAGAGEDACTTACTATGCGGCGGGATACC TTGTTTGATATTGATTACGGCCCGCCAGTTTTACATCAAAACGATCCGGGAATGE TGGTAaAaCCGGTTCTTCCGGCATATCGTCATTTTGAACTGGTCAGAATACTGACCGATGAGCATECCAACAACGTTCAGCATTACCTTGATCACGAATGCTTTATATTGGGCGGCTGCCTGATG GCTAATTTGCAGCATATTCATCAGGTCGCTGCCATATTTCcTTTGTCAA GAGCGCGGTGTGGCACCCGCCACCATTGTTTTCCACCGCGATEATTCCTTAGTgGt GGGGTA GAAATAATGTCTGGCGTGCATTTTCTAACCGCAATTATTCATGGCTGTATGCAAUCTCaCTGGC AGTAAGAAAGTCCGCGAGATGCGGCATGCAACATEGAACAGTGCGACGCgTTITATCCACTTTGTGCaGAACCATCCTTTCCTTATATCATTGAACCGAATGUCTCCTGCGaaTGTCgETTCTACATTAGATATCCTCAAACATCTGTGGAATAaAaAACTAGASCATGGAACAATTIAA sodium/proton antiporter 1 GTGAAACATCTGCATCGATTCTTTAGCaGTGATGCCTCGGGAGGCATTATTCTCATTATTGCCGCTGTATTAGCGATGATTATGGCCAACAGCGGTgcAACCAGTGGATGGTATCACGACTTTCTTGAGACGCCGGTTCAGCTCCGGGTTGGGACACTTGAGATCAACAAGAACATGCTG CTATGGATCAATGCGCTCTGaTgGCGGTATTTTTCCTGTTGGTTGGTCTGGAGTTAAACGCGAGCTGaTGCAGGTTCGCTGGCCAGTCEGCGCCAGGCGGCatTTCCTGTTATTGCCGCAATCGGCGGGATGATTGTCCCGGCATTGCTCTATCTGGCTETTAACTATGCCGATCCG aTTaCCCGCGAAGGCTGGGCATCCCGGCGGCGACTGECATTGCCTTTGCACTTggTgTGTTGGCGCTGTTGGGAAGTCGTGTTCCGTTAGCGCE GAAGATCTTTTEGATGGCTCTGGCUATTATCGACGATCTTGGGGCCATCATLATCATCGCATTGTTCTACACTAATGACTTATCG ATGGCCTCTCTTGGCGTCGCGCCTGTAGCAATTGCGgEACTCGCGGTATTGAA CTGTGTGGTGTACGCCGCACGGGCGTUTATATTCTGGTTGGCGTGGTGCE GTGGACAGCGGTGTTGAAATCGGGGGTTCACGCAACCcTGGCTGGCGECATEGICGGCTTCTTTATTCCTTTGAAA GAGAAGCATGGGCGCTCTCCGCCTAAACGTCTGGAGCATGTTTTGCAECCATGGGTGGCGTATCTGATUTTGCCGCTGTTTGCATTTGCTAATGCTGGCGTTTCACTGCAGGTGTCACGCtggAaGGTTTgACCECCATTCTGCCATTAGGGATCATCGCTGGTTTGCTGaTTGGCAAG CCACEGCGTAtTaGTCTgttcTGCTGGETGGcgCTGCGTTTGAAATTGGCACATCTGCCAGAGGGAACgACTUACCAGCAAATTATGGCGGETGGTaTCCTGTGCGCTATCGGTTtTACTatGTCTATCTTTATTGCCAGCCTGGCATTTGGTAGCGTAGATCCAGAaCTGaTTAACUGG GCAAAATTAGGTATCCTTGTCGGTTCAATTTCE TcGgCGGTAATTGGATATAGCTGGTTACGCGTTCGTTTACGTCCATCAGTTTGA transcriptional activator protein NhaR ATGAGCATGTCTCATATCAATTACAACCACTUGTATTACTTCTGGCaTGTCTACAAAAAGGTTCTGEGGTTGGCGCAGCGGAGGCGCTTTATTTAACACCACAAACCATTACCGGGCaGATCCGGGCGCTGGAGAGCGCCTGCAAGGGAAACTATTTAAGCGTAAAGGACGTGGTCTG GAACCCAGCGAACTGGGGGAACTGGTCTATCGCEATGCCGATAAAATGTTCACCTTAAGCCAGGAAATGCTGGATATCGTCAACTATCGCAAAGAGTCCAACTUATTGETTGATGTTGGTGTGGCAGATGCACTTUCCAAACGtCTGGTCAGCAGTGTTCUGGATECCGCAGTIGTGGAA GACGAGCAGAECCATCTACGCTGTTTCGAaTCGACGCACGAGATECTTTTGAGCAgt TGAGTCAGCATA ACTGGATATGATCATCTCTGACTGTCCGaTCGATTCCACTCAGCAGGAAGGGCTGTTTTCCATGAAAATEGGCGAATGTGGTGTCASETTCTGGTGCACTAACCCACTA CCAGAAAAGCCGTTTCCTGCCUGTCTTGAAGAGCGTCGETEACTTATTCCGGGGCGTCGCTCAaTGTTGGGGCGTAAACTATTAAACTGGTTTAACTCCCAGGGCTTGAACGTCGAAATTTTGGGTGAGTTTGATGATGCTGCGTTGATGAAAGCCTTTGGGGCGACGCATAACGCTATT TTCGTTGCACCTTCGCETTACGCTAATSATTTCTATAACSATGACTCGSTUGTGSAGATAGGCCGTGTTGAGACGTGATGGAAGAGTACCACGCGATTTTTGCCGAAGgaTGAETCASCACCCTGCAGTACAGCGTATCTGCAATACAgacTATTCTGCGCESTTTACTCCAGCTTCA AAATAA riboflavin kinase ATGAAGCTGATACGCGCALACATAATCTCAGCCAGGCCCCCCAAGAAGGGTGTGTGCTGACTATTGGT ATTTCGACGGCGTGCATCGCggTCATCGCGCGCTGTTACAGGGCUTGCAGGAAGAAGGGCGCAAGCGCAACUTACCGGTGATGGTGATGCTTTTtGaACCTCAACCACTG GAACTGTTTGCTACTGATAAAGCCCCGGCACGGcTCACCCGGCTGCgGGAAAAACTGCgTtaTcTTGCAGAGTGTGGCGTTGATTACGTGCTGTGCGEGCGTUTTGCaGGCGTUTTGCGGCGTTAACCGCGCAAAACTTCATCASTGATOTECTGGTGAAGCACTTGCGGGTAAAATTT CTTGCCGTAGGTGACGAETTCCGCTTTggCGCTGgTCGTGAAGGCGAETTCTTGTTATTACAGAAGGGGCATGGAATACGGCTTCGATATCACCAGCaCGCAAATTUTTGCGAAGGTGGTGTGCGTATCAGCAGCACCGCCGtGCGTCAGGCGCt TGCGGATGACAATCTGGCTCTG GCAGAAAGTTTACTGGGGCACCCGTTTGCTATCTCCGGGCGTGTAGTCCACGGTGATGATTAGGGCGCACTATAGGTTTCCCGACGGCGATGTACCGcTaCgCCGTCAGGTTTCCCCGGTGAAAGGGGTTTATGCGGTAGaAgTGTTGGOCCE TGgCGAAAGCCGTTACCCGGcgTT GCAAACATCGGAACACCCCCAACGGTTGCCGGTATTCGCCAGCAACTGCAGTGCATTTGTTAGATGTTGCAATGGCCTTTATGGTCGCCATATACAAGTAGTGCTGCGEAAAAAATACGCAATGAGCAGCGATTTGCATCGCTGGACGAACTGAAAGCGCAGATTGCGCGTGATGAA TTAACCGCCCGCGaaTTTTTTGGGCTAACAAAACCGGCTTAa Isoleucyl-tRNA synthetase ATGAGTGACTATAAATCACCCTgAATTTGCCGGAAACAGGTECCCGATECGTGGCGATCTCGCCAAGCGCGAACCGGGATGCTGGCGCGTTGGATGATGATGATCTgTaCGGCATCATCCGTGCGGCTAAAAAGGCAAAAACCTTCATTCTGCATgATGGCCCTCCTTATGCG AATGGCAGCALTCATATTGGTCACTCGGTTAACAAGATTCTGAAAGACATTATCATTAAGTCCAAAGGGCTIECTGGATATGACTCGCCGTATGTGCCTGGCTGGGACTGTCaTGGtCTGCCAATCGAACTGAAAGTAGAGCAAGAATACGGTAAGCCGGGGGAGATTCACCGCCGcT GAGTU CCGCGCCAAGTGCCGCGAATACGCTGCgACCCAGGTTGACGGTCAGCGCAAAGACTTTaTCCGTCTGGGCGTGCTGGGCGActgGTCGCACCCGTACCTGACCATGGACE TCAAAACTGAAGCCAACATCATCCGCGCGCTGGGCAAAATCATCGGCAACGGTCACCTGCACAA GGCGCGAAGCCGGTGCACTGGTGCGTTGACTGCCGTTCTGCACTGGCAGAAGCGGAAGUTSAGTATTACGCAAAACTECTCCGTCCATCGACGTCGCTTUCCAGGCGGTCGATCAGGATGCGCTGAAAACGAAATTTGGCGTAAGCAATETTAACGGCCCAATTTCGCEGGTTATCTGG aCCACCACGCCGTGCACGCTGCCTGCTAacCGCGCAATCTCCATEGCACCTGATTTTGALEATGCGCTGGTGCaAatCGACGGTCAGGCCGTGATCCTCGCGAAAGATCUGGtTGAAGCGTAAEGCAGCGTATCGGCGTTAGCGATTACACCATTCTTGGCACGGEGAAAGGTGCCGAG CtGGAACTGTTGCGCTTTACCCATCCGTTUATGGACETCGATGTTCCGGCATTCTCGGCGACCACGTTACGCTGGATGCCGGTACCGGTGCCG AGGCCACGGTCCGGCGACTATGTGATCGGTCAAAAATATGGTCTGGAAaCCGCTAACCCgGTTgGCCCGGAC GECACtTaTCTGCCGCGTACTTACCCGACTCUGGATEGCGTTAACGTCTTCAAAGCGAACGSTATTGTCATTGCGTTGTTGCAGGAAAAAGG TGTTGCA TTGAGAAAATGCAACACAGCTATCCGTCCTGCEGGCGTCATAAAACGCCGATCAUCTTCCGcgCGACGCCGCAG TGGTTCGTCASCATOGATCAGAAAGGTCTGCGTGCGCAGTCACTGAAAGAGATCAAAGGCGTGCAGTGGATCC STTGCTAACCGTCCTGACTGGTGTATCT CGTCaGCGTACCTGGGGGGTGCCGATGTCACTGTTCGTgCaCAaa GACACAGAAGAaCTGCA GTACTCtcAACTGa TGGAAGAAGTGGCAAAACGCGTTGAAGTEGAC CCTGG" TCCTCGGCGa AGTACCGGATACGCEGOATGTATGGTETGACTCCGGATCTACC CACTCTTCCGTTGTTGA CGGAATETGCCGGTC CATGTaTTGAGGTTCT cGTGgCT TCTCTACCGC CACGGCTTTACCGTGGATGGTCAGGGT CGCAAGATGTCTAAATCCA tAACaCCGTTTCGCCGCA AATAAACtGGGGCGG TGGCGAAATGGCC ACGEGCTGO ATCGTCGTATCCGTAACACCgCGCGC TTCCTGCTGGCAAACCTGAACSGTTtTGAECCGGCAAAAGTATGGTGAAACCGGAAGAGATGGTGGTACTGGI CGCTGGGCCGEAGGTTGTGCGAAAGCGGCACAGGAAGACATCCECAAGGCGTACGAAGCATACGATTTCCACGAAGTGGTCAGCGTCTGaTGCGCETCTGCTCC GTTGAGATGGGTTCCTTCTACCTCGACATCATCAAAGACCGTCAGTATACCGCCAAAGCGGCAGCGTGGCGCGTCGTAGCTGCCAGACTGCGCTGTATCACATCGCaGAAGCGCTGGTTCGCTGGATGGCACCAATCCTCTCCTTCaCcGCTGaTGAAGTGTGGGGt TaCCTGCCggGC GAACGTGAAAAATACGTCTTCACCGGCgAgTGSTACGAAGGCCTGETTGGTCTGGCAGACAGTGAAGCAATGAACGTGCGTTCTGGGACGAGCTGTTGAAAGTGCGTGGCGAAGTGAACAAAGTCaTTGAGCAAGCGCGTGCCGATAAGAACGTGGGGGGCTCGCTGGAAGCGGCAGTA ACCTTGTATGCAGAACCGGAaCTGGCGCGAaaCTGaCCGCGCTGGGCGAT GAATTACGATTTGTCCTGtTGACCTCCGCGCTACCGTTGCAGACEATAACGACGCACCTCCTGATGCCCAGCAGaGCGA GTCCTCAAAGGGCTGAAAgICGCGTTGAGTAAAGCCGAAGGtGaGAAG TGTCCtcGct GCTGCACTACACCCAGGATGTCGCAAGGTGGCGGACACGCAGAAATCTGCGGCCGCTGTGTCASCACGTCGCCGGTGACGGTGAAAAaCGTAAGTTTGCCTGA Non-protein region GCTTGCGCCAACGCCATTTCATCGCCATCCCGCCASCATACAGGCCTCGGAAGAACCATGGTGTTGGTGCCAACGGCC GACCATTTTTCGGTGCAGGCGCATGCCACAGATCGGCAACCATGTTTACGCAACGCAGATCGATTGCTGCAGITTGCGGATATTCTTCTTTGTCGATCC AGTTTTTGTTAATGGAEAAAECCA FKBP-type 16 kDa peptidyl-prolyl cis-trans isomerase ATGTCTGAATCTGTACAGaGCATASCGCCGTCCTGGTGCACTTCACGCTAAAACTCGACGAT GGCACCACCGCTGAGTCTACCCGCACACGGTAACCGGCGCTGTTCCGCCTG GTGATCCTTCTCTTTCTGAGGCTGGAGCAACACCTGCTGGGGCTGAAAGTGGECGATAAA ACCaCCTTCLCGCTGGAGCCAGATGCCGCgTTEGGCGTGCCGTCACCgGACCTGATECAGTACTTCTCCCGCCGTGAATTTATGJATGCAGGCGAGCCAGAAATTGGCGCAATCATECTTTTTACCGCAATGGATGGCAGTGAGATGCCTGGCGTGTCCGCgAAATTAACGGCGACTCC ATTACCGTTGATTTCAACCaTCCGCTGGCCGGGCAGACCGTTCATTTTGATATTGaagTGCTGGAATCGATCCGGCACTGGAGGCGTA thr operon leader peptide ATGAAACGCATTAGCaCCACCATTACCACCACCATCaCCATTACCACAGGTAACGGTGCGGGCTGA aspartokinase I/homoserine dehydrogenase I ATGCGAGE GTTGAAGTTCGGCGGTCATCASTGGCAAATGCAGAACGTDTTCTGCGGGTTGCCGATAttCTGGAAAGCAATGCCAGGCAGGGGCAGGTGGCCACCGTCCTCCTGCCCCCGCCAAAATCACCAACCATCtGGTaGCGATGATtGaaAAaACCATTAGCGGTCAGGADGCt TTCcCATATCAGCGATGCCGAACGTATTTTTGCCGAACTICTGACGGACTCGCCGCCGCCCAGCCGGGATTTCCGCTGGCACAATTgAAAACTTTCGTCGACCAGGAATTTGCCCAAATAAAACATGTCCTGCATGGCatCAGTTTGTTGGGGCAGTGCCCGGATAGCATCAACGCT GCGCTGATTTGCCGTGgCGAGAAAATGTcGaTcgccattaTGGCCGGCGTGTTAGAAGCGCGTGGTCACAACGTTACCGTTATCGATCCGGTCGAAAAACTGCTGCAGTGGGTCATTACCtCgAaTCTACCGTTGATaTtGCTGAATCCACCCGCCGTATTGCGGCAAGCCGCATTCCg GCTGACCACATEGtGCTGATGGCTGGTTTCACTGcCggTAATGAAAAAGGCGaGCTGGEGGTECTGGGACGCAACGGTTCCGACI CGGTCCTGGCGGCCTGTTTACGCGCCGATTGTTGCGAgaTCTGGACGGATGTTGACGGTGTTTATACCTGCG CGCGTCAGGTG CCCGATGCGAGGTTGTTGAAGTCGATGTCCTATCAGASGCGATGGAGCTTTCTTACTTCGGCGCTAAA CacccccccAC ATCGCCCAG CAGATCCCTtgCCUGATTAAAAATACCGgAAAECCCCAAGC TACGCECATTG AGCCGTGAT GAAGACGAATTACCGGTCAAGGGCATTTCCAATSTGAATACATGGCAATETTCAGCGTTTCCGCCCCGGGGADGAAAGGGATggTTEGCATGGCGGCGCGCGTCTTTGCAGCGTGTCACGCGCCCGTaTTUCCGTGGTGCUGATTACGC TCTTCCGAAT GTATCAGTTTC TGCGTTCCGCAAGCGACTGTGTGCGAGCTgAaCGGGCA TGCAGGAAGAGETCTACCTGGAaCTGAAGAAGGCTTACTGGAGCCGTTGGCgGt GACGGAACGGCTGGCCATTATCTCGGTGGTAGGTGATGGTATGCGCAC taCGTGGGAUCTCGgCGAAATECTETGCCGCGCTS GCCCGCGCCAATATCAACATTGTCGCCATTGCtCaGGGaTCTTTGAaCGCTCAAUCTCTGTCGTGGTCATAACGATgATGCGACCACTGGCGTGCGCGTTACTCATCAGATGCTGTTCAATACCGATCAGGTTATCGAAGTGTTTGTGATTGgCGTCGGTGGCGTTGCCGGTGCGCTG CTGGAGCAACTGAAGCGTCAGCAAAGCTGGTTGAAGAATAA CATATCGCTTACGTGTCTGCGGTGTTGCTAACTCGAAGECACEGCTCACCAATGTACATGGCCTTAATCTGGAAAACTGGCAGGAAGAACTGGCGCAAGCcAAAGAGCCGTTTAATCTCGCGCGcTtAATTCGCCTC GTGAAAGAATATCATCTGCEGAaCCCGGTCATTGTTGACTWTACTTCCACCAGGCTGTGGCAGATCAATATECCGACTECCTgCGCGAAGGTTTCCACGTTGTTACGCCGAaCAAAAAGGCCACACCTCGTCSATGGATTACTACCATCAGTIGCGTTATGCGGCGGAAAAATCGCGG CGTAATTCCTCLATGACACCACGTTGGGGCTGGATTACCGGTTATTGAGAACCTGCAAAATCTGCTCAATGCEGGTGATGAATTGATGAAGTTCTCCGGCATTCTTTCAGGTTCGCTTTCTTALATCTTCGGCAAGTTAGACGAAGGCaTGAGTETCTCCGAGGCGACCACACTGGCG CGGGAAATGGGTTATACCGAACCGGACCCGCGAGATGATCTTECtGGTATGJATGTGGCGCGTAagCTADTGATECTCGCTCGTGAAACGGGACGTGAACTGGAGCEGGCGGATATTGAAATTGAACCTgTGCTGCCCGCaGaGTTTAACGCCGAGGGTGATGTCGCCGCTTTTATGGCG AATCTGTCACAGCTCGACGaTCECTTTGCCGCGCGTGTGGCGAAGGCCCGTGATGAAGGAAAAGTTTTGCGCTATGTTGGCAATALOGATGAAGATGGCgTCTGCCGCGTGAAGTTGCCGAAGTGGATGSTAATGaTCCGCTGTTCAAAGTGAaAaATGGCGAAAACGCCCTGGCCTTC TATAGCCACTATLATCAGCCGCTGCCGTTGGTACTGCGCGGATATGGTGCGGGCATGACGTTACAGCTGCCGGTGTCTTTGCTGATCTGCTACGEACCCTCTCATGGAGTTAGGAGTCTGA homoserine kinase ATGGTTAAAGTTTAEGCCCCGGCUTCCAGTGCCATATGaGCGTCGGGTTTGATGTGCTCGGGGCGGCGGTGACACCTGTTGATGGTGCATTGCTCGGAGATGTagTcaCGGTTGAGGCGGCAGAGACATTCASTCTCAACAACCTCGGACGCTTTGCCGAEAAGCTGCCGTCAGAGCCA CGgGaaAATATCGTTLATCAGTGcTGGGAGCGTTTTTGCCAGGAGCTTGGCAAGCAAATTCCAGTGGCGATGACTCTGGAAAAGAATatGCCGAECOGTTCGGGCTTAGGCTCCAGCGCCGTTCAGTGGTCGCGGCgCTgAt GGCGATGAATGAACACTGCGGCAGCCGCTTAATGAC ACTCGTTTGCTGGCTTUGATGGGCGAGTTGGAAGGGCGTATCTCCGGCAGCALTCATTACGACAACGEGGCACCGTGTTtTCETGGTGGTATGCAGTUGATGATCGAAGAAACGACATCATCAGCCAGCAGTGCCAGGGTTTGATGAGEGGCTGTGGGTGCTGGCGTATCCGGGGAET AAAGTCECGCGGCAGAAGCCAGGGCTaTTTTACCGGCGCAGTATCGCCGCCAGGATTGCATTGCGCACGGGCGACATCTGCCAGGCTTCATTCACGCCTGCTATTCCCGTCAGCTTGAGCTTGCCGCGAAGCTGATgAAAGTGTTATCGCTGAACCCTACCGTGACGGTTACTGCCA GGCTTCCGGCAGGCGCGGCAGGCGGTTGCGGAAATCGGCGCGGTAGCGAGCGGTATCTCCGGCTCCGGCCCGACTETGTTCGCTCTGTGEGACAAGCCGGATACCGCCCAGCGCGTTGCCGACTGGTTGGGTAAGAACtACCTGCAAAATCAGGAAGGTTTTGTTCATATTTGCCGGCTG GATACGGCGGGCGCACGACTACTGGAAAACTAA threonine synthase ATGAAACTCtacaATCTGAAAGATCACAATGAGCAGGTCaGCTTTGCGCAAGCCGTAACCCAGGGOTTAGGCAAAAATCAGGGGCtGTTTTTTCCGCACgaCCTGCCGGaaTTCAGCcTgACTGAAaTTGATGAGATCCTGAAGCtGGATTTTGTCACCCGCAGTGCGAAGATCCTCTCg GCGTTTATTGGTGATGAAATCCCGCAGGAAaTCCTGGAAGAGCGCGTACGTGCGGCGTTTGCCTTCCCGGCTCCGGTCGCCAATGTTGAASGCGATGTCGGTTGTCTGGA TTGTTCCACGGGCCAACGCTGGCATTTAAAGATTTCGGCGGTCGCTTTATGGCACAAATGCTgACCCAT ATTGCGGGCGATAAGCCAGTGACCATTCTGACCGCGACATCCGGTGATACTGGAGCGGCAGTGGCTCATGcTTTCEACGGTETACCGAATGTGAAAGTGGTTATCCTCTATCCACGAGGCAAAATCAGTCCACTGCAAGAAAAACTgTTCTGTACATTG9GCggCAATATCGaAACTGTT GCCATCGAcggCGaTTTCGATGCCTGTCAGGCGCTGGTGAAGCAGGCGTTTGATGATGAAGAACTGAAAGTGgCgCtGGGGCE GAATTCTGCTAACTCCATCAACATCAGTCGCTTGCTGGCGCAGATTTGTTATTACTTTGAGGCTGTCGCACAGTE GCCGCAAGAAGCACGTAACCAG TTGGTTGTCTCGT CCGAGTGCAAACETCGGCGATETGACGGGGGGTCTGCTGGCGAaGTCACTCGGTCDGCCGGTAAAACGTETTATTGCE GCGACCAACGTGAACGALACCGTACCACGTTTCCTGCaCGCGGTCAGTGGTCACCCAAaGCGACTCAGOCGAcgTtaTCCAATGCG ATGGATGTTAGCCAGCCAACAACTGGCCGCGTGTGGAAGAGTTGE TCCGCCGCAAAATCTGGCAACTGAAAGAGCTGGETTATGCAGCCGTGGATGATGAAACCACGCAACAGACAATGCGTGAGE TAALAGAACTGGGCTATACCTCGGAGCCGCACGCTGCCGTAGCTTATCGTGCG CTGCGTGACCAGTTGAAECCAGGCGAATATGGCTTGTU CCTCGGCACCGCGCATCCGGCGAAatTtAAAGAGAGCGTGGAAGCGATTCTCGGTGAAACGTTGGatCTGCCAAAAGAGCTGGCAGAACGTGCTgATTTACCCTTGCTTTCGCATAACCTGCCCGCCGATTTTGCTGCGTTG CGTAAatTgaTGATGAaTCATCAGTAA hypothetical protein AtGCAGCCCGGCTETTTTTATGAAGAAAATATGGAGaAaAACGACagGGAAAAAGGAGAAATTCECAATAAATGCGGEAACTTAGASATTAGGATTGCGGAGAAT ACAACTGCCGTTCTCATCGCGTAATCTCCGGATATCGACCCATAACGGGCAATGATAAAAGGAGTAACCTGTGA Non-protein region aAAAACTGCTGGAAACAATGAAAGACGTACCGGACGACCAACGTCAGGCGC transaldolase B ATGACGGACAAATTGaCCTCCCTTCGTCAGTACACCACCGTAGTGGCCGACACTGGGGACATCGCGGCAATGAAGCTGTATCAACCGCAGGATGCCACAACCAACCCTECTCTCATTCTTAACGCAGCGCAGATTCCGGAATACCGTAASTTgATTGATGATGCTGTCGCCTGGGGAA CaGCAGAGCAACGATCGCgCgCAGCAGATCGEGGACGCGACCGACAAACTGGCAGTAATATTOGTCT GAAaTCCTGAAACTGETTCCGCGCCgTATCTCAACUGAAGTUGATGCGCGTCTTTCCTATGACaCCGAAGCGTCAATTGCGAAAGCAAAACGCCTGATCAAACTCTACAAC GATGCAGGTATTAGCAACGATCgTaTTCTGATCAAACTGGCTTCTACCTGGCAGGGTATCCGTGCTGCAGAACAGCTGGAAAAAGAaCGTATTAACTGTAACCTGACCCTGCTgt TCTCct TCGCCAGGCTCGTGCTTGTGCGGAGCGCGCGTGTTCCTGaTCTCGCCGTTTGTTGGC CGTATTCTTGACTGGTACAAaGCGAATACCGATAAGAAAGAGEACGCTCCGGCAGAAGATCCGGGCGTGGTTTCTGTatCtGAAATCEACCAGEACTACAAGAGCATGGTTaTgAAACCGTGGTTATGGGCGCAAGCTTCCGTAACATCGGCGAAATTCTGGAACTGGCAGGCTGCGAC CGTCTGACCat CGCACCGCACTGCTGAAAGAGCTGCGGAGAGCGAAGGGGCTATC,AACGTAAACTOTCTTACACTGgTGAAGTGAAAGCgCGTCCGGCGCGTATCACEGAGE CCGAGTTCCTGTGCAOCACAACCAGGATCCAATGGCAGTAGATAAACTgGcGGAGETATCCGT AAGTTTGCTGTTGACCAGGAAAAACTGGAAAAAATGATCGGCGATCTGCEGTAA molybdopterin biosynthesis mog protein ATGAATACTTTACGTATTGGCTTAGTtTcCaTCTCTGATCGCGCATCCAGCGGCGTTTALCAGgaTAAAGGCATCCCTGCGCTGGAAGAATGGCTGACAUCGGCGCTAACCACGCCGTTTGAaCTGGAAACCCgcTTaATCCCCGATGAGCAGGCGATCATCGAGCAAACGTTGTGTGAG CTGGTGGATGAATGAGETGCCATCTGGTGCTCACCACGGGCGGAACTGGCCCTGCGCGTCGTGACETAACGCCCGATGCGACGCTGGCAGTAGCGGACCGCGAGATOCCAGGCTTTGGTGAACAGATGCGCCAGATCAGCCTGCATTTTGTACCaaCTGCGATCCTTTCGCGTCAGGTS gGGGTgATTCGCAAACAGGCGCTGATCCTTAACTT CCCGGTCAACCGAAGECTATTAAAGAGACGCE GSAAGGTGEGAAGGACGCTGAGEGTAACGTTGTGGTGCACGSTATTTTTGCCGCGTaCcGTACTGCATTCAGTTGCTGGAAGGGCCATACGTTGAACGGCaCCgGaAGTG GTTGCAGCATTCAGCCGAAGAGTGCAGACGCGAAGE TAGCGAATAA chaperone protein DnaK a TGGGTAAAATAaTTGGTATCGACCTGGGTACTACCAaCTCTTGTGTagCGATTAEGGATGGCACCACTCCUCGTGEACTGGAGAACGCCGAAGGCGATCGCACCACGCCTTCTATCATTGCCTATACCCAGGADGGTGAAACTCTGGTTGGTCAGCCGGCTAAACGTCAGGCAgt GACG AACCCGCAAACACCCTGTU TGCGATTAAACGCCUGATTGGCCGCCGCTTCCAGGACGAAGAAGTACAGCGXGATOTTTCCATCATGCCGTTCAAAATTAUTGCTGCtgatAACGGCGACGCATGGGTCGAAGETAAAGGCCAGAAAATGGCACCGCCGCAGAUCTCTGCTGAAGTGCTG AAAAAAAEGAAGAAAACCGCTGAAGATTACCTGGGTGAACCGGTAACTGaAGCTgt TATTACCGTACCGGCATACTEtaACGATGCTCAGCGTCAGGCAACCAAAGACGCAGGCCGTATCGCTGGTCTGGAAGTAAAaCGTATCATCAACGAaCCGACCGCAGCTGCGCTGGCTEACGGt CTGGACAAAGSTACTGGCAACCgtACTATCGCGGTTTATGACCTGGGTGGTGGTACTTTCGATATTTCCATTATCGAATCGACGAAGTTGACGGCgAAAAAACCUECGAAGTTCTGGCAACCAACGGTGATACCCACCTGGGTGGEGAAGACTTCGACAGTCGTCTGATCAACTACTG GTTGAAGAATTCAAGAAAGATCAGGGCATTGacCtGCGCAACGaTcCGCTGGCAATGCAGCGCCTGAAGAAGCGGCAGAAAAAGCgAAAATCGAACTGTctTCCGCTCAGCAGACCGCGTTAACCTGCCGTACATCACTGCAGACGCGACCGGTCCGAAACACATGAACATCAASTG act CGTGCGAAACTGGAAAGCCTgGtTGAAGAUCTGGTAAACCGt TCCATTGAGCCGCTGAAAGTTGCACTGCAGGACGCTGGCCTGTCCGTATCTGATAECGACgaCGTTATTCTCGTTGGTGGTCAGACTCGTATGCCAATGGETCAGAAGAAAGTTGCTGaATTCTTTGGTAAAGAG CCGCGTAAAGATGTTAACCCGGACGAAGCTGTGCCATCGGTGCTGCTGTTCAGGGTGGTGTTCTGACTGGEGACGTAAAAGaCGTacTGCTgCtGGACGTTACCCCGCTGTCtCTGGGTATCGaAACCaTGGGCGGTGTGATGACCACGCTGATCGCgAAAACACCACTATCCCGACC AaGcAcaGCCAGGTGTTCTCTACCGCTG AGACAACCAGTCTGCGGTAACCATCCATgtGCTGCAGGGTGAACGTAACGTGCGGCTGATAACAATCTCTggSTCAGTTCAACCTGGATGGTATCACCCGGCACCGCGCGGCATECCGCAGATCGAAGUTACCETCGATATCGTGCT GACGGTATCCTGCaCGTTTCCGCGAAAGACAAAAACAGCGGTAAAGAGCAGAAGATCACTATCAGGCTTCTTCTGGECTGAaCGAAGAT GAAATCCAGAAAATGGTACGCGCGCAGAAGCTAACGCCGAAGCTGACCGTAGTTTGAAGAGCTGGTACAGACtcGCAACCAGGGCGAC CATCTGCTGCACAOCACCCGTAAGCAGGTTGAAGAAGCAGGCGACACACTGCCGGCTGACGACAAAACTGCTATCGAGTCTGCGCTGACTGCACTGGAAACXGCTCTGAAGGTGAAGACAAAGCCCTATCGAAGCGAAAATGCAGGAACTGGCACAGGTTTCCCAGAAACTGATGGAA ATCGCCCaGCAGCAACATGCCCAGCAGCAGACTGCCGGTGCTgATCCTTCEGCAAACAACGCGAAAGATGACGATGTTGTCGACGCEGAATTTGAAGAAGTCAAAGACAAAAAATAA chaperone protein Dnaj GTGCatTCatCTAGGGGCAATTTAAAAAAGATGGCTAAGCAAGATTATTACGAGTTTTAGGCGTTTCCAAAACAGCGGAAGAGCGXGAaa TCAAAAAGGCCTACAAACGCCTGGCCATGAAa TACCaCCCGGaCcGTAACCAGGSTGACAAAGGGCCGAGGCGAAATTTAAAGAGATC AAGGAGCTTATGAAGTTCTGACCGACECGCAAAAACGTGCGCATACGATCAGTATGGTCATGCTGCGTTTGAGCAAGGTGGCATGGGCGGCGGCGGETTTGGCGGCGGCGCAGACTTCAGCGATAETTETGGTGACGETTTCGGCGATATTTTTGGCGGCGGACGTGGTCGTCAACGT GCGGCGCGCGGTGCTGATTTACGCTATAACATGGAGE CACCCLCGAAGAAGCTGTACGUgGCGtGaCCAAAGGATCCGCATECCGACTCEGGAAGAGTGTGACGTTTGCCACSGTAGC GTGCAAAACCAGGTACACACC CAGACCTGTCCGACCTETCATGGTTCTGGCCAGGEG CAGATGCGCCAGGGTTTCTTT AGACCTgTCCACACTGTCAGGGCCGCGGTACGCTGaTCAAAGATCCGTGCAACAAATGTCATGGTCATGGTCGTGETGAGCGCaGCAAAACGCTGTCCGTTAAAATCCCGGCaGGGGTGGACACTGGAGACCGCATCCGTCTTGCGCGC GAAGGTGAAGCGGGTGAACACGCCGCACCGGCAGGCGATCTETACGTTCAGGTECAGGTTAAACAGCACCCGATTTTCGAGCGTGAAGGCAACAACCTGTATTGCGAAGTCCCGATCAACTTCGCTATGgCGGCGCTGGGTGGTgaAATCGAAGTACCGACCcTTGATGGTCGCGTCaaA CTGAAAGTGCCTGGCGAAACCCAGACCGGTAAGCTGETCCgTaTGCGCGGTAAAGGCGTCAAGTCUGTCCGCGGTGGcgCACAGGGTGATTEGCTATGCCGCGTTGTTGTCgaAACACCGGTAGGTTTGAACAGAAGCAGAAACAGCTGCTGCAAGaGct GCAAGAAAGCETTGGTGGC CCAACCGGCGAGCACAACAGCCCGCGTTCAAAGAGCETCTTUGATGGCGTGAAGAAGTTTTTTGACGaCCTGACTCGCTAA hypothetical protein TTGCTCTTCTCGGATTCGTAAGCCGTGAAAACAGCACCTCCGtCTGGCCAGTTCGGATGTGAACCTCACAGAGGTCTTTTCTCGTTACCAGCGCCGCCACTACGGCGGTGATACAGATGACGATCAGOGcgACaAtcAtCgCcTTATGCTGCTTCATTGCTCECTECTCCTTGACCTT TCGGTCAGTAAGAGGCACTCTACATGTGTTCTGCATATAGGGGGCCTCG9GTUGATGETAAAATAT CACTCGGGGCTTTTCTCTACTGCCGTTCAGCTAATSCcTGA hypothetical protein aTGTCTGCCAAaa GACGACTTCTTATTGCGTGTACCTTGATACAGCTATCTATCAETTTCCTGCATATTCTTCATTASAATATAAAGGAECCTTTGGTTCAATAATGCGGGTTATGCAGACTGGAATAGTGGETTTETAACACTCACCGTGGTGAAGTATGGAAAGTGACEGCGGAT TTTGGGGTAATTTTAAAGAAGCAGAATTTTACTCAUTTTATGAAGTAATGTACTCAATCATGCTGTAGCAGGGAGAAATCATACOGETTCAGCAATGACGCATGTCAGACTCUTTGactCTGATATGACATTCTTTGGCAAAATTTTGOCCAATGGGATAACTCATEGggTGAGAT CTCGACATGTTTTATGGATTCGGTTACCTCGGCTGGAACGGCCAGTGGGGCTTTTTTAAACCGTATATTGGATTGCATAATCAATCTGGTGACTACGTATCAGCTAAATATGGTCAAACGAAT GTTGGAATGGETATGTTGTTGGCTGGACAGCAGTATTACCATTTACGTTATTTGAC GAAAAATTTGTTTTATCTAACTGGAATGAATAGAACTGGACAGGACGATGCTTACACGGAGCAGCAATTTGGCCGGAACGGETTaAaTGGCGGETTAACTATTGCCTGGAAGTTCTATCCTCGCTGGAAAGCCAGTGTGACGTGGCGTTATTTCGATAALAGCTGGGCTACGATGGC TTTgGcgaTCAAATGATTTALATGCTTGGTTATGATTTCEAA putative secreted sulfatase ATGCAGAAAACGTTAATGGCCAGTTTGATCGGCCTTGCAGTTTGCACAGGGAALGCTTTTAGECCTGCCTTAGCCGCAGAGGCTAAACAACCTAATTTAGTCATtaTTATGGCGGTGATE TAGGTtaTGGCGAETTAGAaCaTATGGTCATCAGATCGTTAAAACACCIAATATCGAC AGGCUTGCCCAGGAAGGGGTCAATTEACTGACTACTATGCCCCCGCTCCTTLAGTTCACCTECACGCGCaGGGCTATTAACCGGCCGGATGCCATTLCGTACTGGAATTCGCTCATGGATECCtt SGCAAAGATGTTGCCD TAGGGCGTAACGAAC TCACGATTGCTAaTCTACTC AaAgCGCAAGGGTACGACACOCAATGATGGGTAAGCTGCATCTGAATGCAGGCGGCGaTCCC CCACAAGCACACATATGGGcTTTGATTACTCACTGGTTAATACSGCGGG GCCACGCTC TAAAGAACGCCCGCGTTATGGCATGGTT tACCCGACAGGCtgGCUACGTAACGGGCAACCCACTCCACGaGCTGATAAAAEGAGCGGTGAGTATGTCGTTCGGAAGTCGTCAACTGGCTGGATAACAAAaaGGACaGCAAGCCTTTCTT TTGCTTTTACCGAAG CATAGCCCCCTGGCTTCGCCCAAAaaATACCTC GATTGTCTCACATATATGAGCGCGTATCAG GCATCCTGATTTAUTTTATGGCGACTGGGCASACAAACCCTSGCGT GTGTGGGGGAATAT AATATCAGCTATCUGGAT AGGTTGGAAAAGTGCTGGATAAAA AAAGCTGTGGgtGaGaaGaTAACACA ATCGTTATTTTTACCAGTGatAACGGTCCgGTAaCGCGTGAAGCGCGCAAAGTGTATGAGCTGAATTTGGCAGGGGASACGGA TGGATTACGCGGTCGCAAGGATAACCTTTGGGAAGGCGGAATTCGTGTTCCAGCCATTATTAAATATGGTAAACATCTACCACAGGGAATGGTTTCA GATACACCCGTTTATGGUCTOGACTGGATGCCTACETTaGCgAAAATGATGAACTTCAAATTACCTACAGACCGTACTTTCGATGGTGAATCGCTGGTTCCTGtTcTTGAGCAAAAGCATTGAAACGCGAAAAGCCATTAATTTTCGGGATTGATATGCCATTCCAGGATJATCCAACC GATGAATGGGCGATCCGTGATGGTGACTGGAAGAT GATTATCGATCGCATAATAAACCGAAATATCTCTACAATCTGAAATCTGATCGTTATGAAaCacTTaAtCTGATCGGTAAAAAAACAGATATTGAAAAACAGATGTATGGTAGETTETAAAATATAAAACTGATATTGATAAT GATECTCTAATGAAAGCCAGAGGTGATAAACCAGAAGCGGTGACCTGGGGCTAa putative cytoplasmic protein ATGTTTACCAMCGTAAATGTTGATTGTTGCAAAACACCAGGAUGTAAaaACCTGGGGTTGCTGAATAGCCAGGATTATGTCGCACAGOGTaAaAATATTTTATGCCGTGAATGT GTTCTTGTVTCCAGUGATATCTGAACAGTCGCTTAALATTTATCGTAATATTGTGAAUCACTCC TGGAGAGGTTTGATTTGCCAATGTTCAACTEGCGGAGGCACGTCCCTCAAAAAATATGJATATECtGCACAagGCCAGAGAAGAATSTATTGCCATCAETGTGAGAAAACATTLATCACTCTGGAACAUGTAATTACCACACCACGAGGAGCCCTGTTAGCATTGATGATTGAGCAAGGG GAGGCACTTGCGGATATCAGAAAGTCATTACGTCTTAACA SGACTTAGCCGTGAACTGTTAAAATTAGCGCGTGAAGCAAACTATAAAGAAAGTCGACAGTGTTTCCCTGCTTCTGATATTACCCTGAGEACCCGCGCTTtTCGCGTCAAGTATAATGGTAGCAATAACTCTCTT TATGCTCTTGTTACCGCAGAAGAACAAAGGGCAGGGTGGTTGcCaTCTCAACCAATTACTCCCCATC GCCGTAGagCaaCATTATCAATACCATCGAACUATGAAGAGCGTATGTCTCCAGGGACGCTGGCACAECATGTCCAGCGCAAAGAGEDACTTACTATGCGGCGGGATACC TTGTTTGATATTGATTACGGCCCGCCAGTTTTACATCAAAACGATCCGGGAATGE TGGTAaAaCCGGTTCTTCCGGCATATCGTCATTTTGAACTGGTCAGAATACTGACCGATGAGCATECCAACAACGTTCAGCATTACCTTGATCACGAATGCTTTATATTGGGCGGCTGCCTGATG GCTAATTTGCAGCATATTCATCAGGTCGCTGCCATATTTCcTTTGTCAA GAGCGCGGTGTGGCACCCGCCACCATTGTTTTCCACCGCGATEATTCCTTAGTgGt GGGGTA GAAATAATGTCTGGCGTGCATTTTCTAACCGCAATTATTCATGGCTGTATGCAAUCTCaCTGGC AGTAAGAAAGTCCGCGAGATGCGGCATGCAACATEGAACAGTGCGACGCgTTITATCCACTTTGTGCaGAACCATCCTTTCCTTATATCATTGAACCGAATGUCTCCTGCGaaTGTCgETTCTACATTAGATATCCTCAAACATCTGTGGAATAaAaAACTAGASCATGGAACAATTIAA sodium/proton antiporter 1 GTGAAACATCTGCATCGATTCTTTAGCaGTGATGCCTCGGGAGGCATTATTCTCATTATTGCCGCTGTATTAGCGATGATTATGGCCAACAGCGGTgcAACCAGTGGATGGTATCACGACTTTCTTGAGACGCCGGTTCAGCTCCGGGTTGGGACACTTGAGATCAACAAGAACATGCTG CTATGGATCAATGCGCTCTGaTgGCGGTATTTTTCCTGTTGGTTGGTCTGGAGTTAAACGCGAGCTGaTGCAGGTTCGCTGGCCAGTCEGCGCCAGGCGGCatTTCCTGTTATTGCCGCAATCGGCGGGATGATTGTCCCGGCATTGCTCTATCTGGCTETTAACTATGCCGATCCG aTTaCCCGCGAAGGCTGGGCATCCCGGCGGCGACTGECATTGCCTTTGCACTTggTgTGTTGGCGCTGTTGGGAAGTCGTGTTCCGTTAGCGCE GAAGATCTTTTEGATGGCTCTGGCUATTATCGACGATCTTGGGGCCATCATLATCATCGCATTGTTCTACACTAATGACTTATCG ATGGCCTCTCTTGGCGTCGCGCCTGTAGCAATTGCGgEACTCGCGGTATTGAA CTGTGTGGTGTACGCCGCACGGGCGTUTATATTCTGGTTGGCGTGGTGCE GTGGACAGCGGTGTTGAAATCGGGGGTTCACGCAACCcTGGCTGGCGECATEGICGGCTTCTTTATTCCTTTGAAA GAGAAGCATGGGCGCTCTCCGCCTAAACGTCTGGAGCATGTTTTGCAECCATGGGTGGCGTATCTGATUTTGCCGCTGTTTGCATTTGCTAATGCTGGCGTTTCACTGCAGGTGTCACGCtggAaGGTTTgACCECCATTCTGCCATTAGGGATCATCGCTGGTTTGCTGaTTGGCAAG CCACEGCGTAtTaGTCTgttcTGCTGGETGGcgCTGCGTTTGAAATTGGCACATCTGCCAGAGGGAACgACTUACCAGCAAATTATGGCGGETGGTaTCCTGTGCGCTATCGGTTtTACTatGTCTATCTTTATTGCCAGCCTGGCATTTGGTAGCGTAGATCCAGAaCTGaTTAACUGG GCAAAATTAGGTATCCTTGTCGGTTCAATTTCE TcGgCGGTAATTGGATATAGCTGGTTACGCGTTCGTTTACGTCCATCAGTTTGA transcriptional activator protein NhaR ATGAGCATGTCTCATATCAATTACAACCACTUGTATTACTTCTGGCaTGTCTACAAAAAGGTTCTGEGGTTGGCGCAGCGGAGGCGCTTTATTTAACACCACAAACCATTACCGGGCaGATCCGGGCGCTGGAGAGCGCCTGCAAGGGAAACTATTTAAGCGTAAAGGACGTGGTCTG GAACCCAGCGAACTGGGGGAACTGGTCTATCGCEATGCCGATAAAATGTTCACCTTAAGCCAGGAAATGCTGGATATCGTCAACTATCGCAAAGAGTCCAACTUATTGETTGATGTTGGTGTGGCAGATGCACTTUCCAAACGtCTGGTCAGCAGTGTTCUGGATECCGCAGTIGTGGAA GACGAGCAGAECCATCTACGCTGTTTCGAaTCGACGCACGAGATECTTTTGAGCAgt TGAGTCAGCATA ACTGGATATGATCATCTCTGACTGTCCGaTCGATTCCACTCAGCAGGAAGGGCTGTTTTCCATGAAAATEGGCGAATGTGGTGTCASETTCTGGTGCACTAACCCACTA CCAGAAAAGCCGTTTCCTGCCUGTCTTGAAGAGCGTCGETEACTTATTCCGGGGCGTCGCTCAaTGTTGGGGCGTAAACTATTAAACTGGTTTAACTCCCAGGGCTTGAACGTCGAAATTTTGGGTGAGTTTGATGATGCTGCGTTGATGAAAGCCTTTGGGGCGACGCATAACGCTATT TTCGTTGCACCTTCGCETTACGCTAATSATTTCTATAACSATGACTCGSTUGTGSAGATAGGCCGTGTTGAGACGTGATGGAAGAGTACCACGCGATTTTTGCCGAAGgaTGAETCASCACCCTGCAGTACAGCGTATCTGCAATACAgacTATTCTGCGCESTTTACTCCAGCTTCA AAATAA riboflavin kinase ATGAAGCTGATACGCGCALACATAATCTCAGCCAGGCCCCCCAAGAAGGGTGTGTGCTGACTATTGGT ATTTCGACGGCGTGCATCGCggTCATCGCGCGCTGTTACAGGGCUTGCAGGAAGAAGGGCGCAAGCGCAACUTACCGGTGATGGTGATGCTTTTtGaACCTCAACCACTG GAACTGTTTGCTACTGATAAAGCCCCGGCACGGcTCACCCGGCTGCgGGAAAAACTGCgTtaTcTTGCAGAGTGTGGCGTTGATTACGTGCTGTGCGEGCGTUTTGCaGGCGTUTTGCGGCGTTAACCGCGCAAAACTTCATCASTGATOTECTGGTGAAGCACTTGCGGGTAAAATTT CTTGCCGTAGGTGACGAETTCCGCTTTggCGCTGgTCGTGAAGGCGAETTCTTGTTATTACAGAAGGGGCATGGAATACGGCTTCGATATCACCAGCaCGCAAATTUTTGCGAAGGTGGTGTGCGTATCAGCAGCACCGCCGtGCGTCAGGCGCt TGCGGATGACAATCTGGCTCTG GCAGAAAGTTTACTGGGGCACCCGTTTGCTATCTCCGGGCGTGTAGTCCACGGTGATGATTAGGGCGCACTATAGGTTTCCCGACGGCGATGTACCGcTaCgCCGTCAGGTTTCCCCGGTGAAAGGGGTTTATGCGGTAGaAgTGTTGGOCCE TGgCGAAAGCCGTTACCCGGcgTT GCAAACATCGGAACACCCCCAACGGTTGCCGGTATTCGCCAGCAACTGCAGTGCATTTGTTAGATGTTGCAATGGCCTTTATGGTCGCCATATACAAGTAGTGCTGCGEAAAAAATACGCAATGAGCAGCGATTTGCATCGCTGGACGAACTGAAAGCGCAGATTGCGCGTGATGAA TTAACCGCCCGCGaaTTTTTTGGGCTAACAAAACCGGCTTAa Isoleucyl-tRNA synthetase ATGAGTGACTATAAATCACCCTgAATTTGCCGGAAACAGGTECCCGATECGTGGCGATCTCGCCAAGCGCGAACCGGGATGCTGGCGCGTTGGATGATGATGATCTgTaCGGCATCATCCGTGCGGCTAAAAAGGCAAAAACCTTCATTCTGCATgATGGCCCTCCTTATGCG AATGGCAGCALTCATATTGGTCACTCGGTTAACAAGATTCTGAAAGACATTATCATTAAGTCCAAAGGGCTIECTGGATATGACTCGCCGTATGTGCCTGGCTGGGACTGTCaTGGtCTGCCAATCGAACTGAAAGTAGAGCAAGAATACGGTAAGCCGGGGGAGATTCACCGCCGcT GAGTU CCGCGCCAAGTGCCGCGAATACGCTGCgACCCAGGTTGACGGTCAGCGCAAAGACTTTaTCCGTCTGGGCGTGCTGGGCGActgGTCGCACCCGTACCTGACCATGGACE TCAAAACTGAAGCCAACATCATCCGCGCGCTGGGCAAAATCATCGGCAACGGTCACCTGCACAA GGCGCGAAGCCGGTGCACTGGTGCGTTGACTGCCGTTCTGCACTGGCAGAAGCGGAAGUTSAGTATTACGCAAAACTECTCCGTCCATCGACGTCGCTTUCCAGGCGGTCGATCAGGATGCGCTGAAAACGAAATTTGGCGTAAGCAATETTAACGGCCCAATTTCGCEGGTTATCTGG aCCACCACGCCGTGCACGCTGCCTGCTAacCGCGCAATCTCCATEGCACCTGATTTTGALEATGCGCTGGTGCaAatCGACGGTCAGGCCGTGATCCTCGCGAAAGATCUGGtTGAAGCGTAAEGCAGCGTATCGGCGTTAGCGATTACACCATTCTTGGCACGGEGAAAGGTGCCGAG CtGGAACTGTTGCGCTTTACCCATCCGTTUATGGACETCGATGTTCCGGCATTCTCGGCGACCACGTTACGCTGGATGCCGGTACCGGTGCCG AGGCCACGGTCCGGCGACTATGTGATCGGTCAAAAATATGGTCTGGAAaCCGCTAACCCgGTTgGCCCGGAC GECACtTaTCTGCCGCGTACTTACCCGACTCUGGATEGCGTTAACGTCTTCAAAGCGAACGSTATTGTCATTGCGTTGTTGCAGGAAAAAGG TGTTGCA TTGAGAAAATGCAACACAGCTATCCGTCCTGCEGGCGTCATAAAACGCCGATCAUCTTCCGcgCGACGCCGCAG TGGTTCGTCASCATOGATCAGAAAGGTCTGCGTGCGCAGTCACTGAAAGAGATCAAAGGCGTGCAGTGGATCC STTGCTAACCGTCCTGACTGGTGTATCT CGTCaGCGTACCTGGGGGGTGCCGATGTCACTGTTCGTgCaCAaa GACACAGAAGAaCTGCA GTACTCtcAACTGa TGGAAGAAGTGGCAAAACGCGTTGAAGTEGAC CCTGG" TCCTCGGCGa AGTACCGGATACGCEGOATGTATGGTETGACTCCGGATCTACC CACTCTTCCGTTGTTGA CGGAATETGCCGGTC CATGTaTTGAGGTTCT cGTGgCT TCTCTACCGC CACGGCTTTACCGTGGATGGTCAGGGT CGCAAGATGTCTAAATCCA tAACaCCGTTTCGCCGCA AATAAACtGGGGCGG TGGCGAAATGGCC ACGEGCTGO ATCGTCGTATCCGTAACACCgCGCGC TTCCTGCTGGCAAACCTGAACSGTTtTGAECCGGCAAAAGTATGGTGAAACCGGAAGAGATGGTGGTACTGGI CGCTGGGCCGEAGGTTGTGCGAAAGCGGCACAGGAAGACATCCECAAGGCGTACGAAGCATACGATTTCCACGAAGTGGTCAGCGTCTGaTGCGCETCTGCTCC GTTGAGATGGGTTCCTTCTACCTCGACATCATCAAAGACCGTCAGTATACCGCCAAAGCGGCAGCGTGGCGCGTCGTAGCTGCCAGACTGCGCTGTATCACATCGCaGAAGCGCTGGTTCGCTGGATGGCACCAATCCTCTCCTTCaCcGCTGaTGAAGTGTGGGGt TaCCTGCCggGC GAACGTGAAAAATACGTCTTCACCGGCgAgTGSTACGAAGGCCTGETTGGTCTGGCAGACAGTGAAGCAATGAACGTGCGTTCTGGGACGAGCTGTTGAAAGTGCGTGGCGAAGTGAACAAAGTCaTTGAGCAAGCGCGTGCCGATAAGAACGTGGGGGGCTCGCTGGAAGCGGCAGTA ACCTTGTATGCAGAACCGGAaCTGGCGCGAaaCTGaCCGCGCTGGGCGAT GAATTACGATTTGTCCTGtTGACCTCCGCGCTACCGTTGCAGACEATAACGACGCACCTCCTGATGCCCAGCAGaGCGA GTCCTCAAAGGGCTGAAAgICGCGTTGAGTAAAGCCGAAGGtGaGAAG TGTCCtcGct GCTGCACTACACCCAGGATGTCGCAAGGTGGCGGACACGCAGAAATCTGCGGCCGCTGTGTCASCACGTCGCCGGTGACGGTGAAAAaCGTAAGTTTGCCTGA Non-protein region GCTTGCGCCAACGCCATTTCATCGCCATCCCGCCASCATACAGGCCTCGGAAGAACCATGGTGTTGGTGCCAACGGCC GACCATTTTTCGGTGCAGGCGCATGCCACAGATCGGCAACCATGTTTACGCAACGCAGATCGATTGCTGCAGITTGCGGATATTCTTCTTTGTCGATCC AGTTTTTGTTAATGGAEAAAECCA FKBP-type 16 kDa peptidyl-prolyl cis-trans isomerase ATGTCTGAATCTGTACAGaGCATASCGCCGTCCTGGTGCACTTCACGCTAAAACTCGACGAT GGCACCACCGCTGAGTCTACCCGCACACGGTAACCGGCGCTGTTCCGCCTG GTGATCCTTCTCTTTCTGAGGCTGGAGCAACACCTGCTGGGGCTGAAAGTGGECGATAAA ACCaCCTTCLCGCTGGAGCCAGATGCCGCgTTEGGCGTGCCGTCACCgGACCTGATECAGTACTTCTCCCGCCGTGAATTTATGJATGCAGGCGAGCCAGAAATTGGCGCAATCATECTTTTTACCGCAATGGATGGCAGTGAGATGCCTGGCGTGTCCGCgAAATTAACGGCGACTCC ATTACCGTTGATTTCAACCaTCCGCTGGCCGGGCAGACCGTTCATTTTGATATTGaagTGCTGGAATCGATCCGGCACTGGAGGCGTAStep by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started