Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

I need help understanding this project. I don't get how Rabin-Karp or a brute foroce substring works. I have posted the shell code and missing

I need help understanding this project. I don't get how Rabin-Karp or a brute foroce substring works. I have posted the shell code and missing code from the follwoing methods. I don't just want code but some understand of how this works and how to think/figure out how to code it.

These methods are in the shell code with details regarding implementation of the methods but these are the ones I need help with:

public void checkMatch(DNA matchCandidate)

private int rabinKarp(DNA substrandCandidate)

private int rabinKarpHashes(int[] hashes, int pos, int length)

private void setStrand(java.lang.String strandIn) throws InvalidDNAStrandException

public int dnaHash()

public static int charNumericValue(char molecule)

public DNA dnaSubstrand(int pos, int length)

private void setStrand(java.lang.String strandIn) throws InvalidDNAStrandException

public DNA(java.lang.String strandIn)

---------------------------------------------------------------------------------------------------------

package dna;

/**

* Used to find the best sequential match for a DNA strand.

* Holds a master DNA strand and accepts candidate strands, storing the best match that has been tested.

* Best matches are determined by the longest common substring.

* Longest common substrings are determined by using a recursive approach to the Rabin-Karp algorithm.

*/

public class DNAMatcher extends java.lang.Object

{

/**

* Master strand of DNA to be matched.

*/

private DNA masterStrand;

/**

* Current best DNA match found.

*/

private DNA bestMatch;

/**

* Longest common substring match between master strand and current best match DNA.

*/

private DNA substrandMatch;

/**

* DNAMatcher constructor.

* Sets the master strand to the DNA passed. Nulls current best match and substrand match.

* @param masterStrandIn Incoming master DNA strand for this DNAMatcher.

*/

public DNAMatcher(DNA masterStrandIn)

{

setMasterStrand(masterStrandIn);

setBestMatch(null);

setSubstrandMatch(null);

}

/**

* Returns the DNAMatcher's masterStrand.

* @return DNAMatcher's masterStrand.

*/

public DNA getMaster()

{

return masterStrand;

}

/**

* Sets the private field of masterStrand.

*/

public void setMasterStrand(DNA masterStrandIn)

{

masterStrand = masterStrandIn;

}

/**

* Returns the DNAMatcher's bestMatch.

* @return DNAMatcher's bestMatch.

*/

public DNA getBestMatch()

{

return bestMatch;

}

/**

* Sets initial value to null.

* @param holder Sets bestMatch field variable.

*/

public void setBestMatch(DNA holder)

{

bestMatch = holder;

}

/**

* Returns the DNAMatcher's substrandMatch.

* @return DNAMatcher's substrandMatch.

*/

public DNA getSubstrandMatch()

{

return substrandMatch;

}

/**

* Sets initial value to null.

* @param holder Sets substrandMatch field variable.

*/

public void setSubstrandMatch(DNA holder)

{

substrandMatch = holder;

}

/**

* Formats this DNAMatcher into a String. If a match has been found returns a String in the format:

*

* Master: MASTERSTRAND

* Best Match: BESTMATCH

* Matching Substring: SUBSTRANDMATCH

*

* If a match has not been found returns a String in the format:

* Master: MASTERSTRAND

* NO MATCH FOUND

*

* @return Formatted String of this DNAMatcher.

*/

@Override

public java.lang.String toString()

{

}

/**

* Given a DNA match candidate, checks if the passed DNA candidate has a longer common substring than is currently stored.

* Uses the Rabin-Karp algorithm for substring matching.

* Performs this by breaking the candidate strand into all possible substrings and sending them to the RabinKarp method.

* This is a bit, Brute Force (there are more efficient but complex ways of doing this), however the work is mitigated

* by eliminating many possible substrings. A substring from the candidate that is longer than master strand can be skipped.

* All substrings whose length is shorter than the current best match are also skipped.

* Additionally, substrings of the candidate are processed from longest to shortest and stops once a match is

* found and doesn't bother with the shorter substrings.

*

* @param matchCandidate Candidate for matching master strand.

*/

public void checkMatch(DNA matchCandidate)

{

// NEED CODE HERE

}

/**

* Implements Rabin-Karp alogrithm for finding longest common substring. Does the preprocessing of finding the hash for

* the candidate's substrand using the stringHash method and the hashes of substrings in the master strand String of

* the same length as the candidate substrand using the recursive RabinKarpHashes method. Uses a search to determine

* if the substring hash is in the collection of hashes and returns the result.

*

* @param substrandCandidate DNA substrand to find in the master strand.

* @return Location in master strand if candidate substrand is found. If not found returns -1.

*/

private int rabinKarp(DNA substrandCandidate)

{

// NEED CODE HERE

}

/**

* Finds and stores the hash values of all substrings of passed size in the master strand. Starts at the passed position,

* and works backwards through the master strand's String. The hash values are stored in the passed array.

* The position in the array corresponds to the position of the substring in the master strand's String.

* THIS METHOD MUST BE RECURSIVE USING THE TECHNIQUE AS DESCRIBED IN THE RABIN-KARP LECTURE.

*

* @param hashes Array of hashes to be populated.

* @param pos Current position in master strand's String to be hashed.

* @param length Length of the substrings in the master strand to be hashed.

*

* @return Returns the hash of the substring in the master strand at the given position of the given length.

* This must be returned so that a general case hashing a substring can recursively call upon the

* method to get the hash of the substring that is found before it.

*/

private int rabinKarpHashes(int[] hashes, int pos, int length)

{

// CODE HERE

}

}

-------------------------------------------------------------------------------------------------------------------------------------------

package dna;

/**

* DNA strand. Internally represented by a String containing only valid DNA molecules as follows

* A: Adenine

* C: Cytosine

* G: Guanine

* T: Thymine

* Capable of handling a DNA strand of a maximum of 15 molecules.

*/

public class DNA extends java.lang.Object

{

/**

* Holds the molecule strand characters.

*/

private java.lang.String strandString;

/**

* Constructor - DNA constructor. Accepts a String which represents the molecule sequence.

* @param strandIn Molecule sequence for this DNA.

*/

public DNA(java.lang.String strandIn)

{

// NEED CODE HERE

}

/**

* Returns the molecule sequence String.

* @return strandString String.

*/

public java.lang.String getStrand()

{

}

/**

* Returns the number of molecules in the DNA strand.

* @return Length of the DNA strand.

*/

public int getStrandLength()

{

}

/**

* toString in class java.lang.Object

*

* @return String representation

*/

@Override

public java.lang.String toString()

{

//TOSTRING CODE

}

/**

* Sets the strand String for this DNA. Ensures the requested stand does not exceed maximum

* length or contain invalid molecules.

* @param strandIn Desired strand String.

* @throws InvalidDNAStrandException Throws an InvalidDNAStrandException if incoming strand is invalid.

*/

private void setStrand(java.lang.String strandIn) throws InvalidDNAStrandException

{

// NEED CODE HERE

}

/**

* Returns the hash value of the entire string for this DNA object. Since there are only four possible

* characters in the strand, it uses a base 4 number system to create the hash as described in the

* Rabin-Karp lecture. This will be needed only to find the hash of the substring that is being

* searched for and the base case for finding all substring hashes in the master strand string.

* This should not be used to find ALL substring hashes of the master strand.

* @return Hash of the string for this DNA object.

*/

public int dnaHash()

{

// NEED CODE HERE

}

/**

* Given the character of a molecule, returns its numeric mapping.

* 'A' - 0

* 'C' - 1

* 'G' - 2

* 'T' - 3

* @param molecule Character of molecule.

* @return Numeric mapping of passed molecule character.

*/

public static int charNumericValue(char molecule)

{

// NEED CODE HERE

}

/**

* Generates and returns a new instance of DNA that is a substrand of this DNA. The DNA generated

* starts at the given position and is of the given length.

* @param pos Beginning of the desired substrand of DNA.

* @param length Length of the desired substrand of DNA.

* @return New DNA that is a substrand of this DNA.

*/

public DNA dnaSubstrand(int pos, int length)

{

// NEED CODE HERE

}

}

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Database Design Application Development And Administration

Authors: Michael V. Mannino

4th Edition

0615231047, 978-0615231044

More Books

Students also viewed these Databases questions

Question

What are the Five Phases of SDLC? Explain each briefly.

Answered: 1 week ago

Question

How can Change Control Procedures manage Project Creep?

Answered: 1 week ago