[Solved] Experiment 1. Q. What is the effect of th

Answered step by step

Verified Expert Solution

Link Copied!

Question

1 Approved Answer

Posted on Sep 22, 2024

Experiment 1. Q. What is the effect of the load factor on hash table performance? Procedure: Locate the loadFactor variable in the putExperiment. Change this

Experiment 1.

Q. What is the effect of the load factor on hash table performance?

Procedure:

Locate the loadFactor variable in the putExperiment.

Change this to the values specified in the report table.

Run the program and record the data.

Plot the data.

What conclusions do the data support?

Any other observations?

Experiment 2.

Q. What is the effect of the hash function on hash table performance?

Procedure:

Run the program using the original hash function with the specified loadFactor.

Collect the data.

Change the hash function to be a really terrible (but valid) hash function.

(be sure the value returned is in [0,M-1])

( Save the original hash function code for experiment 3).

Run the program using the new hash function with the specified loadFactor.

Collect the data

Plot the data.

What conclusions do the data support? Any other observations?

Experiment 3.

This hash implementation is using linear probing to resolve collisions. The probing starts at the location of the collision; which can lead to clustering.

Q. Can we reduce the effect of clustering by choosing a different starting place to perform linear probing?

Procedure:

Run the program using the original hash function and original put implementation with the specified loadFactor.

Collect the data.

Change the put function to choose as the starting point for linear probing, the location that is as far away as possible from the collision location (i.e. half the table size away).

That is: put will still do linear probing, but the starting point will not be the collision location.

Clearly comment your code to explain how your code implements this scheme.

Run the program using the new put implementation.

Collect the data

Plot the data.

What conclusions do the data support? Any other observations?

package CSC301SPSolutions; // change the package if you want import stdlib.*; import algs13.Queue; import algs31.SequentialSearchST; /* *********************************************************************** * Textbook symbol table implementation with linear probing hash table. * * Some modifications made to support empirical testing * See comments in putExperiment method * * CSC 301 Fall 2018 version 1.0 * *************************************************************************/

public class CS301LinearProbingHashST { private static final int INIT_CAPACITY = 4;

private int N; // number of key-value pairs in the symbol table private int M; // size of linear probing table private K[] keys; // the keys private V[] vals; // the values private int putCount; // for experimental data collection private int getCount; //

// create an empty hash table - use 16 as default size public CS301LinearProbingHashST() { this(INIT_CAPACITY); }

// create linear proving hash table of given capacity @SuppressWarnings("unchecked") public CS301LinearProbingHashST(int capacity) { M = capacity; keys = (K[]) new Object[M]; vals = (V[]) new Object[M]; }

public void resetPutCount(int val) { // for experimental data collection putCount = 0; } public void resetGetCount(int val) { // for experimental data collection getCount = 0; } public int getGetCount() { return getCount; } // for experimental data collection public int getPutCount() { return putCount; }// for experimental data collection

// return the number of key-value pairs in the symbol table public int size() { return N; }

// is the symbol table empty? public boolean isEmpty() { return size() == 0; }

// does a key-value pair with the given key exist in the symbol table? public boolean contains(K key) { return get(key) != null; }

// hash function for keys - returns value between 0 and M-1 private int hash(K key) { return (key.hashCode() & 0x7fffffff) % M;

//ToDo experiment 2: modify this function as described in the instructions

}

// resize the hash table to the given capacity by re-hashing all of the keys private void resize(int capacity) { CS301LinearProbingHashST temp = new CS301LinearProbingHashST<>(capacity); for (int i = 0; i < M; i++) { if (keys[i] != null) { temp.put(keys[i], vals[i]); } } keys = temp.keys; vals = temp.vals; M = temp.M; }

// insert the key-value pair into the symbol table public void put(K key, V val) { if (val == null) delete(key);

// double table size if 50% full commented out for experiments // if (N >= M/2) resize(2*M);

//toDo experiment 3 // Modify the code to choose an alternate starting point for linear probing // as described in the instructions

int i; for (i = hash(key); keys[i] != null; i = (i + 1) % M) { // linear probing putCount++; if (keys[i].equals(key)) { vals[i] = val; return; } } keys[i] = key; vals[i] = val; N++; }

// return the value associated with the given key, null if no such value public V get(K key) { int i; for (i=hash(key); keys[i] != null; i = (i + 1) % M) { getCount++; if (keys[i].equals(key)) return vals[i]; } return null; }

// delete the key (and associated value) from the symbol table public void delete(K key) { if (!contains(key)) return;

// find position i of key int i = hash(key); while (!key.equals(keys[i])) { i = (i + 1) % M; }

// delete key and associated value keys[i] = null; vals[i] = null;

// rehash all keys in same cluster i = (i + 1) % M; while (keys[i] != null) { // delete keys[i] an vals[i] and reinsert K keyToRehash = keys[i]; V valToRehash = vals[i]; keys[i] = null; vals[i] = null; N--; put(keyToRehash, valToRehash); i = (i + 1) % M; }

N--;

// halves size of array if it's 12.5% full or less if (N > 0 && N <= M/8) resize(M/2);

assert check(); }

// return all of the keys as in Iterable public Iterable keys() { Queue queue = new Queue<>(); for (int i = 0; i < M; i++) if (keys[i] != null) queue.enqueue(keys[i]); return queue; }

// integrity check - don't check after each put() because // integrity not maintained during a delete() private boolean check() {

// check that hash table is at most 50% full if (M < 2*N) { System.err.println("Hash table size M = " + M + "; array size N = " + N); return false; }

// check that each key in table can be found by get() for (int i = 0; i < M; i++) { if (keys[i] == null) continue; else if (get(keys[i]) != vals[i]) { System.err.println("get[" + keys[i] + "] = " + get(keys[i]) + "; vals[i] = " + vals[i]); return false; } } return true; } /* putExperiment * * determine the average number of compares required to 'put' a specified number of keys * into a hash table * * method: * 1) read in some strings from a file, store them in an auxiliary table * to be used to populate our hash table * 2) put the desired number of keys from the auxiliary table to the hash table * (the put method counts the number of compares required for each put) * 3) print the average number of compares * 4) repeat steps 2-5 for the other table sizes */ public static void putExperiment(int minlen, String file) { // 'read-in' data into an auxiliary ST // ( this eliminates any duplicates and guarantees that every // 'put' adds a new value to the table ) // other iterable collections could be made to work

SequentialSearchST sourceData = new SequentialSearchST<>();

In in = new In (file); // reading from the file passed to the function

while (!in.isEmpty()) { String key = in.readString(); if (key.length() < minlen) continue; // only keys of at least minlen are used sourceData.put(key, 1); // all values are set to 1, since we don't really // use the values at all in the experiments }

// do 10 experiments with expSize ranging from 100, 200, ..., 1000 // for each experiment size, we create a table with extra space specified by the loadFactor // loadFactor: 0.50 means the table will be only half full, there is twice as much space as we need // loadFactor: 0.80 means the table will be 80% full, 20% of the space will not be used

double loadFactor = 0.75; StdOut.format(" Put Experiment results. LoadFactor %5.3f ", loadFactor); StdOut.format(" ---------------------------------------- ");

for ( int expSize = 100; expSize <= 1000; expSize+=100) { int expTableSize = (int) ( expSize /loadFactor); CS301LinearProbingHashST aHashST = new CS301LinearProbingHashST<>(expTableSize);

aHashST.resetPutCount(0); // fill the hash table from the aux table int count = 0; for ( String s: sourceData.keys() ) { aHashST.put(s, 1); count++; if ( count == expSize) break; // all puts done. }

double avgNumComp = aHashST.getPutCount()/(double)expSize; StdOut.format(" expSize %4d average number of compares %6.2f ", expSize, avgNumComp);

} }

/* ********************************************************************* * Unit test client. ***********************************************************************/ public static void main(String[] args) { putExperiment( 8, "data/tale.txt"); } }