Answered step by step

Verified Expert Solution

Link Copied!

Question

1 Approved Answer

Posted on Sep 22, 2024

DATA STRUCTURE...please help In this question, you will be implementing a hash table with a set of functionalities specified below. As you know, we already

DATA STRUCTURE...please help

In this question, you will be implementing a hash table with a set of functionalities specified below. As you know, we already covered this topic in class and even provided an implementation, which should help guide you. In fact, you should already be quite familiar with the details of that implementation. In your case here, you will be using a Python list to implement the internal table.

Each entry in the internal table must be a key_value_pair as defined for you:

key_value_pair = namedtuple( 'key_value_pair', 'key value' ) This namedtuple defines a special tuple with fields key and value. For example:

>>> kvp = key_value_pair( 'year', 2020 ) >>> print( f"--- key: {kvp.key}, value: {kvp.value}" ) --- key: year, value: 2020 In addition, when the hash table is full, your implementation will raise a HashTableFullError defined already as:

class HashTableFullError( Exception ): pass A set of probing method implementations will be provided to you below.

Your task is to complete or provide the entire implementation of each of the methods needed in the hashtable class.

All code cells below use file magic so that their contents are stored as separate files. Please do not change the file magic directives at the top of these code cells.

__init__( self, M, probing_method, null_key='', deleted_key='', table_name="slots" )already defined for you. You may add new code, but do not modify the code provided to you. The internal table shall be kept in the instance variable, table, for example, and so on. See the implementation provided for you further below. hash( self, key )already defined for you. Do not modify. is_available( self, index ) returns True if the specified slot at the given index does not already have an entry. If there is an existing entry at index, this method returns False. set_value( self, index, key, new_value ) sets the value of the slot at the given index to the (key, new_value) pair using the key_value_pair definition provided to you. Remember that the internal table must contain entries of type key_value_pair. get_key( self, index ) returns the key at the given index in the hash table. No error checking required. get_value( self, index ) returns the value at the given index in the hash table. No error checking required. keys( self ) returns all the keys in the hash table as a Python list values( self ) returns all the values in the hash table as a Python list __contains__( self, key ) returns True if the specified key, key, is in the hash table; otherwise, it returns False __str__( self )already defined for you. No need to modify. This method returns a string representation of the hash table entries using the DataFrame class of the pandas module. __repr__( self )already defined for you. No need to modify. search( self, key, is_verbose=True ) returns the index of the internal table slot for the specified key and a list of internal table slot indices in the order they were visited, if that key is present in the hash table. Otherwise, it returns None and slot indices visited in the order that they were visited. Note that there is a challenge in implementing this method. That is, when should the search end when the specified key is not in the hash table.

For example, say that some key, , hashes to slot 2 using linear probing, but, due to collisions, value was stored at index 9. When this method is invoked with as argument when the hash table is full, search will start at index 2 and last until and including index 9. Therefore, for , search must return the following:

>>> ht.search( K ) (9, [2, 3, 4, 5, 6, 7, 8, 9]) Suppose that, another key, , which hashes to index 10 but does not exist in the hash table. Then, given a full hash table, this method will return None along with all the slot indices the search procedure visited, without cycles!

>>> ht.search( 400 ) (None, [10, 11, 12, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) remove( self, key, is_verbose=True ) removes the hash table entry with the key, key, if that key exists. If the key does not exist, this method raises an AssertionError. A removed internal table entry should be replaced by key_value_pair( self.deleted_key, None ) and you code must be implemented such that hash table search procedure can still work properly. That is, a deleted entry must still behave as if it is still there while searching, but it must accept a new (key, value) pair during addition.

add( self, key, value, is_verbose=True ) adds the provided (key, value) pair as a key_value_pair namedtuple (defined above) to the hash table if the specified key is not there already. If the key already exists, it overwrites the existing value at that key with value. This method will also keep track of the number of collisions while attempting to add a new entry to the hash table in the instance variable named n_collisions. Each call to add() should not reinitialize this value. It should simply count up how many collisions were experienced during the lifetime of the current hashtable object. If the internal table is full, then this method should raise a HashTableFullError. rehash( self, new_M, probing_method=None ) rehashes the existing entries of this hash table according to the new M value. If a new probing method is specified, that method must be used. Otherwise, the existing probing method should be used. Additional Specifications You will develop your code from scratch. You cannot use any import statements except for the ones already present in the code provided to you. Additional Specifications and Suggestions You must develop your solution from scratch! That means, you are not allowed to use any import statements other than the ones provided for you below. The hashtable code imports namedtuple from collections, and it also imports the pandas module only for easier printingSee __str__() and __repr__() implementations.

Your implementation absolutely needs the definition of the namedtuple class. You may not use the pandas module other than what it has been used for in the code provided for you.

Even if your code uses a single import statement other than these allowed ones, you will receive zero automatically for this entire test! Failure to follow specifications is a major flaw on the part of an engineer, and, therefore, no excuses will be accepted, even if your existing code with disallowed import statements works perfectly! That is, it will not matter at all whether your code works given that you disregarded and violated provided specifications. In other words, anything that works is and will not be accepted! This test is not about you providing anything that works. It is about providing a correct solution, given these specifications. Your code will be tested automatically and whether you have used disallowed import statements will also be checked automatically.

Failure to see or comprehend these specifications cannot be used as excuse to ask for reevaluation for code that clearly violates specifications! Such an attitude will not be tolerated. Do what seems easiest first. Otherwise, you may run out of time if you spend a lot of time on the more difficult parts of this question. If you first work on the "easier" methods, at least, you will have some of the functions implemented and tested so that you can get some points. The most difficult methods to write are search() and add(). Plan accordingly and use your time wisely. No time extension will be provided. Please do not ask for it. So do as much as you can within the 3 hours you have. You may want to work elsewhere, such as using GVIM, where you can type faster and be more efficient. Then, once you make sure your code works, you can transfer it and run the tests, after already running them separately to make sure you on the right track. Since this notebook is relatively long, it may be easier for you to work like this. This is purely a suggestion. %%file probing_methods.py class probing_methods: """ Collection of probing methods """ @staticmethod def linear_probe( self, key, i ): """ Implements linear probing. Note that linear probing is the same as modified linear probing with c=1. Therefore, this method only calls the method __modified_linear_probe__ with c=1. By reusing an existing method we end up taking advantage of this essential refactoring. Therefore, we will no longer need the following code:

original_hash_val = self.hash( key ) # Original hash value modified_hash_value = (original_hash_val + i) % self.M #print( f"--- original_hash_val: {original_hash_val}" ) return modified_hash_value

Instead all this will be a single line call! """ return probing_methods.__modified_linear_probe__( self, key, i, c=1 )

@staticmethod def __modified_linear_probe__( self, key, i, c ): """ This method will be called from the inner method returned by modified_linear_probe to implment modified linear probing. It will also be called directly by linear_probe() method. """ original_hash_val = self.hash( key ) # Original hash value modified_hash_value = (original_hash_val + i * c) % self.M #print( f"--- original_hash_val: {original_hash_val}" ) return modified_hash_value

@staticmethod def modified_linear_probe( c ): """ Note that we do not call a function here but return an inner function within which the c-value passed to this function will be captured. Note that all probing methods must have the call signature ( self, key, i ). That is, addition parameters must be captured by creating inner methods that will be returned from methods that will return those inner methods. """ def inner( self, key, i ): """ """ return probing_methods.__modified_linear_probe__( self, key, i, c ) return inner

@staticmethod def quadratic_probe( self, key, i ): """ Implements quadratic probing """ original_hash_val = self.hash( key ) # Original hash value modified_hash_value = (original_hash_val + i**2) % self.M #print( f"--- original_hash_val: {original_hash_val}" ) return modified_hash_value

@staticmethod def double_hashing( self, key, i, P=8 ): """ Implements double hashing """ hp = lambda key: 1 + key % P original_hash_val = self.hash( key ) # Original hash value modified_hash_value = (original_hash_val + i * hp( key )) % self.M #print( f"--- original_hash_val: {original_hash_val}" ) return modified_hash_value When you run the following cell, a file called hashtable.py will be created, since the below cell starts with a file magic directive. DO NOT DELETE THIS DIRECTIVE! Otherwise, your work will not be graded! %%file hashtable.py from collections import namedtuple import pandas

pandas.options.display.width = 999 pandas.options.display.max_colwidth = 999

key_value_pair = namedtuple( 'key_value_pair', 'key value' )

class HashTableFullError( Exception ): pass

class hashtable: """ A hash table implementation """ def __init__( self, M, probing_method, null_key='', deleted_key='', table_name="slots" ): """ Intializes the hash table data structures """ self.M = M self.probing_method = probing_method self.n_collisions = 0

self.null_key = null_key self.deleted_key = deleted_key self.table_name = table_name self.table = [key_value_pair( null_key, None )] * M

def hash( self, key ): """ Returns the hash value for the specified key. """ return key % self.M

def is_available( self, index ): """ Returns True if the specified slot at the given slot index is available to store a value/key. If not, it returns False. """ # YOUR CODE HERE raise NotImplementedError() def set_value( self, index, key, value ): """ Sets the value of the slot at the given index to the provivded value. """ # YOUR CODE HERE raise NotImplementedError() def get_key( self, index ): """ Returns the key at the given index in the hash table """ # YOUR CODE HERE raise NotImplementedError()

def get_value( self, index ): """ Returns the value at the given index in the hash table """ # YOUR CODE HERE raise NotImplementedError()

def keys( self ): """ Returns the keys in the hash table """ # YOUR CODE HERE raise NotImplementedError()

def values( self ): """ Returns the values in the hash table """ # YOUR CODE HERE raise NotImplementedError() def __contains__( self, key ): """ Returns True if the specified key is in the current hash table; otherwise, it returns False """ # YOUR CODE HERE raise NotImplementedError() def todataframe( self ): """ Returns a pandas dataframe for easier viewing in notebooks """ return pandas.DataFrame( [self.table], index=["slots"] )

def __str__( self ): """ Returns a string representation of the hash table key index. """ df = pandas.DataFrame( [self.table] ) df.index = [self.table_name] return str( df ) def __repr__( self ): """ See __str__() """ return self.__str__() def search( self, key, is_verbose=False ): """ Returns the index of the slot for the specified key if that key is present in the hash table. Otherwise, it returns None. """ # YOUR CODE HERE raise NotImplementedError()

def remove( self, key, is_verbose=False ): """ Removes the indicated entry by key from the hash table, if that key key exists. If the key does not exist, this method raises an AssertionError """ # YOUR CODE HERE raise NotImplementedError()

def add( self, key, value, is_verbose=False ): """ Adds the provided (key, value) pair as a key_value_pair (defined above) to the hash table if the specified key is not there already. If the key already exists, it overwrites the existing value at that key. """ # YOUR CODE HERE raise NotImplementedError() def rehash( self, new_M, probing_method=None ): """ Rehashes the existing entries of this hash table according to the new M value. If a new probing method is specified, that method must be used. Otherwise, the existing probing method should be used. """ # YOUR CODE HERE raise NotImplementedError()

TEST PREPARATION Run the following code cell as preparation for upcoming tests if you wish to run the test code to find out if your implementation works. However, you do not need to run any of the following tests. The following tests will be used to grade your work. In any case, you can take advantage of the existence of these tests. Note that you will not be able to see all the test code that will be run against your submission.

# # T E S T #1 # # # This test is to warn you in case you used 'import' statements in your code that are disallowed. # Be informed that all subsequent tests will check for the same implicitly. That is, if you # used 'import' in your code even once, all remaining tests will fail, which means # you will automatically get zero for this test! If you used 'import', it means you # did not heed the warnings provided to you already. If you have used 'import' that has been flagged by # this test, it is time to go back and FIX your code. # # Please do not be bothered how the following test code works. It checks whether you # have used any 'import' statements in your solution that are disallowed in code saved in `hashtable.py`. # # Free points are awarded if you code passes this test :) # import ast

def check_import( program_filepath, exceptions=["collections", "namedtuple", "pandas"] ): """ Returns all AST nodes that contain import statements, both import and impor from will be included """ import_statements = [] exceptions = set( exceptions )

with open( program_filepath ) as infile: # source = infile.read() parsed_ast = ast.parse( source ) nodes = [node for node in ast.walk( parsed_ast )]

for node in nodes: # if isinstance( node, ast.ImportFrom ) or isinstance( node, ast.Import ): # imported_module_names = set( [cur_node.name for cur_node in node.names] )

if imported_module_names - exceptions: # # Keep import statements that mention anything other than # the exceptions # import_statements.append( node )

return import_statements

def assert_no_imports( program_filepath ): """ Asserts that no import statements have been used in the provided program file """ import_statements = check_import( program_filepath ) import_linenos = [imp_ast_node.lineno for imp_ast_node in import_statements]

if len( import_statements ) > 0: # assert False, f"Disallowed import statements found at line(s) in {program_filepath}: {import_linenos}" else: print( "--- GOOD! You did not use any disallowed 'import' statements." )

program_filepath = "hashtable.py"

assert_no_imports( program_filepath )

print( ">>> TEST 1 PASSED <<<" )

# # T E S T #2 # # from hashtable import hashtable, HashTableFullError from probing_methods import probing_methods

print( "=" * 80 ) print( "LINEAR PROBING" ) print( "=" * 80 )

ht = hashtable( M, probing_methods.linear_probe )

for cur_key, cur_value in zip( keys, values ): # ht.add( cur_key, cur_value )

print( f"--- KEYS: {ht.keys()} " ) print( f"--- VALUES: {ht.values()} " ) print( f"--- #Collisions: {ht.n_collisions} " )

assert ht.keys() == [388, '', 431, '', '', 96, 226, 579, 903, '', '', 765, 142] assert ht.values() == ['H', None, 'B', None, None, 'C', 'F', 'E', 'G', None, None, 'A', 'D'] assert ht.n_collisions == 5

print( ">>> TEST 2 PASSED <<<" )

# # T E S T #3 # # from hashtable import hashtable, HashTableFullError from probing_methods import probing_methods

print( "=" * 80 ) print( "LINEAR PROBING" ) print( "=" * 80 )

ht = hashtable( M, probing_methods.linear_probe )

for cur_key, cur_value in zip( keys, values ): # ht.add( cur_key, cur_value )

print( f"--- KEYS: {ht.keys()} " ) print( f"--- VALUES: {ht.values()} " ) print( f"--- #Collisions: {ht.n_collisions} " )

assert ht.search( 388 ) == (0, [11, 12, 0]) assert ht.search( 395 ) == (None, [5, 6, 7, 8, 9])

print( ">>> TEST 3 PASSED <<<" )

%%file probing_methods.py class probing_methods: """ Collection of probing methods """ @staticmethod def linear_probe( self, key, i ): """ Implements linear probing. Note that linear probing is the same as modified linear probing with c=1. Therefore, this method only calls the method __modified_linear_probe__ with c=1. By reusing an existing method we end up taking advantage of this essential refactoring. Therefore, we will no longer need the following code:

original_hash_val = self.hash( key ) # Original hash value modified_hash_value = (original_hash_val + i) % self.M #print( f"--- original_hash_val: {original_hash_val}" ) return modified_hash_value

Instead all this will be a single line call! """ return probing_methods.__modified_linear_probe__( self, key, i, c=1 )

#####_______________________________________________________#####

%%file hashtable.py from collections import namedtuple import pandas

pandas.options.display.width = 999 pandas.options.display.max_colwidth = 999

key_value_pair = namedtuple( 'key_value_pair', 'key value' )

class HashTableFullError( Exception ): pass

self.null_key = null_key self.deleted_key = deleted_key self.table_name = table_name self.table = [key_value_pair( null_key, None )] * M

def hash( self, key ): """ Returns the hash value for the specified key. """ return key % self.M

def get_value( self, index ): """ Returns the value at the given index in the hash table """ # YOUR CODE HERE raise NotImplementedError()

def keys( self ): """ Returns the keys in the hash table """ # YOUR CODE HERE raise NotImplementedError()

#####_______________________________________________________#####

import string

M = 13 c = 3 keys = [765, 431, 96, 142, 579, 226, 903, 388] values = [c for c in string.ascii_uppercase]

#####_______________________________________________________#####

with open( program_filepath ) as infile: # source = infile.read() parsed_ast = ast.parse( source ) nodes = [node for node in ast.walk( parsed_ast )]

for node in nodes: # if isinstance( node, ast.ImportFrom ) or isinstance( node, ast.Import ): # imported_module_names = set( [cur_node.name for cur_node in node.names] )

if imported_module_names - exceptions: # # Keep import statements that mention anything other than # the exceptions # import_statements.append( node )

return import_statements

program_filepath = "hashtable.py"

assert_no_imports( program_filepath )

print( ">>> TEST 1 PASSED <<<" )

#####_______________________________________________________#####

# # T E S T #2 # # from hashtable import hashtable, HashTableFullError from probing_methods import probing_methods

print( "=" * 80 ) print( "LINEAR PROBING" ) print( "=" * 80 )

ht = hashtable( M, probing_methods.linear_probe )

for cur_key, cur_value in zip( keys, values ): # ht.add( cur_key, cur_value )

print( f"--- KEYS: {ht.keys()} " ) print( f"--- VALUES: {ht.values()} " ) print( f"--- #Collisions: {ht.n_collisions} " )

print( ">>> TEST 2 PASSED <<<" )

The question is: i want just (report) that explain the program for test 1 and 2 ?

original_hash_val = self.hash( key ) # Original hash value modified_hash_value = (original_hash_val + i) % self.M #print( f"--- original_hash_val: {original_hash_val}" ) return modified_hash_value

Instead all this will be a single line call! """ return probing_methods.__modified_linear_probe__( self, key, i, c=1 )

%%file hashtable.py from collections import namedtuple import pandas

pandas.options.display.width = 999 pandas.options.display.max_colwidth = 999

key_value_pair = namedtuple( 'key_value_pair', 'key value' )

class HashTableFullError( Exception ): pass

self.null_key = null_key self.deleted_key = deleted_key self.table_name = table_name self.table = [key_value_pair( null_key, None )] * M

def hash( self, key ): """ Returns the hash value for the specified key. """ return key % self.M

def get_value( self, index ): """ Returns the value at the given index in the hash table """ # YOUR CODE HERE raise NotImplementedError()

def keys( self ): """ Returns the keys in the hash table """ # YOUR CODE HERE raise NotImplementedError()

import string

M = 13 c = 3 keys = [765, 431, 96, 142, 579, 226, 903, 388] values = [c for c in string.ascii_uppercase]

with open( program_filepath ) as infile: # source = infile.read() parsed_ast = ast.parse( source ) nodes = [node for node in ast.walk( parsed_ast )]

for node in nodes: # if isinstance( node, ast.ImportFrom ) or isinstance( node, ast.Import ): # imported_module_names = set( [cur_node.name for cur_node in node.names] )

if imported_module_names - exceptions: # # Keep import statements that mention anything other than # the exceptions # import_statements.append( node )

return import_statements

program_filepath = "hashtable.py"

assert_no_imports( program_filepath )

print( ">>> TEST 1 PASSED <<<" )

# # T E S T #2 # # from hashtable import hashtable, HashTableFullError from probing_methods import probing_methods

print( "=" * 80 ) print( "LINEAR PROBING" ) print( "=" * 80 )

ht = hashtable( M, probing_methods.linear_probe )

for cur_key, cur_value in zip( keys, values ): # ht.add( cur_key, cur_value )

print( f"--- KEYS: {ht.keys()} " ) print( f"--- VALUES: {ht.values()} " ) print( f"--- #Collisions: {ht.n_collisions} " )

print( ">>> TEST 2 PASSED <<<" )