Question
Answer The 2nd Part Of Question Part 2 Solve Using Python Here Are The Data Structure Please Answer These And Explain Them Also using Comment
Answer The 2nd Part Of Question Part 2 Solve Using PythonHere Are The Data Structure Please Answer These And Explain Them Also using Comment
HashSet.py
from dataclasses import dataclass
from typing import List
@dataclass
class HashSet:
buckets: List[List] = None
size: int = 0
def init(self):
self.size = 0
self.buckets = [[] for i in range(8)]
# Computes hash value for a word (a string)
def get_hash(self, word):
pass # Placeholder code ==> to be replaced
# Doubles size of bucket list
def rehash(self):
pass # Placeholder code ==> to be replaced
# Adds a word to set if not already added
def add(self, word):
pass # Placeholder code ==> to be replaced
# Returns a string representation of the set content
def to_string(self):
pass # Placeholder code ==> to be replaced
# Returns current number of elements in set
def get_size(self):
pass # Placeholder code ==> to be replaced
# Returns True if word in set, otherwise False
def contains(self, word):
pass # Placeholder code ==> to be replaced
# Returns current size of bucket list
def bucket_list_size(self):
pass # Placeholder code ==> to be replaced
# Removes word from the set if there does nothing
# if word not inset
def remove(self, word):
pass # Placeholder code ==> to be replaced
# Returns the size of the bucket with most elements
def max_bucket_size(self):
pass # Placeholder code ==> to be replaced
2nd structure
hash_main.py
import HashSet as hset
# Program starts
# Initialize word set
words = hset.HashSet() # Create new empty HashSet
words.init() # Initialize with eight empty buckets
# Add names to word set. Notice: a) contains duplicate names,
# b) more than eight names ==> will trigger rehash
names = ["Ella", "Owen", "Fred", "Zoe", "Adam", "Ceve", "Adam", "Ceve", "Jonas", "Ola", "Morgan", "Fredrik", "Simon", "Albin", "Jonas", "Amer", "David"]
for name in names:
words.add(name)
print(" to_string():", words.to_string()) # { Adam David Amer Ceve Owen Ella Jonas Morgan Fredrik Zoe Fred Albin Ola Simon }
print("get_size():", words.get_size()) # 14
print("contains(Fred):", words.contains("Fred")) # True
print("contains(Bob):", words.contains("Bob")) # False
# Hash specific data
mx = words.max_bucket_size()
print(" max bucket:", mx) # 2
buckets = words.bucket_list_size()
print("bucket list size:", buckets) # 16
# Remove elements
delete = ["Ceve", "Adam", "Ceve", "Jonas", "Ola"]
for s in delete:
words.remove(s)
print(" get_size:", words.get_size()) # 10
print("to_string():", words.to_string()) # { David Amer Owen Ella Morgan Fredrik Zoe Fred Albin Simon }
Project Task The project is about understanding hashing and binary search trees. We will use the two large text files you used in Assignment 3 as input data. We strongly recommend that you look at Lecture 10 before you start to work on the project exercises. The problem can be divided in five parts: 1. Count unique words using Python's set and dictionary 2. Implement two data structures suitable for working with words as data: a) A hash based set, and b) a binary search tree (BST) based map (dictionary). 3. Use your two data structures to repeat Part 1 (counting unique words) 4. Present word related plots using matplotlib (VG Exercise) 5. Measure the time to perform certain operations on your map and set implementation. (VG Exercise) Part 1 - Count unique words 1 (G exercise) In Exercise 6 in Assignment 3 you saved all words from the two text files eng_news_100K-sentences.txt and holy grail.txt in two separate files. (Do it now if you haven't done this exercise already.) Your task here is to 1) use Python's set class to count the number of unique words in each file, and 2) use Python's dictionary class to produce a Top 10 list of the ten most frequently used words having a length larger than 4 in each file. In Part 3 you will repeat the same computations using your own hash and BST based implementations. Part 2 - Implementing data structures (G exercise) Lecture 10 outlines the basic ideas of two techniques suitable for implementing maps (dictionaries) and sets: 1) Binary search trees (BST), and 2) Hashing. Your task is to implement a set (suitable for words) based on hashing and a map based on binary search trees. Additional limitations: The BST based map is a linked implementation where each node has four fields (key, value, left-child, right-child). The hash-based set is built using a Python list to store the buckets where each bucket is another Python list. The initial bucket list size is 8 and rehashing (double the bucket list size) takes place when the number of elements equals the number of buckets. Furthermore, code skeletons outlining which methods we expect for each data structure are available here. They also contains an example program showing how the various methods can be used. Notice: You are not allowed to make any changes of the method signatures in the given skeletons. Also, the demo programs should work as outlined in the provided example programs once your implementations are complete. Your task is simply to complete the given code fragments in the skeletons. However, feel free to add additional methods. Part 3 - Count unique words 2 (G exercise)Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started