Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

# basic Q class class QueueADT: def __init__(self): self.q = [] # this is the data structure the ADT manages def push(self, value): # add

# basic Q class class QueueADT: def __init__(self): self.q = [] # this is the data structure the ADT manages def push(self, value): # add an item to the queue self.q.append(value) def pop(self): # return first item in the queue, if it exists first_item = self.q.pop(0) return first_item def is_not_empty(self): return self.q != [] def is_empty(self): return self.q == [] class Graph: def __init__(self): self.graph = {} # key is node label, value is edge list -- the adjacency list for that node label def add_node(self, label): self.graph[label] = [] def add_edge(self, label1, label2): # add an edge from label1 to label2 # could check to make sure that label1 and label2 are keys in our dictionary adjacency_list = self.graph[label1] # could check that label2 is not already in label1's list adjacency_list.append(label2) def add_edges(self, edge_list): for edge in edge_list: # edge is a list, it's 2 items long # edge[0], edge[1] are our two labels edge[0] -> edge[1] self.add_edge(edge[0], edge[1]) def adjacency_list(self, label): # return the value associated with key stored in label # we can check and see if label is in the dictionary and return [] if it's not return self.graph.get(label, []) #if label in self.graph: # return self.graph[label] #else: # return [] #return self.graph[label] def bfs(self, starting_node): q = QueueADT() #create a new, empty queue Q q.push(starting_node) #push starting_node onto Q visited_list = [] visited_list.append(starting_node) #add starting_node to visited list # visited_list = [starting_node] while q.is_not_empty(): #while Q is not empty v = q.pop() #v = pop from Q for w in self.adjacency_list(v): #for vertices w in the adjacency list of v if w not in visited_list: #if w has not been visited visited_list.append(w) #add w to visited list q.push(w) #push w onto Q 

Here is what I need to be add in this code above -

Use my solution (Links to an external site.)Links to an external site. if needed as a starting point for this project. For this classwork assignment you want to create a class WebGraph: that does the following:

  1. Uses URLs stored as strings as the labels for the nodes. Are any changes necessary to the code for this part?
  2. Uses a dictionary rather than list to keep track of visited nodes. Instead of visited.append(edge), we can use visited[edge] = None to store information (should we need to later) about web pages we've accessed.
  3. Replace the adjacency_list in bfs with the "discovery" of the adjacency list by reading in the contents of a web page given its URL and determining the URLs that page links to. Each of these URLs must be added to the WebGraph network if they're have not already been visited. Write a function that takes a URL as a parameter and returns the URLs it links to as a list of strings. Call this function from bfs.
  4. Sleep for 3 seconds between each access of a new URL. This is easy: import time and use time.sleep() to pause your code. Google for how it works if you must.

Be careful testing this! If your changes work, your BFS routine will start rapidly crawling the web. This is bad. So add one more important requirement:

  1. The bfs member function will take an additional parameter max_nodes_visited with a default value of None. When max_nodes_visited is not None, the BFS will terminate after visiting max_nodes_visited web pages. When it is None, it will terminate when the entire world wide web has been crawled. Do not call with this option!! But technically it will only visit the connected component of the world wide web that the seed URL is in. That's just one small fraction of the web!

Test your BFS with max_nodes_visited of 1 or 2 at first.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Sybase Database Administrators Handbook

Authors: Brian Hitchcock

1st Edition

0133574776, 978-0133574777

More Books

Students also viewed these Databases questions