Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Add an optional parameter limit with a default of 10 to crawl() function which is the maximum number of web pages to download . Save
Add an optional parameter limit with a default of 10 to crawl() function which is the maximum number of web pages to download . Save files to pages dir using the MD5 hash of the pages URL and Only crawl URLs that are in landmark.edu domain (*.landmark.edu)
Use a regular expression when examining discovered links . import hashlib filename = 'pages/' + hashlib.md5(url.encode()).hexdigest() + '.html' import re p = re.compile('ab*') if p.match('abc'): print("yes")
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started