Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

python language complete the code Write class AnchorParser that is a subclass of the HTML Parser class. It will find and collect the contents of

python language image text in transcribed
image text in transcribed
complete the code
Write class AnchorParser that is a subclass of the HTML Parser class. It will find and collect the contents of all the link descriptions in an HTML file fed to it. The 4 parser works by identifying when an "a' tag has been encountered and setting a boolean variable in the class to indicate that. When the data handler for the class is called and the boolean in the class indicates that a 'a' tag is currently open, the data inside the header is added to a list. Finally, when a closing header tag is encountered the boolean variable is unset. To implement this parser you will need to override/implement the following methods of the HTMLParser class: a. init : The constructor calls the parent class constructor, set the boolean variable in the object appropriately, and sets the list of headings to the empty list. b. handle starttag: If the tag that resulted in this method being called is a header, the header indicator should be set. c. handle endtag: If the tag that resulted in this method being called is a header, d. handle data: If the parser is currently inside a header, then the data should be the header indicator should be unset. added to the list of headers contents. Make sure that you strip any leading or trailing spaces or newlines off the contents of the header before adding it to the list. getLabels: The function returns the list of headings gathered by the parser. e. find a template for the class AnchorParser and a test function testParser(0 You can in the zip file. The following shows what that test function would display on several sample web pages. Note that your solution must work on any page, not just the ones provided here. Think carefully about what it means to collect headings in a general context: >>> testParser ('http://zoko.cdm.depaul.edu/csc242/helloworld.html') 'here' >>> testParser ('http://zoko.cdm.depaul.edu/csc242/random.html) ['Click here to e-mail Mr. Zoko', 'Go to the image sample' from html. parser import HTMLParser from urllib. parse import urljoin from urllib. request import urlopen def testParser (ur1): content - urlopen(ur1). read ). decode 0 parser AnchorParser ) parser. feed (content) return parser. getLabels O class AnchorParser (HTMLParser): def init_(self): def handle_starttag (self, tag, attrs): def handle_endtag (self, tag): def handle data (self, data): def getLabels (self): pass pass pass ass pass

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Making Databases Work The Pragmatic Wisdom Of Michael Stonebraker

Authors: Michael L. Brodie

1st Edition

1947487167, 978-1947487161

More Books

Students also viewed these Databases questions