Question
Need help with a Python problem about Parsers: Write class HeaderParser that is a subclass of the HTMLParser class (no global variables). It will find
Need help with a Python problem about Parsers:
Write class HeaderParser that is a subclass of the HTMLParser class (no global variables). It will find and collect the contents of all the headings in an HTML file fed to it. The parser works by identifying when a header tag has been encountered and setting a boolean variable in the class to indicate that. When the data handler for the class is called and the boolean in the class indicates that a header is currently open, the data inside the header is added to a list. Finally, when a closing header tag is encountered the boolean variable is unset. To implement this parser you will need to override the following methods of the HTMLParser class:
__init__: The constructor calls the parent class constructor, set the boolean variable in the object appropriately, and sets the list of headings to the empty list.
handle_starttag: If the tag that resulted in this method being called is a header, the header indicator should be set.
handle_endtag: If the tag that resulted in this method being called is a header, the header indicator should be unset.
handle_data: If the parser is currently inside a header, then the data should be added to the list of headers contents. Make sure that you strip any leading or trailing spaces or newlines off the contents of the header before adding it to the list.
getHeadings: The function returns the list of headings gathered by the parser.
You can find a template for the class and a test function testHParser() in the given template file. The following shows what that test function would display on several sample web pages. Note that your solution must work on any page, not just the ones provided here. Think carefully about what it means to collect headings in a general context:
--------------------------------------------------------------------------------------------------
This is the template file that is given (Copy/Paste):
from html.parser import HTMLParser from urllib.request import urlopen
class HeaderParser(HTMLParser): '''subclass of imported HTMLParser class. Finds and collects contents of headings in an HTML file that is fed to it. Identifies when a heading tag has been encountered and sets a boolean variable to indicate it. When data handler is called and the boolean in the calss indicates the header is currently open, the data inside the header is added to a list. Finally, when a closing header tag is encountered, the boolean variable is unset. Override the following methods''' def __init__(self): '''The constructor calls the parent class constructor, sets the boolean variable in the object appropriately, and sets the empty list''' pass
def handle_starttag(self, tag, attrs): '''If the tag that resulted in this method being called is a header, the header indicator should be set''' pass
def handle_endtag(self, tag): '''If the tag that resulted in this method being called is a header, the header indicator should be unset''' pass
def handle_data(self, data): '''If the parser is currently inside a header, then the data should be added to the list. Strips any leading or trailing white space.''' pass
def getHeadings(self): '''The function returns the list of headings gathered by the parser''' pass
def testHParser(url): '''opens the url, reads it, and converts it to a string saved in a variable. Creats an object of HeaderParser class and feeds it the string. Returns the the getHeadings method''' pass
--------------------------------------------------------------------------------------------------------
Python 3.4.1 Shell File Edit Shell Debug Options Windows Help test Parser http://facweb.cdm. de paul .edu/ asettle/csc242 web/headings.html C' Heading one Small Heading Third Heading lst test HPar ser http://facweb.cdm. de paul .edu/ asettle/csc2 42 test.html lst C' Hello World Bigger heading test Parser http://facweb.cdm. depaul .edu/ asettle/csc242 web/cookie .html Cookie the cat testHParser http://www.depaul edu C' Upcoming Admission Open House Events Headlines Explore De Paul\u20ob u200bt u 200b Academics' Arts Academics Mission Chicago Academics' 'DePaul Un iversity u200b About Academics Admission Financial Aid Student Life\u20 Ob Resources Information For Quick links Ln: 38 Col: 4Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started