Related questions with answers
Develop class MyHTMLParser as a subclass of HTMLParser that, when fed an HTML file, prints the names of the start and end tags in the order that they appear in the document, and with an indentation that is proportional to the element’s depth in the tree structure of the document. Ignore HTML elements that do not require an end tag, such as p and br .
>>> infile = open('w3c.html') >>> content = infile.read() >>> infile.close() >>> myparser = MyHTMLParser() >>> myparser.feed(content) html start head start title start title end head end body start h1 start h1 end h2 start h2 end ul start li start ... a end body end html end
We assume that opening tags in HTML have form