This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author diegorubenarias
Recipients diegorubenarias
Date 2007-12-19.20:05:20
SpamBayes Score 0.287731
Marked as misclassified No
Message-id <1198094721.16.0.0968009293382.issue1663@psf.upfronthosting.co.za>
In-reply-to
Content
Hello my name is Diego, I needed to parse HTML to retrieve only text,
but not grasped how to do it with class HTMLParser, so the change to do
it. The code to use is:
class ParsearHTML (HTMLParser.HTMLParser):

    def __init__(self,datos):
        HTMLParser.HTMLParser.__init__(self)
        self.feed(datos)
        self.close()

    def handle_data(self,data):
        return data

parser  = ParsearHTML(onTmp)
data = parser.feed(onTmp)
And changes in the class are attached. Thank you very much. Diego.
Files
File name Uploaded
HTMLParser.py diegorubenarias, 2007-12-19.20:05:20
History
Date User Action Args
2007-12-19 20:05:21diegorubenariassetspambayes_score: 0.287731 -> 0.287731
recipients: + diegorubenarias
2007-12-19 20:05:21diegorubenariassetspambayes_score: 0.287731 -> 0.287731
messageid: <1198094721.16.0.0968009293382.issue1663@psf.upfronthosting.co.za>
2007-12-19 20:05:21diegorubenariaslinkissue1663 messages
2007-12-19 20:05:20diegorubenariascreate