This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author ezio.melotti
Recipients ezio.melotti, moonflow
Date 2012-11-20.14:43:52
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1353422632.8.0.170653726289.issue16513@psf.upfronthosting.co.za>
In-reply-to
Content
Sorry, I misread your code, looks like you want the href *without* 'cve'.
In that case change my code to use "'cve' not in attrs['href']" (also avoid using  s.find('cve') == -1 , and use the more readable and idiomatic  'cve' not in s ).

I think your original script doesn't work for two reasons:
1) you are looking for a table with class="tablesorter", but in the HTML the table doesn't have that class, so self.is_table is never set to True;
2) you are finding the href of the <a> with a "style" attribute and correctly setting it to self.href_name, but the value is then replaced by "" when the following <a> without "style" is found;

That said, I still suggest you to abandon sgmllib and use HTMLParser, or possibly an external module like BeautifulSoup or LXML.
History
Date User Action Args
2012-11-20 14:43:52ezio.melottisetrecipients: + ezio.melotti, moonflow
2012-11-20 14:43:52ezio.melottisetmessageid: <1353422632.8.0.170653726289.issue16513@psf.upfronthosting.co.za>
2012-11-20 14:43:52ezio.melottilinkissue16513 messages
2012-11-20 14:43:52ezio.melotticreate