Message147766
16ed15ff0d7c was not in current stable py3.2 so I missed it..
When the comma is now raised as attribute name, then the problem is anyway moved to the higher level anyway - and is/can be handled easily there by usual methods.
(still I guess locatestarttagend_tolerant matches a free standing comma extra after an attribute)
"should be generic enough to work with all kind of invalid markup": I think we would be rather complete then (->missing space issue)- at least regarding %age of real cases. And it could be improved with few touches over time if something missing. 100% is not the point unless it shall drive the official W3C checker. The call of self.warning, as in old patch, doesn't cost otherwise and I see no real increase of complexity/cpu-time.
"HTMLParser won't do any check about the validity of the elements' names or attributes' names/values": yes thats of course up to the next level handler (BTDT)- thus the possibilty of error handling is not killed. Its about what HTMLParser _hides_ irrecoverably.
"there should be a valid use case for this": Almost any app which parses HTML (self authored or remote) can have (should have?) a no-fuzz/collateral warn log option. (->no need to make a expensive W3C checker session). I mostly have this in use as said, as it was anyway there.
Well, as for me, I use anyway a private backport to Python2 of this. I try to avoid Python3 as far as possible. (No real plus, too much problems) So for me its just about joining Python4 in the future perhaps - which can do true CPython multithreading, stackless, psyco/static typing ... and print statement again without typing so many extra braces ;-)
I considered extra libs like the HTML tidy binding, but this is all too much fuzz for most cases. And HTMLParser has already quite everything, with the few calls inserted .. |
|
Date |
User |
Action |
Args |
2011-11-16 12:16:26 | kxroberto | set | recipients:
+ kxroberto, fdrake, terry.reedy, jjlee, orsenthil, ezio.melotti, Neil Muller, eric.araujo, r.david.murray |
2011-11-16 12:16:26 | kxroberto | set | messageid: <1321445786.44.0.978807070872.issue1486713@psf.upfronthosting.co.za> |
2011-11-16 12:16:25 | kxroberto | link | issue1486713 messages |
2011-11-16 12:16:24 | kxroberto | create | |
|