classification
Title: HTMLParser parses attributes incorrectly.
Type: behavior Stage: committed/rejected
Components: Library (Lib) Versions: Python 3.3, Python 3.2, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: ezio.melotti Nosy List: Michael.Brooks, ezio.melotti, python-dev
Priority: high Keywords:

Created on 2011-11-06 19:09 by Michael.Brooks, last changed 2011-11-17 15:25 by ezio.melotti. This issue is now closed.

Files
File name Uploaded Description Edit
red_test.html Michael.Brooks, 2011-11-06 19:09 HTML incorrectly parsed by HTMLParser
Messages (7)
msg147169 - (view) Author: Michael Brooks (Michael.Brooks) Date: 2011-11-06 19:09
Open the attached file "red_test.html" in a browser.  The "bad" elements are blue because the style tag isn't parsed by any known browser.   However,  the HTMLParser library will incorrectly recognize them.
msg147170 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2011-11-06 19:14
Thanks for the report.
Could you try with the latest 2.7 and see if you can reproduce the problem? (see the devguide for instructions.)

If you can reproduce the issue even on the latest 2.7, it would be great if you could provide a patch with a test case like the ones in Lib/test/test_htmlparser.py.
msg147177 - (view) Author: Michael Brooks (Michael.Brooks) Date: 2011-11-06 19:54
Yes, I am running the latest version,  which is python 2.7.2.

On Sun, Nov 6, 2011 at 12:14 PM, Ezio Melotti <report@bugs.python.org>wrote:

>
> Ezio Melotti <ezio.melotti@gmail.com> added the comment:
>
> Thanks for the report.
> Could you try with the latest 2.7 and see if you can reproduce the
> problem? (see the devguide for instructions.)
>
> If you can reproduce the issue even on the latest 2.7, it would be great
> if you could provide a patch with a test case like the ones in
> Lib/test/test_htmlparser.py.
>
> ----------
> nosy: +ezio.melotti
> stage:  -> test needed
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue13357>
> _______________________________________
>
msg147179 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2011-11-06 19:56
I mean 2.7.3 (i.e. the development version).
You need to get a clone of Python as explained here: http://docs.python.org/devguide/
msg147182 - (view) Author: Michael Brooks (Michael.Brooks) Date: 2011-11-06 20:26
Python 2.7.3 is still affected by both of these issues.

On Sun, Nov 6, 2011 at 12:56 PM, Ezio Melotti <report@bugs.python.org>wrote:

>
> Ezio Melotti <ezio.melotti@gmail.com> added the comment:
>
> I mean 2.7.3 (i.e. the development version).
> You need to get a clone of Python as explained here:
> http://docs.python.org/devguide/
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue13357>
> _______________________________________
>
msg147615 - (view) Author: Roundup Robot (python-dev) Date: 2011-11-14 16:57
New changeset 3c3009f63700 by Ezio Melotti in branch '2.7':
#1745761, #755670, #13357, #12629, #1200313: improve attribute handling in HTMLParser.
http://hg.python.org/cpython/rev/3c3009f63700

New changeset 16ed15ff0d7c by Ezio Melotti in branch '3.2':
#1745761, #755670, #13357, #12629, #1200313: improve attribute handling in HTMLParser.
http://hg.python.org/cpython/rev/16ed15ff0d7c

New changeset 426f7a2b1826 by Ezio Melotti in branch 'default':
#1745761, #755670, #13357, #12629, #1200313: merge with 3.2.
http://hg.python.org/cpython/rev/426f7a2b1826
msg147804 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2011-11-17 15:25
I verified with the red_test.html you provided and now HTMLParser seems to parse everything correctly, so I'm closing this.
History
Date User Action Args
2011-11-17 15:25:10ezio.melottisetstatus: open -> closed
versions: + Python 3.2, Python 3.3
messages: + msg147804

resolution: fixed
stage: test needed -> committed/rejected
2011-11-14 16:57:16python-devsetnosy: + python-dev
messages: + msg147615
2011-11-14 12:44:10ezio.melottisetassignee: ezio.melotti
2011-11-07 05:45:58rhettingersetpriority: normal -> high
2011-11-06 20:26:10Michael.Brookssetmessages: + msg147182
2011-11-06 19:56:24ezio.melottisetmessages: + msg147179
2011-11-06 19:54:06Michael.Brookssetmessages: + msg147177
2011-11-06 19:14:18ezio.melottisetnosy: + ezio.melotti

messages: + msg147170
stage: test needed
2011-11-06 19:09:06Michael.Brookscreate