Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTMLParser parses attributes incorrectly. #57566

Closed
MichaelBrooks mannequin opened this issue Nov 6, 2011 · 7 comments
Closed

HTMLParser parses attributes incorrectly. #57566

MichaelBrooks mannequin opened this issue Nov 6, 2011 · 7 comments
Assignees
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@MichaelBrooks
Copy link
Mannequin

MichaelBrooks mannequin commented Nov 6, 2011

BPO 13357
Nosy @ezio-melotti
Files
  • red_test.html: HTML incorrectly parsed by HTMLParser
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/ezio-melotti'
    closed_at = <Date 2011-11-17.15:25:10.856>
    created_at = <Date 2011-11-06.19:09:06.024>
    labels = ['type-bug', 'library']
    title = 'HTMLParser parses attributes incorrectly.'
    updated_at = <Date 2011-11-17.15:25:10.854>
    user = 'https://bugs.python.org/MichaelBrooks'

    bugs.python.org fields:

    activity = <Date 2011-11-17.15:25:10.854>
    actor = 'ezio.melotti'
    assignee = 'ezio.melotti'
    closed = True
    closed_date = <Date 2011-11-17.15:25:10.856>
    closer = 'ezio.melotti'
    components = ['Library (Lib)']
    creation = <Date 2011-11-06.19:09:06.024>
    creator = 'Michael.Brooks'
    dependencies = []
    files = ['23618']
    hgrepos = []
    issue_num = 13357
    keywords = []
    message_count = 7.0
    messages = ['147169', '147170', '147177', '147179', '147182', '147615', '147804']
    nosy_count = 3.0
    nosy_names = ['ezio.melotti', 'Michael.Brooks', 'python-dev']
    pr_nums = []
    priority = 'high'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue13357'
    versions = ['Python 2.7', 'Python 3.2', 'Python 3.3']

    @MichaelBrooks
    Copy link
    Mannequin Author

    MichaelBrooks mannequin commented Nov 6, 2011

    Open the attached file "red_test.html" in a browser. The "bad" elements are blue because the style tag isn't parsed by any known browser. However, the HTMLParser library will incorrectly recognize them.

    @MichaelBrooks MichaelBrooks mannequin added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Nov 6, 2011
    @ezio-melotti
    Copy link
    Member

    Thanks for the report.
    Could you try with the latest 2.7 and see if you can reproduce the problem? (see the devguide for instructions.)

    If you can reproduce the issue even on the latest 2.7, it would be great if you could provide a patch with a test case like the ones in Lib/test/test_htmlparser.py.

    @MichaelBrooks
    Copy link
    Mannequin Author

    MichaelBrooks mannequin commented Nov 6, 2011

    Yes, I am running the latest version, which is python 2.7.2.

    On Sun, Nov 6, 2011 at 12:14 PM, Ezio Melotti <report@bugs.python.org>wrote:

    Ezio Melotti <ezio.melotti@gmail.com> added the comment:

    Thanks for the report.
    Could you try with the latest 2.7 and see if you can reproduce the
    problem? (see the devguide for instructions.)

    If you can reproduce the issue even on the latest 2.7, it would be great
    if you could provide a patch with a test case like the ones in
    Lib/test/test_htmlparser.py.

    ----------
    nosy: +ezio.melotti
    stage: -> test needed


    Python tracker <report@bugs.python.org>
    <http://bugs.python.org/issue13357\>


    @ezio-melotti
    Copy link
    Member

    I mean 2.7.3 (i.e. the development version).
    You need to get a clone of Python as explained here: http://docs.python.org/devguide/

    @MichaelBrooks
    Copy link
    Mannequin Author

    MichaelBrooks mannequin commented Nov 6, 2011

    Python 2.7.3 is still affected by both of these issues.

    On Sun, Nov 6, 2011 at 12:56 PM, Ezio Melotti <report@bugs.python.org>wrote:

    Ezio Melotti <ezio.melotti@gmail.com> added the comment:

    I mean 2.7.3 (i.e. the development version).
    You need to get a clone of Python as explained here:
    http://docs.python.org/devguide/

    ----------


    Python tracker <report@bugs.python.org>
    <http://bugs.python.org/issue13357\>


    @ezio-melotti ezio-melotti self-assigned this Nov 14, 2011
    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Nov 14, 2011

    New changeset 3c3009f63700 by Ezio Melotti in branch '2.7':
    bpo-1745761, bpo-755670, bpo-13357, bpo-12629, bpo-1200313: improve attribute handling in HTMLParser.
    http://hg.python.org/cpython/rev/3c3009f63700

    New changeset 16ed15ff0d7c by Ezio Melotti in branch '3.2':
    bpo-1745761, bpo-755670, bpo-13357, bpo-12629, bpo-1200313: improve attribute handling in HTMLParser.
    http://hg.python.org/cpython/rev/16ed15ff0d7c

    New changeset 426f7a2b1826 by Ezio Melotti in branch 'default':
    bpo-1745761, bpo-755670, bpo-13357, bpo-12629, bpo-1200313: merge with 3.2.
    http://hg.python.org/cpython/rev/426f7a2b1826

    @ezio-melotti
    Copy link
    Member

    I verified with the red_test.html you provided and now HTMLParser seems to parse everything correctly, so I'm closing this.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant