Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parameter type error for xml.sax.parseString(string, ...) #54799

Closed
ThomasRyan mannequin opened this issue Nov 30, 2010 · 11 comments
Closed

Parameter type error for xml.sax.parseString(string, ...) #54799

ThomasRyan mannequin opened this issue Nov 30, 2010 · 11 comments
Assignees
Labels
stdlib Python modules in the Lib dir topic-XML type-feature A feature request or enhancement

Comments

@ThomasRyan
Copy link
Mannequin

ThomasRyan mannequin commented Nov 30, 2010

BPO 10590
Nosy @loewis, @birkenfeld, @terryjreedy, @tiran, @ezio-melotti, @serhiy-storchaka
Dependencies
  • bpo-2175: Expat sax parser silently ignores the InputSource protocol
  • bpo-17089: Expat parser parses strings only when XML encoding is UTF-8
  • Files
  • sax_parse_3.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/serhiy-storchaka'
    closed_at = <Date 2015-04-04.07:14:44.926>
    created_at = <Date 2010-11-30.19:37:35.249>
    labels = ['expert-XML', 'type-feature', 'library']
    title = 'Parameter type error for xml.sax.parseString(string, ...)'
    updated_at = <Date 2015-04-04.07:14:44.926>
    user = 'https://bugs.python.org/ThomasRyan'

    bugs.python.org fields:

    activity = <Date 2015-04-04.07:14:44.926>
    actor = 'serhiy.storchaka'
    assignee = 'serhiy.storchaka'
    closed = True
    closed_date = <Date 2015-04-04.07:14:44.926>
    closer = 'serhiy.storchaka'
    components = ['Library (Lib)', 'XML']
    creation = <Date 2010-11-30.19:37:35.249>
    creator = 'Thomas.Ryan'
    dependencies = ['2175', '17089']
    files = ['38810']
    hgrepos = []
    issue_num = 10590
    keywords = ['patch']
    message_count = 11.0
    messages = ['122933', '180045', '180141', '183408', '183409', '183455', '183457', '231781', '239945', '239946', '240048']
    nosy_count = 11.0
    nosy_names = ['loewis', 'georg.brandl', 'terry.reedy', 'ygale', 'christian.heimes', 'ezio.melotti', 'eli.bendersky', 'Thomas.Ryan', 'tshepang', 'python-dev', 'serhiy.storchaka']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue10590'
    versions = ['Python 3.5']

    @ThomasRyan
    Copy link
    Mannequin Author

    ThomasRyan mannequin commented Nov 30, 2010

    In 3.1.3, 3.1.2, maybe earlier...

    xml.sax.parseString(string, handler, error_handler=handler.ErrorHandler())

    Source code requires bytes, not a string as implied by function name and by the documentation.

    Exception thrown for strings.

    Since the name includes "string" the source should probably be fixed.
    Or at least update the documentation.

    Someday replace/augment parseString() with parseBytes()?

    @ThomasRyan ThomasRyan mannequin added type-bug An unexpected behavior, bug, or error topic-XML labels Nov 30, 2010
    @serhiy-storchaka
    Copy link
    Member

    Indeed, xml.dom.minidom.parseString() and xml.etree.ElementTree.fromstring() accepts both bytes and strings, xml.dom.minidom.parse(), xml.etree.ElementTree.parse() and even xml.sax.parse() accepts both byte and text streams. Only xml.sax.parseString() rejects strings in contrast to its name. This looks as 2 to 3 porting bug.

    @serhiy-storchaka
    Copy link
    Member

    Here is a patch which fixes this issue and a couple of related issues: bpo-1483, bpo-2174, bpo-2175, bpo-10590.

    @elibendersky
    Copy link
    Mannequin

    elibendersky mannequin commented Mar 3, 2013

    I'm not very knowledgeable in other XML modules, but I hate to see this patch linger. Also it's a pre-requisite for bpo-16986, it seems.

    Serhiy, since the patch is large could you give a short summary of the things it fixes? Note that the best approach IMHO is to submit and push minimal patches that fix specific issues and not lump several fixes together, unless it doesn't make sense to separate them.

    @tiran
    Copy link
    Member

    tiran commented Mar 3, 2013

    Please hold of any modifications of XML code until we have decided how we are going to fix the XML exploits.

    Also I think this is a new feature and not a fix. parseString() is documented as 'parses from a buffer string'. It doesn't say that it can parse text.

    @serhiy-storchaka
    Copy link
    Member

    Low-level part already extracted to bpo-17089 and committed. bpo-16986 has a similar patch for cElementTree. The main part of path was moved to bpo-2175 which is now pre-requisite for bpo-16986 and for this issue. It contains additional tests and additional fixes. It is hard and little sense to split them on separated patches. Let's move the discussion to bpo-2175.

    And then the patch for this issue will be small and simple, only several lines and one test. At least this issue is less important and actually can be considered as a new feature.

    @serhiy-storchaka serhiy-storchaka added type-feature A feature request or enhancement and removed type-bug An unexpected behavior, bug, or error labels Mar 4, 2013
    @elibendersky
    Copy link
    Mannequin

    elibendersky mannequin commented Mar 4, 2013

    Serhiy, OK - I'll look at bpo-2175 first. But yes, Christian is right, let's wait for the security issues to be resolved first.

    @serhiy-storchaka
    Copy link
    Member

    There was no significant motion in the direction of fixing XML security issues. May be resolve bpo-2175 first?

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Apr 2, 2015

    New changeset 3ac1b21fbb42 by Serhiy Storchaka in branch '2.7':
    Issue bpo-10590: Added tests for xml.sax.parse() and xml.sax.parseString().
    https://hg.python.org/cpython/rev/3ac1b21fbb42

    New changeset ca8666310eb3 by Serhiy Storchaka in branch '3.4':
    Issue bpo-10590: Added tests for xml.sax.parse() and xml.sax.parseString().
    https://hg.python.org/cpython/rev/ca8666310eb3

    New changeset 846c165cf643 by Serhiy Storchaka in branch 'default':
    Issue bpo-10590: Added tests for xml.sax.parse() and xml.sax.parseString().
    https://hg.python.org/cpython/rev/846c165cf643

    @serhiy-storchaka
    Copy link
    Member

    After resolving bpo-2175 and committing tests that works with current code, only minimum of changes are left. Here is a patch that adds support of string argument in xml.sax.parseString().

    @serhiy-storchaka serhiy-storchaka added stdlib Python modules in the Lib dir and removed topic-unicode labels Apr 2, 2015
    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Apr 4, 2015

    New changeset fca669149d8a by Serhiy Storchaka in branch 'default':
    Issue bpo-10590: xml.sax.parseString() now supports string argument.
    https://hg.python.org/cpython/rev/fca669149d8a

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir topic-XML type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants