Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pure Python xml.etree.ElementTree is missing default attribute values #86317

Closed
obfusk mannequin opened this issue Oct 26, 2020 · 10 comments
Closed

Pure Python xml.etree.ElementTree is missing default attribute values #86317

obfusk mannequin opened this issue Oct 26, 2020 · 10 comments
Labels
3.10 only security fixes topic-XML type-bug An unexpected behavior, bug, or error

Comments

@obfusk
Copy link
Mannequin

obfusk mannequin commented Oct 26, 2020

BPO 42151
Nosy @rhettinger, @scoder, @serhiy-storchaka, @mattip, @corona10, @obfusk
PRs
  • bpo-42151: don't set specified_attributes=1 in pure Python ElementTree #22987
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2021-02-24.02:28:26.727>
    created_at = <Date 2020-10-26.03:51:30.021>
    labels = ['expert-XML', 'type-bug', '3.10']
    title = 'Pure Python xml.etree.ElementTree is missing default attribute values'
    updated_at = <Date 2021-02-24.02:28:26.726>
    user = 'https://github.com/obfusk'

    bugs.python.org fields:

    activity = <Date 2021-02-24.02:28:26.726>
    actor = 'corona10'
    assignee = 'none'
    closed = True
    closed_date = <Date 2021-02-24.02:28:26.727>
    closer = 'corona10'
    components = ['XML']
    creation = <Date 2020-10-26.03:51:30.021>
    creator = 'obfusk'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 42151
    keywords = ['patch']
    message_count = 10.0
    messages = ['379637', '379672', '379739', '379796', '379824', '383493', '383558', '387569', '387601', '387602']
    nosy_count = 7.0
    nosy_names = ['rhettinger', 'scoder', 'eli.bendersky', 'serhiy.storchaka', 'mattip', 'corona10', 'obfusk']
    pr_nums = ['22987']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue42151'
    versions = ['Python 3.10']

    @obfusk
    Copy link
    Mannequin Author

    obfusk mannequin commented Oct 26, 2020

    I originally reported this as a bug in PyPy, but it turns out that CPython's C implementation (_elementtree) behaves differently than the pure Python version (b/c it sets specified_attributes = 1).

    PyPy issue with example code: https://foss.heptapod.net/pypy/pypy/-/issues/3333

    @obfusk obfusk mannequin added stdlib Python modules in the Lib dir 3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes 3.10 only security fixes type-bug An unexpected behavior, bug, or error labels Oct 26, 2020
    @scoder
    Copy link
    Contributor

    scoder commented Oct 26, 2020

    The patch looks right. I'm not sure if this can still be changed in Py3.8, though, since that has been around for quite a while now.

    Admittedly, few people will disable the C accelerator module and thus whitness this issue, but for them, this is a breaking change, and some code might rely on the current behaviour. I have no way to tell how much, and whether it intentionally relies on it.

    I'd definitely change this for 3.9 and later. Maybe for 3.8, but it's at least a bit of a risk, given that there will only be very few more minor releases for it, and given that this is how things have been working for years. So, rather not, unless there is a convincing argument for backporting the change.

    @scoder scoder added topic-XML and removed stdlib Python modules in the Lib dir 3.7 (EOL) end of life 3.8 only security fixes labels Oct 26, 2020
    @serhiy-storchaka
    Copy link
    Member

    specified_attributes = True is also set in xml.dom.expatbuilder. Should not it be set to true in the C implementation of ElementTree?

    @obfusk
    Copy link
    Mannequin Author

    obfusk mannequin commented Oct 27, 2020

    specified_attributes = True is also set in xml.dom.expatbuilder.

    That's good to know and should perhaps be addressed as well.

    Should not it be set to true in the C implementation of ElementTree?

    That would break existing code. Including mine.

    I also think the current behaviour of the C implementation makes a lot more sense, especially as there is currently no way to request the alternative.

    I think using specified_attributes=False as the default behaviour for both implementations is the best solution. But I certainly would not oppose adding e.g. a keyword argument to override the default behaviour for those who would prefer the alternative.

    @scoder
    Copy link
    Contributor

    scoder commented Oct 28, 2020

    In general, since the C accelerator is enabled by default, and few people would consider disabling it explicitly, I generally consider the behaviour of the C implementation to be "right", if both implementations differ.

    As a single data point, the reason why the difference was found in this case was differing behaviour in PyPy (which uses only the Python implementation). It was only later found to be a problem on the CPython side.

    Changing the behaviour of the C implementation would certainly break a lot more code than changing the Python implementation.

    @mattip
    Copy link
    Contributor

    mattip commented Dec 21, 2020

    Is there an owner for the XML module that can make a decision? The PR has a test that shows this fix brings the python implementation into sync with the C implementation, which is, unintuitively, the "reference implementation".

    @rhettinger
    Copy link
    Contributor

    Changing the behaviour of the C implementation would
    certainly break a lot more code than changing the Python
    implementation.

    +1 for changing only the Python implementation.

    @mattip
    Copy link
    Contributor

    mattip commented Feb 23, 2021

    PyPy issue https://foss.heptapod.net/pypy/pypy/-/issues/3181 shows another problem with the pure-python ElementTree implementation, that again is not reflected in the C implementation. Is there a code owner for this stdlib module?

    @corona10
    Copy link
    Member

    New changeset 1f43340 by Felix C. Stegerman in branch 'master':
    bpo-42151: don't set specified_attributes=1 in pure Python ElementTree (GH-22987)
    1f43340

    @corona10
    Copy link
    Member

    @obfusk
    Thank you Felix for reporting and contributing!

    @corona10 corona10 removed the 3.9 only security fixes label Feb 24, 2021
    @corona10 corona10 removed the 3.9 only security fixes label Feb 24, 2021
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.10 only security fixes topic-XML type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants