Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

minidom xmlns not handling spaces in xmlns attribute value field #56429

Closed
hfischer mannequin opened this issue May 30, 2011 · 10 comments
Closed

minidom xmlns not handling spaces in xmlns attribute value field #56429

hfischer mannequin opened this issue May 30, 2011 · 10 comments
Labels
topic-XML type-bug An unexpected behavior, bug, or error

Comments

@hfischer
Copy link
Mannequin

hfischer mannequin commented May 30, 2011

BPO 12220
Nosy @terryjreedy, @bitdancer
Files
  • test.xml
  • minidom_space_char_in_namespace.patch
  • minidom_space_char_in_namespace_with_test.patch
  • minidom_space_char_in_namespace_unsupported.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2014-04-20.04:49:39.125>
    created_at = <Date 2011-05-30.21:06:13.340>
    labels = ['expert-XML', 'type-bug']
    title = 'minidom xmlns not handling spaces in xmlns attribute value field'
    updated_at = <Date 2014-04-20.04:49:39.124>
    user = 'https://bugs.python.org/hfischer'

    bugs.python.org fields:

    activity = <Date 2014-04-20.04:49:39.124>
    actor = 'r.david.murray'
    assignee = 'none'
    closed = True
    closed_date = <Date 2014-04-20.04:49:39.125>
    closer = 'r.david.murray'
    components = ['XML']
    creation = <Date 2011-05-30.21:06:13.340>
    creator = 'hfischer'
    dependencies = []
    files = ['22203', '29812', '34819', '34866']
    hgrepos = []
    issue_num = 12220
    keywords = ['patch']
    message_count = 10.0
    messages = ['137329', '137609', '186780', '187979', '187992', '216099', '216221', '216282', '216897', '216898']
    nosy_count = 6.0
    nosy_names = ['terry.reedy', 'r.david.murray', 'hfischer', 'python-dev', 'amathew', 'mstepniowski']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue12220'
    versions = ['Python 3.4']

    @hfischer
    Copy link
    Mannequin Author

    hfischer mannequin commented May 30, 2011

    Minidom raises an exception if there's a space anywhere in the URI of an xmlns, but it is legal (but terrible practice) to have spaces in URIs. I think this should work or politely raise a syntax error. E.g., this fails: xmlns:abc="http:abc.com/de f g/hi/j k".

    The attachment xml file from an end user has this xmlns:

    xmlns:verrels=" http://xbrl.org/2010/versioning-relationship-sets"

    which causes minidom to raise a ValueError exception, instead of a sensible syntax error message.

    The relevant python code is expabuilder.py, method _parse_ns_name, which does not have an elif for len(parts) != 2 (to raise a syntax error which identifies the bad construct).

    @hfischer hfischer mannequin added type-crash A hard crash of the interpreter, possibly with a core dump topic-XML labels May 30, 2011
    @ned-deily ned-deily added type-bug An unexpected behavior, bug, or error and removed type-crash A hard crash of the interpreter, possibly with a core dump labels May 30, 2011
    @terryjreedy
    Copy link
    Member

    SyntaxErrors refer to Python syntax errors; they are raised during parsing of *Python* code. An error in the value given to a Python sensibly raises a ValueError unless a module does something more specific.

    From the xml.dom doc
    "DOM Level 2 recommendation defines a single exception, DOMException"
    One subclass is "exception xml.dom.SyntaxErr -- Raised when an invalid or illegal string is specified." which would be appropriate here.

    However, "The xml.dom.minidom module is essentially a DOM 1.0-compatible DOM with some DOM 2 features (primarily namespace features)." In particular, "DOMException is currently not supported in xml.dom.minidom. Instead, xml.dom.minidom uses standard Python exceptions such as TypeError and AttributeError." or ValueError.

    An improved error report could go into 2.7/3.2.
    A change in minidom spec to use DOMException would be a feature request for 3.3 or later (and a bigger project -- code welcome). For the moment, I am assuming that you are requesting the former.

    A Python exception is not a crash. A crash is a Segmentation Fault (*nix) or 'Your program stopped unexpectedly' (Windows)

    @amathew
    Copy link
    Mannequin

    amathew mannequin commented Apr 13, 2013

    I added a more descriptive error message for invalid namespaces. I agree that it would be great to eventually move to DOMException's.

    @bitdancer
    Copy link
    Member

    Thanks for the patch. It would be nice to have a test before we commit this. The tests should use assertRaisesRegex to look for something specific to this error...probably the word 'syntax'...in the error text.

    On the other hand, if the spaces are technically legal, is calling it a syntax error appropriate? Perhaps the message should instead say something like "spaces in URIs is not supported"?

    @terryjreedy
    Copy link
    Member

    'unsupported syntax' would be more accurate, but I agree that saying what it is that is unsupported is even better.

    @mstepniowski
    Copy link
    Mannequin

    mstepniowski mannequin commented Apr 14, 2014

    Added test to amathew's patch.

    @bitdancer
    Copy link
    Member

    Thanks. Could you also change 'Invalid syntax' to 'Unsupported syntax', per the last bit of the discussion between Terry and I?

    @mstepniowski
    Copy link
    Mannequin

    mstepniowski mannequin commented Apr 15, 2014

    I agree that "Unsupported syntax" is a more accurate message. Changed in the newest patch.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Apr 20, 2014

    New changeset 13c1c5e3d2ee by R David Murray in branch '3.4':
    bpo-12220: improve minidom error when URI contains spaces.
    http://hg.python.org/cpython/rev/13c1c5e3d2ee

    New changeset 3e67d923a0df by R David Murray in branch 'default':
    Merge: bpo-12220: improve minidom error when URI contains spaces.
    http://hg.python.org/cpython/rev/3e67d923a0df

    @bitdancer
    Copy link
    Member

    Thanks, amathew and Marek.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    topic-XML type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants