New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
minidom xmlns not handling spaces in xmlns attribute value field #56429
Comments
Minidom raises an exception if there's a space anywhere in the URI of an xmlns, but it is legal (but terrible practice) to have spaces in URIs. I think this should work or politely raise a syntax error. E.g., this fails: xmlns:abc="http:abc.com/de f g/hi/j k". The attachment xml file from an end user has this xmlns: xmlns:verrels=" http://xbrl.org/2010/versioning-relationship-sets" which causes minidom to raise a ValueError exception, instead of a sensible syntax error message. The relevant python code is expabuilder.py, method _parse_ns_name, which does not have an elif for len(parts) != 2 (to raise a syntax error which identifies the bad construct). |
SyntaxErrors refer to Python syntax errors; they are raised during parsing of *Python* code. An error in the value given to a Python sensibly raises a ValueError unless a module does something more specific. From the xml.dom doc However, "The xml.dom.minidom module is essentially a DOM 1.0-compatible DOM with some DOM 2 features (primarily namespace features)." In particular, "DOMException is currently not supported in xml.dom.minidom. Instead, xml.dom.minidom uses standard Python exceptions such as TypeError and AttributeError." or ValueError. An improved error report could go into 2.7/3.2. A Python exception is not a crash. A crash is a Segmentation Fault (*nix) or 'Your program stopped unexpectedly' (Windows) |
I added a more descriptive error message for invalid namespaces. I agree that it would be great to eventually move to DOMException's. |
Thanks for the patch. It would be nice to have a test before we commit this. The tests should use assertRaisesRegex to look for something specific to this error...probably the word 'syntax'...in the error text. On the other hand, if the spaces are technically legal, is calling it a syntax error appropriate? Perhaps the message should instead say something like "spaces in URIs is not supported"? |
'unsupported syntax' would be more accurate, but I agree that saying what it is that is unsupported is even better. |
Added test to amathew's patch. |
Thanks. Could you also change 'Invalid syntax' to 'Unsupported syntax', per the last bit of the discussion between Terry and I? |
I agree that "Unsupported syntax" is a more accurate message. Changed in the newest patch. |
New changeset 13c1c5e3d2ee by R David Murray in branch '3.4': New changeset 3e67d923a0df by R David Murray in branch 'default': |
Thanks, amathew and Marek. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: