New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xml.etree.ElementTree - XMLParser and TreeBuilder's doctype() method missing #58215
Comments
The C accelerator of xml.etree.ElementTree (used by default since bpo-13988) does not use or define or use the doctype() methods of the XMLParser and TreeBuilder classes, although this method is documented. As far as I can tell, this problem exists in 3.2 as well and wasn't introduced by the changes in bpo-13988. |
The problem is deeper. _elementtree does not expose XMLParser and TreeBuilder as types at all, just as factory functions. XMLParser: not sure if it was meant to be subclassed. If not, it should at least be documented. In any case, the XMLParser in _elementtree does not use the doctype() method on its target. TreeBuilder: was definitely meant to be subclassed (it's documented explicitly), so it must be exposed as a type, not as a factory function. |
For the doctype() issue, it might be removed, since it was already deprecated in 3.2 (in compliance with PEP-387). However the deprecation cycle is under discussion for the 3.x serie. (bpo-13248) For subclassing, you hit the same issue for all the functions exposed by _elementtree, including Element fatory. |
It doesn't really matter if something was *meant* to be subclassed. If it can be in 3.x, and can't be in 3.x+1, that's a sort of backwards compatibility bug we want to avoid pretty strongly because it's gratuitous breakage. |
the class versus factory issue is gone to bpo-14128. The current issue is only about the doctype() method missing in the C implementation. I propose to drop this deprecated method from the Python version, instead of implementing the deprecated method in the C version. |
Florent, The deprecation should be probably raised separately on pydev. From the recent discussions on this and similar topics, I doubt that removal of these methods will be accepted. |
I understand the point about compatibility. I am not opposed to adding the deprecated method in the C implementation, but it will need someone to do the patch, taking care of raising the deprecation warning correctly, and taking care of the case where XMLParser is subclassed. Is it worth the hassle? Please not that contrary to what is stated in the first message (msg153328), the doctype() method is not defined on the default TreeBuilder (Python) class. The documentation suggests to add it on custom TreeBuilder implementations. |
This last feature (doctype handler on custom TreeBuilder) does not have tests... So, it is certainly broken with the C implementation. |
Two other differences:
And I confirm that if you implement the "doctype()" method on a custom TreeBuilder object, the C XMLParser ignores it, while the Python version works fine. I propose:
|
On Sun, Feb 26, 2012 at 23:49, Florent Xicluna <report@bugs.python.org>wrote: Yes, these suggestions sound reasonable to me. Moving toward two more Eli |
New changeset 39cc025968f1 by Florent Xicluna in branch 'default': New changeset 47016103185f by Florent Xicluna in branch 'default': |
New changeset 717632ae7b3f by Eli Bendersky in branch 'default': |
I'm working on completing this issue. It will be done in stages, with each stage committed as a separate chunk of functionality.
The first bullet was implemented and committed --^ |
New changeset 20b8f0ee3d64 by Eli Bendersky in branch 'default': |
|
New changeset a29ae1c2b8b2 by Eli Bendersky in branch 'default': |
New changeset 6f9bfcc1896f by Eli Bendersky in branch 'default': |
The latest commit completes this issue. |
This is causing buildbot failures on some of the buildbots: http://www.python.org/dev/buildbot/all/builders/x86%20Gentoo%203.x/builds/2529/steps/test/logs/stdio |
Thanks. Fixed in changeset eb1d633fe307. I'll watch the bots to see no problems remain. |
It seems like your changes have introduced a segfault: bpo-16089 |
There is an issue in Python 2.7 (and 3.2 if that matters) with the DeprecationWarning for the doctype() method being triggered internally. It is a bit different from this issue but I think a solution has already been committed for 3.3, and the commit message references this issue. Let me know if I should raise a separate report. The warning is triggered for the Python version of the ElementTree module: $ python2.7 -Wall
Python 2.7.5 (default, Sep 6 2013, 09:55:21)
[GCC 4.8.1 20130725 (prerelease)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from xml.etree import ElementTree
>>> ElementTree.XML(b'<!DOCTYPE blaua SYSTEM "id"><elem/>')
/usr/lib/python2.7/xml/etree/ElementTree.py:1627: DeprecationWarning: This method of XMLParser is deprecated. Define doctype() method on the TreeBuilder target.
DeprecationWarning,
/usr/lib/python2.7/xml/etree/ElementTree.py:1627: DeprecationWarning: This method of XMLParser is deprecated. Define doctype() method on the TreeBuilder target.
DeprecationWarning,
<Element 'elem' at 0xd47910>
>>> Possible solution is the patch hunk below, taken from this commit: http://hg.python.org/cpython/diff/47016103185f/Lib/xml/etree/ElementTree.py#l1.99 @@ -1636,7 +1627,7 @@ class XMLParser:
pubid = pubid[1:-1]
if hasattr(self.target, "doctype"):
self.target.doctype(name, pubid, system[1:-1])
- elif self.doctype is not self._XMLParser__doctype:
+ elif self.doctype != self._XMLParser__doctype:
# warn about deprecated call
self._XMLParser__doctype(name, pubid, system[1:-1])
self.doctype(name, pubid, system[1:-1]) |
Martin, do you see a way to work around the problem? I'm not sure it's serious enough to warrant committing to the 2.7.x branch at this point. |
The best way to work around it for me is just to ignore the warning. It doesn’t really worry me that much, I only noticed it while porting a program to Python 3 anyway. So if you don’t want to touch the 2.7 branch I can live with that :) |
On Mon, Oct 28, 2013 at 6:00 AM, Martin Panter <report@bugs.python.org>wrote:
I'd rather not. 2.7.x is in maintenance mode - we'll fix serious bugs and |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: