msg153328 - (view) |
Author: Eli Bendersky (eli.bendersky) *  |
Date: 2012-02-14 04:19 |
The C accelerator of xml.etree.ElementTree (used by default since issue 13988) does not use or define or use the doctype() methods of the XMLParser and TreeBuilder classes, although this method is documented.
As far as I can tell, this problem exists in 3.2 as well and wasn't introduced by the changes in issue 13988.
|
msg153330 - (view) |
Author: Eli Bendersky (eli.bendersky) *  |
Date: 2012-02-14 04:51 |
The problem is deeper. _elementtree does not expose XMLParser and TreeBuilder as types at all, just as factory functions.
XMLParser: not sure if it was meant to be subclassed. If not, it should at least be documented. In any case, the XMLParser in _elementtree does not use the doctype() method on its target.
TreeBuilder: was definitely meant to be subclassed (it's documented explicitly), so it must be exposed as a type, not as a factory function.
|
msg153332 - (view) |
Author: Florent Xicluna (flox) *  |
Date: 2012-02-14 07:33 |
For the doctype() issue, it might be removed, since it was already deprecated in 3.2 (in compliance with PEP 387). However the deprecation cycle is under discussion for the 3.x serie. (issue 13248)
For subclassing, you hit the same issue for all the functions exposed by _elementtree, including Element fatory.
|
msg153934 - (view) |
Author: Georg Brandl (georg.brandl) *  |
Date: 2012-02-22 08:28 |
It doesn't really matter if something was *meant* to be subclassed. If it can be in 3.x, and can't be in 3.x+1, that's a sort of backwards compatibility bug we want to avoid pretty strongly because it's gratuitous breakage.
|
msg154329 - (view) |
Author: Florent Xicluna (flox) *  |
Date: 2012-02-26 11:55 |
the class versus factory issue is gone to issue 14128.
The current issue is only about the doctype() method missing in the C implementation.
I propose to drop this deprecated method from the Python version, instead of implementing the deprecated method in the C version.
|
msg154333 - (view) |
Author: Eli Bendersky (eli.bendersky) *  |
Date: 2012-02-26 12:02 |
Florent,
The deprecation should be probably raised separately on pydev. From the recent discussions on this and similar topics, I doubt that removal of these methods will be accepted.
|
msg154338 - (view) |
Author: Florent Xicluna (flox) *  |
Date: 2012-02-26 12:29 |
I understand the point about compatibility.
However it is slightly different here, because the method is already deprecated in Python 2.7 and 3.2, with a warning in the documentation and a DeprecationWarning at runtime.
This method was never available in the C version, and the documentation is clear about the recommended way of writing a custom doctype handler.
Arguably, it is not the most popular feature of ElementTree.
I am not opposed to adding the deprecated method in the C implementation, but it will need someone to do the patch, taking care of raising the deprecation warning correctly, and taking care of the case where XMLParser is subclassed. Is it worth the hassle?
Please not that contrary to what is stated in the first message (msg153328), the doctype() method is not defined on the default TreeBuilder (Python) class. The documentation suggests to add it on custom TreeBuilder implementations.
|
msg154343 - (view) |
Author: Florent Xicluna (flox) *  |
Date: 2012-02-26 13:13 |
This last feature (doctype handler on custom TreeBuilder) does not have tests...
So, it is certainly broken with the C implementation.
|
msg154398 - (view) |
Author: Florent Xicluna (flox) *  |
Date: 2012-02-26 21:49 |
Two other differences:
* the C TreeBuilder has an undocumented and unused method "xml"
* if you omit one of the TreeBuilder method (start(), end(), data(), close()) on you custom TreeBuilder implementation, the C XMLParser works fine, but the Python XMLParser raises an attribute error.
And I confirm that if you implement the "doctype()" method on a custom TreeBuilder object, the C XMLParser ignores it, while the Python version works fine.
I propose:
- to drop the undocumented, empty "TreeBuilder().xml" method.
- to relax the Python XMLParser, in order to accept incomplete TreeBuilder implementation (like the C version)
- to implement the "doctype()" handler in the C XMLParser (this needs some work)
|
msg154408 - (view) |
Author: Eli Bendersky (eli.bendersky) *  |
Date: 2012-02-26 22:18 |
On Sun, Feb 26, 2012 at 23:49, Florent Xicluna <report@bugs.python.org>wrote:
Yes, these suggestions sound reasonable to me. Moving toward two more
consistent implementations of the API, while not disabling existing
features is the way to go. Things like relaxing the Python XMLParser just
add features rather than restricting them, so they should not result in
breakage due to backward-incompatibility.
Eli
|
msg154935 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2012-03-05 09:43 |
New changeset 39cc025968f1 by Florent Xicluna in branch 'default':
Issue #14007: drop unused TreeBuilder().xml.
http://hg.python.org/cpython/rev/39cc025968f1
New changeset 47016103185f by Florent Xicluna in branch 'default':
Issue #14007: accept incomplete TreeBuilder objects (missing start/end/data/close) for the Python implementation as well. Add disabled tests for the doctype() method.
http://hg.python.org/cpython/rev/47016103185f
|
msg161873 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2012-05-29 12:46 |
New changeset 717632ae7b3f by Eli Bendersky in branch 'default':
Issue #14007: make TreeBuilder an actual type exposed from _elementtree, and subclassable.
http://hg.python.org/cpython/rev/717632ae7b3f
|
msg161875 - (view) |
Author: Eli Bendersky (eli.bendersky) *  |
Date: 2012-05-29 13:07 |
I'm working on completing this issue. It will be done in stages, with each stage committed as a separate chunk of functionality.
* Make TreeBuilder an actual exported type (subclassable) from _elementtree
* TreeBuilder constructor should accept arguments as documented
* Make XMLParser an actual exported type (subclassable) from _elementtree
* Make XMLParser call doctype on a given target builder, and on itself, as is done in the Python version
The first bullet was implemented and committed --^
|
msg161961 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2012-05-30 14:59 |
New changeset 20b8f0ee3d64 by Eli Bendersky in branch 'default':
Issue #14007: implemented the 'element_factory' feature of TreeBuilder in
http://hg.python.org/cpython/rev/20b8f0ee3d64
|
msg161987 - (view) |
Author: Eli Bendersky (eli.bendersky) *  |
Date: 2012-05-31 09:38 |
* Another problem: the C implementation of XMLParser does not take 3 positional args, but only 2. Although the 'html' arg is documented as "unsupported", it should still be taken and silently ignored, similarly to the Python version
|
msg162048 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2012-06-01 04:16 |
New changeset a29ae1c2b8b2 by Eli Bendersky in branch 'default':
Issue #14007: make XMLParser a real subclassable type exported from _elementtree. +cleanups
http://hg.python.org/cpython/rev/a29ae1c2b8b2
|
msg162058 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2012-06-01 08:34 |
New changeset 6f9bfcc1896f by Eli Bendersky in branch 'default':
Issue #14007: implement doctype() method calling in XMLParser of _elementtree.
http://hg.python.org/cpython/rev/6f9bfcc1896f
|
msg162059 - (view) |
Author: Eli Bendersky (eli.bendersky) *  |
Date: 2012-06-01 08:36 |
The latest commit completes this issue.
|
msg162179 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2012-06-02 21:31 |
This is causing buildbot failures on some of the buildbots:
http://www.python.org/dev/buildbot/all/builders/x86%20Gentoo%203.x/builds/2529/steps/test/logs/stdio
http://www.python.org/dev/buildbot/all/builders/x86%20Windows7%203.x/builds/5082/steps/test/logs/stdio
|
msg162189 - (view) |
Author: Eli Bendersky (eli.bendersky) *  |
Date: 2012-06-03 03:11 |
Thanks. Fixed in changeset eb1d633fe307.
I'll watch the bots to see no problems remain.
|
msg171604 - (view) |
Author: Christian Heimes (christian.heimes) *  |
Date: 2012-09-30 02:53 |
It seems like your changes have introduced a segfault: #16089
|
msg201519 - (view) |
Author: Martin Panter (martin.panter) *  |
Date: 2013-10-28 09:18 |
There is an issue in Python 2.7 (and 3.2 if that matters) with the DeprecationWarning for the doctype() method being triggered internally. It is a bit different from this issue but I think a solution has already been committed for 3.3, and the commit message references this issue. Let me know if I should raise a separate report.
The warning is triggered for the Python version of the ElementTree module:
$ python2.7 -Wall
Python 2.7.5 (default, Sep 6 2013, 09:55:21)
[GCC 4.8.1 20130725 (prerelease)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from xml.etree import ElementTree
>>> ElementTree.XML(b'<!DOCTYPE blaua SYSTEM "id"><elem/>')
/usr/lib/python2.7/xml/etree/ElementTree.py:1627: DeprecationWarning: This method of XMLParser is deprecated. Define doctype() method on the TreeBuilder target.
DeprecationWarning,
/usr/lib/python2.7/xml/etree/ElementTree.py:1627: DeprecationWarning: This method of XMLParser is deprecated. Define doctype() method on the TreeBuilder target.
DeprecationWarning,
<Element 'elem' at 0xd47910>
>>>
Possible solution is the patch hunk below, taken from this commit:
http://hg.python.org/cpython/diff/47016103185f/Lib/xml/etree/ElementTree.py#l1.99
@@ -1636,7 +1627,7 @@ class XMLParser:
pubid = pubid[1:-1]
if hasattr(self.target, "doctype"):
self.target.doctype(name, pubid, system[1:-1])
- elif self.doctype is not self._XMLParser__doctype:
+ elif self.doctype != self._XMLParser__doctype:
# warn about deprecated call
self._XMLParser__doctype(name, pubid, system[1:-1])
self.doctype(name, pubid, system[1:-1])
|
msg201533 - (view) |
Author: Eli Bendersky (eli.bendersky) *  |
Date: 2013-10-28 12:53 |
Martin, do you see a way to work around the problem?
I'm not sure it's serious enough to warrant committing to the 2.7.x branch at this point.
|
msg201538 - (view) |
Author: Martin Panter (martin.panter) *  |
Date: 2013-10-28 13:00 |
The best way to work around it for me is just to ignore the warning. It doesn’t really worry me that much, I only noticed it while porting a program to Python 3 anyway. So if you don’t want to touch the 2.7 branch I can live with that :)
|
msg201539 - (view) |
Author: Eli Bendersky (eli.bendersky) *  |
Date: 2013-10-28 13:06 |
On Mon, Oct 28, 2013 at 6:00 AM, Martin Panter <report@bugs.python.org>wrote:
>
> Martin Panter added the comment:
>
> The best way to work around it for me is just to ignore the warning. It
> doesn’t really worry me that much, I only noticed it while porting a
> program to Python 3 anyway. So if you don’t want to touch the 2.7 branch I
> can live with that :)
>
I'd rather not. 2.7.x is in maintenance mode - we'll fix serious bugs and
may change trivial things. But in this case, since the effects are hardly
serious we should consider the danger of mucking with code in a very stable
maintenance branch. While this issue may seem simple at the surface, it's
not. That's because ET in 3.3+ is quite different from ET in 3.2-
(including 2.7-). In 3.3 ET underwent major changes, and its testing
infrastructure was greatly improved too. Back-porting across the 3.3
boundary is risky because it's hard to foresee the effects (i.e.
interactions between the C accelerator and the Python module).
|
|
Date |
User |
Action |
Args |
2022-04-11 14:57:26 | admin | set | github: 58215 |
2013-10-28 13:06:10 | eli.bendersky | set | messages:
+ msg201539 |
2013-10-28 13:00:00 | martin.panter | set | messages:
+ msg201538 |
2013-10-28 12:53:47 | eli.bendersky | set | messages:
+ msg201533 |
2013-10-28 09:18:43 | martin.panter | set | nosy:
+ martin.panter messages:
+ msg201519
|
2012-09-30 02:53:28 | christian.heimes | set | nosy:
+ christian.heimes messages:
+ msg171604
|
2012-06-03 03:12:00 | eli.bendersky | set | status: open -> closed
messages:
+ msg162189 |
2012-06-02 21:31:51 | r.david.murray | set | status: closed -> open nosy:
+ r.david.murray messages:
+ msg162179
|
2012-06-01 08:36:56 | eli.bendersky | set | status: open -> closed resolution: fixed messages:
+ msg162059
stage: test needed -> resolved |
2012-06-01 08:34:31 | python-dev | set | messages:
+ msg162058 |
2012-06-01 04:16:50 | python-dev | set | messages:
+ msg162048 |
2012-05-31 09:38:09 | eli.bendersky | set | messages:
+ msg161987 |
2012-05-30 14:59:11 | python-dev | set | messages:
+ msg161961 |
2012-05-29 13:07:57 | eli.bendersky | set | messages:
+ msg161875 |
2012-05-29 12:46:53 | python-dev | set | messages:
+ msg161873 |
2012-04-27 21:59:38 | Arfrever | set | priority: normal -> release blocker |
2012-03-05 09:43:03 | python-dev | set | nosy:
+ python-dev messages:
+ msg154935
|
2012-02-26 22:18:46 | eli.bendersky | set | messages:
+ msg154408 |
2012-02-26 21:49:37 | flox | set | messages:
+ msg154398 |
2012-02-26 13:13:33 | flox | set | messages:
+ msg154343 stage: needs patch -> test needed |
2012-02-26 12:29:31 | flox | set | priority: high -> normal
messages:
+ msg154338 |
2012-02-26 12:02:39 | eli.bendersky | set | messages:
+ msg154333 |
2012-02-26 11:55:24 | flox | set | messages:
+ msg154329 |
2012-02-22 08:28:19 | georg.brandl | set | nosy:
+ georg.brandl messages:
+ msg153934
|
2012-02-14 10:49:41 | Arfrever | set | nosy:
+ Arfrever
|
2012-02-14 07:33:49 | flox | set | messages:
+ msg153332 |
2012-02-14 04:51:46 | eli.bendersky | set | messages:
+ msg153330 |
2012-02-14 04:19:50 | eli.bendersky | set | messages:
+ msg153328 |
2012-02-14 04:19:29 | eli.bendersky | set | -> (no value) messages:
- msg153324 |
2012-02-14 04:16:26 | eli.bendersky | create | |