Issue 17902: Document that _elementtree C API cannot use custom TreeBuilder for iterparse or IncrementalParser

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/62102

classification

Title:	Document that _elementtree C API cannot use custom TreeBuilder for iterparse or IncrementalParser
Type:	behavior	Stage:	resolved
Components:	Documentation, XML	Versions:	Python 3.3, Python 3.4

process

Status:	closed	Resolution:	fixed
Dependencies:		Superseder:
Assigned To:	docs@python	Nosy List:	Aaron.Oakley, docs@python, eli.bendersky, python-dev
Priority:	normal	Keywords:	patch

Created on 2013-05-03 20:56 by Aaron.Oakley, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
elementtree.rst-340a0.patch	Aaron.Oakley, 2013-05-03 20:56	xml.etree.ElementTree Documentation patch	review

Messages (7)
msg188329 - (view)	Author: Aaron Oakley (Aaron.Oakley) *	Date: 2013-05-03 20:56
It would really help to document that the C API can only use the default xml.etree.ElementTree.TreeBuilder for targets with iterparse (and by extension, IncrementalParser). I got a nice surprise about that when I went from 3.2 to 3.3 and started getting "TypeError: event handling only supported for ElementTree.TreeBuilder targets". I included a patch to add notes to iterparse and IncrementalParser, but I'm not sure what to refer to the C module as since xml.etree.cElementTree is deprecated.
msg189562 - (view)	Author: Eli Bendersky (eli.bendersky) *	Date: 2013-05-18 23:18
Aaron, could you please sign the PSF CLA (http://www.python.org/psf/contrib/contrib-form/) - this will make it accepting patches from you easier. Other than that, I agree it's a legit patch. The alternative would be to fix _elementtree to actually allow arbitrary TreeBuilders there, although I'm not sure it's worth the effort.
msg191362 - (view)	Author: Aaron Oakley (Aaron.Oakley) *	Date: 2013-06-17 19:04
So sorry, I just found the emails from the bug tracker in my spam folder. Anyhow, I've now signed the CLA.
msg194323 - (view)	Author: Roundup Robot (python-dev)	Date: 2013-08-04 01:55
New changeset a5a5ba4f71ad by Eli Bendersky in branch '3.3': Issue #17902: Clarify doc of ElementTree.iterparse http://hg.python.org/cpython/rev/a5a5ba4f71ad New changeset 96f45011957e by Eli Bendersky in branch 'default': Issue #17902: Clarify doc of ElementTree.iterparse and IncrementalParser http://hg.python.org/cpython/rev/96f45011957e
msg196256 - (view)	Author: Eli Bendersky (eli.bendersky) *	Date: 2013-08-27 01:21
Aaron - could you describe your use case of passing a custom parser into iterparse? We're currently considering deprecating the feature of passing a parser into iterparse in a future release (this is being discussed in issue 17741).
msg196259 - (view)	Author: Aaron Oakley (Aaron.Oakley) *	Date: 2013-08-27 02:11
From memory, the use case at the time was using a custom TreeBuilder sub-class fed into a builtin XMLParser object. The code would construct a builder separately and keep a reference to it around. The builder would delegate calls to start(), data(), end(), and close() to super and save the completed tree when its close() was called. my_builder = CustomTreeBuilder() et_parser = ET.XMLParser(target=my_builder) for (evt, elem) in ET.iterparse("...", events, parser=et_parser): pass # Do first processing tree = my_builder.root # Saved tree It was done like this initially so that some data (I can't recall exactly what) from the XML input could be processed first very conveniently using the parse events from iterparse while allowing the whole tree to be retrieved afterwards. That said, the project later moved to using lxml for various features not contained in xml.etree.ElementTree, and I don't think the process I described is still being used.
msg196261 - (view)	Author: Eli Bendersky (eli.bendersky) *	Date: 2013-08-27 03:41
On Mon, Aug 26, 2013 at 7:11 PM, Aaron Oakley <report@bugs.python.org>wrote: > > Aaron Oakley added the comment: > > >From memory, the use case at the time was using a custom TreeBuilder > sub-class fed into a builtin XMLParser object. The code would construct a > builder separately and keep a reference to it around. The builder would > delegate calls to start(), data(), end(), and close() to super and save the > completed tree when its close() was called. > > my_builder = CustomTreeBuilder() > et_parser = ET.XMLParser(target=my_builder) > > for (evt, elem) in ET.iterparse("...", events, parser=et_parser): > pass # Do first processing > > tree = my_builder.root # Saved tree > > It was done like this initially so that some data (I can't recall exactly > what) from the XML input could be processed first very conveniently using > the parse events from iterparse while allowing the whole tree to be > retrieved afterwards. > > That said, the project later moved to using lxml for various features not > contained in xml.etree.ElementTree, and I don't think the process I > described is still being used. > Thanks for the information, Aaron; much appreciated.

History
Date	User	Action	Args
2022-04-11 14:57:45	admin	set	github: 62102
2013-08-27 03:41:41	eli.bendersky	set	messages: + msg196261
2013-08-27 02:11:20	Aaron.Oakley	set	messages: + msg196259
2013-08-27 01:21:36	eli.bendersky	set	messages: + msg196256
2013-08-04 01:55:45	eli.bendersky	set	status: open -> closed resolution: fixed stage: patch review -> resolved
2013-08-04 01:55:30	python-dev	set	nosy: + python-dev messages: + msg194323
2013-06-17 19:04:32	Aaron.Oakley	set	messages: + msg191362
2013-05-18 23:18:18	eli.bendersky	set	messages: + msg189562
2013-05-03 23:24:55	pitrou	set	nosy: + eli.bendersky stage: patch review versions: + Python 3.3
2013-05-03 20:56:18	Aaron.Oakley	create