This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: ElementTree: allow passing XMLPullParser instance into iterparse()
Type: enhancement Stage:
Components: Library (Lib), XML Versions: Python 3.5
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: eli.bendersky, scoder
Priority: normal Keywords:

Created on 2014-01-10 19:55 by scoder, last changed 2022-04-11 14:57 by admin.

Messages (1)
msg207877 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2014-01-10 19:55
in the XMLPullParser ticket

http://bugs.python.org/issue17741

specifically here:

http://bugs.python.org/msg196177

it says:

"""
* [The pull parser] will *not* accept a "parser" argument in the constructor.
Rationale: the parser argument of iterparse is broken anyway. This will
make it much easier to modify the implementation of EventParser in the
future when the C internals are fixed w.r.t problems mentioned in this issue.

* iterparse's "parser" argument will be deprecated, and the documentation
will be more detailed w.r.t to the limitations on its current "parser"
argument (the limitations are there in the code, but they're not fully
documented).
"""

And the "parser" argument to iterparse is now deprecated, according to the
docs:

http://docs.python.org/3.4/library/xml.etree.elementtree.html#xml.etree.ElementTree.iterparse

In lxml, however, I'm noticing that it would be really helpful to pass a
pull parser into iterparse(). Essentially, iterparse() is now stripped down
to a wrapper around the pull parser(s: XML/HTML in lxml) that simply serves
the feeding side of the interface for the user's convenience.

Note that lxml's iterparse() never had a "parser" argument. That's for
historical reasons, because it originally *was* a parser itself, but it
no longer is now.

I'd like to allow passing pull parsers into iterparse(), so that users can
configure them on their own. Currently, iterparse() must duplicate
basically all of the parser configuration arguments. I'd like to deprecate
that in lxml and replace it with the same simple interface as in ET, i.e.
pass in *either* a set of events *or* a readily configured pull parser.
Preferably raising an error if users pass both.

Could we change the deprecation from "argument is deprecated" to "passing a
normal (non-pull) parser into iterparse is deprecated", and then allow
passing a pull parser in the future?
History
Date User Action Args
2022-04-11 14:57:56adminsetgithub: 64418
2014-01-10 19:55:57scodercreate