Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose the C implementation of ElementTree by default when importing ElementTree #58196

Closed
elibendersky mannequin opened this issue Feb 10, 2012 · 59 comments
Closed

Expose the C implementation of ElementTree by default when importing ElementTree #58196

elibendersky mannequin opened this issue Feb 10, 2012 · 59 comments
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@elibendersky
Copy link
Mannequin

elibendersky mannequin commented Feb 10, 2012

BPO 13988
Nosy @scoder, @pjenvey, @ezio-melotti, @merwok, @florentx
Files
  • issue13988_prepare_pep399.diff
  • issue13988_prepare_pep399_v2.diff
  • issue13988_fold_cET_behind_ET.diff
  • issue13988_doc_news.1.patch
  • issue13988_fold_cET_behind_ET_v2.diff
  • findall_takes_no_keywords_anymore.py
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2012-02-14.03:32:56.115>
    created_at = <Date 2012-02-10.14:52:56.428>
    labels = ['type-bug', 'library']
    title = 'Expose the C implementation of ElementTree by default when importing ElementTree'
    updated_at = <Date 2012-06-15.04:49:46.656>
    user = 'https://bugs.python.org/elibendersky'

    bugs.python.org fields:

    activity = <Date 2012-06-15.04:49:46.656>
    actor = 'eli.bendersky'
    assignee = 'none'
    closed = True
    closed_date = <Date 2012-02-14.03:32:56.115>
    closer = 'eli.bendersky'
    components = ['Library (Lib)']
    creation = <Date 2012-02-10.14:52:56.428>
    creator = 'eli.bendersky'
    dependencies = []
    files = ['24479', '24485', '24486', '24492', '24498', '25599']
    hgrepos = []
    issue_num = 13988
    keywords = ['patch']
    message_count = 59.0
    messages = ['153052', '153053', '153056', '153057', '153058', '153059', '153061', '153062', '153063', '153064', '153065', '153066', '153068', '153075', '153078', '153086', '153089', '153101', '153109', '153110', '153111', '153112', '153114', '153115', '153119', '153120', '153122', '153151', '153155', '153158', '153160', '153165', '153175', '153180', '153203', '153204', '153221', '153223', '153230', '153249', '153255', '153258', '153259', '153265', '153319', '153321', '153323', '153326', '153453', '153480', '153564', '160745', '160746', '160747', '160748', '160749', '160754', '161044', '162843']
    nosy_count = 11.0
    nosy_names = ['effbot', 'scoder', 'pjenvey', 'ezio.melotti', 'eric.araujo', 'Arfrever', 'eli.bendersky', 'flox', 'tshepang', 'python-dev', 'cmn']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue13988'
    versions = ['Python 3.3']

    @elibendersky
    Copy link
    Mannequin Author

    elibendersky mannequin commented Feb 10, 2012

    Following the discussion on python-dev [1], this issue will track the re-organization of Lib/xml/etree to expose the C implementation (_elementtree) by default when importing ElementTree. The test suite will also have to be updated - it's required that it exercises both the C and the Python implementations.

    I would like to make the import "magic" simple. Thus, the idea I currently plan to pursue is:

    • xml/etree/ElementTree.py will be a simple facade that attempts to 'import *' from _elementtree, and on failure does 'import *' from pyElementTree
    • The current contents of xml/etree/ElementTree.py will move to xml/etree/pyElementTree.py
    • xml/etree/cElementTree.py disappears.

    The test suite will be modified accordingly.

    I'll be working on creating a patch for this. Any help, ideas, comments and discussions are more than welcome.

    [1] http://mail.python.org/pipermail/python-dev/2012-February/116278.html

    @elibendersky elibendersky mannequin added the stdlib Python modules in the Lib dir label Feb 10, 2012
    @elibendersky
    Copy link
    Mannequin Author

    elibendersky mannequin commented Feb 10, 2012

    Oh, and not to forget: the documentation has to be updated to just not mention cElementTree any longer. For the user, the fact that a fast C library is invoked underneath should be invisible.

    @scoder
    Copy link
    Contributor

    scoder commented Feb 10, 2012

    Eli Bendersky, 10.02.2012 15:52:

    • The current contents of xml/etree/ElementTree.py will move to xml/etree/pyElementTree.py

    IIRC, there is a well specified way how accelerator modules should be used
    by Python modules. I recall a lengthy discussion on python-dev (or the py3k
    list?) back in the old pre-3.0 days, maybe there's even a PEP?

    • xml/etree/cElementTree.py disappears.

    Careful with backwards compatibility here. It's the accelerator module
    (_elementtree.so, IIRC) which is to be moved behind ElementTree.py.

    I don't see a compelling enough reason to break imports in existing code by
    removing the cElementTree module, so we should not do that.

    Stefan

    @elibendersky
    Copy link
    Mannequin Author

    elibendersky mannequin commented Feb 10, 2012

    > IIRC, there is a well specified way how accelerator modules should be used
    by Python modules. I recall a lengthy discussion on python-dev (or the py3k
    list?) back in the old pre-3.0 days, maybe there's even a PEP?

    If there's a convention, I'll happily follow it. It would be great if someone could dig up the relevant details (I'll try to look for them myself).

    > I don't see a compelling enough reason to break imports in existing code by
    removing the cElementTree module, so we should not do that.

    Agreed. Perhaps it should just be deprecated?

    @elibendersky
    Copy link
    Mannequin Author

    elibendersky mannequin commented Feb 10, 2012

    Hmm, that may be PEP-399:

    If an acceleration module is provided it is to be named the same as the module it is accelerating with an underscore attached as a prefix, e.g., _warnings for warnings. The common pattern to access the accelerated code from the pure Python implementation is to import it with an import *, e.g., from _warnings import *. This is typically done at the end of the module to allow it to overwrite specific Python objects with their accelerated equivalents.

    However, it's hardly a rule, just describing a "common pattern". I wonder why this approach is preferable to the one I proposed (explicit facade module)?

    @scoder
    Copy link
    Contributor

    scoder commented Feb 10, 2012

    Eli Bendersky, 10.02.2012 16:43:
    >>> I don't see a compelling enough reason to break imports in existing code by
    >>> removing the cElementTree module, so we should not do that.
    > 
    > Agreed. Perhaps it should just be deprecated?

    Given that its mere existence is currently almost undocumented (except for
    one tiny sentence in the docs), I vote for clearly documenting it as
    deprecated, yes, with a mention to the fact that it's automatically used by
    xml.etree.ElementTree starting with 3.3.

    I wouldn't want to see more than that done, though, specifically not a
    visible warning when it's being imported. There's far too much code out
    there that uses it in previous Python versions. Such a warning would strike
    even if it's only being imported conditionally with a try-except, which is
    the most common way of doing it. So it would hit most users and require an
    awful lot of code to be touched to fix it, for basically zero benefit.

    Stefan

    @ezio-melotti
    Copy link
    Member

    A note in the doc is easy to miss IMHO, and since DeprecationWarnings are silenced by default, I don't think they will affect the final users.

    A different "problem" is that developers will have to check for the Python version if they want to use ElementTree on Python >=3.3 and keep using cElementTree on <3.3 (unless another way is provided).

    If possible I would avoid pyElementTree, and keep ElementTree that imports from _elementtree and the deprecated cElementTree (until it can be removed).

    @elibendersky
    Copy link
    Mannequin Author

    elibendersky mannequin commented Feb 10, 2012

    If possible I would avoid pyElementTree,

    I suppose it's possible, but I'm genuinely interested in a technical reason for doing so. The approach suggested in PEP-399 is useful for module in which part of the functionality is implemented in C, and then augmented in Python. ElementTree is different - it's pretty much two separate implementations of the same API.

    So, I think there's little question in terms of simplicity and clarity. Having pyElementTree and cElementTree (keeping it for backwards compat), and a facade named ElementTree that chooses between them is simple, clean and intuitive.

    From a performance point of view, consider the (by far) common case - where _elementtree *is* successfully imported.

    Option 1: from _elementtree import *, at the end of the Python implementation in ElementTree.py - so for each invocation, the whole import of the Python code has to be done, just to reach the overriding import * at the end.

    Option 2: ElementTree is a facade that attempts to import _elementtree first. So the Python implementation in pyElementTree doesn't even have to be parsed and imported

    @florentx
    Copy link
    Mannequin

    florentx mannequin commented Feb 10, 2012

    If possible I would avoid pyElementTree,

    Me too:

    • __name__ and __qualname__ would be less confusing
    • the cElementTree accelerator uses large parts of Python implementation

    ElementTree is different - it's pretty much two separate implementations of the same API.

    Not fully separated... there's some python code hidden in the C module.

    From a performance point of view, consider the (by far) common case

    • where _elementtree *is* successfully imported.
      ... for each invocation, the whole import of the Python code has
      to be done, just to reach the overriding import * at the end.

    This point is wrong... the _elementtree.c accelerator imports Python ElementTree already.

    As you can see on lines 2938 to 2945, the change could lead to an import cycle:
    http://hg.python.org/cpython/file/705b56512287/Modules/_elementtree.c#l2938

    Trying to sort this out, it already gives me a headache.
    I would like to remove the Python bootstrap code from the C module and try to do it differently, in a more standard way.

    @florentx florentx mannequin added the performance Performance or resource usage label Feb 10, 2012
    @elibendersky
    Copy link
    Mannequin Author

    elibendersky mannequin commented Feb 10, 2012

    > From a performance point of view, consider the (by far) common case
    > - where _elementtree *is* successfully imported.
    > ... for each invocation, the whole import of the Python code has
    > to be done, just to reach the overriding import * at the end.

    This point is wrong... the _elementtree.c accelerator imports Python ElementTree already.

    As you can see on lines 2938 to 2945, the change could lead to an import cycle:
    http://hg.python.org/cpython/file/705b56512287/Modules/_elementtree.c#l2938

    Trying to sort this out, it already gives me a headache.
    I would like to remove the Python bootstrap code from the C module and try to do it differently, in a more standard way.

    The Python code inside _elementtree could be moved to Python code,
    which would then import the Python stuff it needs from pyElementTree.
    Since pyElementTree doesn't import _elementtree, there will be
    circular dependencies.

    So this is a point *in favor* of pyElementTree being pure-Python :-)

    In other words:

    In xml/etree there is:

    • ElementTree: tries to import cElementTree. On success, done. On
      ImportError, imports pyElementTree
    • pyElementTree: the pure Python implementation
    • cElementTree: sets up the bootstrap Python code and tries to import
      _elementtree. In case of an error, propagates an ImportError up.

    Would that work?

    @elibendersky
    Copy link
    Mannequin Author

    elibendersky mannequin commented Feb 10, 2012

    Oops, in last message:

    s/there will be circular dependencies/there will not be circular dependencies/

    @elibendersky elibendersky mannequin removed the performance Performance or resource usage label Feb 10, 2012
    @ezio-melotti
    Copy link
    Member

    In xml/etree there is:

    • ElementTree: tries to import cElementTree. On success, done. On
      ImportError, imports pyElementTree
    • pyElementTree: the pure Python implementation
    • cElementTree: sets up the bootstrap Python code and tries to import
      _elementtree. In case of an error, propagates an ImportError up.

    What I had in mind is more like:

    • ElementTree: defines the python code and if _elementtree is available overrides part of it with the functions imported from it;
    • cElementTree: at this point it could just be a deprecated alias for ElementTree

    @ezio-melotti ezio-melotti added the performance Performance or resource usage label Feb 10, 2012
    @elibendersky
    Copy link
    Mannequin Author

    elibendersky mannequin commented Feb 10, 2012

    What I had in mind is more like:
     - ElementTree: defines the python code and if _elementtree is available overrides part of it with the functions imported from it;

    The problem with this is the bootstrap Python code executed by
    _elementtree. That should not be executed when _elementtree (the C
    parts) can't be imported. Keeping this code in ElementTree will
    probably complicate matters since it will add import conditions.

    @ezio-melotti
    Copy link
    Member

    > - ElementTree: defines the python code and if _elementtree is
    > available overrides part of it with the functions imported from it;

    The problem with this is the bootstrap Python code executed by
    _elementtree.

    This might become unnecessary if ElementTree becomes the main module and _elementtree only contains a few faster functions/classes that are supposed to replace the ones written in Python.
    So basically you only have a single fully functional Python module (ElementTree) plus an optional C module (_elementtree) that only provides faster replacements for ElementTree.

    That should not be executed when _elementtree (the C parts) can't be
    imported.

    We are assuming that _elementtree might be missing, but what are the cases where this might actually happen? Other implementations like PyPy? Exotic platforms that can't compile _elementtree?

    Keeping this code in ElementTree will probably complicate
    matters since it will add import conditions.

    Wouldn't that as simple as having in ElementTree.py:
    ...
    full python code here...
    ...
    try:
    # override a few functions/classes with the faster versions
    from _elementtree import *
    except ImportError:
    # _elementtree is missing, so we just keep the "slow" versions
    pass
    else:
    # do the rest here if at all needed (e.g. plug the faster
    # versions in the right places)

    I'm not familiar with ElementTree (I just looked at the bootstrap bit quickly), so what I'm saying might not be applicable here, but I've seen other modules doing something similar to what I'm proposing (json, heapq, maybe even warning and others).

    @florentx
    Copy link
    Mannequin

    florentx mannequin commented Feb 10, 2012

    The first step is to strip out the cElementTree bootstrap code from the C module.
    I did it in the attached patch (plus removal of obsolete code for copy() in Python 2.4).
    This passes the unmodified tests "test_xml_etree" and "test_xml_etree_c".

    Then I think the right approach is to fold completely cElementTree behind ElementTree.
    The cElementTree alias can be simply declared in Lib/xml/etree/init.py.

    @elibendersky
    Copy link
    Mannequin Author

    elibendersky mannequin commented Feb 11, 2012

    Ezio,

    > We are assuming that _elementtree might be missing, but what are the cases where this might actually happen? Other implementations like PyPy? Exotic platforms that can't compile _elementtree?

    I guess both. To make the stdlib work on PyPy without changes, it has to be able to load the pure Python modules in a fallback.

    As for platforms that can't compile _elementtree, keep in mind that there's also expat which _elementtree uses, so it's a lot of code to compile. Python works on some embedded systems, I'm not sure all of them can compile this code.

    @elibendersky
    Copy link
    Mannequin Author

    elibendersky mannequin commented Feb 11, 2012

    Florent, thanks for the patch - at this point code is more useful than talk :-)

    Anyhow, I tried to apply it and a few tests in test_xml_etree_c fail, because it can't find fromstring and fromstringlist. This gets fixed when I import fromstringlist in cElementTree.py from ElementTree, and in the same file assign:

      fromstring = XML

    Which is similar to what ElementTree itself does.

    In general, I agree that a good first step would be to refactor the code to extract the boostrapping from _elementtree.c to cElementTree.py. As long as the tests pass, this can be committed regardless of this issue's original intent.

    However, why did you leave some bootstrapping code inside? Can't all of it go away?

    @merwok
    Copy link
    Member

    merwok commented Feb 11, 2012

    I strongly feel that existing code importing ElementTree or cElementTree should not be broken. Let’s add transparent import from _elementtree to ElementTree without breaking existing uses of cET.

    I think that 3.2 and 2.7 should get a doc note about cET, do we have a bug for this?

    @elibendersky
    Copy link
    Mannequin Author

    elibendersky mannequin commented Feb 11, 2012

    I strongly feel that existing code importing ElementTree or cElementTree should not be broken.  Let’s add transparent import from _elementtree to ElementTree without breaking existing uses of cET.

    AFAICS there's currently no disagreement on this point. The import
    from cElementTree will keep working in 3.3 as it always had. However,
    the explicit mention of cElementTree should be removed from the
    documentation of ElementTree. The only remaining question is whether a
    silent deprecation warning should be added in cElementTree.

    I think that 3.2 and 2.7 should get a doc note about cET, do we have a bug for this?

    What doc note? Something in the spirit of: "Note that in 3.3, the
    accelerated C implementation will be provided by default when
    importing ElementTree" - or do you mean something else?

    I don't think there's an open bug for this.

    @elibendersky
    Copy link
    Mannequin Author

    elibendersky mannequin commented Feb 11, 2012

    The more I think about it, the more the bootstrap code in _elementtree.c annoys me. It's the only instance of calling PyRun_String in Modules/ !

    It's hackish and causes ugly import problems. If the C code needs stdlib functionality like copy.deepcopy, it should use PyImport_ImportModule like everyone else and not through a PyRun_String hack.

    Since we've already decided to do some refactoring, I suggest all trace of the bootstrap is removed from _elementtree.c

    @scoder
    Copy link
    Contributor

    scoder commented Feb 11, 2012

    Eli Bendersky, 11.02.2012 09:08:

    The more I think about it, the more the bootstrap code in _elementtree.c
    annoys me. It's the only instance of calling PyRun_String in Modules/ !

    It's hackish and causes ugly import problems. If the C code needs stdlib
    functionality like copy.deepcopy, it should use PyImport_ImportModule
    like everyone else and not through a PyRun_String hack.

    I find it perfectly legitimate to run Python code from a C module.
    Certainly not a hack. We all know that most non-trivial functionality can
    be expressed much easier in Python than in C, that's why we use Python
    after all. In particular, defining a class with attributes and methods is a
    couple of lines of code in Python, but a huge amount of code in C. Avoiding
    the complexity of writing everything in C, or even of splitting the code in
    a harder to understand way, is worth it.

    That being said, I think it's worth removing any clear *redundancy* with
    ET.py, as Florent's patch did. The goal is to keep _elementtree.c a pure
    accelerator module that improves plain ElementTree, and redundancy is
    counterproductive in this context. But if the implementation differs for
    some reason, I would tend towards leaving it as is.

    Stefan

    @elibendersky
    Copy link
    Mannequin Author

    elibendersky mannequin commented Feb 11, 2012

    > I find it perfectly legitimate to run Python code from a C module.
    Certainly not a hack. We all know that most non-trivial functionality can
    be expressed much easier in Python than in C, that's why we use Python
    after all. In particular, defining a class with attributes and methods is a couple of lines of code in Python, but a huge amount of code in C. Avoiding the complexity of writing everything in C, or even of splitting the code in a harder to understand way, is worth it.
    <<

    There can be arguments both way, but if we follow the lead of existing standard extension modules, the tendency is clearly not to use PyRun_String. Many C extensions use functionality from Python, but none does it the "bootstrap way". Why is that? Is there a good reason, or is it just convention?

    @florentx
    Copy link
    Mannequin

    florentx mannequin commented Feb 11, 2012

    Anyhow, I tried to apply it and a few tests in test_xml_etree_c fail,
    because it can't find fromstring and fromstringlist.

    Ooops, I cut some redundancy after running the tests, and I forgot to re-add the import. You're right.

    However, why did you leave some bootstrapping code inside?
    It's the only instance of calling PyRun_String in Modules/

    I just tried to cut the import cycle and import it the other way.
    I think it was done like that historically, for some reason, when
    the module was first developped (for Python 1.5 maybe ...)
    It is not necessary to remove all the Python code at once, and I am better at Python than at C.
    We can delay this additional clean-up at a later time, it does not
    block the PEP-399 implementation.

    @florentx
    Copy link
    Mannequin

    florentx mannequin commented Feb 11, 2012

    Updated patch:

    • fixed missing import and missing alias
    • moved the XMLTreeBuilder alias to the Python module

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Feb 11, 2012

    New changeset 31dfb4be934d by Florent Xicluna in branch 'default':
    Issue bpo-13988: move the python bootstrap code to cElementTree.py, and remove obsolete code for Python 2.4 and 2.5.
    http://hg.python.org/cpython/rev/31dfb4be934d

    @florentx
    Copy link
    Mannequin

    florentx mannequin commented Feb 11, 2012

    I've pushed this first part, which is just a code refactoring.

    I tried to work out a patch for the second part.
    The tricky thing is because of xml.etree still using doctests.
    The patch for the tests seems to be enough small and readable.

    We have small differences between C and Python, about the warnings beeing raised. In general the C implementation do not raise the deprecation warnings. IMHO, this could be fixed later.

    Still missing is the patch for the documentation.

    @florentx
    Copy link
    Mannequin

    florentx mannequin commented Feb 12, 2012

    Updated patch:

    • add 'XMLID' and 'register_namespace' to the ElementTree.__all__
    • the comment says explicitly that cElementTree is deprecated
    • exercise the deprecated module with the tests

    @elibendersky
    Copy link
    Mannequin Author

    elibendersky mannequin commented Feb 12, 2012

    Florent,

    Your updated patch looks good. I think that the explicit import of _namespace_map into cElementTree is just to satisfy some weird magic in the tests and can probably be removed as well (along with the weird magic :-), but that's not really important and can be left for later cleanups.

    Regarding the documentation, alright let's not mention the implementation detail, and your "versionchanged" addition makes sense. I don't think adding directly to whatsnew/3.3.rst is necessary, updating Misc/NEWS is enough.

    I'll apply the documentation patch after you apply the code patch. Or if you want, you can apply it yourself, I don't mind.

    Thanks for the cooperation!

    @elibendersky
    Copy link
    Mannequin Author

    elibendersky mannequin commented Feb 12, 2012

    By the way, I see that if the explicit import of _namespace_map is commented out, the test_xml_etree_c test fails because it's not in the __all__ list. So the test can just import it directly with:

    from xml.etree.ElementTree import _namespace_map

    And the import in cElementTree won't be necessary. After all, _namespace_map is definitely not a public API!

    This will keep cElementTree an nice-and-clean:

    from xml.etree.ElementTree import *

    @florentx
    Copy link
    Mannequin

    florentx mannequin commented Feb 12, 2012

    from xml.etree.ElementTree import _namespace_map

    And the import in cElementTree won't be necessary.
    After all, _namespace_map is definitely not a public API!

    Because of the interaction of the support.import_fresh_module with the CleanContext context manager, it's not so easy to remove black magic.
    I don't find better than:

            if hasattr(ET, '_namespace_map'):
                _namespace_map = ET._namespace_map
            else:
                from xml.etree.ElementTree import _namespace_map

    This is why I kept the import in the deprecated "cElementTree" at first.
    It does not hurt (it's private API), and it makes the test easier.

    ( If you have doubts, try ./python -m test test_xml_etree{,_c} or variants. )

    I will probably commit code and documentation at once. It makes things easier regarding traceability.

    @elibendersky
    Copy link
    Mannequin Author

    elibendersky mannequin commented Feb 13, 2012

    Alright, it's not really important at this point and can be cleaned up
    later.

    I will probably commit code and documentation at once. It makes things
    easier regarding traceability.

    Sounds good

    @ezio-melotti
    Copy link
    Member

    FWIW the JSON doc doesn't even mention the acceleration module _json, but since people here are used to import cElementTree I think it should be mentioned that it's now deprecated and accelerations are used automatically, so something like this would work:

    .. versionchanged:: 3.3
    The :mod:`xml.etree.cElementTree` module is now deprecated.
    A fast implementation will be used automatically whenever available.

    I also agree with Éric that there's no need to mention _elementtree (people might try to import that instead, and other implementations might use a different name).

    Lib/test/test_xml_etree_c.py could also be removed, and the other tests could import cElementTree too (even though I'm not sure that works too well with doctests).

    Shouldn't cElementTree raise an error when _elementtree is missing?
    A DeprecationWarning should be added too.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Feb 13, 2012

    New changeset 65fc79fb4eb2 by Florent Xicluna in branch 'default':
    Issue bpo-13988: cElementTree is deprecated and the _elementtree accelerator is automatically used whenever available.
    http://hg.python.org/cpython/rev/65fc79fb4eb2

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Feb 13, 2012

    New changeset e9cf34d56ff1 by Florent Xicluna in branch 'default':
    Fix xml_etree_c test error (follow up of issue bpo-13988).
    http://hg.python.org/cpython/rev/e9cf34d56ff1

    @florentx
    Copy link
    Mannequin

    florentx mannequin commented Feb 13, 2012

    Now the merge is done. Thank you Eli for the effort, and to the other contributors for the review.

    Following topics may need further work:

    • add a Deprecation warning for cElementTree? it will annoy the package maintainers which support both 3.2 and >= 3.3, because either they'll use the non-accelerated version in 3.2, or they will have the Deprecation warning in 3.3 ... IMHO, it's better to do nothing, and just keep the mention in the documentation that it is deprecated.

    • raise the Deprecation warnings for the functions and methods which are marked as deprecated in the documentation (the Python code does it, but not the C accelerator)

    • convert _elementtree.c Python bootstrap code to C

    • refactor the test suite

    These topics are not high priority. A specific issue should be opened if any of them require some attention.

    @florentx florentx mannequin closed this as completed Feb 13, 2012
    @elibendersky
    Copy link
    Mannequin Author

    elibendersky mannequin commented Feb 14, 2012

    I would add to the TODO - improve the documentation of the module. Opened bpo-14006 for this.

    @elibendersky elibendersky mannequin reopened this Feb 14, 2012
    @elibendersky elibendersky mannequin closed this as completed Feb 14, 2012
    @elibendersky
    Copy link
    Mannequin Author

    elibendersky mannequin commented Feb 14, 2012

    I started going over the deprecated methods in ElementTree and ran into a more serious problem. XmlParser.doctype() is listed as deprecated, and indeed ElementTree (the Python version) issues a deprecation warning. However, the C implementation doesn't have doctype() at all so I get AttributeError.

    @pjenvey
    Copy link
    Member

    pjenvey commented Feb 14, 2012

    DeprecationWarnings aren't that annoying anymore now that they're silent by default. It should at least have a PendingDeprecationWarning

    @elibendersky
    Copy link
    Mannequin Author

    elibendersky mannequin commented Feb 14, 2012

    Opened bpo-14007 to track the doctype() problem

    @elibendersky
    Copy link
    Mannequin Author

    elibendersky mannequin commented Feb 16, 2012

    Emitting a deprecation warning on importing cElementTree has been rejected in the pydev list. The other remaining tasks have new issues on them, so this issue is done now.

    @ezio-melotti
    Copy link
    Member

    I'm still not sure that's the best option. Without deprecation people will keep using cElementTree and we will have to keep it around forever (or at least until Python4 and then have a 3to4 to fix the import).
    This might be fine, but as a developer I would still like Python to tell me "You can just import ElementTree now, there's no need to use cElementTree". Maybe the deprecation can be added to 3.4?

    P.S. I'm fine with keeping it around for several more versions, but if we eventually have to remove it, we would still have to warn the users beforehand. The more we wait, the more users will be still using cElementTree by the time we will actually remove it.

    @merwok
    Copy link
    Member

    merwok commented Feb 17, 2012

    I don’t see benefits in removing cET.

    @cmn
    Copy link
    Mannequin

    cmn mannequin commented May 15, 2012

    Hi,

    the C implementation of ElementTree do not support namespaces for find/all/... .

    To me this is a serious regression, as I rely on ElementTree namespace support, and 3.3 would break it with this change.
    Breaking namespace support is a fundamental problem.

    Please reconsider this therefore.

    Code to reproduced attached - works fine with python 3.2.

    As the C implementation of ElementTree and Element lack the namespace keyword for findall (and *all* the other methods),
    where namespaces are very important when dealing with xml,
    and it is not possible to prevent using the v implementation of ElementTree without changing the python install,
    I propose to revert this change.

    Until the C implementation can do namespaces as well.

    @cmn cmn mannequin added type-bug An unexpected behavior, bug, or error and removed performance Performance or resource usage labels May 15, 2012
    @cmn
    Copy link
    Mannequin

    cmn mannequin commented May 15, 2012

    The file was bad, sorry.
    re-attached

    @Arfrever
    Copy link
    Mannequin

    Arfrever mannequin commented May 15, 2012

    Markus (cmn): Please file a separate issue, which will be a release blocker for 3.3 release. (It's not the only regression.)

    @Arfrever
    Copy link
    Mannequin

    Arfrever mannequin commented May 15, 2012

    Temporary very ugly workaround (before importing xml.etree.ElementTree) is:

    import sys
    sys.modules["_elementtree"] = None

    @ezio-melotti
    Copy link
    Member

    It seems to me that namespaces are actually supported, but they are accepted only as positional args and not keyword args, so this should be easy to fix.

    @cmn
    Copy link
    Mannequin

    cmn mannequin commented May 15, 2012

    As advised I opened a new bug on this:
    http://bugs.python.org/issue14818

    @cmn
    Copy link
    Mannequin

    cmn mannequin commented May 18, 2012

    New bug - C implementation of ElementTree: Inheriting from Element breaks text member
    http://bugs.python.org/issue14849

    @elibendersky
    Copy link
    Mannequin Author

    elibendersky mannequin commented Jun 15, 2012

    Note: last traces of Python bootstrap code were removed from _elementtree in changeset 652d148bdc1d

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants