Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xml.etree.ElementTree.TreeBuilder.start differs between pure Python and C implementations #83676

Closed
hauntsaninja opened this issue Jan 30, 2020 · 4 comments
Labels
3.9 only security fixes stdlib Python modules in the Lib dir

Comments

@hauntsaninja
Copy link
Contributor

BPO 39495
Nosy @scoder, @serhiy-storchaka, @hauntsaninja
PRs
  • bpo-39495: Remove default value from C impl of TreeBuilder.start #18275
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2020-03-02.07:53:39.024>
    created_at = <Date 2020-01-30.02:11:47.506>
    labels = ['library', '3.9']
    title = 'xml.etree.ElementTree.TreeBuilder.start differs between pure Python and C implementations'
    updated_at = <Date 2020-03-02.07:53:39.023>
    user = 'https://github.com/hauntsaninja'

    bugs.python.org fields:

    activity = <Date 2020-03-02.07:53:39.023>
    actor = 'serhiy.storchaka'
    assignee = 'none'
    closed = True
    closed_date = <Date 2020-03-02.07:53:39.024>
    closer = 'serhiy.storchaka'
    components = ['Library (Lib)']
    creation = <Date 2020-01-30.02:11:47.506>
    creator = 'hauntsaninja'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 39495
    keywords = ['patch']
    message_count = 4.0
    messages = ['361002', '361005', '361015', '363127']
    nosy_count = 4.0
    nosy_names = ['scoder', 'eli.bendersky', 'serhiy.storchaka', 'hauntsaninja']
    pr_nums = ['18275']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue39495'
    versions = ['Python 3.9']

    @hauntsaninja
    Copy link
    Contributor Author

    The C accelerated version of xml.etree.ElementTree.TreeBuilder.start has a default value for attrs, whereas the pure Python version does not.

    In [41]: sys.version                                                                                                               
    Out[41]: '3.8.1 (default, Jan 23 2020, 23:36:06) \n[Clang 11.0.0 (clang-1100.0.33.17)]'
    
    In [42]: import xml.etree.ElementTree                                                                                              
    
    In [43]: inspect.signature(xml.etree.ElementTree.TreeBuilder.start)                                                                
    Out[43]: <Signature (self, tag, attrs=None, /)>
    
    In [44]: from test.support import import_fresh_module                                                                              
    
    In [45]: pyElementTree = import_fresh_module('xml.etree.ElementTree', blocked=['_elementtree'])                                    
    
    In [46]: inspect.signature(pyElementTree.TreeBuilder.start)                                                                        
    Out[46]: <Signature (self, tag, attrs)>
    

    From PEP-399 (https://www.python.org/dev/peps/pep-0399/)

    Acting as a drop-in replacement also dictates that no public API be provided in accelerated code that does not exist in the pure Python code. Without this requirement people could accidentally come to rely on a detail in the accelerated code which is not made available to other VMs that use the pure Python implementation. 
    

    @hauntsaninja hauntsaninja added the stdlib Python modules in the Lib dir label Jan 30, 2020
    @hauntsaninja
    Copy link
    Contributor Author

    Based on https://github.com/python/cpython/blob/master/Modules/_elementtree.c#L2700 and the behaviour at runtime, something like the following patch could work:

    diff --git a/Lib/xml/etree/ElementTree.py b/Lib/xml/etree/ElementTree.py
    index c8d898f328..bbfc1afe08 100644
    --- a/Lib/xml/etree/ElementTree.py
    +++ b/Lib/xml/etree/ElementTree.py
    @@ -1452,7 +1452,7 @@ class TreeBuilder:
             """Add text to current element."""
             self._data.append(data)
     
    -    def start(self, tag, attrs):
    +    def start(self, tag, attrs=None):
             """Open new element and return it.
     
             *tag* is the element name, *attrs* is a dict containing element
    @@ -1460,6 +1460,8 @@ class TreeBuilder:
     
             """
             self._flush()
    +        if attrs is None:
    +            attrs = {}
             self._last = elem = self._factory(tag, attrs)
             if self._elem:
                 self._elem[-1].append(elem)
    

    Happy to submit a PR!

    @serhiy-storchaka
    Copy link
    Member

    According to the documentation the attrs parameter does not have default value. I think that the C implementation should be changed.

    https://docs.python.org/3/library/xml.etree.elementtree.html#xml.etree.ElementTree.TreeBuilder.start

    @serhiy-storchaka
    Copy link
    Member

    New changeset 4edc95c by Shantanu in branch 'master':
    bpo-39495: Remove default value from C impl of TreeBuilder.start (GH-18275)
    4edc95c

    @serhiy-storchaka serhiy-storchaka added the 3.9 only security fixes label Mar 2, 2020
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes stdlib Python modules in the Lib dir
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants