classification
Title: xml.etree.ElementTree.TreeBuilder.start differs between pure Python and C implementations
Type: Stage: resolved
Components: Library (Lib) Versions: Python 3.9
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: eli.bendersky, hauntsaninja, scoder, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2020-01-30 02:11 by hauntsaninja, last changed 2020-03-02 07:53 by serhiy.storchaka. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 18275 merged hauntsaninja, 2020-01-30 06:49
Messages (4)
msg361002 - (view) Author: Shantanu (hauntsaninja) * Date: 2020-01-30 02:11
The C accelerated version of `xml.etree.ElementTree.TreeBuilder.start` has a default value for `attrs`, whereas the pure Python version does not.

```
In [41]: sys.version                                                                                                               
Out[41]: '3.8.1 (default, Jan 23 2020, 23:36:06) \n[Clang 11.0.0 (clang-1100.0.33.17)]'

In [42]: import xml.etree.ElementTree                                                                                              

In [43]: inspect.signature(xml.etree.ElementTree.TreeBuilder.start)                                                                
Out[43]: <Signature (self, tag, attrs=None, /)>

In [44]: from test.support import import_fresh_module                                                                              

In [45]: pyElementTree = import_fresh_module('xml.etree.ElementTree', blocked=['_elementtree'])                                    

In [46]: inspect.signature(pyElementTree.TreeBuilder.start)                                                                        
Out[46]: <Signature (self, tag, attrs)>
```

From PEP 399 (https://www.python.org/dev/peps/pep-0399/)
```
Acting as a drop-in replacement also dictates that no public API be provided in accelerated code that does not exist in the pure Python code. Without this requirement people could accidentally come to rely on a detail in the accelerated code which is not made available to other VMs that use the pure Python implementation. 
```
msg361005 - (view) Author: Shantanu (hauntsaninja) * Date: 2020-01-30 03:05
Based on https://github.com/python/cpython/blob/master/Modules/_elementtree.c#L2700 and the behaviour at runtime, something like the following patch could work:
```
diff --git a/Lib/xml/etree/ElementTree.py b/Lib/xml/etree/ElementTree.py
index c8d898f328..bbfc1afe08 100644
--- a/Lib/xml/etree/ElementTree.py
+++ b/Lib/xml/etree/ElementTree.py
@@ -1452,7 +1452,7 @@ class TreeBuilder:
         """Add text to current element."""
         self._data.append(data)
 
-    def start(self, tag, attrs):
+    def start(self, tag, attrs=None):
         """Open new element and return it.
 
         *tag* is the element name, *attrs* is a dict containing element
@@ -1460,6 +1460,8 @@ class TreeBuilder:
 
         """
         self._flush()
+        if attrs is None:
+            attrs = {}
         self._last = elem = self._factory(tag, attrs)
         if self._elem:
             self._elem[-1].append(elem)
```

Happy to submit a PR!
msg361015 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-01-30 06:32
According to the documentation the attrs parameter does not have default value. I think that the C implementation should be changed.

https://docs.python.org/3/library/xml.etree.elementtree.html#xml.etree.ElementTree.TreeBuilder.start
msg363127 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-03-02 06:33
New changeset 4edc95cf0a2960431621eee9bc194f6225f1690b by Shantanu in branch 'master':
bpo-39495: Remove default value from C impl of TreeBuilder.start (GH-18275)
https://github.com/python/cpython/commit/4edc95cf0a2960431621eee9bc194f6225f1690b
History
Date User Action Args
2020-03-02 07:53:39serhiy.storchakasetstatus: open -> closed
stage: patch review -> resolved
resolution: fixed
versions: + Python 3.9
2020-03-02 06:33:27serhiy.storchakasetmessages: + msg363127
2020-01-30 06:49:37hauntsaninjasetkeywords: + patch
stage: patch review
pull_requests: + pull_request17650
2020-01-30 06:32:32serhiy.storchakasetmessages: + msg361015
2020-01-30 03:05:24hauntsaninjasetmessages: + msg361005
2020-01-30 02:56:54xtreaksetnosy: + scoder, eli.bendersky, serhiy.storchaka
2020-01-30 02:11:47hauntsaninjacreate