This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: In xml.etree.ElementTree bytes tag or attributes raises on serialization
Type: behavior Stage: resolved
Components: Library (Lib), XML Versions: Python 3.6
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: Nosy List: py.user, scoder
Priority: normal Keywords:

Created on 2016-09-21 12:19 by py.user, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (2)
msg277128 - (view) Author: py.user (py.user) * Date: 2016-09-21 12:19
https://docs.python.org/3/library/xml.etree.elementtree.html#xml.etree.ElementTree.Element
"The element name, attribute names, and attribute values can be either bytestrings or Unicode strings."


The element name, attribute names, and attribute values can have bytes type, but they can't be serialized:

>>> import xml.etree.ElementTree as etree
>>>
>>> root = etree.Element(b'x')
>>> root
<Element b'x' at 0xb739934c>
>>>
>>> elem = etree.SubElement(root, b'y', {b'a': b'b'})
>>> elem
<Element b'y' at 0xb7399374>
>>> elem.attrib
{b'a': b'b'}
>>>
>>> etree.dump(root)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.3/xml/etree/ElementTree.py", line 1224, in dump
    elem.write(sys.stdout, encoding="unicode")
  File "/usr/lib/python3.3/xml/etree/ElementTree.py", line 826, in write
    qnames, namespaces = _namespaces(self._root, default_namespace)
  File "/usr/lib/python3.3/xml/etree/ElementTree.py", line 937, in _namespaces
    _raise_serialization_error(tag)
  File "/usr/lib/python3.3/xml/etree/ElementTree.py", line 1105, in _raise_serialization_error
    "cannot serialize %r (type %s)" % (text, type(text).__name__)
TypeError: cannot serialize b'x' (type bytes)
>>>
>>> etree.tostring(root)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.3/xml/etree/ElementTree.py", line 1171, in tostring
    ElementTree(element).write(stream, encoding, method=method)
  File "/usr/lib/python3.3/xml/etree/ElementTree.py", line 826, in write
    qnames, namespaces = _namespaces(self._root, default_namespace)
  File "/usr/lib/python3.3/xml/etree/ElementTree.py", line 937, in _namespaces
    _raise_serialization_error(tag)
  File "/usr/lib/python3.3/xml/etree/ElementTree.py", line 1105, in _raise_serialization_error
    "cannot serialize %r (type %s)" % (text, type(text).__name__)
TypeError: cannot serialize b'x' (type bytes)
>>>


Also attribute name can be serialized, but it holds the letter "b" and single quotes:

>>> import xml.etree.ElementTree as etree
>>> 
>>> e = etree.Element('a', {b'x': '1'})
>>> etree.tostring(e)
b'<a b\'x\'="1" />'
>>>


And same try with site package lxml works fine for all cases because it converts bytes to unicode strings right away:

>>> import lxml.etree
>>>
>>> root = lxml.etree.Element(b'x')
>>> root
<Element x at 0xb6ff00cc>
>>>
>>> elem = lxml.etree.SubElement(root, b'y', {b'a': b'b'})
>>> elem
<Element y at 0xb73a834c>
>>>
>>> elem.attrib
{'a': 'b'}
>>>
msg351721 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2019-09-10 16:09
Arguably, writing out "b'x'" as attribute name instead of raising an exception isn't ideal. However, OTOH, I think it's reasonable to accept anything that is serialisable as a string, not just strings. That makes it difficult to draw a line.

In any case, it's ok for ElementTree to not allow bytes in general.

I'll close this issue, because I don't think it's worth being called a bug.
History
Date User Action Args
2022-04-11 14:58:37adminsetgithub: 72424
2019-09-10 16:09:54scodersetstatus: open -> closed

nosy: + scoder
messages: + msg351721

resolution: wont fix
stage: resolved
2016-09-21 12:19:15py.usercreate