This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: ElementTree.write() raises TypeError when xml_declaration = True and encoding is a unicode string
Type: behavior Stage: resolved
Components: Library (Lib), XML Versions: Python 2.7
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: Nosy List: David.Buxton, amaury.forgeotdarc, eli.bendersky, flox
Priority: normal Keywords: patch

Created on 2012-08-29 13:18 by David.Buxton, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
xml_declaration_test.patch David.Buxton, 2012-08-29 16:21 review
Messages (6)
msg169373 - (view) Author: David Buxton (David.Buxton) Date: 2012-08-29 13:18
The problem is an inconsistency between the ElementTree.write() method on Python 2 and 3 when xml_declaration is True. For Python 2.7 the encoding argument MUST NOT be a unicode string. For Python 3.2 the encoding argument MUST be a unicode string.

On Python 2.7.3 (ElementTree 1.3.0) you can only use byte strings as the encoding argument when including the xml declaration. If you use a unicode object you get TypeError thrown:


    >>> from xml.etree import ElementTree as ET
    >>> from io import BytesIO
    >>> 
    >>> tree = ET.ElementTree(ET.Element(u'example'))
    >>> tree.write(BytesIO(), xml_declaration=True, encoding='utf-8')
    >>> tree.write(BytesIO(), xml_declaration=True, encoding=u'utf-8')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/etree/ElementTree.py", line 813, in write
        write("<?xml version='1.0' encoding='%s'?>\n" % encoding)
    TypeError: 'unicode' does not have the buffer interface


So the encoding argument must be a byte string.

However on Python 3.2.3 (ElementTree 1.3.0) the same argument must be a unicode string. If you pass a byte string in it raises TypeError.

This only happens when you pass in an encoding and xml_declaration=True. This is a (small) problem when writing Py 2/3 compatible code since the version of ElementTree is supposed to be the same.
msg169391 - (view) Author: David Buxton (David.Buxton) Date: 2012-08-29 16:21
A patch against the current default branch to add tests for the xml_declaration keyword argument. This passes when applied to the 2.7 branch too.

This does NOT test whether one can use both bytes/unicode for the encoding argument. It just uses the native string type for each interpreter. I don't know whether you would actually consider that part of things worth testing yet.
msg169392 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2012-08-29 16:34
Thanks for the patch, David.

Alas, due to personal reasons I will not be able to work on core Python in the next 2-3 months. I may throw in a review here and there, but that's not a promise ;-)
msg169395 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2012-08-29 16:45
Why is it a problem? Encoding must be a text string:
  encoding='utf-8'
works with both Python 2 and 3.

Or if you used "from __future__ import unicode_literals",
    str('utf-8')
works as well.
msg169396 - (view) Author: David Buxton (David.Buxton) Date: 2012-08-29 17:00
Only a problem because I am using unicode_literals and it didn't occur to me to use `str('utf-8')` to get a native string on both 2+3. Much the best solution, thank you.

But that is still a little smelly - I think what I want ideally is for ElementTree to accept str or unicode on 2.7. Either way I appreciate this is very low priority and indeed debatable as to whether it is "wrong".
msg172966 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2012-10-15 12:55
Finally found time to look at this, sorry for the delay.

I agree with Amaury, and don't think any change is necessary in 2.7, the behavior there is quite consistent with what was usually done in 2.x. For porting or keeping the code 2/3 compatible, Amaury's solution is appropriate.
History
Date User Action Args
2022-04-11 14:57:35adminsetgithub: 60015
2012-10-15 12:55:57eli.benderskysetstatus: open -> closed
resolution: wont fix
messages: + msg172966

stage: resolved
2012-08-29 17:00:53David.Buxtonsetmessages: + msg169396
2012-08-29 16:45:15amaury.forgeotdarcsetnosy: + amaury.forgeotdarc
messages: + msg169395
2012-08-29 16:34:16eli.benderskysetmessages: + msg169392
2012-08-29 16:21:07David.Buxtonsetfiles: + xml_declaration_test.patch
keywords: + patch
messages: + msg169391
2012-08-29 13:22:31floxsetnosy: + eli.bendersky, flox
components: + Library (Lib)
2012-08-29 13:18:39David.Buxtoncreate