Title: xml.etree.ElementTree.tostring returns type bytes, expected type str
Type: behavior Stage: resolved
Components: Documentation, XML Versions: Python 3.1
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: docs@python Nosy List: Gunnar.Eikman, JTMoon79, docs@python, r.david.murray, serhiy.storchaka
Priority: normal Keywords:

Created on 2011-01-19 00:12 by JTMoon79, last changed 2015-11-26 17:17 by serhiy.storchaka. This issue is now closed.

Messages (5)
msg126506 - (view) Author: JamesThomasMoon1979 (JTMoon79) Date: 2011-01-19 00:12
method xml.etree.ElementTree.tostring from module returns type bytes.
The documentation reads
"""Returns an encoded string containing the XML data."""
(from as of 2011-01-18)

Here is a test program:
# created for python 3.1

import sys
print(sys.version) # for help verifying version tested
from xml.etree import ElementTree

sampleinput = """<?xml version="1.0"?><Hello></Hello>"""
xmlobj = ElementTree.fromstring(sampleinput)
xmlstr = ElementTree.tostring(xmlobj,'utf-8')
print("xmlstr value is '", xmlstr, "'", sep="")
print("xmlstr type is '", type(xmlstr), "'", sep="")
test program output:
3.1.3 (r313:86834, Nov 27 2010, 18:30:53) [MSC v.1500 32 bit (Intel)]
xmlstr value is 'b'<Hello />''
xmlstr type is '<class 'bytes'>'

This cheap "fix" for this bug may be simply be a change in documentation.
However, a method called "tostring" really should return something nearer to the built-in str.
msg126507 - (view) Author: JamesThomasMoon1979 (JTMoon79) Date: 2011-01-19 00:14
Some other bugs affecting the tostring method (for consideration by the reviewer):
msg126517 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-01-19 02:10
This is indeed a doc problem, although there was some discussion of working toward a method rename.  See issue 8047 (but be prepared to read a novel to understand why tostring returns bytes...)  The doc for 3.2 is slightly clearer, but both 3.1 and 3.2 could be made clearer by referring to an 'encoded byte string' rather than just an 'encoded string'.  (An encoded string has to be a byte string, but that isn't obvious unless you've dealt with encode/decode a bunch.)

Technically this could be closed as a duplicate of issue 8047, since that issue proposes that the API fix (which would include the doc change) be backported to 3.1.  But no one has proposed a patch there, so at a minimum the 3.1 docs should be clarified.
msg159021 - (view) Author: Gunnar Eikman (Gunnar.Eikman) Date: 2012-04-23 14:42
I moved a working script from Ubuntu (Python 3.1.2) to Windows (Python 3.2.3) today.

Had to revise script. The tostring method returns a string on Linux (contradicts this issue), but bytes on Windows (as described in this issue)...

I used tostring with a single argument "tostring(theXml)"

Is there an explanation for this? I am not an advanced Python hacker...

Be careful when moving from one environment to another!
msg255058 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-11-21 15:09
For now the documentation explains the resulting type of tostring().
Generates a string representation of an XML element, including all subelements. element is an Element instance. encoding [1] is the output encoding (default is US-ASCII). Use encoding="unicode" to generate a Unicode string (otherwise, a bytestring is generated). method is either "xml", "html" or "text" (default is "xml").

Looks as this issue can be closed.
Date User Action Args
2015-11-26 17:17:05serhiy.storchakasetstatus: pending -> closed
resolution: out of date
stage: needs patch -> resolved
2015-11-21 15:09:25serhiy.storchakasetstatus: open -> pending
nosy: + serhiy.storchaka
messages: + msg255058

2012-04-23 14:42:06Gunnar.Eikmansetnosy: + Gunnar.Eikman
messages: + msg159021
2011-01-19 02:10:56r.david.murraysetnosy: + r.david.murray

messages: + msg126517
stage: needs patch
2011-01-19 00:14:28JTMoon79setmessages: + msg126507
2011-01-19 00:12:25JTMoon79create