This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author Neil Muller
Recipients Neil Muller, effbot, hodgestar
Date 2009-06-07.21:30:58
SpamBayes Score 0.000638595
Marked as misclassified No
Message-id <1244410260.68.0.534028988218.issue6233@psf.upfronthosting.co.za>
In-reply-to
Content
In py3k, ElementTree no longer correctly converts characters to entities
when they can't be represented in the requested output encoding.

Python 2:

>>> import xml.etree.ElementTree as ET
>>> e = ET.XML("<?xml version='1.0'
encoding='iso-8859-1'?><body>t\xe3t</body>")
>>> ET.tostring(e, 'ascii')
"<?xml version='1.0' encoding='ascii'?>\n<body>t&#227;t</body>"

Python 3:

>>> import xml.etree.ElementTree as ET
>>> e = ET.XML("<?xml version='1.0'
encoding='iso-8859-1'?><body>t\xe3t</body>")
>>> ET.tostring(e, 'ascii')
.....
UnicodeEncodeError: 'ascii' codec can't encode characters in position
1-2: ordinal not in range(128)


It looks like _encode_entity isn't ever called inside ElementTree
anymore - it probably should be called as part of _encode for characters
that can't be represented.
History
Date User Action Args
2009-06-07 21:31:00Neil Mullersetrecipients: + Neil Muller, effbot, hodgestar
2009-06-07 21:31:00Neil Mullersetmessageid: <1244410260.68.0.534028988218.issue6233@psf.upfronthosting.co.za>
2009-06-07 21:30:58Neil Mullerlinkissue6233 messages
2009-06-07 21:30:58Neil Mullercreate