Message100333
The xml.etree.ElementTree package in the Python 3.x standard library breaks compatibility with existing ET 1.2 code. The serialiser returns a unicode string when no encoding is passed. Previously, the serialiser was guaranteed to return a byte string. By default, the string was 7-bit ASCII compatible.
This behavioural change breaks all code that relies on the default behaviour of ElementTree. Since there is no longer a default encoding in Python 3, unicode strings are incompatible with byte strings, which means that the result of the serialisation can no longer be written to a file, for example.
XML is well defined as a stream of bytes. Redefining it as a unicode string *by default* is hard to understand at best.
Finally, it would have been good to look at the other ET implementation before introducing such a change. The lxml.etree package has had support for serialising XML into a unicode string for years, and does so in a clear, safe and explicit way. It requires the user to pass the 'unicode' (Py3 'str') type as encoding parameter, e.g.
tree.tostring(encoding=str)
which is explicit enough to make it clear that this is different from a normal encoding. |
|
Date |
User |
Action |
Args |
2010-03-03 07:15:25 | scoder | set | recipients:
+ scoder |
2010-03-03 07:15:25 | scoder | set | messageid: <1267600525.56.0.547982490868.issue8047@psf.upfronthosting.co.za> |
2010-03-03 07:15:23 | scoder | link | issue8047 messages |
2010-03-03 07:15:22 | scoder | create | |
|