Message 100633 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	scoder
Recipients	effbot, flox, pitrou, r.david.murray, scoder
Date	2010-03-08.09:01:16
SpamBayes Score	1.0967088e-10
Marked as misclassified	No
Message-id	<1268038878.95.0.129268603057.issue8047@psf.upfronthosting.co.za>
In-reply-to

Content
Antoine, in the same comment, you say that it was not backported to Py2 in order to prevent breaking existing code, and then you ask if it's difficult to support in lxml. ;-) Supporting the same behaviour in lxml would either mean that it breaks existing code in Py2 (when making the API consistent), or that you can safely (and correctly) write the return value to a file in Py2, but that you can't do the same in Py3 (when adopting the change only in Py3). Previously, in ElementTree, serialising without an explicit encoding was a way to get a byte encoded serialisation without an XML declaration header, so I expect there to be code that depends on this. Since ElementTree 1.3 uses the same keyword argument as lxml for this feature, I assume that Florent's patches provide at least an alternative here, even if it requires users to adapt their code. I just wish this backwards incompatible feature had been advertised at the time, or at least documented in any way. Even the latest 3.2-dev docs still state that the default encoding of the serialiser is US-ASCII, not a word about ever returning a unicode string, especially not by default, and totally not the required big fat warning that writing to a file will fail with mysterious errors if no encoding is specified.

Antoine, in the same comment, you say that it was not backported to Py2 in order to prevent breaking existing code, and then you ask if it's difficult to support in lxml. ;-)

Supporting the same behaviour in lxml would either mean that it breaks existing code in Py2 (when making the API consistent), or that you can safely (and correctly) write the return value to a file in Py2, but that you can't do the same in Py3 (when adopting the change only in Py3).

Previously, in ElementTree, serialising without an explicit encoding was a way to get a byte encoded serialisation without an XML declaration header, so I expect there to be code that depends on this. Since ElementTree 1.3 uses the same keyword argument as lxml for this feature, I assume that Florent's patches provide at least an alternative here, even if it requires users to adapt their code.

I just wish this backwards incompatible feature had been advertised at the time, or at least *documented* in any way. Even the latest 3.2-dev docs still state that the default encoding of the serialiser is US-ASCII, not a word about *ever* returning a unicode string, especially not by default, and totally not the required big fat warning that writing to a file will fail with mysterious errors if no encoding is specified.

History
Date	User	Action	Args
2010-03-08 09:01:19	scoder	set	recipients: + scoder, effbot, pitrou, r.david.murray, flox
2010-03-08 09:01:18	scoder	set	messageid: <1268038878.95.0.129268603057.issue8047@psf.upfronthosting.co.za>
2010-03-08 09:01:17	scoder	link	issue8047 messages
2010-03-08 09:01:16	scoder	create