Message 145998 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	ncoghlan
Recipients	belopolsky, benjamin.peterson, cben, eric.araujo, flox, georg.brandl, gvanrossum, lemburg, loewis, ncoghlan, ssbarnea, vstinner
Date	2011-10-20.01:53:07
SpamBayes Score	1.7946755e-13
Marked as misclassified	No
Message-id	<1319075588.9.0.343117960561.issue7475@psf.upfronthosting.co.za>
In-reply-to

Content
I'm fine with people needing to drop down to the lower level lookup() API if they want the filtering functionality in Python code. For most purposes, constraining the expected codec input and output formats really isn't a major issue - we just need it in the core in order to emit sane error messages when people misuse the convenience APIs based on things that used to work in 2.x (like 'a'.encode('base64')). At the C level, I'd adjust _PyCodec_Lookup to accept the two extra arguments and add _PyCodec_EncodeText, _PyCodec_DecodeBinary, _PyCodec_TransformText and _PyCodec_TransformBinary to support the convenience APIs (rather than needing the individual objects to know about the details of the codec tagging mechanism). Making new codecs available isn't a backwards compatibility problem - anyone relying on a particular key being absent from an extensible registry is clearly doing the wrong thing. Regarding the particular formats, I'd suggest that hex, base64, quopri, uu, bz2 and zlib all be flagged as binary transforms, but rot13 be implemented as a text transform (Florent's patch has rot13 as another binary transform, but it makes more sense in the text domain - this should just be a matter of adjusting some of the data types in the implementation from bytes to str)

I'm fine with people needing to drop down to the lower level lookup() API if they want the filtering functionality in Python code. For most purposes, constraining the expected codec input and output formats really isn't a major issue - we just need it in the core in order to emit sane error messages when people misuse the convenience APIs based on things that used to work in 2.x (like 'a'.encode('base64')).

At the C level, I'd adjust _PyCodec_Lookup to accept the two extra arguments and add _PyCodec_EncodeText, _PyCodec_DecodeBinary, _PyCodec_TransformText and _PyCodec_TransformBinary to support the convenience APIs (rather than needing the individual objects to know about the details of the codec tagging mechanism).

Making new codecs available isn't a backwards compatibility problem - anyone relying on a particular key being absent from an extensible registry is clearly doing the wrong thing.

Regarding the particular formats, I'd suggest that hex, base64, quopri, uu, bz2 and zlib all be flagged as binary transforms, but rot13 be implemented as a text transform (Florent's patch has rot13 as another binary transform, but it makes more sense in the text domain - this should just be a matter of adjusting some of the data types in the implementation from bytes to str)

History
Date	User	Action	Args
2011-10-20 01:53:09	ncoghlan	set	recipients: + ncoghlan, lemburg, gvanrossum, loewis, georg.brandl, cben, belopolsky, vstinner, benjamin.peterson, eric.araujo, ssbarnea, flox
2011-10-20 01:53:08	ncoghlan	set	messageid: <1319075588.9.0.343117960561.issue7475@psf.upfronthosting.co.za>
2011-10-20 01:53:08	ncoghlan	link	issue7475 messages
2011-10-20 01:53:07	ncoghlan	create