Author ncoghlan
Recipients belopolsky, benjamin.peterson, cben, eric.araujo, flox, georg.brandl, gvanrossum, lemburg, loewis, ncoghlan, ssbarnea, vstinner
Date 2011-10-20.01:53:07
SpamBayes Score 1.79468e-13
Marked as misclassified No
Message-id <1319075588.9.0.343117960561.issue7475@psf.upfronthosting.co.za>
In-reply-to
Content
I'm fine with people needing to drop down to the lower level lookup() API if they want the filtering functionality in Python code. For most purposes, constraining the expected codec input and output formats really isn't a major issue - we just need it in the core in order to emit sane error messages when people misuse the convenience APIs based on things that used to work in 2.x (like 'a'.encode('base64')).

At the C level, I'd adjust _PyCodec_Lookup to accept the two extra arguments and add _PyCodec_EncodeText, _PyCodec_DecodeBinary, _PyCodec_TransformText and _PyCodec_TransformBinary to support the convenience APIs (rather than needing the individual objects to know about the details of the codec tagging mechanism).

Making new codecs available isn't a backwards compatibility problem - anyone relying on a particular key being absent from an extensible registry is clearly doing the wrong thing.

Regarding the particular formats, I'd suggest that hex, base64, quopri, uu, bz2 and zlib all be flagged as binary transforms, but rot13 be implemented as a text transform (Florent's patch has rot13 as another binary transform, but it makes more sense in the text domain - this should just be a matter of adjusting some of the data types in the implementation from bytes to str)
History
Date User Action Args
2011-10-20 01:53:09ncoghlansetrecipients: + ncoghlan, lemburg, gvanrossum, loewis, georg.brandl, cben, belopolsky, vstinner, benjamin.peterson, eric.araujo, ssbarnea, flox
2011-10-20 01:53:08ncoghlansetmessageid: <1319075588.9.0.343117960561.issue7475@psf.upfronthosting.co.za>
2011-10-20 01:53:08ncoghlanlinkissue7475 messages
2011-10-20 01:53:07ncoghlancreate