Author lemburg
Recipients lemburg, martin.panter, ncoghlan
Date 2013-11-10.11:40:06
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <527F7086.1080708@egenix.com>
In-reply-to <1384075247.27.0.0100141647279.issue19543@psf.upfronthosting.co.za>
Content
On 10.11.2013 10:20, Nick Coghlan wrote:
> 
> The long discussion in issue 7475 and some subsequent discussions I had with Armin Ronacher have made it clear to me that the key distinction between the codec systems in Python 2 and Python 3 is the following differences in type signatures of various operations:
> 
> Python 2 (8 bit str):
> 
>     codecs module: object <-> object
>     convenience methods: basestring <-> basestring
>     available codecs: unicode <-> str, str <-> str, unicode <-> unicode
> 
> Python 3 (Unicode str):
> 
>     codecs module: object <-> object
>     convenience methods: str <-> bytes
>     available codecs: str <-> bytes, bytes <-> bytes, str <-> str
> 
> The significant distinction is the fact that, in Python 2, the convenience methods covered all standard library codecs, but for Python 3, the codecs module needs to be used directly for the bytes <-> bytes codecs and the one str <-> str codec (since those codecs no longer satisfy the constraints of the text model related convenience methods).

Please remember that the codec sub-system is extensible. It's
easily possible to add more codecs via registered codec
search functions.

Whatever you add as warning has to be aware of the fact that
there may be codecs in the system that are not part of the
stdlib and which can potentially implement codecs that use
other type combinations that the ones you listed above.
History
Date User Action Args
2013-11-10 11:40:06lemburgsetrecipients: + lemburg, ncoghlan, martin.panter
2013-11-10 11:40:06lemburglinkissue19543 messages
2013-11-10 11:40:06lemburgcreate