Author lemburg
Recipients belopolsky, lemburg, loewis, vstinner
Date 2010-11-17.00:19:07
SpamBayes Score 7.71605e-15
Marked as misclassified No
Message-id <>
In-reply-to <>
Alexander Belopolsky wrote:
> Alexander Belopolsky <> added the comment:
> On Tue, Nov 16, 2010 at 5:54 PM, Marc-Andre Lemburg
> <> wrote:
>> Marc-Andre Lemburg <> added the comment:
>> Please note that PyCodec_Encode()/PyCodec_Decode() will return whatever the codec returns for these operations.
>> The codec system is not limited to converting between Unicode and bytes only.
> Not according to the latest reST documentation:
> """
> * Encoding converts a string object to a bytes object using a
> particular character set encoding (e.g., cp1252 or iso-8859-1).
> * Decoding converts a bytes object encoded using a particular
> character set encoding to a string object.
> """

That's another documentation bug, then. The codec system has always
supported other type combinations for encoding/decoding as well.

Only certain methods on str and bytes objects in 3.x limit the possible
types to either str or bytes - which probably results in the
idea that Python codecs don't support anything else.

The text from the 2.7 documentation is correct, also for 3.x:

>> A typical example is a same-type codec such as rot13 that only transforms Unicode data.
> I thought rot13 would only transform English (or Latin) alphabet.

Right, everything else passes through as-is.

Other examples are codecs that escape certain code points using e.g.
XML entity sequences, backslash notations or other such techniques.

For bytes, you have the zip, base64 and hex codecs which work in
a similar way.
Date User Action Args
2010-11-17 00:19:09lemburgsetrecipients: + lemburg, loewis, belopolsky, vstinner
2010-11-17 00:19:07lemburglinkissue10435 messages
2010-11-17 00:19:07lemburgcreate