This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author lemburg
Recipients doerwalter, eric.araujo, lemburg, loewis, pitrou, vstinner
Date 2010-05-28.13:23:29
SpamBayes Score 5.6305158e-05
Marked as misclassified No
Message-id <4BFFC3CF.8090406@egenix.com>
In-reply-to <1275051738.3101.7.camel@localhost.localdomain>
Content
Antoine Pitrou wrote:
> 
> Antoine Pitrou <pitrou@free.fr> added the comment:
> 
>> class BinaryDataCodec(codecs.Codec):
>>
>>     # Note: Binding these as C functions will result in the class not
>>     # converting them to methods. This is intended.
>>     encode = codecs.readbuffer_encode
>>     decode = codecs.latin_1_decode
> 
> What's the point, though? Creating a non-symmetrical codec doesn't sound
> like a very useful or recommandable thing to do. 

Why not ? If you're only interested in the binary data and
don't care about the original input object type, that's a
very natural thing to do.

E.g. you could use a memory mapped file as input to the encoder.
Would you really expect the codec to recreate such a file object when
decoding the binary data ?

> Especially in the py3k
> codec model where encode() only works on unicode objects.

That's a common misunderstanding. The codec system does not
mandate a specific type combination. Only the helper methods
.encode() and .decode() on bytes and str objects in Python3 do.

>> While it's possible to emulate the functions via other methods,
>> these methods always introduce intermediate objects, which isn't
>> necessary and only costs performance.
> 
> The bytes() constructor doesn't (shouldn't) create any more intermediate
> objects than read/charbuffer_encode() do.

Looking at the code, the data takes quite a long path through
the whole machinery. For non-Unicode objects, it always tries to create
an integer and only if that fails reverts back to the buffer
interface after a few more function calls.

Furthermore, the bytes() constructor accepts a lot more
objects than the "s#" parser marker, e.g. lists of integers,
plain integers, arbitrary iterators, which a codec
just interested in the binary representation of an
object via the buffer interface most likely doesn't
want to accept.

> And all this doesn't address the fact that these functions have never
> been documented, and don't seem used in the outside world
> (understandably so, since there's no way to know about their existence,
> and their intended use).

That's a documentation bug and probably the result of the fact
that none of the exposed encoder/decoder APIs are documented.
History
Date User Action Args
2010-05-28 13:23:31lemburgsetrecipients: + lemburg, loewis, doerwalter, pitrou, vstinner, eric.araujo
2010-05-28 13:23:29lemburglinkissue8838 messages
2010-05-28 13:23:29lemburgcreate