Message 96374 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	lemburg
Recipients	benjamin.peterson, flox, georg.brandl, lemburg, loewis, skip.montanaro
Date	2009-12-14.10:30:08
SpamBayes Score	5.445644e-14
Marked as misclassified	No
Message-id	<4B2613AE.2000007@egenix.com>
In-reply-to	<4B23EE1B.5070903@v.loewis.de>

Content
Martin v. Löwis wrote: > > Martin v. Löwis <martin@v.loewis.de> added the comment: > >> So, after reading the above comments, I think we may end up with >> following changes: >> * restore the "bytes-to-bytes" codecs in the "encodings" package +1 >> * then create new helpers on bytes objects (either >> ".transform()/.untransform()" or ".encodebytes()/.decodebytes") +1 - the names are still up for debate, IIRC. > I would still be opposed to such a change, and I think it needs a PEP. All this has already been discussed and the only reason it didn't go in earlier was timing. No need for a PEP. > If the codecs are restored, one half of them becomes available to > .encode/.decode methods, since the codec registry cannot tell which > ones implement real character encodings, and which ones are other > conversion methods. So adding them would be really confusing. Not at all. The helper methods check the return types and raise an exception if the types don't match the expected types. The codecs registry itself doesn't need to know about the possible input/output types of codecs, since this information is not required to match a name to an implementation. What we could do, is add that information to the CodecInfo object used for registering the codec. codecs.lookup() would then return the information to the application. E.g. .encode_input_types = (str,) .encode_output_types = (bytes,) .decode_input_types = (bytes,) .decode_output_types = (str,) Codecs not supporting these CodecInfo attributes would simply return None. > I also wonder why you are opposed to the import statement. My > recommendation is indeed that you use the official API for these > libraries (and indeed, there is an official API for each of them, > unlike real codecs, which don't have any other documented API). That's not the point. The codec API provides a standardized API for all these encodings. The hex, zlib, bz2, etc. codecs are just adapters of the different pre-existing APIs to the codec API.

Martin v. Löwis wrote:
> 
> Martin v. Löwis <martin@v.loewis.de> added the comment:
> 
>> So, after reading the above comments, I think we may end up with
>> following changes:
>>  * restore the "bytes-to-bytes" codecs in the "encodings" package

+1

>>  * then create new helpers on bytes objects (either
>>    ".transform()/.untransform()" or ".encodebytes()/.decodebytes")

+1 - the names are still up for debate, IIRC.

> I would still be opposed to such a change, and I think it needs a PEP.

All this has already been discussed and the only reason it didn't
go in earlier was timing. No need for a PEP.

> If the codecs are restored, one half of them becomes available to
> .encode/.decode methods, since the codec registry cannot tell which
> ones implement real character encodings, and which ones are other
> conversion methods. So adding them would be really confusing.

Not at all. The helper methods check the return types and raise an
exception if the types don't match the expected types.

The codecs registry itself doesn't need to know about the possible
input/output types of codecs, since this information is not
required to match a name to an implementation.

What we could do, is add that information to the CodecInfo object
used for registering the codec. codecs.lookup() would then
return the information to the application.

E.g.

.encode_input_types = (str,)
.encode_output_types = (bytes,)
.decode_input_types = (bytes,)
.decode_output_types = (str,)

Codecs not supporting these CodecInfo attributes would simply
return None.

> I also wonder why you are opposed to the import statement. My
> recommendation is indeed that you use the official API for these
> libraries (and indeed, there is an official API for each of them,
> unlike real codecs, which don't have any other documented API).

That's not the point. The codec API provides a standardized API for
all these encodings. The hex, zlib, bz2, etc. codecs are just
adapters of the different pre-existing APIs to the codec API.

History
Date	User	Action	Args
2009-12-14 10:30:12	lemburg	set	recipients: + lemburg, loewis, skip.montanaro, georg.brandl, benjamin.peterson, flox
2009-12-14 10:30:11	lemburg	link	issue7475 messages
2009-12-14 10:30:08	lemburg	create