Message 117668 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	ncoghlan
Recipients	eric.araujo, eric.smith, ncoghlan, pitrou, r.david.murray
Date	2010-09-29.22:23:39
SpamBayes Score	3.8032816e-10
Marked as misclassified	No
Message-id	<AANLkTi=X08tqYbm8iSqf9UC22mObpeZtrUXpxjhGFQ3g@mail.gmail.com>
In-reply-to	<1285798167.3194.56.camel@localhost.localdomain>

Content
> I think it's quite misguided. latin1 encoding and decoding is blindingly > fast (on the order of 1GB/s. here). Unless you have multi-megabyte URLs, > you won't notice any overhead. Ah, I didn't know that (although it makes sense now I think about it). I'll start exploring ideas along those lines then. Having to name all the literals as I do in the patch is really quite ugly. A general sketch of such a strategy would be to stick the following near the start of affected functions: encode_result = not isinstance(url, str) # or whatever the main parameter is called if encode_result: url = url.decode('latin-1') # decode any other arguments that need it # Select the bytes versions of any relevant globals else: # Select the str versions of any relevant globals Then, at the end, do an encoding step. However, the encoding step may get a little messy when it comes to the structured data types. For that, I'll probably take a leaf out of the email6 book and create a parallel bytes API, with appropriate encode/decode methods to transform one into the other.

> I think it's quite misguided. latin1 encoding and decoding is blindingly
> fast (on the order of 1GB/s. here). Unless you have multi-megabyte URLs,
> you won't notice any overhead.

Ah, I didn't know that (although it makes sense now I think about it).
I'll start exploring ideas along those lines then. Having to name all
the literals as I do in the patch is really quite ugly.

A general sketch of such a strategy would be to stick the following
near the start of affected functions:

encode_result = not isinstance(url, str) # or whatever the main
parameter is called
if encode_result:
    url = url.decode('latin-1')
    # decode any other arguments that need it
    # Select the bytes versions of any relevant globals
else:
    # Select the str versions of any relevant globals

Then, at the end, do an encoding step. However, the encoding step may
get a little messy when it comes to the structured data types. For
that, I'll probably take a leaf out of the email6 book and create a
parallel bytes API, with appropriate encode/decode methods to
transform one into the other.

History
Date	User	Action	Args
2010-09-29 22:23:41	ncoghlan	set	recipients: + ncoghlan, pitrou, eric.smith, eric.araujo, r.david.murray
2010-09-29 22:23:39	ncoghlan	link	issue9873 messages
2010-09-29 22:23:39	ncoghlan	create