Author ncoghlan
Recipients eric.araujo, eric.smith, ncoghlan, pitrou, r.david.murray
Date 2010-09-29.22:23:39
SpamBayes Score 3.80328e-10
Marked as misclassified No
Message-id <AANLkTi=X08tqYbm8iSqf9UC22mObpeZtrUXpxjhGFQ3g@mail.gmail.com>
In-reply-to <1285798167.3194.56.camel@localhost.localdomain>
Content
> I think it's quite misguided. latin1 encoding and decoding is blindingly
> fast (on the order of 1GB/s. here). Unless you have multi-megabyte URLs,
> you won't notice any overhead.

Ah, I didn't know that (although it makes sense now I think about it).
I'll start exploring ideas along those lines then. Having to name all
the literals as I do in the patch is really quite ugly.

A general sketch of such a strategy would be to stick the following
near the start of affected functions:

encode_result = not isinstance(url, str) # or whatever the main
parameter is called
if encode_result:
    url = url.decode('latin-1')
    # decode any other arguments that need it
    # Select the bytes versions of any relevant globals
else:
    # Select the str versions of any relevant globals

Then, at the end, do an encoding step. However, the encoding step may
get a little messy when it comes to the structured data types. For
that, I'll probably take a leaf out of the email6 book and create a
parallel bytes API, with appropriate encode/decode methods to
transform one into the other.
History
Date User Action Args
2010-09-29 22:23:41ncoghlansetrecipients: + ncoghlan, pitrou, eric.smith, eric.araujo, r.david.murray
2010-09-29 22:23:39ncoghlanlinkissue9873 messages
2010-09-29 22:23:39ncoghlancreate