Author janssen
Recipients gvanrossum, janssen, jimjjewett, lemburg, loewis, mgiuca, orsenthil, pitrou, thomaspinckney3
Date 2008-08-08.00:18:54
SpamBayes Score 3.07551e-11
Marked as misclassified No
Message-id <4b3e516a0808071718p1621b455j86933e2f1a56f144@mail.gmail.com>
In-reply-to <ca471dc20808071623v74ca2f35m947484f381f2a3fe@mail.gmail.com>
Content
On Thu, Aug 7, 2008 at 4:23 PM, Guido van Rossum <report@bugs.python.org>wrote:

>
> >> However I fear that this middle ground will in practice cause:
> >>
> >> (a) more in-the-field failures, since devs are notorious for testing
> >> with ASCII only; and
> >
> > Returning bytes deals with this problem.
>
> In an unpleasant way. We might as well consider changing all APIs that
> deal with URLs to insist on bytes.
>

That seems a bit over-the-top.  Most URL operations *are* about strings, and
most of the APIs should deal with strings; we're talking about the return
result of an operation specifically designed to extract binary data from the
one place where it's allowed to occur.  Vastly smaller than "changing all
APIs that deal with URLs".

By the way, I see that the email package dodges this by encoding the bytes
to strings using the codec "raw-unicode-escape".  In other words, byte
sequences in the outward form of a string.  I'd be OK with that.  That is,
make the default codec for "unquote" be "raw-unicode-escape".  All the bytes
will come through unscathed, and people who are naively expecting ASCII
strings will still receive them, so the code won't break.  This actually
seems to be closest to the current usage, so I'm going to change my patch to
do that.
Files
File name Uploaded
unnamed janssen, 2008-08-08.00:18:53
History
Date User Action Args
2008-08-08 00:18:55janssensetrecipients: + janssen, lemburg, gvanrossum, loewis, jimjjewett, orsenthil, pitrou, thomaspinckney3, mgiuca
2008-08-08 00:18:54janssenlinkissue3300 messages
2008-08-08 00:18:54janssencreate