Author janssen
Recipients janssen, loewis, mgiuca, orsenthil, thomaspinckney3
Date 2008-08-06.05:59:41
SpamBayes Score 2.24118e-05
Marked as misclassified No
Message-id <1218002384.67.0.585194589097.issue3300@psf.upfronthosting.co.za>
In-reply-to
Content
Here's my version of how quote and unquote should be implemented in
Python 3.0.  I haven't looked at the uses of it in the library, but I'd
expect improper uses (and there are lots of them) will break, and thus
can be fixed.

Basically, percent-quoting is about creating an ASCII string that can be
safely used in URI from an arbitrary sequence of octets.  So, my version
of quote() takes either a byte sequence or a string, and percent-quotes
the unsafe ones, and then returns a str.  If a str is supplied on input,
it is first converted to UTF-8, then the octets of that encoding are
percent-quoted.

For unquote, there's no way to tell what the octets of the quoted
sequence may mean, so this takes the percent-quoted ASCII string, and
returns a byte sequence with the unquoted bytes.  For convenience, since
the unquoted bytes are often a string in some particular character set
encoding, I've also supplied unquote_as_string(), which takes an
optional character set, and first unquotes the bytes, then converts them
to a str, using that character set encoding, and returns the resulting
string.
History
Date User Action Args
2008-08-06 05:59:44janssensetrecipients: + janssen, loewis, orsenthil, thomaspinckney3, mgiuca
2008-08-06 05:59:44janssensetmessageid: <1218002384.67.0.585194589097.issue3300@psf.upfronthosting.co.za>
2008-08-06 05:59:43janssenlinkissue3300 messages
2008-08-06 05:59:42janssencreate