Author dmahn
Recipients dmahn
Date 2009-03-10.14:45:11
SpamBayes Score 6.05219e-11
Marked as misclassified No
Message-id <1236696316.6.0.478633045289.issue5468@psf.upfronthosting.co.za>
In-reply-to
Content
urllib.parse.urlencode() uses quote_plus() extensively to create a
complete query string, but doesn't effectively/properly take advantage
of the flexibility built into quote_plus().  Namely:

1) Instances of type "bytes" are not properly encoded, as str() is used
prior to passing to quote_plus().  This creates a nonsensical string
such as b'1234', while quote_plus() can handle these types properly if
passed intact.  The ability to encode this type is particularly useful
for putting binary data into the query string, or for pre-encoded text
which you may want to encode in a non-standard character encoding.

2) Sometimes it would be desirable to encode query strings entirely in
"latin-1" or possibly "ascii" instead of "utf-8".  Adding the extra
parameters now present on quote_plus() can easily give that extra
functionality.

I have attached a new version of urlencode() that provides both of the
above fixes/enhancements.  Additionally, an unused codepath in the
existing function has been eliminated/cleaned up.  Some doctests are
included as well.
History
Date User Action Args
2009-03-10 14:45:17dmahnsetrecipients: + dmahn
2009-03-10 14:45:16dmahnsetmessageid: <1236696316.6.0.478633045289.issue5468@psf.upfronthosting.co.za>
2009-03-10 14:45:14dmahnlinkissue5468 messages
2009-03-10 14:45:13dmahncreate