Message 70855 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	janssen
Recipients	gvanrossum, janssen, jimjjewett, loewis, mgiuca, orsenthil, pitrou, thomaspinckney3
Date	2008-08-07.20:49:23
SpamBayes Score	5.899376e-09
Marked as misclassified	No
Message-id	<1218142166.03.0.564419825964.issue3300@psf.upfronthosting.co.za>
In-reply-to

Content
Just to reply to Antoine's comments on my patch: - it would be nice to have more unit tests, especially for the various bytes/unicode possibilities, and perhaps also roundtripping (Matt's patch has a lot of tests) Yes, I completely agree. - quote_as_bytes() should return a bytes object, not a bytearray Good point. - using the "%02X" format looks clearer to me than going through the _hextable lookup table... Really? I see it the other way. - when the argument is of the wrong type, quote_as_bytes() should raise a TypeError rather than a ValueError Good point. - why is quote_as_string() hardwired to utf8 while unquote_as_string() provides a charset parameter? wouldn't it be better for them to be consistent with each other? To encourage the use of UTF-8. The caller can alway explicitly encode to some other character set, then pass in the bytes that result from that encoding. Remember that the RFC for percent-encoding really takes bytes in, and produces bytes out. The string-in and string-out versions are to support naive programming (what a nice way of putting it!).

Just to reply to Antoine's comments on my patch:

- it would be nice to have more unit tests, especially for the various
bytes/unicode possibilities, and perhaps also roundtripping (Matt's
patch has a lot of tests)

Yes, I completely agree.

- quote_as_bytes() should return a bytes object, not a bytearray

Good point.

- using the "%02X" format looks clearer to me than going through the
_hextable lookup table...

Really?  I see it the other way.

- when the argument is of the wrong type, quote_as_bytes() should raise
a TypeError rather than a ValueError

Good point.

- why is quote_as_string() hardwired to utf8 while unquote_as_string()
provides a charset parameter? wouldn't it be better for them to be
consistent with each other?

To encourage the use of UTF-8.  The caller can alway explicitly encode
to some other character set, then pass in the bytes that result from
that encoding.  Remember that the RFC for percent-encoding really takes
bytes in, and produces bytes out.  The string-in and string-out versions
are to support naive programming (what a nice way of putting it!).

History
Date	User	Action	Args
2008-08-07 20:49:26	janssen	set	recipients: + janssen, gvanrossum, loewis, jimjjewett, orsenthil, pitrou, thomaspinckney3, mgiuca
2008-08-07 20:49:26	janssen	set	messageid: <1218142166.03.0.564419825964.issue3300@psf.upfronthosting.co.za>
2008-08-07 20:49:25	janssen	link	issue3300 messages
2008-08-07 20:49:23	janssen	create