This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author janssen
Recipients gvanrossum, janssen, jimjjewett, lemburg, loewis, mgiuca, orsenthil, pitrou, thomaspinckney3
Date 2008-08-12.17:38:38
SpamBayes Score 6.85552e-06
Marked as misclassified No
Message-id <1218562720.33.0.227607266543.issue3300@psf.upfronthosting.co.za>
In-reply-to
Content
Larry Masinter is off on vacation, but I did get a brief message saying
that he will dig up similar discussions that he was involved in when he
gets back.

Out of curiosity, I sent a note off to the www-international mailing
list, and received this:

``For the authority (server name) portion of a URI, RFC 3986 is pretty
clear that UTF-8 must be used for non-ASCII values (assuming, for a
moment, that IDNA addresses are not Punycode encoded already). For the
path portion of URIs, a large-ish proportion of them are, indeed, UTF-8
encoded because that has been the de facto standard in Web browsers for
a number of years now. For the query and fragment parts, however, the
encoding is determined by context and often depends on the encoding of
some page that contains the form from which the data is taken. Thus, a
large number of URIs contain non-UTF-8 percent-encoded octets.''

http://lists.w3.org/Archives/Public/www-international/2008JulSep/0041.html
History
Date User Action Args
2008-08-12 17:38:40janssensetrecipients: + janssen, lemburg, gvanrossum, loewis, jimjjewett, orsenthil, pitrou, thomaspinckney3, mgiuca
2008-08-12 17:38:40janssensetmessageid: <1218562720.33.0.227607266543.issue3300@psf.upfronthosting.co.za>
2008-08-12 17:38:39janssenlinkissue3300 messages
2008-08-12 17:38:38janssencreate