This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author gvanrossum
Recipients gvanrossum, janssen, jimjjewett, loewis, mgiuca, orsenthil, pitrou, thomaspinckney3
Date 2008-08-13.17:17:19
SpamBayes Score 4.393866e-06
Marked as misclassified No
Message-id <ca471dc20808131017t2176cdbfwc70439529887feb6@mail.gmail.com>
In-reply-to <1218647121.49.0.344183013642.issue3300@psf.upfronthosting.co.za>
Content
> Bill Janssen <bill.janssen@gmail.com> added the comment:
>
> Erik van der Poel at Google has now chimed in with stats on current URL
> usage:
>
> ``...the bottom line is that escaped non-utf-8 is still quite prevalent,
> enough (in my opinion) to require an implementation in Python, possibly
> even allowing for different encodings in the path and query parts (e.g.
> utf-8 path and gb2312 query).''
>
> http://lists.w3.org/Archives/Public/www-international/2008JulSep/0042.html
>
> I think it's worth remembering that a very large proportion of the use
> of Python's urllib.unquote() is in implementations of Web server
> frameworks of one sort or another.  We can't control what the browsers
> that talk to such frameworks produce; the IETF doesn't control that,
> either.  In this case, "practicality beats purity" is the clarion call
> of the browser designers, and we'd better be able to support them.

I think we're supporting these sufficiently by allowing developers to
override the encoding and errors value. I see no argument here against
having a default encoding of UTF-8.
History
Date User Action Args
2008-08-13 17:17:20gvanrossumsetrecipients: + gvanrossum, loewis, jimjjewett, janssen, orsenthil, pitrou, thomaspinckney3, mgiuca
2008-08-13 17:17:20gvanrossumlinkissue3300 messages
2008-08-13 17:17:19gvanrossumcreate