Author loewis
Recipients Arfrever, ezio.melotti, gregory.p.smith, lemburg, loewis, pitrou, vstinner
Date 2010-05-03.21:30:59
SpamBayes Score 0.00416886
Marked as misclassified No
Message-id <4BDF4091.5000702@v.loewis.de>
In-reply-to <4BDF3A5E.6080101@egenix.com>
Content
> Here's one (RFC 3875, sections 4.1.7 and 4.1.5):
> 
> LANG = 'en_US.utf8'
> CONTENT_TYPE = 'application/x-www-form-urlencoded'
> QUERY_STRING = 'type=example&name=Löwis'
> PATH_INFO = '/home/löwis/bin/mycgi.py'
> 
> (HTML uses Latin-1 as default encoding and so do many of the
>  protocols invented for it !)

BTW, I think you are misinterpreting the RFC. It doesn't actually say
that QUERY_STRING is Latin-1 encoded, but instead, it says

"the details of the parsing, reserved characters and support for non
US-ASCII characters depends on the context"

Latin-1 is only given as a possible example. Apache passes the URL from
the HTTP request unescaped; browsers will likely CGI-escape it. So most
likely, it will be

QUERY_STRING = 'type=example&name=L%F6wis'
or
QUERY_STRING = 'type=example&name=L%C3%B6wis'

IMO, applications are much better off to consider QUERY_STRING as a
character string.
History
Date User Action Args
2010-05-03 21:31:02loewissetrecipients: + loewis, lemburg, gregory.p.smith, pitrou, vstinner, ezio.melotti, Arfrever
2010-05-03 21:30:59loewislinkissue8603 messages
2010-05-03 21:30:59loewiscreate