Author mgiuca
Recipients gvanrossum, janssen, jimjjewett, lemburg, loewis, mgiuca, orsenthil, pitrou, thomaspinckney3
Date 2008-08-10.11:25:34
SpamBayes Score 0.000508076
Marked as misclassified No
Message-id <1218367586.43.0.507851434689.issue3300@psf.upfronthosting.co.za>
In-reply-to
Content
> Invalid user input? What if the query string comes from filling
> a form?
> For example if I search the word "numéro" in a latin1 Web site,
> I get the following URL:
> http://www.le-tigre.net/spip.php?page=recherche&recherche=num%E9ro

Yes, that is a concern. I suppose the idea should be that as the
programmer _you_ write the website, so you make it UTF-8 and you use our
defaults. Or you make it Latin-1, and you override our defaults (which
is tricky if you use cgi.FieldStorage, for example).

But anyway, how do you propose to handle that (other than the programmer
setting the correct default). With errors='replace', the above query
will result in "num�ro", but with errors='strict', it will result in a
UnicodeDecodeError (which you could handle, if you remembered). As a
programmer I don't really want to handle that error every time I use
unquote or anything that calls unquote. I'd rather accept the
possibility of '�'s in my input.

I'm not going to dig in my heels though, this time :) I just want to
make sure the consequences of this decision are known before we commit.
History
Date User Action Args
2008-08-10 11:26:26mgiucasetrecipients: + mgiuca, lemburg, gvanrossum, loewis, jimjjewett, janssen, orsenthil, pitrou, thomaspinckney3
2008-08-10 11:26:26mgiucasetmessageid: <1218367586.43.0.507851434689.issue3300@psf.upfronthosting.co.za>
2008-08-10 11:25:34mgiucalinkissue3300 messages
2008-08-10 11:25:34mgiucacreate