Author mgiuca
Recipients BreamoreBoy, adamnelson, ajaksu2, collinwinter, ezio.melotti, haypo, mastrodomenico, merwok, mgiuca, nagle, orsenthil, pitrou, vak, varmaa
Date 2010-07-22.02:18:53
SpamBayes Score 0.00764967
Marked as misclassified No
Message-id <1279765135.91.0.81278225784.issue1712522@psf.upfronthosting.co.za>
In-reply-to
Content
If you're going the way of option 2, I would strongly advise against relying on the KeyError. The fact that a KeyError is raised by urllib.quote is not part of it's specification, it's a bug/quirk in the implementation (which is now unlikely to be change, but it's unsafe to rely on it).

Robotparser should encode the string, if and only if it is a unicode string, with ('ascii', 'strict'), catch the UnicodeEncodeError, and raise the TypeError you suggested. This will have precisely the same behaviour as your proposed option 2 (will work fine for byte strings and Unicode strings with ASCII-only characters, but raise a TypeError on Unicode strings with non-ASCII characters) without relying on the KeyError from urllib.quote.
History
Date User Action Args
2010-07-22 02:18:56mgiucasetrecipients: + mgiuca, collinwinter, varmaa, nagle, orsenthil, pitrou, haypo, ajaksu2, ezio.melotti, merwok, mastrodomenico, vak, adamnelson, BreamoreBoy
2010-07-22 02:18:55mgiucasetmessageid: <1279765135.91.0.81278225784.issue1712522@psf.upfronthosting.co.za>
2010-07-22 02:18:54mgiucalinkissue1712522 messages
2010-07-22 02:18:54mgiucacreate