Message 111147 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	mgiuca
Recipients	BreamoreBoy, adamnelson, ajaksu2, collinwinter, eric.araujo, ezio.melotti, mastrodomenico, mgiuca, nagle, orsenthil, pitrou, vak, varmaa, vstinner
Date	2010-07-22.02:18:53
SpamBayes Score	0.0076496713
Marked as misclassified	No
Message-id	<1279765135.91.0.81278225784.issue1712522@psf.upfronthosting.co.za>
In-reply-to

Content
If you're going the way of option 2, I would strongly advise against relying on the KeyError. The fact that a KeyError is raised by urllib.quote is not part of it's specification, it's a bug/quirk in the implementation (which is now unlikely to be change, but it's unsafe to rely on it). Robotparser should encode the string, if and only if it is a unicode string, with ('ascii', 'strict'), catch the UnicodeEncodeError, and raise the TypeError you suggested. This will have precisely the same behaviour as your proposed option 2 (will work fine for byte strings and Unicode strings with ASCII-only characters, but raise a TypeError on Unicode strings with non-ASCII characters) without relying on the KeyError from urllib.quote.

If you're going the way of option 2, I would strongly advise against relying on the KeyError. The fact that a KeyError is raised by urllib.quote is not part of it's specification, it's a bug/quirk in the implementation (which is now unlikely to be change, but it's unsafe to rely on it).

Robotparser should encode the string, if and only if it is a unicode string, with ('ascii', 'strict'), catch the UnicodeEncodeError, and raise the TypeError you suggested. This will have precisely the same behaviour as your proposed option 2 (will work fine for byte strings and Unicode strings with ASCII-only characters, but raise a TypeError on Unicode strings with non-ASCII characters) without relying on the KeyError from urllib.quote.

History
Date	User	Action	Args
2010-07-22 02:18:56	mgiuca	set	recipients: + mgiuca, collinwinter, varmaa, nagle, orsenthil, pitrou, vstinner, ajaksu2, ezio.melotti, eric.araujo, mastrodomenico, vak, adamnelson, BreamoreBoy
2010-07-22 02:18:55	mgiuca	set	messageid: <1279765135.91.0.81278225784.issue1712522@psf.upfronthosting.co.za>
2010-07-22 02:18:54	mgiuca	link	issue1712522 messages
2010-07-22 02:18:54	mgiuca	create