Message240230
All hell breaks loose when unicode is passed as the second argument to urllib.quote in Python 2:
>>> import urllib
>>> urllib.quote('\xce\x91', u'')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/urllib.py", line 1292, in quote
if not s.rstrip(safe):
UnicodeDecodeError: 'ascii' codec can't decode byte 0xce in position 0: ordinal not in range(128)
This on its own wouldn't be that bad - just another Python 2 unicode wonkiness. However, coupled with caching done by the quote function (quoters are cached based on the second parameter, and u'' == ''), it means that a random preceding call to quote from an entirely different place in the application can break your code:
$ python2
Python 2.7.9 (default, Dec 11 2014, 04:42:00)
[GCC 4.9.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import urllib
>>> urllib.quote('\xce\x91', '')
'%CE%91'
>>>
$ python2
Python 2.7.9 (default, Dec 11 2014, 04:42:00)
[GCC 4.9.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import urllib
>>> urllib.quote('a', u'')
'a'
>>> urllib.quote('\xce\x91', '')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/urllib.py", line 1292, in quote
if not s.rstrip(safe):
UnicodeDecodeError: 'ascii' codec can't decode byte 0xce in position 0: ordinal not in range(128)
Good luck debugging that.
So, one of two things needs to happen:
- a TypeError when unicode is passed as the second parameter, or
- a cast of the second parameter to str |
|
Date |
User |
Action |
Args |
2015-04-07 21:10:15 | koriakin | set | recipients:
+ koriakin |
2015-04-07 21:10:15 | koriakin | set | messageid: <1428441015.17.0.142718858432.issue23885@psf.upfronthosting.co.za> |
2015-04-07 21:10:15 | koriakin | link | issue23885 messages |
2015-04-07 21:10:15 | koriakin | create | |
|