Message132812
Mixing byte and unicode strings should always be avoided, because the implicit coercion to unicode works only if the byte strings contains only ASCII, and fails otherwise.
Several modules -- including shutil, glob, and os.path -- have API that work with both byte and unicode strings, but fail when you mix the two:
>>> os.path.join('א', 'א') # both byte strings -- works
'\xd7\x90/\xd7\x90'
>>> os.path.join(u'א', u'א') # both unicode -- works
u'\u05d0/\u05d0'
>>> os.path.join('a', u'א') # mixed, ASCII-only byte string -- works
u'a/\u05d0'
>>> os.path.join(u'א', 'א') # mixed, non-ASCII -- fails
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.6/posixpath.py", line 70, in join
path += '/' + b
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd7 in position 1: ordinal not in range(128)
>>> os.path.join('א', u'א') # mixed, non-ASCII -- fails
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.6/posixpath.py", line 70, in join
path += '/' + b
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd7 in position 0: ordinal not in range(128)
>>> |
|
Date |
User |
Action |
Args |
2011-04-02 20:34:01 | ezio.melotti | set | recipients:
+ ezio.melotti, Adam.Matan |
2011-04-02 20:34:01 | ezio.melotti | set | messageid: <1301776441.04.0.155816606179.issue11741@psf.upfronthosting.co.za> |
2011-04-02 20:34:00 | ezio.melotti | link | issue11741 messages |
2011-04-02 20:34:00 | ezio.melotti | create | |
|