You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
On Linux/ext3, filenames are stored natively as sequences of octets. On
Win32/NTFS, they are stored natively as sequences of Unicode code points.
In Python 2.x, the way to unambiguously open a particular file was to
pass the filename as a str object on Linux/ext3 and as a unicode object
on Win32/NTFS. os.listdir(".") would return every filename as a str
object, and os.listdir(u".") would return every filename as a unicode
object---based on the current locale settings---except for filenames
that couldn't be decoded that way.
Consider this bash script (executed on Linux under a UTF-8 locale):
export LC_CTYPE=en_CA.UTF-8 # requires the en_CA.UTF-8 locale to be
built
mkdir /tmp/foo
cd /tmp/foo
touch $'UTF-8 compatible filename\xc2\xa2'
touch $'UTF-8 incompatible filename\xc0'
Under Python 2.52, you get this:
>>> import os
>>> os.listdir(u".")
['UTF-8 incompatible filename\xc0', u'UTF-8 compatible filename\xa2']
>>> os.listdir(".")
['UTF-8 incompatible filename\xc0', 'UTF-8 compatible filename\xc2\xa2']
>>> [open(f, "r") for f in os.listdir(u".")]
[<open file 'UTF-8 incompatible filename�, mode 'r' at 0xb7cee578>,
<open file 'UTF-8 compatible filename¢', mode 'r' at 0xb7cee6e0>]
Under Python 3.0b3, you get this:
>>> import os
>>> os.listdir(".")
[b'UTF-8 incompatible filename\xc0', 'UTF-8 compatible filename¢']
>>> os.listdir(b".")
[b'UTF-8 incompatible filename\xc0', b'UTF-8 compatible filename\xc2\xa2']
>>> [open(f, "r") for f in os.listdir(".")]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 1, in <listcomp>
File "/home/dwon/python3.0b3/lib/python3.0/io.py", line 284, in __new__
return open(*args, **kwargs)
File "/home/dwon/python3.0b3/lib/python3.0/io.py", line 184, in open
raise TypeError("invalid file: %r" % file)
TypeError: invalid file: b'UTF-8 incompatible filename\xc0'
This behaviour of open() makes it impossible to write code that opens
arbitrarily-named files on Linux/ext3.
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: