Title: os.getcwd() should raise UnicodeDecodeError for arbitrary bytes
Created on 2010-01-13 22:35 by flox, last changed 2010-01-13 22:46 by pjenvey.

Messages (3)
Author: Florent Xicluna (flox) Date: 2010-01-13 22:35
When the current working directory is not decodable, the os.getcwd() function should raise an error.

>>> sys.getfilesystemencoding()
>>> cwd=b'/tmp/\xe7'
>>> os.mkdir(cwd); os.chdir(cwd)
>>> os.getcwdb()
>>> os.getcwd()  # Should raise UnicodeDecodeError

Python 2 raises the error.
Author: Florent Xicluna (flox) Date: 2010-01-13 22:44
Actually, it is the documented behaviour.

>>> b'\xe7'.decode('utf-8', 'surrogateescape')
Author: Philip Jenvey (pjenvey) Date: 2010-01-13 22:46
Right, this is an intentional change in behavior in Python 3.1, non-decodable characters are now decoded to utf8b (via the surrogateescape error handler). The unicode string returned from getcwd furthermore can be passsed around to other fs functions, they simply encode back to the original bytes via surrogateescape on POSIX

See PEP 383
