classification
Title: UnicodeDecodeError on OSError on Windows with undecodable (bytes) filename
Type: behavior Stage:
Components: Unicode, Windows Versions: Python 3.4
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: ezio.melotti, flox, ishimoto, loewis, python-dev, sbt, serhiy.storchaka, skrah, tim.golden, vstinner
Priority: normal Keywords: patch

Created on 2012-07-28 12:49 by vstinner, last changed 2012-11-13 21:17 by vstinner. This issue is now closed.

Files
File name Uploaded Description Edit
oserror_filename.patch vstinner, 2012-08-03 12:19 review
oserror_filename_windows.patch vstinner, 2012-10-30 01:42 review
Messages (19)
msg166652 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2012-07-28 12:49
On Windows, if an OS error fails, the filename type is bytes and the filename cannot be decoded: Python raises an UnicodeDecodeError instead of an OSError. The problem is that Python decodes the filename to fill OSError.filename field. See the issue #15441 for the initial report.

There are different options to solve this issue:
 - always keep the filename parameter unchanged, so OSError.filename can be a str or a bytes string, depending on the input parameter
 - try to decode the filename from the filesystem encoding, or keep the filename unchanged: OSError.filename is only a bytes string if the filename cannot be decoded
 - don't fill OSError.filename (= None) if the filename cannot be decoded
 - use "surrogateescape", "replace" or "backslashreplace" error handler to decode the filename

This issue is specific to Windows: on other plaforms, the filename is decoded using the "surrogateescape" error handler and so decoding the filename cannot fail.

I don't know if OSError.filename is only used to display more information to the user, or if it is used to do another operation on the file (ex: os.chmod).

I like solutions keeping the filename unchanged, because it does not loose information, and the user can decide how to handle the undecodable filename.

I don't like the option trying to decode the filename or keeping it unchanged it decoding fails, because applications will work in most cases, but "crash" when someone comes with an unusual code page, a special USB key, or a filename with a non-ASCII character.

So the best option is maybe to always keep the bytes filename unchanged.

Such change cannot be done anymore in Python 3.3, it's too late to test it correctly.
msg166654 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2012-07-28 12:53
In Python 2, it looks like open(arg) does pass its filename argument unchanged to OSError constructor (so it can be bytes or unicode). OSError.filename is always bytes for os.chdir() on UNIX, but OSError.filename can be bytes or unicode for os.chdir() on Windows.
msg166777 - (view) Author: Atsuo Ishimoto (ishimoto) * Date: 2012-07-29 15:32
+1 for keeping the file name unchanged. This solution is not very 
compatible with prior versions, but simple and least-surprise.

I prefer other platforms than Windows to use same method to build OSError.
msg167314 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2012-08-03 12:19
Attached patch modifies all functions of the os module taking filenames to keep the filename unmodified in OSError.filename.

The patch changes also os.link(), os.rename() and os.replace() to use the source, not the destination, in the error message. It is maybe a mistake because these functions can also fail in the directory of the destination does not exist.
msg174172 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-10-30 01:18
New changeset 67d69f943b7f by Victor Stinner in branch 'default':
Issue #15478: Raising an OSError doesn't decode or encode the filename anymore
http://hg.python.org/cpython/rev/67d69f943b7f
msg174175 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-10-30 01:27
New changeset 27a3b19ee792 by Victor Stinner in branch 'default':
Issue #15478: Fix compilation on Windows
http://hg.python.org/cpython/rev/27a3b19ee792
msg174178 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2012-10-30 01:42
The commit is incomplete, there are some remaining functions that need to be patched: here is a new (untested) patch for more Windows functions.
msg174192 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-10-30 09:56
See also issue16074.

> The patch changes also os.link(), os.rename() and os.replace() to use the source, not the destination, in the error message. It is maybe a mistake because these functions can also fail in the directory of the destination does not exist.

Yes, in different cases it can be the source, the destination, both, unknown or none of them.
msg174245 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-10-31 00:07
New changeset 01cc9fb52887 by Victor Stinner in branch 'default':
Issue #15478: Fix test_os on Windows (os.chown is missing)
http://hg.python.org/cpython/rev/01cc9fb52887
msg174248 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-10-31 00:13
New changeset ef87bd0797de by Victor Stinner in branch 'default':
Issue #15478: Fix test_os on FreeBSD
http://hg.python.org/cpython/rev/ef87bd0797de
msg174373 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-10-31 21:25
New changeset 13ebaa36d87d by Victor Stinner in branch 'default':
Issue #15478: Use path_error() in more posix functions, especially in Windows
http://hg.python.org/cpython/rev/13ebaa36d87d

New changeset 9f696742dbda by Victor Stinner in branch 'default':
Issue #15478: Fix again to fix test_os on Windows
http://hg.python.org/cpython/rev/9f696742dbda
msg174375 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-10-31 21:48
New changeset 6903f5214e99 by Victor Stinner in branch 'default':
Issue #15478: Use source filename in OSError, not destination filename
http://hg.python.org/cpython/rev/6903f5214e99
msg174377 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2012-10-31 21:54
All issues should now be fixed.
msg174378 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-10-31 22:01
New changeset b3434c1ae503 by Victor Stinner in branch 'default':
Issue #15441, #15478: Reenable test_nonascii_abspath() on Windows
http://hg.python.org/cpython/rev/b3434c1ae503
msg174536 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2012-11-02 15:52
One of 13ebaa36d87d, 9f696742dbda or 6903f5214e99 causes test failures in test_pep277:


======================================================================
FAIL: test_failures (test.test_pep277.UnicodeFileTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\test_pep277.py", line 120, in test_failures
    self._apply_failure(os.listdir, name)
  File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\test_pep277.py", line 105, in _apply_failure
    self.assertEqual(wildcard, '*.*')
AssertionError: '7_\u05d4\u05e9\u05e7\u05e6\u05e5\u05e1' != '*.*'
- 7_\u05d4\u05e9\u05e7\u05e6\u05e5\u05e1
+ *.*
msg174537 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2012-11-02 16:05
Additionally, some of the changes cause a failure in test_subprocess:


======================================================================                                            
ERROR: test_no_leaking (test.test_subprocess.ProcessTestCase)                                                     
----------------------------------------------------------------------                                            
Traceback (most recent call last):                                                                                
  File "C:\Users\stefan\pydev\cpython\lib\test\test_subprocess.py", line 823, in test_no_leaking                  
    handles.append(os.open(tmpfile, os.O_WRONLY|os.O_CREAT))                                                      
FileExistsError: [WinError 183] Cannot create a file when that file already exists: 'c:\\users\\stefan\\appdata\\l
ocal\\temp\\tmpa41o4x\\@test_2236_tmp'                                                                            
                                                                                                                  
----------------------------------------------------------------------
msg174850 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-11-05 00:21
New changeset 817a90752470 by Victor Stinner in branch 'default':
Issue #15478: Oops, fix regression in os.open() on Windows
http://hg.python.org/cpython/rev/817a90752470
msg174852 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-11-05 00:28
New changeset 11ea4eb79e9d by Victor Stinner in branch 'default':
Issue #15478: Fix test_pep277 on Windows
http://hg.python.org/cpython/rev/11ea4eb79e9d
msg175490 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-11-13 08:31
New changeset ee7b713fec71 by Victor Stinner in branch 'default':
Issue #15478: os.lchflags() is not always available when os.chflags() is available
http://hg.python.org/cpython/rev/ee7b713fec71
History
Date User Action Args
2012-11-13 21:17:42vstinnersetstatus: open -> closed
resolution: fixed
2012-11-13 08:31:27python-devsetmessages: + msg175490
2012-11-05 00:28:17python-devsetmessages: + msg174852
2012-11-05 00:21:14python-devsetmessages: + msg174850
2012-11-02 16:05:59skrahsetnosy: + sbt
messages: + msg174537
2012-11-02 15:52:09skrahsetstatus: closed -> open

nosy: + skrah
messages: + msg174536

resolution: fixed -> (no value)
2012-10-31 22:01:33python-devsetmessages: + msg174378
2012-10-31 21:54:09vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg174377
2012-10-31 21:48:02python-devsetmessages: + msg174375
2012-10-31 21:25:37python-devsetmessages: + msg174373
2012-10-31 00:13:03python-devsetmessages: + msg174248
2012-10-31 00:07:07python-devsetmessages: + msg174245
2012-10-30 09:56:09serhiy.storchakasettype: behavior

messages: + msg174192
nosy: + serhiy.storchaka
2012-10-30 01:42:02vstinnersetfiles: + oserror_filename_windows.patch

messages: + msg174178
2012-10-30 01:27:13python-devsetmessages: + msg174175
2012-10-30 01:18:55python-devsetnosy: + python-dev
messages: + msg174172
2012-08-03 12:19:58vstinnersetfiles: + oserror_filename.patch
keywords: + patch
messages: + msg167314
2012-07-29 15:32:08ishimotosetmessages: + msg166777
2012-07-28 12:53:42vstinnersetmessages: + msg166654
2012-07-28 12:52:06vstinnersettitle: UnicodeDecodeError on OSError -> UnicodeDecodeError on OSError on Windows with undecodable (bytes) filename
2012-07-28 12:49:04vstinnercreate