classification
Title: OSError [WinError 123] when testing if pathlib.Path('*') (asterisks) exists
Type: Stage: patch review
Components: Library (Lib), Windows Versions: Python 3.8, Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: eryksun, jimbo1qaz_, paul.moore, pitrou, serhiy.storchaka, steve.dower, tim.golden, zach.ware
Priority: normal Keywords: patch

Created on 2018-11-24 05:55 by jimbo1qaz_, last changed 2019-01-16 01:14 by eryksun.

Pull Requests
URL Status Linked Edit
PR 11133 open v2m, 2018-12-12 20:15
Messages (6)
msg330371 - (view) Author: jimbo1qaz_ via Gmail (jimbo1qaz_) Date: 2018-11-24 05:55
I'm writing a program taking paths from user input through CLI.

`path` is a pathlib.Path().

Since Windows doesn't expand asterisks, I check if the path doesn't exist. If so I expand using Path().glob(path).

Unfortunately on Windows, if `path` (type: Path) contains asterisks, checking `path.exists()` or `path.is_dir()` raises WinError 123.

Python 3.7.0 (default, Jun 28 2018, 08:04:48) [MSC v.1912 64 bit (AMD64)] :: Anaconda, Inc. on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from pathlib import Path
>>> Path('*').exists()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\jimbo1qaz\Miniconda3\envs\python37\lib\pathlib.py", line 1318, in exists
    self.stat()
  File "C:\Users\jimbo1qaz\Miniconda3\envs\python37\lib\pathlib.py", line 1140, in stat
    return self._accessor.stat(self)
OSError: [WinError 123] The filename, directory name, or volume label syntax is incorrect: '*'
>>> Path('*').is_dir()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\jimbo1qaz\Miniconda3\envs\python37\lib\pathlib.py", line 1330, in is_dir
    return S_ISDIR(self.stat().st_mode)
  File "C:\Users\jimbo1qaz\Miniconda3\envs\python37\lib\pathlib.py", line 1140, in stat
    return self._accessor.stat(self)
OSError: [WinError 123] The filename, directory name, or volume label syntax is incorrect: '*'

I also reproduced on Miniconda 3.6.6, 3.7.0, and official Python 3.7.1.

According to https://bugs.python.org/issue29827 , os.path.exists() (not Path.exists() ) returns False on any OSError.

-----------------

On Linux, checking paths with null bytes (a less common occurrence) raises a different error:

>>> import pathlib
>>> pathlib.Path("\x00").exists()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.6/pathlib.py", line 1336, in exists
    self.stat()
  File "/usr/lib/python3.6/pathlib.py", line 1158, in stat
    return self._accessor.stat(self)
  File "/usr/lib/python3.6/pathlib.py", line 387, in wrapped
    return strfunc(str(pathobj), *args)
ValueError: embedded null byte
msg330394 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2018-11-25 20:32
Path.exists should ignore all OSError exceptions, as os.path.exists does. Barring that, I suppose for Windows we could add EINVAL to IGNORED_ERROS. 

The problem with ValueError was already addressed in issue 33721.
msg333092 - (view) Author: jimbo1qaz_ via Gmail (jimbo1qaz_) Date: 2019-01-06 08:10
Should Path.resolve() also avoid raising OSError?

Path('*').resolve()

Traceback (most recent call last):
...truncated
  File "<ipython-input-5-4fa2fec5c8b3>", line 1, in <module>
    Path('*').resolve()
  File "C:\Users\jimbo1qaz\AppData\Local\Programs\Python\Python37\lib\pathlib.py", line 1134, in resolve
    s = self._flavour.resolve(self, strict=strict)
  File "C:\Users\jimbo1qaz\AppData\Local\Programs\Python\Python37\lib\pathlib.py", line 192, in resolve
    s = self._ext_to_normal(_getfinalpathname(s))
OSError: [WinError 123] The filename, directory name, or volume label syntax is incorrect: '*'


os.path.realpath('"*')
Out[8]: 'C:\\Users\\jimbo1qaz\\Dropbox\\encrypted\\code\\corrscope\\"*'
os.path.abspath('*"')
Out[13]: 'C:\\Users\\jimbo1qaz\\Dropbox\\encrypted\\code\\corrscope\\*"'

(sidenote: what os.path operation does Path.resolve() match? Path('nonexistent').resolve() returns a relative path on Python 3.7.1, whereas Path().resolve() returns an absolute path.)
msg333728 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2019-01-15 18:40
Pathlib doesn't necessarily directly follow os on its error handling - adding Antoine for comment.

Passing strict=False to resolve() should be able to handle an invalid name like that. If not, I propose that we change it so that it does.
msg333730 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2019-01-15 19:02
I'm fine with swallowing the error in both exists() and resolve(). We should be careful not to swallow errors too broadly, though.  The code paths should be audited to check that EINVAL can't mean something else.
msg333746 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2019-01-16 01:14
> (sidenote: what os.path operation does Path.resolve() match? 
> Path('nonexistent').resolve() returns a relative path on Python 
> 3.7.1, whereas Path().resolve() returns an absolute path.)

pathlib should resolve 'nonexistent' in Windows. It works as expected in Unix:

    >>> os.getcwd()
    '/etc'
    >>> os.fspath(Path('nonexistent').resolve())
    '/etc/nonexistent'

A PR to implement ntpath.realpath is in development for issue 14094. The proposed implementation calls ntpath.abspath at the start, unless it's an extended path (i.e. prefixed by \\?\). Unlike Unix, Windows normalizes a path in user mode as a text operation before passing it to the kernel and file system. This means there's no problem if abspath removes a reparse point (e.g. symlink or mountpoint) when it resolves a ".." component.

> The code paths should be audited to check that EINVAL can't mean something else.

We'd have to use the Windows error code (e.g. ERROR_INVALID_NAME) if it has to be specific. EINVAL is the default errno value. In particular, EINVAL includes some low-level device failures such as ERROR_IO_DEVICE and errors for operations that a device doesn't implement, which are commonly ERROR_INVALID_PARAMETER, ERROR_INVALID_FUNCTION, and ERROR_NOT_SUPPORTED. 

Also, a few device and files-system errors are mapped to EACCES (e.g. ERROR_NOT_READY and ERROR_SECTOR_NOT_FOUND). If we include EACCES, then files that exist but are inaccessible (e.g. the user isn't allowed to list the parent  directory) will be reported as not existing instead of raising an error. It's what os.path.exists does, but I guess pathlib wants to be more nuanced.

When using C runtime I/O (e.g. open, read, write), it can help to get the last Windows error code, _doserrno [1]. Its value gets set when errno is set by mapping an OS error. The last NT status value may also help in some cases. It gets set whenever an NT status code is mapped to a Windows error via RtlNtStatusToDosError (usually followed immediately by RtlSetLastWin32Error). It would be nice if OSError always included these two values, maybe as "last_winerror" (differentiated from "winerror") and "last_ntstatus".

For example, here's a case of trying to open a file on a CD drive that has no disk in it.

    import ctypes

    doserrno = ctypes.WinDLL('ucrtbase').__doserrno
    doserrno.restype = ctypes.POINTER(ctypes.c_ulong)
    doserrno.errcheck = lambda r, f, a: r[0]

    get_last_nt_status = ctypes.WinDLL('ntdll').RtlGetLastNtStatus
    get_last_nt_status.restype = ctypes.c_ulong

    def test():
        try:
            open('D:\\test.txt')
        except:
            winerror, ntstatus = doserrno(), get_last_nt_status()
            print('Windows error:', winerror)
            print('NT status:', format(ntstatus, '#010x'))
            raise

    >>> test()
    Windows error: 21
    NT status: 0xc0000013
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "<stdin>", line 3, in test
    PermissionError: [Errno 13] Permission denied: 'D:\\test.txt'

Windows error 21 is ERROR_NOT_READY, so we're already much better informed than EACCES (13). NT status 0xC0000013 is STATUS_NO_MEDIA_IN_DEVICE.

[1]: https://docs.microsoft.com/en-us/cpp/c-runtime-library/errno-doserrno-sys-errlist-and-sys-nerr?view=vs-2017
History
Date User Action Args
2019-01-16 01:14:21eryksunsetmessages: + msg333746
2019-01-15 19:02:30pitrousetmessages: + msg333730
versions: + Python 3.8, - Python 3.6
2019-01-15 18:40:28steve.dowersetnosy: + pitrou
messages: + msg333728
2019-01-06 08:10:56jimbo1qaz_setmessages: + msg333092
2018-12-12 20:15:52v2msetkeywords: + patch
stage: patch review
pull_requests: + pull_request10365
2018-11-25 20:32:24eryksunsetnosy: + eryksun, paul.moore, tim.golden, serhiy.storchaka, zach.ware, steve.dower
messages: + msg330394
components: + Windows
2018-11-24 05:55:59jimbo1qaz_create