classification
Title: os.path.isfile returns false on Windows when file path is longer than 260 characters
Type: behavior Stage: resolved
Components: Library (Lib), Windows Versions: Python 3.8, Python 2.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: eryksun, ldconejo, paul.moore, steve.dower, tim.golden, zach.ware
Priority: normal Keywords:

Created on 2018-03-19 18:28 by ldconejo, last changed 2021-03-01 13:02 by eryksun. This issue is now closed.

Messages (5)
msg314109 - (view) Author: Luis Conejo-Alpizar (ldconejo) Date: 2018-03-19 18:28
Windows has a maximum path length limitation of 260 characters. This limitation, however, can be bypassed in the scenario described below. When this occurs, os.isfile() will return false, even when the affected file does exist. For Windows systems, the behavior should be for os.isfile() to return an exception in this case, indicating that maximum path length has been exceeded.

Sample scenario:

1. Let's say you have a folder, named F1 and located in your local machine at this path:

C:\tc\proj\MTV\cs_fft\Milo\Fries\STL\BLNA\F1\

2. Inside of that folder, you have a log file with this name:

This_is_a_really_long_file_name_that_by_itself_is_not_capable_of_exceeding_the_path_length_limitation_Windows_has_in_pretty_much_every_single_version_of_Wind.log

3. The combined length of the path and the file is exactly 260 characters, so Windows lets you get away with it when the file is initially created and/or placed there.

4. Later, you decide to make the F1 folder available on your network, under this name:

\\tst\tc\proj\MTV\cs_fft\Milo\Fries\STL\BLNA\F1\

5. Your log file continues to be in the folder, but its full network path is now 263 characters, effectively violating the maximum path length limitation.

6. If you use os.listdir() on the networked folder, the log file will come up.

7. Now, if you try os.path.isfile(os.path.join(networked_path,logfile_name)) it will return false, even though the file is indeed there and is indeed a file.
msg314114 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2018-03-19 19:17
Given basically every other file operation on Windows XP will also break on this file, I don't think it's worth us fixing in 2.7.

If it occurs on Python 3.6 on Windows 7, we can consider it. But considering how well known this limitation is (and the workarounds for newer OSs), I don't think it's particularly urgent.
msg314123 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2018-03-20 00:55
Python 2.7 is all but set in stone. Changes to its behavior have to correct serious bugs, not work around limits in an OS. You can do that yourself. For example, use an extended local-device path, i.e. a path that's prefixed by u"\\\\?\\". This path type must be unicode, fully-qualified, and use only backslash as the path separator. Also, the UNC device has to be used explicitly. Python 2 has poor support for unicode raw strings (\u and \U escapes aren't disabled), so you can instead use forward slashes and normpath(). For example: 

    f1_path = os.path.normpath(u"//?/UNC/tst/tc/proj/MTV/cs_fft/Milo/Fries/STL/BLNA/F1") 
    log_path = os.path.join(f1_path, log_filename)
    assert os.path.isfile(log_path)
msg314126 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2018-03-20 01:12
> If you use os.listdir() on the networked folder, the log file 
> will come up.

Querying a file's parent directory (e.g. via os.scandir in Python 3) can provide a basic stat (i.e. attributes, reparse tag, size, and timestamps) when opening the file directly fails. Currently the os.[l]stat implementation in Python 3 falls back on querying the directory for ERROR_ACCESS_DENIED (e.g. due to a file's security, delete disposition, or an exclusive open) and ERROR_SHARING_VIOLATION (e.g. a system paging file). 

This could be expanded to ERROR_PATH_NOT_FOUND, which among other reasons, can indicate the path was too long if long-path support isn't available. This would expand the reach to all files in which the path of the parent directory is less than MAX_PATH. This would keep os.[l]stat consistent with os.listdir and os.scandir, which it currently is not. For example:

    >>> parent_path, filename = os.path.split(path)
    >>> len(path), len(parent_path), filename
    (264, 255, 'spam.txt')

    >>> os.path.exists(path)
    False
    >>> entry = next(os.scandir(parent_path))
    >>> entry.name
    'spam.txt'
    >>> entry.stat()
    os.stat_result(st_mode=33206, st_ino=0, st_dev=0, st_nlink=0,
                   st_uid=0, st_gid=0, st_size=0, st_atime=1521507879,
                   st_mtime=1521507879, st_ctime=1521507879)
msg387861 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2021-03-01 13:02
I'm closing this as not a bug. If the process limits DOS paths to MAX_PATH, then checking os.path.isfile() should not be special cased to support longer DOS paths because calling open() on such a path will raise FileNotFoundError. My suggestion in msg314126 to have stat() fall back on querying the directory in this case is too magical, even if it does make os.stat() more consistent with os.listdir() and the stat result from os.scandir().
History
Date User Action Args
2021-03-01 13:02:07eryksunsetstatus: open -> closed
resolution: not a bug
messages: + msg387861

stage: needs patch -> resolved
2018-03-20 01:12:55eryksunsetstage: needs patch
messages: + msg314126
components: + Windows
versions: + Python 3.8
2018-03-20 00:55:02eryksunsetnosy: + eryksun

messages: + msg314123
title: os.isfile returns false on Windows when file path is longer than 260 characters -> os.path.isfile returns false on Windows when file path is longer than 260 characters
2018-03-19 19:17:44steve.dowersetmessages: + msg314114
2018-03-19 18:30:05ned.deilysetnosy: + paul.moore, tim.golden, zach.ware, steve.dower
2018-03-19 18:28:31ldconejocreate