This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Python 3.9.6 scan_dir returns filenotfound on long paths, but os_walk does not
Type: crash Stage: patch review
Components: IO, Tests, Windows Versions: Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: eryksun, jkloth, jschwar313, paul.moore, serhiy.storchaka, steve.dower, tim.golden, zach.ware
Priority: normal Keywords: patch

Created on 2021-12-15 14:54 by jschwar313, last changed 2022-04-11 14:59 by admin.

Files
File name Uploaded Description Edit
test_dir_scan_dir.py jschwar313, 2021-12-15 14:54 can be used to show filenotfound issue with long directory names. os_walk does not do the same thing.
test_os_walk.py jschwar313, 2021-12-15 15:03 this script works just fine on the same directory structure.
test_dir_scan_dir.py jschwar313, 2021-12-15 18:43 can be used to show filenotfound issue with long directory names. os_walk does not do the same thing.
Messages (13)
msg408607 - (view) Author: Jim Schwartz (jschwar313) Date: 2021-12-15 14:54
Python 3.9.6 scan_dir returns filenotfound on long paths, but os_walk does not.  I've enclosed sample scripts that compare the two and have returned the results.  the windows 10 registry entry to extend the path names fixes this issue (HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\FileSystem\LongPathsEnabled set to 1).  I've enclosed a scripts that proved this occurs and can be used for testing. I have a script that does the same thing using os_walk, but I can't attach two scripts to this Issue.
msg408608 - (view) Author: Jim Schwartz (jschwar313) Date: 2021-12-15 15:03
Here's the second file that works just fine under python 3.9 (by the way, I am using Windows 64-bit).  I didn't test this on later python versions, however, nor did I test it on 32-bit versions.  I see that many people on the internet have said to change the working directory as a work around.  Could this possibly be why?
msg408617 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2021-12-15 16:55
> Python 3.9.6 scan_dir returns filenotfound on long paths, 
> but os_walk does not.

This would be surprising. os.walk() has been implemented via os.scandir() since Python 3.5. Do you have a concrete example of the directory structure to test? 

> I see that many people on the internet have said to 
> change the working directory as a work around.  

Changing the working directory is a workaround in Unix, not Windows. Without long-path support, the working directory in Windows is limited to 258 (MAX_PATH - 2) characters.

Without long-path support, the workaround in Windows is to use an extended path, i.e. a Unicode path that's fully-qualified and normalized -- as returned by os.path.abspath() -- and prefixed by "\\\\?\\" or "\\\\?\\UNC\\" (e.g. r"\\?\C:\spam" or r"\\?\UNC\server\share\spam"). This allows the native path length limit of about 32760 characters. Some API functions and applications do not support extended paths. In particular setting an extended path as the working directory is unsupported and buggy, even if long-path support is enabled. But extended paths work fine with most file functions in the os and shutil modules, such as os.scandir(), os.stat(), os.open(), shutil.copytree(), and shutil.rmtree().
msg408619 - (view) Author: Jim Schwartz (jschwar313) Date: 2021-12-15 17:14
yes, I do.  C:\Users\Jim\Documents\jschw_uiowtv3_old\AppData\Local\Google\Chrome\User Data\Default\Extensions\nenlahapcbofgnanklpelkaejcehkggg\0.1.823.675_0\notifications\pages\Cashback\components\CashBackResolve\components\RewardsActivation\components\CashbackSectionSimple

it's over the 260 character limit that's the default for windows 10.
msg408621 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2021-12-15 17:32
It works as expected for me:

    >>> len(p)
    261
    >>> print(p)
    C:\Temp\Jim\Documents\jschw_uiowtv3_old\AppData\Local\Google\Chrome\User Data\Default\Extensions\nenlahapcbofgnanklpelkaejcehkggg\0.1.823.675_0\notifications\pages\Cashback\components\CashBackResolve\components\RewardsActivation\components\CashbackSectionSimple

os.walk() can list the files in directory `p` if the \\?\ prefix is prepended:

    >>> next(os.walk('\\\\?\\' + p))[-1]
    ['spam.txt']

Without the prefix, the internal os.scandir() call fails, but by default the error is ignored:

    >>> next(os.walk(p))[-1]
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    StopIteration

We can print the exception to see that it's the expected ERROR_PATH_NOT_FOUND (3) error for a path that's too long:

    >>> next(os.walk(p, onerror=print))[-1]
    [WinError 3] The system cannot find the path specified: 'C:\\Temp\\Jim\\Documents\\jschw_uiowtv3_old\\AppData\\Local\\Google\\Chrome\\User Data\\Default\\Extensions\\nenlahapcbofgnanklpelkaejcehkggg\\0.1.823.675_0\\notifications\\pages\\Cashback\\components\\CashBackResolve\\components\\RewardsActivation\\components\\CashbackSectionSimple'
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    StopIteration
msg408623 - (view) Author: Jim Schwartz (jschwar313) Date: 2021-12-15 17:47
do you have this registry entry set to 1: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\FileSystem\LongPathsEnabled set to 1.  It works if you do.  What version of windows do you have?  I have version 21H2 (OS Build 19044.1387).  I don't have windows 11 yet.
msg408625 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2021-12-15 17:56
If I had long paths enabled, then next(os.walk(p, onerror=print)) would not have printed the error that I showed in the example and would not have immediately raised StopIteration. Instead it would have returned a (dirpath, dirnames, filenames) result for directory `p`. Did you repeat the simple examples that I showed, exactly as shown, and get a different result?
msg408626 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2021-12-15 17:56
os.walk() has been implemented via os.scandir(), but by default it ignores OSErrors raised by os.scandir(), DirEntry.is_dir() and DirEntry.is_symlink(). You can get errors raised by os.scandir() if specify the onerror argument, but errors in DirEntry.is_dir() and DirEntry.is_symlink() are always ignored, so too deep directories or links to directories can be treated as files.
msg408632 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2021-12-15 18:30
> but errors in DirEntry.is_dir() and DirEntry.is_symlink() 
> are always ignored

In Windows, is_symlink() won't fail due to a long path, since that information comes from the directory listing, but is_dir() might fail for a long path if it's a symlink to a directory. Windows requires that a symlink to a directory is also a directory (i.e. the symlink reparse point is set on an empty directory), but it's not enough to check that it's a directory symlink. is_dir() requires checking that the target exists, which may fail if the path of the link is too long to open and resolve the link target.
msg408634 - (view) Author: Jim Schwartz (jschwar313) Date: 2021-12-15 18:43
the issue is with the scandir script, not the os_walk script.  I tried to upload the scandir python script before, but I guess it didn't upload.  When I was running the two scripts, I used an input of C:\\ as the input parameter.  Hope that helps.
msg408636 - (view) Author: Jim Schwartz (jschwar313) Date: 2021-12-15 19:12
when I run the following command:

python "H:\Users\LindaJim\Documents\AWS Python Learning\test_dir_scan_dir.py" "C:\\"

I get this output:

...
Traceback (most recent call last):
  File "H:\Users\LindaJim\Documents\AWS Python Learning\test_dir_scan_dir.py", line 54, in <module>
    main(sys.argv[0:])
  File "H:\Users\LindaJim\Documents\AWS Python Learning\test_dir_scan_dir.py", line 30, in main
    for file in get_files_in_dir(source):
  File "H:\Users\LindaJim\Documents\AWS Python Learning\test_dir_scan_dir.py", line 11, in get_files_in_dir
    yield from get_files_in_dir(entry.path)
  File "H:\Users\LindaJim\Documents\AWS Python Learning\test_dir_scan_dir.py", line 11, in get_files_in_dir
    yield from get_files_in_dir(entry.path)
  File "H:\Users\LindaJim\Documents\AWS Python Learning\test_dir_scan_dir.py", line 11, in get_files_in_dir
    yield from get_files_in_dir(entry.path)
  [Previous line repeated 19 more times]
  File "H:\Users\LindaJim\Documents\AWS Python Learning\test_dir_scan_dir.py", line 9, in get_files_in_dir
    for entry in os.scandir(source):
FileNotFoundError: [WinError 3] The system cannot find the path specified: 'C:\\Users\\Jim\\Documents\\jschw_uiowtv3_old\\AppData\\Local\\Google\\Chrome\\User Data\\Default\\Extensions\\nenlahapcbofgnanklpelkaejcehkggg\\0.1.823.675_0\\notifications\\pages\\Cashback\\components\\CashBackResolve\\components\\RewardsActivation\\components\\CashbackSectionSimple'

when I run the following command:

python "H:\Users\LindaJim\Documents\AWS Python Learning\test_os_walk.py" "C:\\"

I get this:

...
file is  C:\winutils\bin\winutils.exe
End time is  2021-12-15.13:11:54
Duration is  0:06:05

I don't think this should happen, right?
msg408643 - (view) Author: Jim Schwartz (jschwar313) Date: 2021-12-15 20:23
my c drive and h drive are both internal drives and I run the python script from my user directory on my c drive.  Not sure if that makes any difference.  Just trying to think of things that might help you reproduce and fix this.
msg408649 - (view) Author: Jim Schwartz (jschwar313) Date: 2021-12-15 21:26
Please let me know if you are able to reproduce this issue.
History
Date User Action Args
2022-04-11 14:59:53adminsetgithub: 90242
2022-03-21 18:46:36jklothsetpull_requests: - pull_request30123
2022-03-21 18:45:41jklothsetpull_requests: + pull_request30123
2022-03-21 18:45:11jklothsetpull_requests: - pull_request30121
2022-03-21 18:44:02jklothsetkeywords: + patch
nosy: + jkloth

pull_requests: + pull_request30121
stage: patch review
2021-12-15 21:26:39jschwar313setmessages: + msg408649
2021-12-15 20:23:57jschwar313setmessages: + msg408643
2021-12-15 19:12:59jschwar313setmessages: + msg408636
2021-12-15 18:43:56jschwar313setfiles: + test_dir_scan_dir.py

messages: + msg408634
2021-12-15 18:30:10eryksunsetmessages: + msg408632
2021-12-15 17:56:20serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg408626
2021-12-15 17:56:01eryksunsetmessages: + msg408625
2021-12-15 17:47:43jschwar313setmessages: + msg408623
2021-12-15 17:32:36eryksunsetmessages: + msg408621
2021-12-15 17:14:58jschwar313setmessages: + msg408619
2021-12-15 16:55:42eryksunsetnosy: + eryksun
messages: + msg408617
2021-12-15 15:03:40jschwar313setfiles: + test_os_walk.py

messages: + msg408608
2021-12-15 14:54:30jschwar313create