classification
Title: os.walk() consumes a lot of file descriptors
Type: resource usage Stage: resolved
Components: Library (Lib) Versions: Python 3.6, Python 3.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: benhoyt, gvanrossum, larry, martin.panter, python-dev, serhiy.storchaka, vstinner
Priority: normal Keywords: patch

Created on 2016-01-02 17:57 by serhiy.storchaka, last changed 2016-02-24 11:05 by serhiy.storchaka. This issue is now closed.

Files
File name Uploaded Description Edit
walk_consume_fds_opt1.patch serhiy.storchaka, 2016-01-03 08:48 review
walk_consume_fds_opt2.patch serhiy.storchaka, 2016-01-03 08:49 review
Messages (13)
msg257352 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-01-02 17:57
Since 3.5 os.walk() consumes a lot of file descriptors. Its number equals to the deep of directories tree. Since the number of file descriptors is limited, this can cause problems.

This was the main reason for rejecting fwalk-based implementation of os.walk() (issue15200).
msg257404 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-01-03 08:48
Here are two alternative solutions for this issue.
msg258018 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2016-01-11 21:10
I don't like the first solution (it just collects all scandir() results in a list, which defeats one of the purposes of using scandir()), but I don't understand the second solution.

Ideally the number of FDs used should be limited by the depth of the tree being walked, right?
msg258022 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-01-11 21:35
Both patches are basically equivalent. The first one collects all scandir() results in a list, the second one collects only directory names in a list. The purpose of using os.scandir() in os.walk() was a speed up (issue23605), and both patches preserve it.

Yes, the number of FDs used is equivalent to the depth of the tree which can be very deep (I just created a tree depth of 1000 levels). And what is worse, all these FDs can be effectively leaked if the walking was not finished. This is unwanted behavior change.
msg258025 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2016-01-11 21:42
I am all for preventing the leaks. But using FDs proportional to the tree
depth seems reasonable to me. (If you are worried about some kind of DoS
attack on the algorithm by someone who can build a tree with depth 1000,
well, if they can do that they can also create a flat folder with a million
files in it.)

Is there a potential hybrid strategy, where for small directories we close
the FD but for large directories we keep it open?
msg258096 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-01-12 10:06
> But using FDs proportional to the tree depth seems reasonable to me.

This was the main reason for rejecting issue15200. May be we can allow this for globbing, but only explicitly enabled with special parameter or separate functions. The benefit for globbing is starting to yield results without internal caching. The benefit for os.walk() is smaller, because in any case it yields files and directories not one-by-one, but collected into lists.

> Is there a potential hybrid strategy, where for small directories we close
> the FD but for large directories we keep it open?

May be with more complicated code. I think this needs explicit close() method, so it can go only in 3.6. For now I think we have to commit one of my patches (the difference only stylistic).
msg258116 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2016-01-12 18:36
I like them both, if I had to choose I'd pick patch 2.

But yes, we need to add a close() method to the scandir iterator object.

In the meantime, I am still worried about what would happen if somehow the loop got interrupted and the frame got kept alive and the iterator wasn't closed by its dealloc until much later.  This kind of thing was common in asyncio and we had to resort to similar tricks to break some cycles. Maybe you can add a try/finally that *deletes* scandir_it to force it to close itself (at least in CPython)? That can go into 3.5.
msg258238 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-01-14 22:26
Yes, OSError, MemoryError and KeyboardInterrupt can be raised during iterating scandir object or calling is_dir() or is_symlink() methods of the result. They stop iterating and left file descriptor open. If close file descriptor on error during iteration (issue26117), patch 1 becomes more safe, because it minimizes the lifetime of opened descriptor.
msg260089 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-02-11 11:32
New changeset e951d76f1945 by Serhiy Storchaka in branch '3.5':
Issue #25995: os.walk() no longer uses FDs proportional to the tree depth.
https://hg.python.org/cpython/rev/e951d76f1945

New changeset 6197a09a56b1 by Serhiy Storchaka in branch 'default':
Issue #25995: os.walk() no longer uses FDs proportional to the tree depth.
https://hg.python.org/cpython/rev/6197a09a56b1
msg260095 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-02-11 11:43
walk_consume_fds_opt1.patch is applied in 3.5 (it is more reliable with the absence of close()) and walk_consume_fds_opt2.patch is applied in 3.6.
msg260784 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-02-24 10:40
The change seems to be causing some Windows buildbot failures <http://buildbot.python.org/all/builders/AMD64%20Windows7%20SP1%203.5/builds/666/steps/test/logs/stdio>:

======================================================================
ERROR: test_walk_bad_dir (test.test_os.BytesWalkTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\buildbot.python.org\3.5.kloth-win64\build\lib\test\test_os.py", line 931, in test_walk_bad_dir
    root, dirs, files = next(walk_it)
  File "C:\buildbot.python.org\3.5.kloth-win64\build\lib\test\test_os.py", line 1027, in walk
    for broot, bdirs, bfiles in os.walk(os.fsencode(top), **kwargs):
  File "C:\buildbot.python.org\3.5.kloth-win64\build\lib\os.py", line 372, in walk
    entries = list(scandir(top))
TypeError: os.scandir() doesn't support bytes path on Windows, use Unicode instead

======================================================================
ERROR: test_walk_bottom_up (test.test_os.BytesWalkTests)
ERROR: test_walk_prune (test.test_os.BytesWalkTests)

Was this line

entries = list(scandir(top))

meant to be

entries = list(scandir_it)
msg260788 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-02-24 11:05
New changeset 9bffe39e8273 by Serhiy Storchaka in branch '3.5':
Fixed a bug in os.walk() with bytes path on Windows caused by merging fixes
https://hg.python.org/cpython/rev/9bffe39e8273
msg260789 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-02-24 11:05
Thanks Martin!
History
Date User Action Args
2016-02-24 11:05:58serhiy.storchakasetstatus: open -> closed

messages: + msg260789
2016-02-24 11:05:05python-devsetmessages: + msg260788
2016-02-24 10:40:27martin.pantersetstatus: closed -> open
nosy: + martin.panter
messages: + msg260784

2016-02-11 11:43:04serhiy.storchakasetstatus: open -> closed
resolution: fixed
messages: + msg260095

stage: patch review -> resolved
2016-02-11 11:32:19python-devsetnosy: + python-dev
messages: + msg260089
2016-02-08 15:59:59serhiy.storchakasetassignee: serhiy.storchaka
2016-01-14 22:26:51serhiy.storchakasetmessages: + msg258238
2016-01-12 18:36:17gvanrossumsetmessages: + msg258116
2016-01-12 10:06:48serhiy.storchakasetmessages: + msg258096
2016-01-11 21:42:12gvanrossumsetmessages: + msg258025
2016-01-11 21:35:22serhiy.storchakasetmessages: + msg258022
2016-01-11 21:10:38gvanrossumsetnosy: + gvanrossum
messages: + msg258018
2016-01-03 08:49:03serhiy.storchakasetfiles: + walk_consume_fds_opt2.patch
2016-01-03 08:48:40serhiy.storchakasetfiles: + walk_consume_fds_opt1.patch
keywords: + patch
messages: + msg257404

stage: patch review
2016-01-02 17:57:05serhiy.storchakacreate