classification
Title: iglob should try to use `readdir`
Type: performance Stage: resolved
Components: Library (Lib) Versions: Python 3.4
process
Status: closed Resolution: duplicate
Dependencies: 11406 Superseder: Use scandir() to speed up the glob module
View: 25596
Assigned To: Nosy List: neologix, o11c, tebeka
Priority: normal Keywords:

Created on 2013-10-13 01:42 by tebeka, last changed 2017-02-07 19:05 by serhiy.storchaka. This issue is now closed.

Messages (3)
msg199649 - (view) Author: Miki Tebeka (tebeka) * Date: 2013-10-13 01:42
Currently glob.iglob calls os.listdir internally. Which means that if there are many files in the directory - a big list of them is created in memory.

iglob should try to use readdir and be a "true" iterator, not consuming a lot of memory.

See one possible implementation using ctypes at http://stackoverflow.com/questions/4403598/list-files-in-a-folder-as-a-stream-to-begin-process-immediately
msg199657 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2013-10-13 06:47
Actually, it should probably be using a generator-based version of os.listdir().
See #11406.
msg287251 - (view) Author: Ben Longbons (o11c) * Date: 2017-02-07 18:46
This is a duplicate of bug 25596, which is now fixed.
History
Date User Action Args
2017-02-07 19:05:40serhiy.storchakasetstatus: open -> closed
superseder: Use scandir() to speed up the glob module
resolution: duplicate
stage: resolved
2017-02-07 18:46:38o11csetnosy: + o11c
messages: + msg287251
2013-10-13 06:47:06neologixsetnosy: + neologix
dependencies: + There is no os.listdir() equivalent returning generator instead of list
messages: + msg199657
2013-10-13 01:42:57tebekacreate