Title: glob returns results in undeterministic order
Type: Stage: resolved
Components: Library (Lib) Versions: Python 3.7, Python 3.6, Python 3.3, Python 3.4, Python 3.5, Python 2.7
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: bmwiedemann, rhettinger, serhiy.storchaka
Priority: normal Keywords:

Created on 2017-05-24 19:29 by bmwiedemann, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (4)
msg294381 - (view) Author: Bernhard M. Wiedemann (bmwiedemann) * Date: 2017-05-24 19:29
because POSIX readdir does not guarantee any order
glob often gives unexpectedly random results.

Some background:
for openSUSE Linux we build packages in the Open Build Service (OBS)
which tracks dependencies, so when e.g. a new glibc is submitted,
all packages depending on glibc are rebuilt
and if those depending binaries changed,
the new version is pushed to the mirrors.

Many python modules build their .so files from a glob.glob(path, "*.cpp")

The old glob behaviour would often lead to the linker
randomly ordering functions in resulting object files,
thus we were not able to auto-detect
that the package did not actually change
which wastes bandwidth of distribution mirrors and users.

See also on that topic.

There are plenty affected packages out there
msg294391 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-05-24 20:53
This looks as a duplicate of issue21748. That behavior was explicitly documented in issue25615.
msg295117 - (view) Author: Bernhard M. Wiedemann (bmwiedemann) * Date: 2017-06-04 09:02
From my performance measurements, the overhead was negligible (not even counting the processing done on files returned by glob).

And also glob in C, bash, perl all do sort by default and these are generally pretty fast languages, yet they still chose consistency over performance.

I updated my PR to also update the documentation accordingly.
msg295133 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-06-04 17:23
Sorry, we're going to reject this patch for the reasons discussed in the two other referenced patches.

If a user wants sorted order, they can effortlessly specify that with sorted(glob('*.cpp')).
