This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: os.cpu_count is problematic on sparc/solaris
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.6
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: phantal, r.david.murray, vstinner
Priority: normal Keywords:

Created on 2017-01-13 17:48 by phantal, last changed 2022-04-11 14:58 by admin.

Messages (5)
msg285430 - (view) Author: Brian Vandenberg (phantal) Date: 2017-01-13 17:48
I'm attempting to build python 3.6.0 on sparc/solaris 10.  After the initial configure/compile complete I ran "make test" and I see:

$ make test
running build
running build_ext
(...)
running build_scripts
copying and adjusting (...)
changing mode of (...)
renaming (...)
(...)
Run tests in parallel using 258 child processes


I'm fairly sure the issue stems from the fact that each core on the machine has 8 "threads" and there's 32 cores (for a total of 256 virtual cores).

Each core can execute 8 parallel tasks only in very specific circumstances.  It's intended for use by things like lapack/atlas where you might be doing many computations on the same set of data.

Outside of these more restricted circumstances each core can only handle 2 parallel tasks (or so I gathered from the documentation), so at best this machine could handle 64 backgrounded jobs though I normally restrict my builds to the actual core count or less.

The most common way to get a "realistic" core count on these machines from shell scripts is:

$ core_count=`kstat -m cpu_info | grep core_id | sort -u | wc -l`

... though I'm not sure how the test suite is determining the core count.  I didn't see any mention of "kstat" anywhere.
msg285431 - (view) Author: Brian Vandenberg (phantal) Date: 2017-01-13 17:51
I forgot to mention, this wasn't an issue in 3.5.1 though I never did check how many jobs it was using.

I ran into other issues building that version and moved to a newer version because at least one of them (logging test race condition) was fixed after 3.5.1.
msg285432 - (view) Author: Brian Vandenberg (phantal) Date: 2017-01-13 17:54
This is odd.  I just went back and re-ran 3.5.1 to see how many cores and it's having the same problem now.  So, scratch that last coment.
msg285441 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2017-01-13 20:59
You don't know it, but you are actually reporting a possible sparc/Solaris specific (I think) bug against os.cpu_count.  There was some discussion about how cpu_count might be problematic in this regard.

It doesn't cause any real problem with the tests, though.  I routinely run with -j40 on my 2 cpu test box because the test run completes faster that way due to the way many tests spend time waiting for various things.
msg285446 - (view) Author: Brian Vandenberg (phantal) Date: 2017-01-13 22:28
> It doesn't cause any real problem with the tests, though.  I routinely run with -j40 on my 2 cpu test box because the test run completes faster that way due to the way many tests spend time waiting for various things.

In my case it did because it caused enough file descriptors to be allocated that it hit the cap for max open file handles.
History
Date User Action Args
2022-04-11 14:58:42adminsetgithub: 73451
2017-01-13 22:28:44phantalsetmessages: + msg285446
2017-01-13 20:59:38r.david.murraysettitle: test suite is attempting to spawn 258 child processes to run tests -> os.cpu_count is problematic on sparc/solaris
nosy: + r.david.murray

messages: + msg285441

components: + Library (Lib), - Tests
2017-01-13 18:02:48serhiy.storchakasetnosy: + vstinner
2017-01-13 17:54:06phantalsetmessages: + msg285432
2017-01-13 17:51:08phantalsetmessages: + msg285431
2017-01-13 17:48:34phantalcreate