New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
os.cpu_count() returns wrong number of processors on specific systems #77347
Comments
wrong number of cpu's is reported on some specific platforms. first platform: multiprocessing using concurrent.futures able to fully utilize the server; second platform: multiprocessing using concurrent.futures able to utilize only 1/4 of the server's power; |
wrong number of cpu's is reported on some specific platforms. first platform: multiprocessing using concurrent.futures able to fully utilize the server; second platform: multiprocessing using concurrent.futures able to utilize only 1/4 of the server's power; |
You mean os.cpu_count() reports *more* CPUs than exist on the machine? How can that happen? |
Yup. |
The difference between os.cpu_count() and psutil.cpu_count() is because one uses GetMaximumProcessorCount() and the other dwNumberOfProcessors. This is tracked as a bug in psutil bug tracker but it's not fixed yet: As for Python this is where it was discussed and changed: In summary: psutil is wrong and you should rely on os.cpu_count(). |
Maybe i'm missing something, and would appreciate clarification. Perhaps psutil is wrong, but it gives an answer that has something to do with the actual situation. On platform 2, i have 2 Intel Xeon Gold 6138, each with 20 physical processors, 40 logicals. you are saying i need to rely on os.cpu_count(), which outputs '128'. Can you elaborate on this? Moreover, when attempting to parallelize on the processors, i reach 25% utilization, which suggests Python 'sees' only one processor group. |
Oh! So both os.cpu_count() and psutil.cpu_count() are wrong? How do you determine the actual number of logical/physical CPUs on your machine? |
Yes. Both are wrong, and os.cpu_count() is completely off. It seems like aside from the os.cpu_count() issue, Python itself has some problem - it 'sees' only 1 CPU group. It is evident from the fact the when parallelizing, utilization level is only 25%. |
Let's not conflate different issues. The parallelization issue is distinct from the os.cpu_count() issue (and I'm skeptical Python is at fault there). |
Ok, no problem. makes sense? |
Exactly. |
By re-reading GetMaximumProcessorCount (now used by os.cpu_count()) returns "the maximum number of logical processors that a processor group or the system CAN have", not the actual number. That would explain why in OP's case os.cpu_count() returns 128 instead of 40. As per https://bugs.python.org/issue30581#msg295255 dwNumberOfProcessors wasn't good because it doesn't take multiple processor groups into account (hence the number may be too small) and GetLogicalProcessorInformationEx may be the way to go. This is based on the assumption that os.cpu_count() should report the number of CPUs in the system (including the non-usable ones, like in case of process groups). |
Adding Chris Wilcox who wrote the original patch using GetMaximumProcessorCount. |
I created a psutil branch using GetLogicalProcessorInformation() to determine both logical and physical CPUs: |
The fallback for older versions of Windows is dwNumberOfProcessors from GetSystemInfo. This can be removed from 3.7 and 3.8, which no longer support Windows versions prior to Windows 7.
GetActiveProcessorCount and GetMaximumProcessorCount are implemented via GetLogicalProcessorInformationEx (i.e. NtQuerySystemInformation, SystemLogicalProcessorAndGroupInformation). They query the RelationGroup information. For ALL_PROCESSOR_GROUPS, they respectively sum the ActiveProcessorCount and MaximumProcessorCount over all groups. These functions were added in Windows 7 to support the implementation of logical processor groups, which allows up to 64 logical processors per group. Each process is created in a single group, which is assigned round-robin. A thread can call SetThreadGroupAffinity to manually switch to another group. Apparently someone at Microsoft advised calling GetMaximumProcessorCount (see bpo-30581), but I don't follow this decision. Why should os.cpu_count() include CPUs that may or may not come online? Also, on POSIX it reports sysconf(_SC_NPROCESSORS_ONLN), not sysconf(_SC_NPROCESSORS_CONF), so for Windows it should instead call GetActiveProcessorCount. I assume on BSD that HW_NCPU is similar, though I'm not sure for MacOS. Also on MacOS it appears to be deprecated in favor of HW_LOGICALCPU, HW_LOGICALCPU_MAX, HW_PHYSICALCPU, and HW_PHYSICALCPU_MAX.
32-bit Windows and WOW64 emulation are limited to 32 CPUs. Applications that need more logical processors should be 64-bit |
That makes sense to me. Thanks for deciphering this. |
Added bpo-32592 as a dependency since that is removing the Vista code mentioned here. Once that change is merged, then this would be a simpler change to make. |
Thank you for clarifying this muddy topic, Eryk! (Dropping bpo-32592 dependency; we've done the update in a basically-compatible way.) |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: