classification
Title: os and multiprocessing.cpu_count do not respect cpuset/affinity
Type: Stage: resolved
Components: Library (Lib) Versions: Python 3.4
process
Status: closed Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: berker.peksag, davin, eryksun, giampaolo.rodola, jtaylor, neologix, python-dev, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2015-02-26 14:30 by jtaylor, last changed 2017-09-05 06:32 by giampaolo.rodola. This issue is now closed.

Files
File name Uploaded Description Edit
0001-Issue-23530-Update-documentation-clarify-relation-of.patch jtaylor, 2015-03-07 11:03 review
Messages (16)
msg236671 - (view) Author: Julian Taylor (jtaylor) Date: 2015-02-26 14:30
multiprocessing.cpu_count and os.cpu_count which are often used to determine how many processes one can run in parallel do not respect the cpuset which may limit the process to only a subset of online cpus leading to heavy oversubscription in e.g. containerized environments:

$ taskset -c 0 python3.4 -c 'import multiprocessing; print(multiprocessing.cpu_count())'
32
$ taskset -c 0 python3.4 -c 'import os; print(os.cpu_count())'
32

While the correct result here should be 1.

This requires programs to have to use less portable methods like forking to gnu nproc or having to read the /proc filesystem.

Having a keyword argument to switch between online and available cpus would be fine too.
msg236690 - (view) Author: Davin Potts (davin) * (Python committer) Date: 2015-02-26 17:23
Detecting the number of processors available to a process is a distinct concept from reporting the number of processors present on a system.  cpu_count is currently focused on the latter.

Functionality to report the number of effectively-available processors is indeed valuable information to have in certain situations but it is also something very OS-dependent (I mean the support for it is not present in all OSes, let alone the complication of how to access such information differing across OSes).

I think a strong case can be made that cpu_count's current functionality should not change -- the information it provides is very important to have access to.

Cool tools like psutil (https://pypi.python.org/pypi/psutil) help cover a great range of genuine needs when making use of processor affinity.
msg236695 - (view) Author: Julian Taylor (jtaylor) Date: 2015-02-26 17:59
I do agree that its probably safer to not change the default return value.
But adding a option (or new function) would still be good, the number of available cpus is more often the number you actually want in practice.
To the very least the documentation should be improved to clearly state that this number does not guarantee that this amount of cpus are actually available to run on and you should use psutils instead.

Code for getting this information for the major operating systems linux, bsd and windows is available in gnu coreutils.
I can possibly work on a patch if it would get accepted but I can only actually test it linux.
msg236781 - (view) Author: Davin Potts (davin) * (Python committer) Date: 2015-02-27 16:31
Adding an option does sound like a better possibility.  Still, when I start looking through the examples that psutil provides, it reminds me how this is but one small piece of a much larger picture which psutil has done a nice, focused job of working to address.

If the patch you create were to depend upon gnu coreutils, I do not think it can be accepted for licensing reasons.  Interestingly psutil does not appear to depend upon that library.

Regarding the docs, what text would you propose instead of what's currently there for describing cpu_count?
msg236805 - (view) Author: Julian Taylor (jtaylor) Date: 2015-02-27 17:40
certainly for anything that needs good control over affinity psutils is the best choice, but I'm not arguing to implement full process control in python. I only want python to provide the number of cores one can work on to make best use of the available resources.

If you code search python files for cpu_count you find on github 18000 uses, randomly sampling a few every single one was to determine the number of cpus to start worker jobs to get best performance. Every one of these will oversubscribe a host that restricts the cpus a process can use. This is an issue especially for the increasingly popular use of containers instead of full virtual machines.

as a documentation update I would like to have a note saying that this number is the number of (online) cpus in the system may not be the number of of cpus the process can actually use. Maybe with a link to len(psutils.Process.get_affinity()) as a reference on how to obtain that number.

there would be no dependence on coreutils, I just mentioned it as you can look up the OS api you need to use to get the number there (e.g. sched_getaffinity). It is trivial API use and should not be a licensing issue, one could also look at the code from psutil which most likely looks very similar.
msg236827 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2015-02-27 20:08
Well, we already expose CPU affinity:

>>> import os
>>> os.sched_getaffinity(0)
{0}

IMO the current implementation is sufficient (and talking about
overcommitting for CPU is a bit moot if you're using virtual machine
anyways).

The current documentation says:
   Return the number of CPUs in the system. Returns None if undetermined.

Which to me is clear enough, although if you want to add an explicit
note that this doesn't take cpu affinity into account that wouldn't
hurt.
msg236828 - (view) Author: Julian Taylor (jtaylor) Date: 2015-02-27 20:13
oh thats great so python already has what I want. Then just an small documentation update would be good, I'll have a go at a patch later.
msg236867 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2015-02-28 06:00
> Well, we already expose CPU affinity:
> 
> >>> import os
> >>> os.sched_getaffinity(0)
> {0}

os.sched_getaffinity only exists on some POSIX systems, such as Linux. For Windows, here's a ctypes version of sched_getaffinity and sched_setaffinity:

    import sys
    from ctypes import *
    from ctypes.wintypes import *

    __all__ = ['sched_getaffinity', 'sched_setaffinity']

    kernel32 = WinDLL('kernel32')

    DWORD_PTR = WPARAM
    PDWORD_PTR = POINTER(DWORD_PTR)

    GetCurrentProcess = kernel32.GetCurrentProcess
    GetCurrentProcess.restype = HANDLE

    OpenProcess = kernel32.OpenProcess
    OpenProcess.restype = HANDLE
    OpenProcess.argtypes = (DWORD, # dwDesiredAccess,_In_
                            BOOL,  # bInheritHandle,_In_
                            DWORD) # dwProcessId, _In_

    GetProcessAffinityMask = kernel32.GetProcessAffinityMask
    GetProcessAffinityMask.argtypes = (
        HANDLE,     # hProcess, _In_
        PDWORD_PTR, # lpProcessAffinityMask, _Out_
        PDWORD_PTR) # lpSystemAffinityMask, _Out_

    SetProcessAffinityMask = kernel32.SetProcessAffinityMask
    SetProcessAffinityMask.argtypes = (
      HANDLE,    # hProcess, _In_
      DWORD_PTR) # dwProcessAffinityMask, _In_

    PROCESS_SET_INFORMATION = 0x0200
    PROCESS_QUERY_INFORMATION = 0x0400
    PROCESS_QUERY_LIMITED_INFORMATION = 0x1000
    if sys.getwindowsversion().major < 6:
        PROCESS_QUERY_LIMITED_INFORMATION = PROCESS_QUERY_INFORMATION

    def sched_getaffinity(pid):
        if pid == 0:
            hProcess = GetCurrentProcess()
        else:
            hProcess = OpenProcess(PROCESS_QUERY_LIMITED_INFORMATION,
                                   False, pid)
        if not hProcess:
            raise WinError()
        lpProcessAffinityMask = DWORD_PTR()
        lpSystemAffinityMask = DWORD_PTR()
        if not GetProcessAffinityMask(hProcess, 
                                      byref(lpProcessAffinityMask),
                                      byref(lpSystemAffinityMask)):
            raise WinError()
        mask = lpProcessAffinityMask.value
        return {c for c in range(sizeof(DWORD_PTR) * 8) if (1 << c) & mask}

    def sched_setaffinity(pid, mask):
        if pid == 0:
            hProcess = GetCurrentProcess()
        else:
            hProcess = OpenProcess(PROCESS_SET_INFORMATION,
                                   False, pid)
        if not hProcess:
            raise WinError()
        bitmask = 0
        for cpu in mask:
            if not isinstance(cpu, int):
                raise TypeError('expected an iterator of ints, but '
                                'iterator yielded %r' % type(cpu))
            if cpu < 0:
                raise ValueError('negative CPU number')
            if cpu >= sizeof(DWORD_PTR) * 8:
                raise ValueError('CPU number too large')
            bitmask |= 1 << cpu
        if not SetProcessAffinityMask(hProcess, bitmask):
            raise WinError()
msg237436 - (view) Author: Julian Taylor (jtaylor) Date: 2015-03-07 11:03
attached documentation update patch.
msg246524 - (view) Author: Julian Taylor (jtaylor) Date: 2015-07-09 20:10
any comments on the doc changes?
msg246698 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015-07-13 20:02
New changeset b9e3838e9664 by Charles-François Natali in branch 'default':
Issue #23530: Improve os.cpu_count() description.
https://hg.python.org/cpython/rev/b9e3838e9664
msg246699 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2015-07-13 20:41
Committed.
Julian, thanks for the patch!
msg246700 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2015-07-13 20:57
It would be nice to backport the patch to the 3.4 and 3.5 branches.
msg246773 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2015-07-15 18:32
It's just a doc improvement, I'm not convinced it really needs backporting.
msg248503 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-08-13 08:12
Argument Clinic code was not regenerated. Actually the commit breaks Argument Clinic.

$ make clinic
./python -E ./Tools/clinic/clinic.py --make
Error in file "./Modules/posixmodule.c" on line 11211:
Docstring for os.cpu_count does not have a summary line!
Every non-blank function docstring must start with
a single line summary followed by an empty line.
make: *** [clinic] Error 255
msg248538 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015-08-13 19:37
New changeset 23c6cc5d5b8f by Charles-François Natali in branch 'default':
Issue #23530: fix clinic comment.
https://hg.python.org/cpython/rev/23c6cc5d5b8f
History
Date User Action Args
2017-09-05 06:32:11giampaolo.rodolasetnosy: + giampaolo.rodola
2016-02-10 22:52:55neologixsetstatus: open -> closed
stage: needs patch -> resolved
2015-08-13 19:37:30python-devsetmessages: + msg248538
2015-08-13 08:12:12serhiy.storchakasetstatus: closed -> open

nosy: + serhiy.storchaka
messages: + msg248503

resolution: fixed ->
stage: resolved -> needs patch
2015-07-15 18:32:33neologixsetmessages: + msg246773
2015-07-13 20:57:38berker.peksagsetnosy: + berker.peksag
messages: + msg246700
2015-07-13 20:41:41neologixsetstatus: open -> closed
resolution: fixed
messages: + msg246699

stage: resolved
2015-07-13 20:02:19python-devsetnosy: + python-dev
messages: + msg246698
2015-07-09 20:10:08jtaylorsetmessages: + msg246524
2015-03-07 11:03:07jtaylorsetfiles: + 0001-Issue-23530-Update-documentation-clarify-relation-of.patch
keywords: + patch
messages: + msg237436
2015-02-28 06:00:51eryksunsetnosy: + eryksun
messages: + msg236867
2015-02-27 20:13:42jtaylorsetmessages: + msg236828
2015-02-27 20:08:49neologixsetmessages: + msg236827
2015-02-27 17:40:44jtaylorsetmessages: + msg236805
2015-02-27 16:31:04davinsetmessages: + msg236781
2015-02-26 17:59:17jtaylorsetmessages: + msg236695
2015-02-26 17:23:10davinsetnosy: + davin
messages: + msg236690
2015-02-26 14:40:30serhiy.storchakasetnosy: + neologix
2015-02-26 14:30:54jtaylorcreate