Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

subprocess: use PROC_THREAD_ATTRIBUTE_HANDLE_LIST with STARTUPINFOEX on Windows Vista #63963

Closed
vstinner opened this issue Nov 25, 2013 · 28 comments
Labels
3.7 (EOL) end of life OS-windows performance Performance or resource usage stdlib Python modules in the Lib dir

Comments

@vstinner
Copy link
Member

BPO 19764
Nosy @gpshead, @pfmoore, @vstinner, @tjguk, @codeape2, @zware, @eryksun, @zooba, @xflr6, @segevfiner
PRs
  • bpo-19764: Implemented support for subprocess.Popen(close_fds=True) on Windows #1218
  • Files
  • windows-subprocess-close-fds.patch: Implement subprocess.Popen(close_fds=True) with stdio on Windows
  • windows-subprocess-close-fds-v2.patch: Second version of the patch
  • windows-subprocess-close-fds-v3.patch: Third version of the patch
  • windows-subprocess-close-fds-v3-vista7-hack.patch: Third version with an hack around Vista/7 nastiness (Requires testing)
  • windows-subprocess-close-fds-v4.patch: 4th version of the patch
  • windows-subprocess-close-fds-v5.patch
  • windows-subprocess-close-fds-v6.patch: 6th version of the patch (Now on top of Git :P)
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2017-12-18.09:35:37.441>
    created_at = <Date 2013-11-25.09:01:55.948>
    labels = ['3.7', 'library', 'OS-windows', 'performance']
    title = 'subprocess: use PROC_THREAD_ATTRIBUTE_HANDLE_LIST with STARTUPINFOEX on Windows Vista'
    updated_at = <Date 2018-07-04.14:56:46.872>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2018-07-04.14:56:46.872>
    actor = 'xflr6'
    assignee = 'none'
    closed = True
    closed_date = <Date 2017-12-18.09:35:37.441>
    closer = 'vstinner'
    components = ['Library (Lib)', 'Windows']
    creation = <Date 2013-11-25.09:01:55.948>
    creator = 'vstinner'
    dependencies = []
    files = ['46175', '46178', '46185', '46282', '46814', '46819', '46820']
    hgrepos = []
    issue_num = 19764
    keywords = ['patch']
    message_count = 28.0
    messages = ['204314', '204637', '204638', '204675', '284817', '284834', '284843', '284865', '284866', '284898', '284903', '285427', '291421', '291900', '291915', '291989', '291995', '291996', '294164', '294791', '294810', '308470', '308528', '308530', '308531', '320784', '320806', '321048']
    nosy_count = 11.0
    nosy_names = ['gregory.p.smith', 'paul.moore', 'vstinner', 'tim.golden', 'Bernt.R\xc3\xb8skar.Brenna', 'sbt', 'zach.ware', 'eryksun', 'steve.dower', 'xflr6', 'Segev Finer']
    pr_nums = ['1218']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'resource usage'
    url = 'https://bugs.python.org/issue19764'
    versions = ['Python 3.7']

    @vstinner
    Copy link
    Member Author

    subprocess.Popen has a race condition on Windows with file descriptors: if two threads spawn subprocesses at the same time, unwanted file descriptors may be inherited, which lead to annoying issues like "cannot delete a file because it is open by another process". For the issue bpo-19575 for an example of such bug.

    Since Windows Vista, a list of handle which should be inherited can be specified in CreateProcess() using PROC_THREAD_ATTRIBUTE_HANDLE_LIST with STARTUPINFOEX. It avoids the need to mark the handle temporarly inheritable.

    For more information, see:
    http://www.python.org/dev/peps/pep-0446/#only-inherit-some-handles-on-windows

    @vstinner vstinner added the type-feature A feature request or enhancement label Nov 25, 2013
    @vstinner
    Copy link
    Member Author

    The purpose of this issue is to avoiding having to call CreateProcess() with bInheritHandles parameter set to TRUE on Windows, and avoid calls to self._make_inheritable() in subprocess.Popen._get_handles().

    Currently, bInheritHandles is set to TRUE if stdin, stdout and/or stderr parameter of Popen constructor is set (to something else than None).

    Using PROC_THREAD_ATTRIBUTE_HANDLE_LIST, handles don't need to be marked as inheritable in the parent process, and CreateProcess() can be called with bInheritHandles parameter set to FALSE.

    @vstinner
    Copy link
    Member Author

    UpdateProcThreadAttribute() documentation says that "... handles must be created as inheritable handles ..." and a comment says that "If using PROC_THREAD_ATTRIBUTE_HANDLE_LIST, pass TRUE to bInherit in CreateProcess. Otherwise, you will get an ERROR_INVALID_PARAMETER."

    http://msdn.microsoft.com/en-us/library/windows/desktop/ms686880%28v=vs.85%29.aspx

    Seriously? What is the purpose of PROC_THREAD_ATTRIBUTE_HANDLE_LIST if it does not avoid the race condition? It's "just" to not inherit some inheritable handles? In Python 3.4, files and sockets are created non-inheritable by default, so PROC_THREAD_ATTRIBUTE_HANDLE_LIST may not improve anything :-/

    @vstinner
    Copy link
    Member Author

    I read again the following blog post:
    http://blogs.msdn.com/b/oldnewthing/archive/2011/12/16/10248328.aspx

    I understood the purpose of PROC_THREAD_ATTRIBUTE_HANDLE_LIST.

    Let say that two Python threads create a Popen object with a pipe for stdout:

    • Thread A : pipe 1
    • Thread B : pipe 2
    • Main thread has random inheritable files and sockets

    Handles of the both pipes are inheritable. Currently, thread A may inherit pipe 2 and thread B may inherit pipe 1 depending exactly when pipes are created and marked as inheritable, and when CreateProcess() is called.

    Using PROC_THREAD_ATTRIBUTE_HANDLE_LIST, thread A will only inherit pipe 1, not pipe 2 nor inheritable handles of the other threads. Thread B will only inherit pipe 1, no other handle. It does not matter that CreateProcess() is called with bInheritHandles=TRUE nor that there are other inheritable handles.

    @vstinner vstinner added OS-windows performance Performance or resource usage and removed type-feature A feature request or enhancement labels Apr 12, 2016
    @segevfiner
    Copy link
    Mannequin

    segevfiner mannequin commented Jan 6, 2017

    Though Python has taken measures to mark handles as non-inheritable there is still a possible race due to having to create inheritable handles while creating processes with stdio pipes (subprocess).

    Attached is a Patch that implements subprocess.Popen(close_fds=True) with stdio handles on Windows using PROC_THREAD_ATTRIBUTE_HANDLE_LIST, which plugs that race completely.

    I implemented this by adding the attribute STARTUPINFO._handleList, which when passed to _winapi.CreateProcess, will be passed to CreateProcess as a PROC_THREAD_ATTRIBUTE_HANDLE_LIST. subprocess.py can than use this attribute as needed with inherit_handles=True to only inherit the stdio handles.

    The STARTUPINFO._handleList attribute can also be used to implement pass_fds later on. Though the exact behavior of how to convert a file descriptor list to a handle list might be a bit sensitive, so I left that out for now.

    This patch obviously doesn't support Windows XP but Python 3 doesn't support XP anymore either.

    @eryksun
    Copy link
    Contributor

    eryksun commented Jan 6, 2017

    Implementing pass_fds on Windows is a problem if Popen has to implement the undocumented use of the STARTUPINFO cbReserved2 and lpReserved2 fields to inherit CRT file descriptors. I suppose we could implement this ourselves in _winapi since it's unlikely that the data format will ever change. Just copy what the CRT's accumulate_inheritable_handles() function does, but constrained by an array of file descriptors.

    @eryksun eryksun added stdlib Python modules in the Lib dir 3.7 (EOL) end of life labels Jan 6, 2017
    @segevfiner
    Copy link
    Mannequin

    segevfiner mannequin commented Jan 6, 2017

    Second version of the patch after review by eryksun.

    Please pay attention to the hack in _execute_child due to having to
    temporarily override the handle_list if the user supplied
    one.

    As for pass_fds: as you noted, it has it's own share of complexities and issues and I think it's best to leave it to a separate patch/issue.

    @vstinner
    Copy link
    Member Author

    vstinner commented Jan 6, 2017

    Python already has a multiprocessing module which is able to pass handles (maybe also FD? I don't know) to child processes on Windows. I found some code in Lib/multiprocessing/reduction.py:

    • duplicate()
    • steal_handle()
    • send_handle()

    But the design doesn't really fit the subprocess module, since this design requires that the child process communicates with the parent process. On UNIX, fork()+exec() is used, so we can execute a few instructions after fork, which allows to pass an exception from the child to the parent. On Windows, CreateProcess() is used which doesn't allow directly to execute code before running the final child process.

    The PEP-446 describes a solution using a wrapper process, so parent+wrapper+child, 3 processes. IMHO the best design for subprocess is really PROC_THREAD_ATTRIBUTE_HANDLE_LIST.

    @vstinner
    Copy link
    Member Author

    vstinner commented Jan 6, 2017

    I dislike adding a lpAttributeList attribute: it's too close to the exact implementation of Windows may change in the future. I would prefer a more high level API.

    Since the only known use case today is to pass handles, I propose to focus on this use case: add a new pass_handles parameter to Popen, similar to pass_fds.

    I see that your patch is able to set close_fds to True on Windows: great job! It would be a great achievement to finally fix this last known race condition of subprocess on Windows!

    So thank you for working on this!

    As for pass_fds: as you noted, it has it's own share of complexities and issues and I think it's best to leave it to a separate patch/issue.

    pass_fds would be "nice to have", but I prefer to stick first to the native and well supported handles on Windows. For me, using file descriptors on Windows is more a "hack" to be able to write code working on Windows and UNIX, but since it's not natively supported on Windows, it comes with own set of issues.

    IMHO it's better supported to work on handles.

    @segevfiner
    Copy link
    Mannequin

    segevfiner mannequin commented Jan 7, 2017

    I removed previous_handle_list in _execute_child since I noticed subprocess already clobbers the other attributes in startupinfo anyhow.

    I figured there will be some discussion about how to pass the handle list, so here's my two cents:

    • subprocess already exposes a bit of Windows specific flags like creationflags and STARTUPINFO.

    • Windows doesn't really break it's API in backwards incompatible ways often (Heck it barely breaks it ever, which is why we have so many Ex functions and reserved parameters :P).

    • The _winapi module tries to expose WinAPI functions as is. So I implemented this as an internal attribute on STARTUPINFO, in the first version, since I wasn't sure we want this exposed to users, but I still wanted to try and mimic the original WinAPI functions internally. The lpAttributeList is a change requested by eryksun that brings it even closer to WinAPI and exposes it for further extension with additional attributes.

    @eryksun
    Copy link
    Contributor

    eryksun commented Jan 7, 2017

    Python already has a multiprocessing module which is able to pass
    handles (maybe also FD? I don't know) to child processes on
    Windows.

    Popen doesn't implement the undocumented CRT protocol that's used to smuggle the file-descriptor mapping in the STARTUPINFO cbReserved2 and lpReserved2 fields. This is a feature of the CRT's spawn and exec functions. For example:

    fdr, fdw = os.pipe()
    os.set_inheritable(fdw, True)
    os.spawnl(os.P_WAIT, os.environ['ComSpec'], 'cmd /c "echo spam >&%d"' % fdw)
    
        >>> os.read(fdr, 10)
        b'spam \r\n'

    We don't have to worry about implementing fd inheritance so long as os.spawn* uses the CRT. Someone that needs this functionality can simply be instructed to use os.spawn.

    I dislike adding a lpAttributeList attribute: it's too close to
    the exact implementation of Windows may change in the future.

    If you're going to worry about lpAttributeList, why stop there?
    Aren't dwFlags, wShowWindow, hStdInput, hStdOutput, and hStdError also too close to the exact implementation? My thoughts when suggesting this were actually to make this as close to the underlying API as possible, and extensible to support other attributes if there's a demand for it.

    Passing a list of handles is atypical usage, and since Python and subprocess use file descriptors instead of Windows handles, I prefer isolating this in a Windows structure such as STARTUPINFO, rather than adding even more confusion to Popen's constructor.

    Since the only known use case today is to pass handles

    In the review of the first patch, I listed 3 additional attributes that might be useful to add in 3.7: IDEAL_PROCESSOR, GROUP_AFFINITY, and PREFERRED_NODE (simplified by the fact that 3.7 no longer supports Vista). Currently the way to set the latter two is to use the built-in start command of the cmd shell.

    I propose to focus on this use case: add a new pass_handles parameter
    to Popen, similar to pass_fds.

    This is a messy situation. Python 3's file I/O is built on the CRT's POSIX layer. If it had been implemented directly on the Windows API using handles, then pass_fds would obviously use handles. That's the current situation with socket module because Winsock makes no attempt to hide AFD handles behind POSIX file descriptors.

    Popen's constructor accepts file descriptors -- not Windows handles -- for its stdin, stdout, and stderr arguments, and the parameter to control inheritance is named "close_fds". It seems out of place to add a "pass_handles" parameter.

    @segevfiner
    Copy link
    Mannequin

    segevfiner mannequin commented Jan 13, 2017

    I have read some of https://github.com/rprichard/win32-console-docs and it documents quite a bunch of nastiness with PROC_THREAD_ATTRIBUTE_HANDLE_LIST in Windows Vista/7. Windows is so much fun sometimes :P

    Essentially console handles in Windows before Windows 8 are user mode handles and not real kernel handles. Those user mode handles are inherited by a different mechanism than kernel handles and regardless of PROC_THREAD_ATTRIBUTE_HANDLE_LIST, and if passed in PROC_THREAD_ATTRIBUTE_HANDLE_LIST will cause it to fail in weird ways. Those user mode console handles have the lower two bits set. The lower two bits in Windows are reserved for tagging such special handles.

    Also in all versions you can't pass in an empty handle list, but a list with just a NULL handle works fine.

    See: https://github.com/rprichard/win32-console-docs/blob/master/src/HandleTests/CreateProcess_InheritList.cc

    I attached a version of the patch with a hack around those issues based on what I read, but I can't test that it actually fixes the issues since I don't have a Windows Vista or 7 system around.

    @segevfiner
    Copy link
    Mannequin

    segevfiner mannequin commented Apr 10, 2017

    It's been a while since this got any attention...

    @eryksun
    Copy link
    Contributor

    eryksun commented Apr 19, 2017

    In case you didn't get notified by Rietveld, I made a couple suggestions on your latest patch. Also, if you wouldn't mind, please update the patch to apply cleanly to 3.7 -- especially since STARTUPINFO now has an __init__ method.

    @segevfiner
    Copy link
    Mannequin

    segevfiner mannequin commented Apr 19, 2017

    Added the 4th version after review by eryksun (In rietveld).

    @segevfiner
    Copy link
    Mannequin

    segevfiner mannequin commented Apr 20, 2017

    Added the 5th version after another review by eryksun (In rietveld).

    @segevfiner
    Copy link
    Mannequin

    segevfiner mannequin commented Apr 20, 2017

    Oh LOL!!! I missed the fact that Python finally moved to GitHub!
    Rebased the patch on top of the Git master XD (And removed accidentally committed code... sorry...)

    I still submitted as a patch since I don't know if the infrastructure handles moving a patch to a PR well :P

    @segevfiner
    Copy link
    Mannequin

    segevfiner mannequin commented Apr 20, 2017

    OK Rietveld definitely punted on the git patch (I guess it's only for the old Mercurial repo, I don't think it actually even support Git...)

    I will try re-submitting the patch as a PR so that it can be reviewed easily.

    @segevfiner
    Copy link
    Mannequin

    segevfiner mannequin commented May 22, 2017

    GitHub PR bit rotting away... :P

    Just a friendly reminder :)

    @gpshead
    Copy link
    Member

    gpshead commented May 30, 2017

    I am not a Windows person... but is there a reason that handle_list must be an attribute of a STARTUPINFO class rather than just a special case of pass_fds such that people could supply windows handles in the pass_fds parameter rather than using STARTUPINFO? (we can detect which is which based on type right? or are they both integers and thus indistinguishable?)

    The whole STARTUPINFO thing feels like we are exposing windows internals here and not offering an abstract API that people would write portable code to get the same behavior across OSes on without platform conditionals of their own.

    But maybe that is required here given how little I know of Windows? Food for thought.

    @eryksun
    Copy link
    Contributor

    eryksun commented May 31, 2017

    We can't reliably distinguish file descriptors from OS handles. They're overlapping sets of integers. A separate pass_handles parameter would be needed. The bigger problem with that idea is that the handles in the list have to be made inheritable before calling CreateProcess. Thus using pass_fds or pass_handles would have a race condition with concurrent CreateProcess calls that inherit all inheritable handles, such as Popen with close_fds=False, spawn*(), and system(). That's not consistent with how pass_fds works on Unix.

    This proposed change doesn't solve the race condition problem in general, but it solves the problem if only subprocess.Popen is used and child processes are limited to inheriting the STARTUPINFO standard handles and handle_list.

    @segevfiner
    Copy link
    Mannequin

    segevfiner mannequin commented Dec 16, 2017

    The PR has been sitting there for quite a while now...

    @vstinner
    Copy link
    Member Author

    New changeset b2a6083 by Victor Stinner (Segev Finer) in branch 'master':
    bpo-19764: Implemented support for subprocess.Popen(close_fds=True) on Windows (bpo-1218)
    b2a6083

    @vstinner
    Copy link
    Member Author

    Copy of my comment on the PR.

    #1218 (comment)

    Merged from master... Again... Hopefully this won't end up missing 3.7 entirely... 😔

    Oops sorry, I wanted this feature but I didn't follow closely the PR.

    I don't know well the Windows API, so I didn't want to take the responsability of reviewing (approving) such PR. But I see that @zooba and @gpshead approved it, so I'm now confortable to merge it :-) Moreover, AppVeyor validated the PR, so let me merge it.

    I prefer to merge the PR right now to not miss the Python 3.7 feature freeze, and maybe fix issues later if needed, before 3.7 final.

    Thank you @segevfiner for this major subprocess enhancement. I really love to see close_fds default changing to True on Windows. It will help to fix many corner cases which are very tricky to debug.

    Sorry for the slow review, but the subprocess is a critical module of Python, and we lack of Windows developers to review changes specific to Windows.

    @vstinner
    Copy link
    Member Author

    Thank you Sergev Finer for finishing the implementation of my PEP-446. Supporting to only inherit a set of Windows handles was a "small note" my PEP-446, mostly because I didn't feel able to implement the feature, but also because we still supported Windows versions which didn't implement this feature (PROC_THREAD_ATTRIBUTE_HANDLE_LIST) if I recall correctly.

    Thanks Eryk Sun, Gregory P. Smith and Steve Dower for the reviews and help on getting this nice feature into Python 3.7!

    @xflr6
    Copy link
    Mannequin

    xflr6 mannequin commented Jun 30, 2018

    AFAIU, this change broke the following usage of subprocess on Windows
    (re-using a subprocess.STARTUPINFO instance to hide the command window):

        import os, subprocess
    
        STARTUPINFO = subprocess.STARTUPINFO()
        STARTUPINFO.dwFlags |= subprocess.STARTF_USESHOWWINDOW
        STARTUPINFO.wShowWindow = subprocess.SW_HIDE
    
        # raises OSError: [WinError 87]
        # in the second loop iteration starting with Python 3.7
        for i in range(2):
            print(i)
            with open(os.devnull, 'w') as stderr:
                subprocess.check_call(['attrib'], stderr=stderr,
                                      startupinfo=STARTUPINFO)

    AFAICT, this works on Python 2.7, 3.4, 3.5, and 3.6

    @eryksun
    Copy link
    Contributor

    eryksun commented Jun 30, 2018

    Sebastian, the problem in this case is that startupinfo.lpAttributeList['handle_list'] contains the duplicated standard-handle values from the previous call, which were closed and are no longer valid. subprocess.Popen has always modified STARTUPINFO in place, including dwFlags, hStdInput, hStdOutput, hStdError, and wShowWindow. This update follows suit to also modify lpAttributeList in place.

    This issue is closed. Please create a new issue if you think Popen should use a deep copy of startupinfo instead, to allow callers to reuse a single STARTUPINFO instance. Or the new issue could propose only to document the existing behavior.

    @xflr6
    Copy link
    Mannequin

    xflr6 mannequin commented Jul 4, 2018

    Thanks Eryk. Done: https://bugs.python.org/issue34044

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life OS-windows performance Performance or resource usage stdlib Python modules in the Lib dir
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants