Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_multiprocessing_spawn fails on AMD64 Windows8 3.x #80297

Closed
pablogsal opened this issue Feb 26, 2019 · 12 comments
Closed

test_multiprocessing_spawn fails on AMD64 Windows8 3.x #80297

pablogsal opened this issue Feb 26, 2019 · 12 comments
Labels
3.8 only security fixes OS-windows tests Tests in the Lib/test dir

Comments

@pablogsal
Copy link
Member

BPO 36116
Nosy @pfmoore, @pitrou, @vstinner, @tjguk, @ambv, @ericsnowcurrently, @zware, @eryksun, @zooba, @pablogsal

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2019-03-06.01:37:25.497>
created_at = <Date 2019-02-26.07:13:43.594>
labels = ['3.8', 'tests', 'OS-windows']
title = 'test_multiprocessing_spawn fails on AMD64 Windows8 3.x'
updated_at = <Date 2019-03-06.01:37:25.497>
user = 'https://github.com/pablogsal'

bugs.python.org fields:

activity = <Date 2019-03-06.01:37:25.497>
actor = 'vstinner'
assignee = 'none'
closed = True
closed_date = <Date 2019-03-06.01:37:25.497>
closer = 'vstinner'
components = ['Tests', 'Windows']
creation = <Date 2019-02-26.07:13:43.594>
creator = 'pablogsal'
dependencies = []
files = []
hgrepos = []
issue_num = 36116
keywords = []
message_count = 12.0
messages = ['336625', '336626', '336627', '336774', '336778', '336782', '336928', '337039', '337072', '337077', '337221', '337265']
nosy_count = 10.0
nosy_names = ['paul.moore', 'pitrou', 'vstinner', 'tim.golden', 'lukasz.langa', 'eric.snow', 'zach.ware', 'eryksun', 'steve.dower', 'pablogsal']
pr_nums = []
priority = 'normal'
resolution = 'fixed'
stage = 'resolved'
status = 'closed'
superseder = None
type = None
url = 'https://bugs.python.org/issue36116'
versions = ['Python 3.8']

@pablogsal
Copy link
Member Author

test test_multiprocessing_spawn failed
test_import (test.test_multiprocessing_spawn._TestImportStar) ... ok
======================================================================
FAIL: test_mymanager_context (test.test_multiprocessing_spawn.WithManagerTestMyManager)
----------------------------------------------------------------------

Traceback (most recent call last):
  File "D:\buildarea\3.x.bolen-windows8\build\lib\test\_test_multiprocessing.py", line 2747, in test_mymanager_context
    self.assertIn(manager._process.exitcode, (0, -signal.SIGTERM))
AssertionError: 3221225477 not found in (0, -15)

Ran 344 tests in 328.196s
FAILED (failures=1, skipped=40)
1 test failed again:
test_multiprocessing_spawn
== Tests result: FAILURE then FAILURE ==

https://buildbot.python.org/all/#/builders/32/builds/2204/steps/3/logs/stdio

@pablogsal pablogsal added 3.8 only security fixes tests Tests in the Lib/test dir OS-windows labels Feb 26, 2019
@pablogsal
Copy link
Member Author

@pablogsal
Copy link
Member Author

It seems that return code 3221225477 in Windows is:

#define STATUS_ACCESS_VIOLATION  ((NTSTATUS)0xC0000005L)

so this is a segfault in the manager.

@pablogsal
Copy link
Member Author

Adding Łukasz, as I think this is a release blocker (the Windows 8 and 7 multiprocessing module may be causing segfaults).

@zooba
Copy link
Member

zooba commented Feb 27, 2019

It's also possible that the child process is causing the segfault because of misconfiguration (e.g. broken environment variables).

And depending on the OS, abort() calls (via Py_FatalError) sometimes appear to be segfaults, so it could be any number of issues. (Aside - I'd love to replace the abort() calls with specific exit codes for configuration errors - they really mess up the crash data we see on Windows.)

I'll try some tests locally to see if this is reproducible, but if anyone can extract the original stdout/stderr from the buildbot, that would be helpful.

@eryksun
Copy link
Contributor

eryksun commented Feb 27, 2019

And depending on the OS, abort() calls (via Py_FatalError) sometimes
appear to be segfaults, so it could be any number of issues.
(Aside - I'd love to replace the abort() calls with specific exit
codes for configuration errors - they really mess up the crash data
we see on Windows.)

In particular, with the Universal CRT, an unhandled abort() is implemented by a __fastfail intrinsic 1 (int 0x29 instruction in x86) with the argument FAST_FAIL_FATAL_APP_EXIT (7).

Prior to Windows 8 this appears as an access violation. In Windows 8+ it's implemented as a second-chance STATUS_STACK_BUFFER_OVERRUN (0xC0000409) exception, which is overloaded from its previous use to support failure codes. (The old usage appears as the failure code FAST_FAIL_LEGACY_GS_VIOLATION, defined to be 0.) It starts as a second-chance exception in order to bypass normal exception handling (i.e. SEH, VEH, UnhandledExceptionFilter). The second-chance exception event is sent to an attached debugger and/or the session server (csrss.exe).

Python's normal signal handling for SIGABRT can't prevent this, since the C handler just sets a flag and returns. But enabling faulthandler sets a C signal handler that restores the previous handler and calls raise(SIGABRT). The default SIGABRT handler for the explicit raise() code path simply calls _exit(3).

Alternatively, we could prevent the __fastfail call via _set_abort_behavior 2, if implemented in msvcrt. For example: msvcrt.set_abort_behavior(0, msvcrt.CALL_REPORTFAULT).

@vstinner
Copy link
Member

vstinner commented Mar 1, 2019

https://buildbot.python.org/all/#/builders/32/builds/2219

FAIL: test_mymanager_context_prestarted (test.test_multiprocessing_spawn.WithManagerTestMyManager)
Re-running failed tests in verbose mode
Re-running test 'test_multiprocessing_spawn' in verbose mode
FAIL: test_mymanager_context_prestarted (test.test_multiprocessing_spawn.WithManagerTestMyManager)

@pablogsal
Copy link
Member Author

See also https://bugs.python.org/issue36114

@vstinner
Copy link
Member

vstinner commented Mar 4, 2019

It's also possible that the child process is causing the segfault because of misconfiguration (e.g. broken environment variables).

Maybe, but the test also produces core dump on FreeBSD: bpo-36114. It looks more like a real bug.

I set the priority again to release blocker to not forget this regression.

@vstinner
Copy link
Member

vstinner commented Mar 4, 2019

test_mymanager and test_mymanager_context of test_multiprocessing_spawn.WithManagerTestMyManager failed in this build:

https://buildbot.python.org/all/#/builders/58/builds/1983/steps/3/logs/stdio

ERROR: test_multiprocessing (test.test_venv.BasicTest)
FAIL: test_async_gen_asyncio_gc_aclose_09 (test.test_asyncgen.AsyncGenAsyncioTest)
FAIL: test_daemon_threads_shutdown_stderr_deadlock (test.test_io.CMiscIOTest)
self.assertIn("Fatal Python error: could not acquire lock "
AssertionError: "Fatal Python error: could not acquire lock for <_io.BufferedWriter name='<stderr>'> at interpreter shutdown, possibly due to daemon threads" not found in (...)
FAIL: test_daemon_threads_shutdown_stdout_deadlock (test.test_io.CMiscIOTest)
self.assertIn("Fatal Python error: could not acquire lock "
AssertionError: "Fatal Python error: could not acquire lock for <_io.BufferedWriter name='<stdout>'> at interpreter shutdown, possibly due to daemon threads" not found in ''
Re-running failed tests in verbose mode
Re-running test 'test_venv' in verbose mode
ERROR: test_multiprocessing (test.test_venv.BasicTest)
Re-running test 'test_asyncgen' in verbose mode
Re-running test 'test_multiprocessing_spawn' in verbose mode
FAIL: test_mymanager (test.test_multiprocessing_spawn.WithManagerTestMyManager)
FAIL: test_mymanager_context (test.test_multiprocessing_spawn.WithManagerTestMyManager)
Re-running test 'test_io' in verbose mode
FAIL: test_daemon_threads_shutdown_stderr_deadlock (test.test_io.CMiscIOTest)
self.assertIn("Fatal Python error: could not acquire lock "
AssertionError: "Fatal Python error: could not acquire lock for <_io.BufferedWriter name='<stderr>'> at interpreter shutdown, possibly due to daemon threads" not found in (...)
FAIL: test_daemon_threads_shutdown_stdout_deadlock (test.test_io.CMiscIOTest)
self.assertIn("Fatal Python error: could not acquire lock "
AssertionError: "Fatal Python error: could not acquire lock for <_io.BufferedWriter name='<stdout>'> at interpreter shutdown, possibly due to daemon threads" not found in ''

@ericsnowcurrently
Copy link
Member

This is resolved with #56368, no?

@vstinner
Copy link
Member

vstinner commented Mar 6, 2019

This is resolved with #56368, no?

I was waiting to see if buildbot workers feel better. It's the case, so I close the issue.

@vstinner vstinner closed this as completed Mar 6, 2019
@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.8 only security fixes OS-windows tests Tests in the Lib/test dir
Projects
None yet
Development

No branches or pull requests

5 participants