classification
Title: test_multiprocessing failure under Windows
Type: crash Stage: resolved
Components: Library (Lib), Tests Versions: Python 3.2
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: georg.brandl Nosy List: Gabi.Davar, Kristian.Vlaardingerbroek, asksol, brett.cannon, brian.curtin, georg.brandl, jnoller, michael.foord, ncoghlan, pitrou, python-dev, terry.reedy, vstinner
Priority: critical Keywords: patch

Created on 2011-01-06 10:26 by pitrou, last changed 2015-11-19 02:59 by python-dev. This issue is now closed.

Files
File name Uploaded Description Edit
issue10845_mitigation.diff ncoghlan, 2011-01-29 14:42 Skip rerunning main module if it is actually a __main__.py file review
Messages (21)
msg125547 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-01-06 10:26
This is under a Windows 7 VM. I don't understand what the assertion means:

Z:\>__svn__\PCbuild\amd64\python_d.exe -m test -v test_multiprocessing
== CPython 3.2b2+ (py3k, Jan 6 2011, 10:56:48) [MSC v.1500 64 bit (AMD64)]
==   Windows-7-6.1.7600 little-endian
==   Z:\__svn__\build\test_python_1360
Testing with flags: sys.flags(debug=0, division_warning=0, inspect=0, interactiv
e=0, optimize=0, dont_write_bytecode=0, no_user_site=0, no_site=0, ignore_enviro
nment=0, verbose=0, bytes_warning=0, quiet=0)
[1/1] test_multiprocessing
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "Z:\__svn__\lib\multiprocessing\forking.py", line 369, in main
    prepare(preparation_data)
  File "Z:\__svn__\lib\multiprocessing\forking.py", line 477, in prepare
    assert main_name not in sys.modules, main_name
AssertionError: __main__
[49434 refs]
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "Z:\__svn__\lib\multiprocessing\forking.py", line 369, in main
    prepare(preparation_data)
  File "Z:\__svn__\lib\multiprocessing\forking.py", line 477, in prepare
    assert main_name not in sys.modules, main_name
AssertionError: __main__
[49434 refs]
test test_multiprocessing crashed -- <class 'EOFError'>:
Traceback (most recent call last):
  File "Z:\__svn__\lib\test\regrtest.py", line 969, in runtest_inner
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "Z:\__svn__\lib\multiprocessing\forking.py", line 369, in main
    indirect_test()
  File "Z:\__svn__\lib\test\test_multiprocessing.py", line 2102, in test_main
    prepare(preparation_data)
  File "Z:\__svn__\lib\multiprocessing\forking.py", line 477, in prepare
    assert main_name not in sys.modules, main_name
AssertionError: __main__
    ManagerMixin.manager.start()
  File "/home/antoine/py3k/__svn__/Lib/multiprocessing/managers.py", line 531, i
n start
[49434 refs]
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "Z:\__svn__\lib\multiprocessing\forking.py", line 369, in main
    self._address = reader.recv()
EOFError
Traceback (most recent call last):
  File "<string>", line 1, in <module>
    prepare(preparation_data)
  File "Z:\__svn__\lib\multiprocessing\forking.py", line 477, in prepare
  File "Z:\__svn__\lib\multiprocessing\forking.py", line 369, in main
1 test failed:
    assert main_name not in sys.modules, main_name
AssertionError: __main__
    prepare(preparation_data)
  File "Z:\__svn__\lib\multiprocessing\forking.py", line 477, in prepare
[49434 refs]
    assert main_name not in sys.modules, main_name
AssertionError: __main__
[49434 refs]
    test_multiprocessing
Traceback (most recent call last):
  File "Z:\__svn__\lib\test\support.py", line 468, in temp_cwd
    yield os.getcwd()
  File "Z:\__svn__\lib\test\__main__.py", line 13, in <module>
    regrtest.main()
  File "Z:\__svn__\lib\test\regrtest.py", line 708, in main
    sys.exit(len(bad) > 0 or interrupted)
SystemExit: True
msg125548 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-01-06 10:35
Actually, it only happens if I use "-m test". If I use "-m test.regrtest" or "Lib/test/regrtest.py" instead, it works fine.
msg125549 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-01-06 10:36
(I guess this means that multiprocessing under Windows is not compatible with execution of a package through a __main__.py file...)
msg125620 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2011-01-07 04:40
I'm curious as to the results you get running "python -m test.__main__" and "python Lib/test/__main__.py" as well. I suspect both will fail.

Looking at the forking.py code, I think multiprocessing makes some assumptions that are flat out invalid in the presence of the -m switch - there are no guarantees that you can reverse engineer the name of an arbitrary module from the __file__ attribute. (This actually ties in with the current thread on python-ideas about making the real name of the main module readily available - then the parent process could just pass that through the pipe as a "main_name" value and this problem would go away)

In the short term, however, I think multiprocessing just needs to either special case the situation where "main_name" actually *is* __main__ or just drop the assertion across the board.
msg125628 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2011-01-07 08:03
On xp, changing from -m test to -m test.regrtest removed the extra craziness during and after the test run that I reported on pydev.

I think making at least a tempory fix to whatever -m test runs should be a release blocker so only individual tests and not the test process are broken. If necessary, just refuse to run under windows and say to use test.regrtest as before. (I believe I did use -m test.regrtest on 3.1 and 2.7 as -m test is not implemented for those.)
msg127311 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2011-01-28 15:09
There isn't really much "-m test" can do to flag this - multiprocessing is making assumptions about the meaning of __file__ that may be flat out invalid when -m is used to execute the main module.

Fixing that properly is going to require a PEP so the interpreter preserves the information that multiprocessing needs in order to spawn the child process correctly on Windows. (I already have that on my personal todo list for 3.3)

I'm not sure what to do for 3.2. We could comment out the assert, since that will be slightly less broken than the current total failure (it will still be slightly broken, though).
msg127413 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-01-29 13:14
Nick, a more problematic issue is that __main__ is always called __main__, regardless of whether it is actually imported as the real "main module" or through a regular import. This means that it is impossible to discriminate between both uses by using "if __name__ == '__main__'", which in turn means that top-level code will always get executed as a side-effect of importing, which means the "__main__.py" feature is completely broken for use with multiprocessing under Windows!

This also shows, IMO, how uselessly complicated and misleading the import system has become.
msg127414 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-01-29 13:29
Actually, forget that (!). The following patch seems to work:

Index: Lib/multiprocessing/forking.py
===================================================================
--- Lib/multiprocessing/forking.py	(révision 88224)
+++ Lib/multiprocessing/forking.py	(copie de travail)
@@ -474,7 +474,8 @@
             else:
                 dirs = [os.path.dirname(main_path)]
 
-            assert main_name not in sys.modules, main_name
+            if main_name != '__main__':
+                assert main_name not in sys.modules, main_name
             file, path_name, etc = imp.find_module(main_name, dirs)
             try:
                 # We would like to do "imp.load_module('__main__', ...)"
Index: Lib/test/__main__.py
===================================================================
--- Lib/test/__main__.py	(révision 88224)
+++ Lib/test/__main__.py	(copie de travail)
@@ -1,13 +1,16 @@
 from test import regrtest, support
 
+# Issue #10845: avoid executing toplevel code if imported by multiprocessing
+# (which is smart enough to import the module under another name).
+if __name__ == "__main__":
 
-TEMPDIR, TESTCWD = regrtest._make_temp_dir_for_build(regrtest.TEMPDIR)
-regrtest.TEMPDIR = TEMPDIR
-regrtest.TESTCWD = TESTCWD
+    TEMPDIR, TESTCWD = regrtest._make_temp_dir_for_build(regrtest.TEMPDIR)
+    regrtest.TEMPDIR = TEMPDIR
+    regrtest.TESTCWD = TESTCWD
 
-# Run the tests in a context manager that temporary changes the CWD to a
-# temporary and writable directory. If it's not possible to create or
-# change the CWD, the original CWD will be used. The original CWD is
-# available from support.SAVEDCWD.
-with support.temp_cwd(TESTCWD, quiet=True):
-    regrtest.main()
+    # Run the tests in a context manager that temporary changes the CWD to a
+    # temporary and writable directory. If it's not possible to create or
+    # change the CWD, the original CWD will be used. The original CWD is
+    # available from support.SAVEDCWD.
+    with support.temp_cwd(TESTCWD, quiet=True):
+        regrtest.main()
msg127415 - (view) Author: Jesse Noller (jnoller) * (Python committer) Date: 2011-01-29 13:31
I'm fine with that tweak antoine
msg127419 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2011-01-29 14:40
"Useless" is a rather strong word. Every change we've made to the import system (or, more accurately, the main module identification system) has been backed up with decent use cases - nearly all of the subsequent problems can be traced back to the information that gets thrown away due to the backwards compatibility concerns that forced the main module to always be called "__main__" without any record of its real name.

In the current case, it is the 3-way combination of that behaviour with the lack of real fork() functionality on Windows and an invalid assumption in multiprocessing as to the relationship between __file__ and __name__ that is causing problems.

Back on topic, Antoine's tweak looks very similar to the solution I was working on (although I hadn't made it as far as figuring out the need to change test/__main__.py to skip the execution of the top level code).

Seeing the need for that change, I think the right answer is actually to skip the find_module/load_module call completely when the module is a file called __main__.py. Such modules are implicitly part of a "if __name__ == '__main__'" clause, as that's their whole reason for existence.

The attached patch works along those lines, and I tested it by making the substantive change (the additional if clause) to an installed copy of 3.2rc1 on my Windows gaming rig.
msg127421 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2011-01-29 14:42
(Replaced patch file to fix a typo in a comment)
msg127422 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2011-01-29 14:48
Georg, it would be very good to have this fix in for RC2. Otherwise any use of multiprocessing in conjunction with zipfile, package or directory execution will fail on Windows in 3.2.
msg127423 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2011-01-29 14:52
With issue10845_mitigation.diff applied, Antoine's patch is no longer necessary?
msg127424 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-01-29 14:53
> (Replaced patch file to fix a typo in a comment)

Does that patch still work if the objects marshalled between parent and
child refer to classes living in the __main__.py module?
msg127426 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2011-01-29 15:02
__main__.py files aren't like ordinary modules - they should *not* be defining classes or anything like that. Instead, they should be treated as if the entire file was implicitly inside an "if __name__ == '__main__':" clause - when "imported" they don't do anything.

That's why Antoine's patch set off alarm bells in my head - no normal __main__.py will include an explicit "if __name__ == '__main__':" check, so multiprocessing should just skip the file completely instead.

There's no doubt that either approach is a hack that will sometimes fail to work, but I think skipping the file execution completely will be the right decision most of the time (and we can easily say "don't do that then" if people rely on class definitions in __main__.py files).
msg127427 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2011-01-29 15:04
To answer Antoine's question directly:

With the patch, it works just as well as multiprocessing on Windows currently does if the objects marshalled between parent and
child refer to classes defined inside a "if __name__ == '__main__':" clause in the main module (i.e. it doesn't, but that isn't really multiprocessing's fault)
msg127445 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2011-01-29 17:20
Looks like Antoine agreed, so this should be fine to go in.
msg127503 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2011-01-30 01:25
Committed for 3.2 in r88247.
msg139043 - (view) Author: Kristian Vlaardingerbroek (Kristian.Vlaardingerbroek) Date: 2011-06-25 10:44
This has been committed a while ago, can this issue be closed?
msg139057 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2011-06-25 13:07
Yeah, the broader issue with multiprocessing, Windows and state held in the main module is covered by PEP 395, so there's no need to double up by keeping this open.
msg254874 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015-11-19 02:59
New changeset 5d88c1d413b9 by Nick Coghlan in branch '2.7':
Close #10128: don't rerun __main__.py in multiprocessing
https://hg.python.org/cpython/rev/5d88c1d413b9
History
Date User Action Args
2015-11-19 02:59:57python-devsetnosy: + python-dev
messages: + msg254874
2015-07-18 15:17:22Gabi.Davarsetnosy: + Gabi.Davar
2011-06-25 13:07:15ncoghlansetstatus: open -> closed
resolution: fixed
messages: + msg139057

stage: resolved
2011-06-25 10:44:45Kristian.Vlaardingerbroeksetnosy: + Kristian.Vlaardingerbroek
messages: + msg139043
2011-01-30 08:13:34georg.brandlsetpriority: release blocker -> critical
nosy: brett.cannon, georg.brandl, terry.reedy, ncoghlan, pitrou, vstinner, jnoller, michael.foord, brian.curtin, asksol
2011-01-30 01:25:20ncoghlansetnosy: brett.cannon, georg.brandl, terry.reedy, ncoghlan, pitrou, vstinner, jnoller, michael.foord, brian.curtin, asksol
messages: + msg127503
2011-01-29 17:20:00georg.brandlsetnosy: brett.cannon, georg.brandl, terry.reedy, ncoghlan, pitrou, vstinner, jnoller, michael.foord, brian.curtin, asksol
messages: + msg127445
2011-01-29 15:04:55ncoghlansetnosy: brett.cannon, georg.brandl, terry.reedy, ncoghlan, pitrou, vstinner, jnoller, michael.foord, brian.curtin, asksol
messages: + msg127427
2011-01-29 15:02:31ncoghlansetnosy: brett.cannon, georg.brandl, terry.reedy, ncoghlan, pitrou, vstinner, jnoller, michael.foord, brian.curtin, asksol
messages: + msg127426
2011-01-29 14:53:47pitrousetnosy: brett.cannon, georg.brandl, terry.reedy, ncoghlan, pitrou, vstinner, jnoller, michael.foord, brian.curtin, asksol
messages: + msg127424
2011-01-29 14:52:59georg.brandlsetnosy: brett.cannon, georg.brandl, terry.reedy, ncoghlan, pitrou, vstinner, jnoller, michael.foord, brian.curtin, asksol
messages: + msg127423
2011-01-29 14:48:35ncoghlansetpriority: critical -> release blocker

messages: + msg127422
assignee: georg.brandl
nosy: brett.cannon, georg.brandl, terry.reedy, ncoghlan, pitrou, vstinner, jnoller, michael.foord, brian.curtin, asksol
2011-01-29 14:42:57ncoghlansetfiles: - issue10845_mitigation.diff
nosy: brett.cannon, georg.brandl, terry.reedy, ncoghlan, pitrou, vstinner, jnoller, michael.foord, brian.curtin, asksol
2011-01-29 14:42:49ncoghlansetfiles: + issue10845_mitigation.diff
nosy: brett.cannon, georg.brandl, terry.reedy, ncoghlan, pitrou, vstinner, jnoller, michael.foord, brian.curtin, asksol
messages: + msg127421
2011-01-29 14:40:15ncoghlansetfiles: + issue10845_mitigation.diff

messages: + msg127419
keywords: + patch
nosy: brett.cannon, georg.brandl, terry.reedy, ncoghlan, pitrou, vstinner, jnoller, michael.foord, brian.curtin, asksol
2011-01-29 13:31:29jnollersetnosy: brett.cannon, georg.brandl, terry.reedy, ncoghlan, pitrou, vstinner, jnoller, michael.foord, brian.curtin, asksol
messages: + msg127415
2011-01-29 13:29:32pitrousetnosy: brett.cannon, georg.brandl, terry.reedy, ncoghlan, pitrou, vstinner, jnoller, michael.foord, brian.curtin, asksol
messages: + msg127414
2011-01-29 13:14:50pitrousetnosy: brett.cannon, georg.brandl, terry.reedy, ncoghlan, pitrou, vstinner, jnoller, michael.foord, brian.curtin, asksol
messages: + msg127413
2011-01-28 15:09:17ncoghlansetnosy: + georg.brandl
messages: + msg127311
2011-01-20 21:00:19brett.cannonsetpriority: normal -> critical
nosy: brett.cannon, terry.reedy, ncoghlan, pitrou, vstinner, jnoller, michael.foord, brian.curtin, asksol
2011-01-07 08:03:36terry.reedysetnosy: + terry.reedy
messages: + msg125628
2011-01-07 04:40:08ncoghlansetnosy: brett.cannon, ncoghlan, pitrou, vstinner, jnoller, michael.foord, brian.curtin, asksol
messages: + msg125620
2011-01-06 22:38:51brett.cannonsetnosy: + brett.cannon
2011-01-06 10:48:57vstinnersetnosy: + vstinner
2011-01-06 10:36:14pitrousetnosy: ncoghlan, pitrou, jnoller, michael.foord, brian.curtin, asksol
messages: + msg125549
2011-01-06 10:35:06pitrousetnosy: + ncoghlan, michael.foord
messages: + msg125548
2011-01-06 10:28:05pitrousetnosy: + brian.curtin
2011-01-06 10:26:17pitroucreate