classification
Title: test_notify_all() of test_multiprocessing_forkserver failed on Python on x86 Tiger 3.6 with "ValueError: cannot convert float NaN to integer"
Type: Stage: resolved
Components: Tests Versions: Python 3.6
process
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: Nosy List: mark.dickinson, vstinner
Priority: normal Keywords:

Created on 2017-11-23 08:32 by vstinner, last changed 2017-12-15 15:10 by vstinner. This issue is now closed.

Messages (7)
msg306787 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-11-23 08:32
test_notify_all() of test_multiprocessing_forkserver failed on Python on x86 Tiger 3.6 with "ValueError: cannot convert float NaN to integer":

http://buildbot.python.org/all/#/builders/46/builds/60

0:52:21 load avg: 1.46 [378/406/1] test_multiprocessing_forkserver crashed (Exit code 1)

Traceback (most recent call last):
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/multiprocessing/forkserver.py", line 196, in main
    _serve_one(s, listener, alive_r, old_handlers)
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/multiprocessing/forkserver.py", line 231, in _serve_one
    code = spawn._main(child_r)
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/multiprocessing/spawn.py", line 114, in _main
    prepare(preparation_data)
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/multiprocessing/spawn.py", line 223, in prepare
    _fixup_main_from_name(data['init_main_from_name'])
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/multiprocessing/spawn.py", line 249, in _fixup_main_from_name
    alter_sys=True)
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/runpy.py", line 201, in run_module
    mod_name, mod_spec, code = _get_module_details(mod_name)
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/runpy.py", line 153, in _get_module_details
    code = loader.get_code(mod_name)
  File "<frozen importlib._bootstrap_external>", line 762, in get_code
ValueError: cannot convert float NaN to integer

Timeout (0:25:00)!
Thread 0xa000d000 (most recent call first):
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/multiprocessing/connection.py", line 379 in _recv
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/multiprocessing/connection.py", line 407 in _recv_bytes
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/multiprocessing/connection.py", line 250 in recv
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/multiprocessing/managers.py", line 757 in _callmethod
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/multiprocessing/managers.py", line 987 in acquire
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/test/_test_multiprocessing.py", line 1044 in test_notify_all
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/unittest/case.py", line 605 in run
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/unittest/case.py", line 653 in __call__
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/unittest/suite.py", line 122 in run
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/unittest/suite.py", line 84 in __call__
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/unittest/suite.py", line 122 in run
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/unittest/suite.py", line 84 in __call__
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/unittest/suite.py", line 122 in run
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/unittest/suite.py", line 84 in __call__
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/unittest/runner.py", line 176 in run
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/test/support/__init__.py", line 1888 in _run_suite
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/test/support/__init__.py", line 1932 in run_unittest
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/test/libregrtest/runtest.py", line 172 in test_runner
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/test/libregrtest/runtest.py", line 173 in runtest_inner
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/test/libregrtest/runtest.py", line 127 in runtest
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/test/libregrtest/runtest_mp.py", line 71 in run_tests_slave
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/test/libregrtest/main.py", line 516 in _main
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/test/libregrtest/main.py", line 509 in main
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/test/libregrtest/main.py", line 584 in main
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/test/regrtest.py", line 46 in _main
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/test/regrtest.py", line 50 in <module>
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/runpy.py", line 85 in _run_code
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/runpy.py", line 193 in _run_module_as_main


Moreover, when test_multiprocessing_forkserver was run a second time, a different test failed, test_forkserver_sigkill():


Re-running test 'test_multiprocessing_forkserver' in verbose mode
(...)
test_forkserver_sigint (test.test_multiprocessing_forkserver.WithProcessesTestProcess) ... ok
test_forkserver_sigkill (test.test_multiprocessing_forkserver.WithProcessesTestProcess) ... FAIL
test_lose_target_ref (test.test_multiprocessing_forkserver.WithProcessesTestProcess) ... ok
(...)
======================================================================
FAIL: test_forkserver_sigkill (test.test_multiprocessing_forkserver.WithProcessesTestProcess)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/test/_test_multiprocessing.py", line 496, in test_forkserver_sigkill
    self.check_forkserver_death(signal.SIGKILL)
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/test/_test_multiprocessing.py", line 486, in check_forkserver_death
    self.assertTrue(evt.is_set())
AssertionError: False is not true
msg306788 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-11-23 08:44
"ValueError: cannot convert float NaN to integer": this error occurred on int(float), where the float comes from os.stat()

File "<frozen importlib._bootstrap_external>", line 762, in get_code

Lib/importlib/_bootstrap_external.py:

def get_code(self, fullname):
    ...
    st = self.path_stats(source_path)
    ...
    source_mtime = int(st['mtime']) <~~~~~~ HERE
    ...

def path_stats(self, path):
    st = _path_stat(path)
    return {'mtime': st.st_mtime, 'size': st.st_size}

def _path_stat(path):
    return _os.stat(path)

=> The float NaN comes from os.stat().st_mtime.


Implementation of os.stat().st_mtime in Modules/posixmodule.c:

static PyObject*
_pystat_fromstructstat(STRUCT_STAT *st)
{
    ...
#if defined(HAVE_STAT_TV_NSEC)
    ansec = st->st_atim.tv_nsec;
    mnsec = st->st_mtim.tv_nsec;
    cnsec = st->st_ctim.tv_nsec;
#elif defined(HAVE_STAT_TV_NSEC2)
    ansec = st->st_atimespec.tv_nsec;
    mnsec = st->st_mtimespec.tv_nsec;
    cnsec = st->st_ctimespec.tv_nsec;
#elif defined(HAVE_STAT_NSEC)
    ansec = st->st_atime_nsec;
    mnsec = st->st_mtime_nsec;
    cnsec = st->st_ctime_nsec;
#else
    ansec = mnsec = cnsec = 0;
#endif
    fill_time(v, 7, st->st_atime, ansec);
    fill_time(v, 8, st->st_mtime, mnsec);
    fill_time(v, 9, st->st_ctime, cnsec);
    ...
}

static void
fill_time(PyObject *v, int index, time_t sec, unsigned long nsec)
{
    ...
    float_s = PyFloat_FromDouble(sec + 1e-9*nsec);  <~~~~~~ HERE
    ...
    PyStructSequence_SET_ITEM(v, index+3, float_s);
    ...
}


Extract of the ./configure script output:

checking for tv_nsec in struct stat... no
checking for tv_nsec2 in struct stat... yes

=> configure defines HAVE_STAT_TV_NSEC2


We need sec and nsec values from fill_time() to debug this issue.
msg306789 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-11-23 08:48
It may also be compiler issue.

* CPU: Intel x86
* compiler: GCC 4.0.1 (Apple Computer, Inc. build 5367)
* CFLAGS: -fno-strict-aliasing -Wsign-compare -g -O0 -Wall -Wstrict-prototypes
* OS: macOS Tiger
* Kernel: Darwin Kernel Version 8.11.1: Wed Oct 10 18:23:28 PDT 2007; root:xnu-792.25.20~1/RELEASE_I386

Extract of pythoninfo:

os.uname: posix.uname_result(sysname='Darwin', nodename='osx', release='8.11.1', version='Darwin Kernel Version 8.11.1: Wed Oct 10 18:23:28 PDT 2007; root:xnu-792.25.20~1/RELEASE_I386', machine='i386')
platform.architecture: 32bit
platform.platform: Darwin-8.11.1-i386-32bit
platform.python_implementation: CPython
sys.version: 3.6.3+ (heads/3.6:ae3c5c7, Nov 22 2017, 20:39:07)  [GCC 4.0.1 (Apple Computer, Inc. build 5367)]
sysconfig[CFLAGS]: -fno-strict-aliasing -Wsign-compare -g -O0 -Wall -Wstrict-prototypes
msg306790 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-11-23 08:49
This issue is very similar, likely the same bug, than bpo-13986.
msg306792 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-11-23 08:57
Similar error: bpo-14028, "Using a 32-bit Python 2.6.5 on a Linux system", reported at 2012-02-16.
msg306793 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-11-23 09:24
Interesting info about NaN on Intel x86:

https://stackoverflow.com/questions/14021763/c-multiplication-or-addition-floats-results-nan

"The multiplication is being performed in the x87 registers, and a floating-point stack overflow has occurred due to (possibly unrelated) earlier operations in your program's execution. When the processor is in this failure state, all computations performed on the x87 registers produce NaN results."

--

For FreeBSD, Python main() starts with:

	/* 754 requires that FP exceptions run in "no stop" mode by default,
	 * and until C vendors implement C99's ways to control FP exceptions,
	 * Python requires non-stop mode.  Alas, some platforms enable FP
	 * exceptions by default.  Here we disable them.
	 */
#ifdef __FreeBSD__
	fedisableexcept(FE_OVERFLOW);
#endif
msg308400 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-12-15 15:10
I really have no clue on the root bug, and the bug only occurred once, so I choose to close it.
History
Date User Action Args
2017-12-15 15:10:40vstinnersetstatus: open -> closed
resolution: out of date
messages: + msg308400

stage: resolved
2017-11-23 09:24:15vstinnersetnosy: + mark.dickinson
2017-11-23 09:24:09vstinnersetmessages: + msg306793
2017-11-23 08:57:27vstinnersetmessages: + msg306792
2017-11-23 08:49:38vstinnersetmessages: + msg306790
2017-11-23 08:48:51vstinnersetmessages: + msg306789
2017-11-23 08:44:11vstinnersetmessages: + msg306788
2017-11-23 08:32:46vstinnercreate