classification
Title: AMD64 FreeBSD Non-Debug 3.x: out of swap space (test process killed by signal 9)
Type: Stage: resolved
Components: Tests Versions: Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: koobs Nosy List: koobs, pablogsal, vstinner
Priority: normal Keywords:

Created on 2020-01-13 14:18 by vstinner, last changed 2020-03-25 21:34 by vstinner.

Messages (12)
msg359904 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-13 14:18
https://buildbot.python.org/all/#/builders/214/builds/152

...
0:08:21 load avg: 3.66 [240/420] test_wait3 passed -- running: test_multiprocessing_forkserver (1 min 51 sec)
0:08:22 load avg: 3.66 [241/420] test_uuid passed -- running: test_multiprocessing_forkserver (1 min 53 sec)
0:08:25 load avg: 3.53 [242/420] test_tuple passed -- running: test_multiprocessing_forkserver (1 min 55 sec)
0:08:32 load avg: 3.56 [243/420] test___all__ passed -- running: test_multiprocessing_forkserver (2 min 3 sec)
*** Signal 9
Stop.
make: stopped in /usr/home/buildbot/python/3.x.koobs-freebsd-9e36.nondebug/build
program finished with exit code 1
elapsedTime=519.823452
msg359905 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-13 14:20
It seems like Signal 9 is SIGKILL.
msg359907 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-13 14:25
Same error on https://buildbot.python.org/all/#builders/214/builds/148
msg359908 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-13 14:25
Same error https://buildbot.python.org/all/#builders/214/builds/138
msg359943 - (view) Author: Kubilay Kocak (koobs) (Python triager) Date: 2020-01-14 03:29
Identified a kernel/userland mismatch which may have caused this. Have restarted the server and worker, and will rebuild https://buildbot.python.org/all/#/builders/214/builds/152
msg359944 - (view) Author: Kubilay Kocak (koobs) (Python triager) Date: 2020-01-14 03:55
Rebuilding now
msg359950 - (view) Author: Kubilay Kocak (koobs) (Python triager) Date: 2020-01-14 07:48
Looks OK now: https://buildbot.python.org/all/#/builders/214

If it fails again in the same manner, please re-open
msg359953 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-14 07:51
> Identified a kernel/userland mismatch which may have caused this. Have restarted the server and worker, and will rebuild https://buildbot.python.org/all/#/builders/214/builds/152

Aha, interesting bug. Thanks for fixing it ;-)
msg363733 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-09 14:20
> If it fails again in the same manner, please re-open

The issue is back. Two examples.

--

Today: https://buildbot.python.org/all/#/builders/214/builds/405

(...)
0:15:40 load avg: 3.11 [366/420] test_pickletools passed -- running: test_multiprocessing_forkserver (2 min 43 sec), test_multiprocessing_fork (1 min 17 sec)
0:15:40 load avg: 3.11 [367/420] test_webbrowser passed -- running: test_multiprocessing_forkserver (2 min 43 sec), test_multiprocessing_fork (1 min 18 sec)
0:15:42 load avg: 3.11 [368/420] test_codecmaps_hk passed -- running: test_multiprocessing_forkserver (2 min 45 sec), test_multiprocessing_fork (1 min 19 sec)
fetching http://www.pythontest.net/unicode/BIG5HKSCS-2004.TXT ...
0:15:43 load avg: 3.11 [369/420] test_pprint passed -- running: test_multiprocessing_forkserver (2 min 45 sec), test_multiprocessing_fork (1 min 20 sec)
*** Signal 9

--

1 day ago: https://buildbot.python.org/all/#/builders/214/builds/395

(...)
0:14:53 load avg: 3.29 [269/420/1] test_keywordonlyarg passed -- running: test_multiprocessing_forkserver (2 min 25 sec)
0:15:00 load avg: 2.94 [270/420/1] test_pprint passed -- running: test_multiprocessing_forkserver (2 min 31 sec)
0:15:00 load avg: 2.94 [271/420/2] test_io crashed (Exit code -9) -- running: test_multiprocessing_forkserver (2 min 31 sec)
0:15:05 load avg: 2.87 [272/420/2] test_positional_only_arg passed -- running: test_multiprocessing_forkserver (2 min 37 sec)
*** Signal 9
msg363877 - (view) Author: Kubilay Kocak (koobs) (Python triager) Date: 2020-03-11 02:12
Investigating
msg364627 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-19 18:46
The bug still occurs time to time. AMD64 FreeBSD Non-Debug 3.x:
https://buildbot.python.org/all/#/builders/214/builds/475
msg365024 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-25 21:34
New failure: https://buildbot.python.org/all/#/builders/214/builds/512

test.pythoninfo says:

datetime.datetime.now: 2020-03-25 18:59:08.424147
socket.hostname: 121-RELEASE-p2-amd64

/var/log/messages says:

Mar 25 18:41:13 121-RELEASE-p2-amd64 kernel: pid 65447 (python), jid 0, uid 1002, was killed: out of swap space

121-RELEASE-p2-amd64% sysctl hw | egrep 'hw.(phys|user|real)'
hw.physmem: 1033416704
hw.usermem: 745279488
hw.realmem: 1073676288

=> 985.5 MB of memory

121-RELEASE-p2-amd64% sysctl vm|grep swap      
vm.swap_enabled: 1
vm.domain.0.stats.unswappable: 0
vm.swap_idle_threshold2: 10
vm.swap_idle_threshold1: 2
vm.swap_idle_enabled: 0
vm.disable_swapspace_pageouts: 0
vm.stats.vm.v_swappgsout: 5793651
vm.stats.vm.v_swappgsin: 3322252
vm.stats.vm.v_swapout: 1390626
vm.stats.vm.v_swapin: 875591
vm.nswapdev: 1
vm.swap_fragmentation: 
vm.swap_async_max: 4
vm.swap_maxpages: 1964112
vm.swap_total: 4294864896
vm.swap_reserved: 7942307840

=> 4095.9 MB of swap (total)

121-RELEASE-p2-amd64% swapinfo -h
Device          1K-blocks     Used    Avail Capacity
/dev/da0p3        4194204      70M     3.9G     2%

121-RELEASE-p2-amd64% swapinfo 
Device          1K-blocks     Used    Avail Capacity
/dev/da0p3        4194204    72164  4122040     2%
History
Date User Action Args
2020-03-25 21:34:50vstinnersetmessages: + msg365024
title: AMD64 FreeBSD Non-Debug 3.x: main regrtest process killed by SIGKILL (Signal 9) -> AMD64 FreeBSD Non-Debug 3.x: out of swap space (test process killed by signal 9)
2020-03-19 18:46:20vstinnersetmessages: + msg364627
2020-03-11 02:12:24koobssetmessages: + msg363877
2020-03-09 14:20:01vstinnersetstatus: closed -> open
resolution: fixed ->
messages: + msg363733
2020-01-14 07:51:34vstinnersetstatus: open -> closed

messages: + msg359953
stage: resolved
2020-01-14 07:48:03koobssetassignee: koobs
resolution: fixed
messages: + msg359950
2020-01-14 03:55:18koobssetmessages: + msg359944
2020-01-14 03:29:03koobssetmessages: + msg359943
2020-01-13 14:25:47vstinnersetnosy: + koobs
2020-01-13 14:25:42vstinnersetmessages: + msg359908
2020-01-13 14:25:15vstinnersetmessages: + msg359907
2020-01-13 14:20:32vstinnersetmessages: + msg359905
title: AMD64 FreeBSD Non-Debug 3.x: main regrtest process killed by Signal 9 -> AMD64 FreeBSD Non-Debug 3.x: main regrtest process killed by SIGKILL (Signal 9)
2020-01-13 14:18:07vstinnercreate