This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: SIGBUS in test_big_buffer() of test_zlib on Debian bigmem buildbot
Type: crash Stage: resolved
Components: Versions:
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: nadeem.vawda Nosy List: loewis, nadeem.vawda, neologix, pitrou, python-dev, torsten, vstinner
Priority: normal Keywords:

Created on 2012-01-26 13:16 by nadeem.vawda, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Messages (12)
msg152006 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2012-01-26 13:16
http://www.python.org/dev/buildbot/all/builders/AMD64%20debian%20bigmem%203.x/builds/58/steps/test/logs/stdio
msg152047 - (view) Author: Torsten Landschoff (torsten) * Date: 2012-01-26 23:30
I tried to reproduce this crash on my desktop system.
AMD64, 8 GB RAM (only) and on Debian unstable from today.
Testing the exact same Python version (hg update d2cf8a34ddf90fb1bc8938de0f736694e61f73fa) the test passes just fine here...
msg152074 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2012-01-27 09:19
I've also been unable to reproduce it on my own machine (AMD64; 8GB RAM).

I guess I'll have to do some trial-and-error debugging using the custom
builder to figure this out.
msg153989 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2012-02-22 21:29
The same test found a bug in Mac OS X kernel: issue #11277.

I'm unable to reproduce the crash on Fedora 16 (with 12 GB of RAM). It may depend on zlib version or the kernel version. I'm running Linux 3.2.6-3.fc16.x86_64 with zlib 1.2.5-6.fc16.
msg153990 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2012-02-22 21:29
A recent crash:

[241/364/1] test_zlib
Fatal Python error: Bus error

Current thread 0x00002b8f2240d260:
  File "/var/tmpfs/martin.vonloewis/3.x.loewis-parallel2/build/Lib/test/test_zlib.py", line 96 in test_big_buffer
  File "/var/tmpfs/martin.vonloewis/3.x.loewis-parallel2/build/Lib/unittest/case.py", line 385 in _executeTestPart
  File "/var/tmpfs/martin.vonloewis/3.x.loewis-parallel2/build/Lib/unittest/case.py", line 440 in run
  File "/var/tmpfs/martin.vonloewis/3.x.loewis-parallel2/build/Lib/unittest/case.py", line 492 in __call__
  File "/var/tmpfs/martin.vonloewis/3.x.loewis-parallel2/build/Lib/unittest/suite.py", line 105 in run
  File "/var/tmpfs/martin.vonloewis/3.x.loewis-parallel2/build/Lib/unittest/suite.py", line 67 in __call__
  File "/var/tmpfs/martin.vonloewis/3.x.loewis-parallel2/build/Lib/unittest/suite.py", line 105 in run
  File "/var/tmpfs/martin.vonloewis/3.x.loewis-parallel2/build/Lib/unittest/suite.py", line 67 in __call__
  File "/var/tmpfs/martin.vonloewis/3.x.loewis-parallel2/build/Lib/unittest/runner.py", line 168 in run
  File "/var/tmpfs/martin.vonloewis/3.x.loewis-parallel2/build/Lib/test/support.py", line 1369 in _run_suite
  File "/var/tmpfs/martin.vonloewis/3.x.loewis-parallel2/build/Lib/test/support.py", line 1403 in run_unittest
  File "/var/tmpfs/martin.vonloewis/3.x.loewis-parallel2/build/Lib/test/test_zlib.py", line 666 in test_main
  File "/var/tmpfs/martin.vonloewis/3.x.loewis-parallel2/build/Lib/test/regrtest.py", line 1221 in runtest_inner
  File "/var/tmpfs/martin.vonloewis/3.x.loewis-parallel2/build/Lib/test/regrtest.py", line 907 in runtest
  File "/var/tmpfs/martin.vonloewis/3.x.loewis-parallel2/build/Lib/test/regrtest.py", line 710 in main
  File "/var/tmpfs/martin.vonloewis/3.x.loewis-parallel2/build/Lib/test/__main__.py", line 13 in <module>
  File "/var/tmpfs/martin.vonloewis/3.x.loewis-parallel2/build/Lib/runpy.py", line 73 in _run_code
  File "/var/tmpfs/martin.vonloewis/3.x.loewis-parallel2/build/Lib/runpy.py", line 160 in _run_module_as_main
make: *** [buildbottest] Bus error

http://www.python.org/dev/buildbot/all/builders/AMD64%20debian%20bigmem%203.x/builds/136/steps/test/logs/stdio
msg154054 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2012-02-23 08:44
"""
File "/var/tmpfs/martin.vonloewis/3.x.loewis-parallel2/build/Lib/test/test_zlib.py", line 96 in test_big_buffer
"""

The SIGBUS could be due to the buildbot running out of tmpfs.
msg154060 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-02-23 11:36
New changeset f3f3bb45205b by Nadeem Vawda in branch 'default':
Issue #13873: Fix crash in test_zlib on bigmem buildbot.
http://hg.python.org/cpython/rev/f3f3bb45205b
msg154062 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2012-02-23 12:44
> The SIGBUS could be due to the buildbot running out of tmpfs.

I haven't been able to reproduce the crash by running the test on a tmpfs
on my own machine (Ubuntu AMD64; 8GB RAM; Linux 3.0.0-15-generic; zlib
1:1.2.3.4.dfsg-3ubuntu3), but maybe it's due to something specific about
the configuration of the buildbot machine?

I've temporarily changed the test to use a regular chunk of memory
instead of the mmap hack). If possible, I'd like to change back to the
old technique in the long run (since it allows the test to run on
machines with less RAM), but until we can figure out the problem, I'd
rather not have the test failing needlessly.
msg154080 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2012-02-23 19:17
> but maybe it's due to something specific about  the configuration of the buildbot
> machine?

Maybe you didn't try with a large enough file. Here's a trial on my box:
"""
$ df -h /tmp/
Filesystem      Size  Used Avail Use% Mounted on
tmpfs           253M   68K  253M   1% /tmp
$ cat /tmp/test.py
import mmap
import zlib

f = open('/tmp/foo', 'wb+')
f.seek(512 * 1024 * 1024)
f.write(b'x')
f.flush()
m = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ)
zlib.crc32(m)
m.close()
$ python /tmp/test.py
Bus error (core dumped)
"""

> I've temporarily changed the test to use a regular chunk of memory
>instead of the mmap hack). If possible, I'd like to change back to the
>old technique in the long run (since it allows the test to run on
>machines with less RAM)

Yes, but this kind of test is only supposed to be run on machines
which have enough memory.
Also, if the filesystem doesn't support sparse files, this writes a
lot to the disk (and if it crashes, you end up with a huge file).
I'd say it's fine this way...
msg154081 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2012-02-23 19:59
Well, it turns out that when I tested it on my own machine, I actually
wasn't using a tmpfs - I misread the output of df and used /tmp¹ instead
of /run. Doing the test in /run does in fact give a bus error. Mea culpa.

¹ Apparently on my system /tmp isn't a tmpfs. Go figure.


> Also, if the filesystem doesn't support sparse files, this writes a
> lot to the disk (and if it crashes, you end up with a huge file).

You may be right; I hadn't thought about that possibility. My concern was
that the test suite isn't run with -M very often, so these sorts of tests
could often be broken for a while before someone found out. In the past,
none of the buildbots ran bigmem tests, so there was a real danger that
we wouldn't notice breakages. However, with the addition of the debian
bigmem buildbot, that is no longer the case, so this isn't such an issue
any more.

I'm okay with leaving the tests as they are in 3.3. Any objections?
If not, I'll also backport the change to 3.2 and 2.7.
msg154087 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2012-02-23 20:23
> I'm okay with leaving the tests as they are in 3.3. Any objections?

Nope, it's fine.
msg154418 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-02-26 22:54
New changeset fc43b051ae1c by Nadeem Vawda in branch '3.2':
Issue #13873: Fix crash in test_zlib when running on a small (<4GB) tmpfs.
http://hg.python.org/cpython/rev/fc43b051ae1c
History
Date User Action Args
2022-04-11 14:57:26adminsetgithub: 58081
2012-02-27 11:09:58nadeem.vawdasetstatus: open -> closed
resolution: fixed
stage: needs patch -> resolved
2012-02-26 22:54:13python-devsetmessages: + msg154418
2012-02-23 20:23:31vstinnersetmessages: + msg154087
2012-02-23 19:59:54nadeem.vawdasetmessages: + msg154081
2012-02-23 19:17:08neologixsetmessages: + msg154080
2012-02-23 12:44:05nadeem.vawdasetmessages: + msg154062
2012-02-23 11:36:23python-devsetnosy: + python-dev
messages: + msg154060
2012-02-23 08:44:21neologixsetnosy: + neologix
messages: + msg154054
2012-02-22 21:29:47vstinnersettitle: SIGBUS in test_zlib on Debian bigmem buildbot -> SIGBUS in test_big_buffer() of test_zlib on Debian bigmem buildbot
2012-02-22 21:29:29vstinnersetmessages: + msg153990
2012-02-22 21:29:02vstinnersetmessages: + msg153989
2012-02-22 21:28:13nadeem.vawdasetnosy: + loewis, pitrou, vstinner
2012-02-22 21:27:39nadeem.vawdalinkissue14090 superseder
2012-01-27 09:19:02nadeem.vawdasetmessages: + msg152074
2012-01-26 23:30:47torstensetnosy: + torsten
messages: + msg152047
2012-01-26 13:16:43nadeem.vawdacreate