Message313809
Environment: Python 3.6.4, macOS 10.12.6
Python 3's dbm appears to corrupt the key index on macOS if objects >4KB are inserted.
Code:
<<<<<<<<<<<
import dbm
import contextlib
with contextlib.closing(dbm.open('test', 'n')) as db:
for k in range(128):
db[('%04d' % k).encode()] = b'\0' * (k * 128)
with contextlib.closing(dbm.open('test', 'r')) as db:
print(len(db))
print(len(list(db.keys())))
>>>>>>>>>>>
On my machine, I get the following:
<<<<<<<<<<<
94
Traceback (most recent call last):
File "test.py", line 10, in <module>
print(len(list(db.keys())))
SystemError: Negative size passed to PyBytes_FromStringAndSize
>>>>>>>>>>>
(The error says PyString_FromStringAndSize on Python 2.x but is otherwise the same). The expected output, which I see on Linux (using gdbm), is
128
128
I get this error with the following Pythons on my system:
/usr/bin/python2.6 - Apple-supplied Python 2.6.9
/usr/bin/python - Apple-supplied Python 2.7.13
/opt/local/bin/python2.7 - MacPorts Python 2.7.14
/usr/local/bin/python - Python.org Python 2.7.13
/usr/local/bin/python3.5 - Python.org Python 3.5.1
/usr/local/bin/python3.6 - Python.org Python 3.6.4
This seems like a very big problem - silent data corruption with no warning. It appears related to issue30388, but in that case they were seeing sporadic failures. The deterministic script above causes failures in every case.
This was discovered after running some code which used shelve (which uses dbm under the hood) in Python 3, but the bug clearly applies to Python 2 as well. |
|
Date |
User |
Action |
Args |
2018-03-14 06:34:12 | nneonneo | set | recipients:
+ nneonneo |
2018-03-14 06:34:12 | nneonneo | set | messageid: <1521009252.74.0.467229070634.issue33074@psf.upfronthosting.co.za> |
2018-03-14 06:34:12 | nneonneo | link | issue33074 messages |
2018-03-14 06:34:12 | nneonneo | create | |
|