classification
Title: dbmmodule.c:dbm_contains fails on 64bit big-endian (test_dbm.py fails) when built against gdbm (int vs Py_ssize_t)
Type: behavior Stage: resolved
Components: Extension Modules Versions: Python 2.7
process
Status: closed Resolution: duplicate
Dependencies: Superseder: PowerLinux dbm failure in 2.7
View: 17926
Assigned To: Nosy List: dmalcolm, donmez, pitrou
Priority: normal Keywords: patch

Created on 2010-08-25 20:04 by dmalcolm, last changed 2013-05-10 10:09 by pitrou. This issue is now closed.

Files
File name Uploaded Description Edit
fix-dbm_contains-on-64bit-bigendian.patch dmalcolm, 2010-08-25 20:04 Patch against 2.7 branch review
Messages (3)
msg114934 - (view) Author: Dave Malcolm (dmalcolm) (Python committer) Date: 2010-08-25 20:04
With a clean build of release27-maint (r84317), test_dbm.py fails on ppc64 with this error:
  File "test_dbm.py", line 24, in test_keys
    self.assert_(k in self.d)
AssertionError

I'm building gainst gdbm-1.8.0 (specifically, on a prerelease of RHEL6, with gdbm-devel-1.8.0-36.el6.ppc64)

All of the headers define datum as:
      typedef struct {
        char *dptr;
        int   dsize;
      } datum;

Note the use of "int" for dsize.

This fragment of code in python's Modules/dbmmodule.c:dbm_contains:
          if (PyString_AsStringAndSize(v, (char **)&key.dptr,
                                       (Py_ssize_t *)&key.dsize)) {
                  return -1;
          }
 appears to assume that
  sizeof(datum.dsize) == sizeof(Py_ssize_t)
which is not correct on these architectures:

(gdb) p sizeof(key.dsize)
$25 = 4
(gdb) p sizeof(Py_ssize_t)
$26 = 8

On ppc64, when PyString_AsStringAndSize writes the 0x00000000000000001 value for the ob_size of "a" to &key.dsize, I believe the 0x00000000 part is written to &key.size, and the 0x00000001 part is written to the 4 bytes following it, due to the incorrect cast from (int*) to (Py_ssize_t*)

Thankfully
(gdb) p sizeof(key)
$28 = 16
so it writes this value to padding within the "datum key", rather than corrupting the stack.

The dbm_fetch() invocation is thus passed a 0 dsize, and doesn't find the key, hence the test fails.

The various other uses with that source file appear correct:
(i) there are various PyArg_Parse* calls using s#, with int, which is correct, given the absence of the PY_SSIZE_T_CLEAN macro.
(ii) there are various calls of PyString_FromStringAndSize(, datum.dsize), which I believe is correct: I believe the compiler will coerce this int to the wider Py_ssize_t type.

I'm attaching a patch which (I hope) correctly coerces the size of the key from Py_ssize_t to "int" within gdb_contains.
msg114935 - (view) Author: Dave Malcolm (dmalcolm) (Python committer) Date: 2010-08-25 20:07
Note to self: I'm tracking this one in RH's downstream tracker as:
  https://bugzilla.redhat.com/show_bug.cgi?id=626756
msg188826 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-05-10 10:09
Should be fixed in issue17926. AFAICT the issue doesn't exist on 3.x.
History
Date User Action Args
2013-05-10 10:09:13pitrousetstatus: open -> closed
versions: - Python 3.1, Python 3.2
superseder: PowerLinux dbm failure in 2.7
messages: + msg188826

resolution: duplicate
stage: patch review -> resolved
2013-05-10 09:42:21serhiy.storchakasetnosy: + pitrou
2012-07-20 11:09:46donmezsetnosy: + donmez
2010-08-25 20:07:48dmalcolmsetkeywords: patch, patch

messages: + msg114935
2010-08-25 20:04:57dmalcolmcreate