This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Integer overflow in _bsddb leads to heap corruption
Type: crash Stage: resolved
Components: Extension Modules Versions: Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Ned Williamson, ZackerySpytz, lemburg, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2015-12-25 03:53 by Ned Williamson, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
bsddbpoc.py Ned Williamson, 2015-12-25 03:53
Pull Requests
URL Status Linked Edit
PR 8337 merged ZackerySpytz, 2018-07-19 05:54
PR 8392 merged ZackerySpytz, 2018-07-22 13:01
Messages (4)
msg256974 - (view) Author: Ned Williamson (Ned Williamson) Date: 2015-12-25 03:53
In function `_db_associateCallback` of the `_bsddb` module, associating two databases with a callback that returns a sufficiently large list will lead to heap corruption due an integer overflow on 32-bit Python.

From `_bsddb.c`:
```
    else if (PyList_Check(result))
    {
        char* data;
        Py_ssize_t size;
        int i, listlen;
        DBT* dbts;

        listlen = PyList_Size(result);

1.      dbts = (DBT *)malloc(sizeof(DBT) * listlen); ///sizeof(DBT) == 28 on my system, enough to overflow

2.      for (i=0; i<listlen; i++)
        {
            if (!PyBytes_Check(PyList_GetItem(result, i)))
            {
                PyErr_SetString(
                   PyExc_TypeError,
#if (PY_VERSION_HEX < 0x03000000)
"The list returned by DB->associate callback should be a list of strings.");
#else
"The list returned by DB->associate callback should be a list of bytes.");
#endif
                PyErr_Print();
            }

            PyBytes_AsStringAndSize(
                PyList_GetItem(result, i),
3.              &data, &size);

            CLEAR_DBT(dbts[i]);
4.          dbts[i].data = malloc(size);          /* TODO, check this */

            if (dbts[i].data)
            {
5.              memcpy(dbts[i].data, data, size);
                dbts[i].size = size;
                dbts[i].ulen = dbts[i].size;
                dbts[i].flags = DB_DBT_APPMALLOC;  /* DB will free */
            }
            else
            {
                PyErr_SetString(PyExc_MemoryError,
                    "malloc failed in _db_associateCallback (list)");
                PyErr_Print();
            }
        }

        CLEAR_DBT(*secKey);

        secKey->data = dbts;
        secKey->size = listlen;
        secKey->flags = DB_DBT_APPMALLOC | DB_DBT_MULTIPLE;
        retval = 0;
    }
```

1. The multiplication in this line can overflow, allocating an undersized buffer.
2. This loop does not suffer from the overflow, so it can corrupt the heap by writing user data (see 3. and 5.).

This bug is present in Python 2.7.11.

See the result of running my attached POC script:
```
(gdb) r vuln.py
Starting program: /vagrant/Python-2.7.11/python.exe vuln.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/i386-linux-gnu/libthread_db.so.1".
python.exe: malloc.c:2372: sysmalloc: Assertion `(old_top == (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) - __builtin_offsetof (struct malloc_chunk, fd)))) && old_size == 0) || ((unsigned long) (old_size) >= (unsigned long)((((__builtin_offsetof (struct malloc_chunk, fd_nextsize))+((2 *(sizeof(size_t))) - 1)) & ~((2 *(sizeof(size_t))) - 1))) && ((old_top)->size & 0x1) && ((unsigned long) old_end & pagemask) == 0)' failed.

Program received signal SIGABRT, Aborted.
0xb7fdd428 in __kernel_vsyscall ()
(gdb) bt
#0  0xb7fdd428 in __kernel_vsyscall ()
#1  0xb7de6607 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#2  0xb7de9a33 in __GI_abort () at abort.c:89
#3  0xb7e2a9dd in __malloc_assert (
    assertion=assertion@entry=0xb7f1e3c0 "(old_top == (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) - __builtin_offsetof (struct malloc_chunk, fd)))) && old_size == 0) || ((unsigned long) (old_size) >= (unsigned long)((((__builtin_offs"...,
    file=file@entry=0xb7f19954 "malloc.c", line=line@entry=2372,
    function=function@entry=0xb7f19ce5 <__func__.10915> "sysmalloc") at malloc.c:293
#4  0xb7e2d5eb in sysmalloc (av=0xb7f62420 <main_arena>, nb=16) at malloc.c:2369
#5  _int_malloc (av=av@entry=0xb7f62420 <main_arena>, bytes=bytes@entry=1) at malloc.c:3800
#6  0xb7e2e708 in __GI___libc_malloc (bytes=1) at malloc.c:2891
#7  0xb7b006b2 in _db_associateCallback (db=0x82a7dd0, priKey=0xbffff228, priData=0xbffff034, secKey=0x8291a80)
    at /vagrant/Python-2.7.11/Modules/_bsddb.c:1531
...
```
We can see that the `malloc` call on the line marked (4.) fails due to corrupted heap structures.
Also, running the script outside of GDB leads to a different message because of differences in heap layout:
```
vagrant@vagrant-ubuntu-trusty-32:/vagrant/Python-2.7.11$ ./python.exe vuln.py
*** Error in `python': corrupted double-linked list: 0x099e9858 ***
Aborted (core dumped)
```

This vulnerability can be fixed by checking for the overflow before the call to malloc. Also, note that the PyBytes_Check check does not exit the function, but PyBytesAsStringAndSize is called immediately afterwards. I would recommend breaking or continuing if that check fails, although I do think PyBytesAsStringAndSize performs this check as well.
msg322087 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-07-21 08:27
New changeset 32522050773c257a5c3c0c8929ba5c64123b53ed by Serhiy Storchaka (Zackery Spytz) in branch '2.7':
bpo-25943: Fix potential heap corruption in bsddb's _db_associateCallback() (GH-8337)
https://github.com/python/cpython/commit/32522050773c257a5c3c0c8929ba5c64123b53ed
msg322143 - (view) Author: Zackery Spytz (ZackerySpytz) * (Python triager) Date: 2018-07-22 12:59
Integer overflow can also occur in DB_join().
msg322154 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-07-22 16:53
New changeset 041a4ee9456d716dd449d38a5328b82e76f5dbc4 by Serhiy Storchaka (Zackery Spytz) in branch '2.7':
bpo-25943: Check for integer overflow in bsddb's DB_join(). (GH-8392)
https://github.com/python/cpython/commit/041a4ee9456d716dd449d38a5328b82e76f5dbc4
History
Date User Action Args
2022-04-11 14:58:25adminsetgithub: 70131
2018-07-22 16:53:59serhiy.storchakasetmessages: + msg322154
2018-07-22 13:01:28ZackerySpytzsetpull_requests: + pull_request7920
2018-07-22 12:59:36ZackerySpytzsetmessages: + msg322143
2018-07-21 13:41:04serhiy.storchakasetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2018-07-21 08:27:49serhiy.storchakasetmessages: + msg322087
2018-07-19 06:42:29ZackerySpytzsetnosy: + ZackerySpytz
2018-07-19 05:54:34ZackerySpytzsetkeywords: + patch
stage: needs patch -> patch review
pull_requests: + pull_request7871
2016-01-03 02:03:00martin.panterunlinkissue25944 superseder
2015-12-25 11:13:07serhiy.storchakalinkissue25944 superseder
2015-12-25 11:10:04serhiy.storchakasetnosy: + lemburg, serhiy.storchaka

components: + Extension Modules, - Library (Lib)
stage: needs patch
2015-12-25 03:53:35Ned Williamsoncreate