classification
Title: ctypes arrays >=2GB in length causes exception
Type: behavior Stage: resolved
Components: ctypes Versions: Python 3.7, Python 3.6, Python 3.5, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: David Heffernan, amaury.forgeotdarc, belopolsky, coderforlife, eryksun, ezio.melotti, i3v, loewis, meador.inge, miss-islington, serhiy.storchaka, vstinner
Priority: normal Keywords: patch

Created on 2013-01-04 21:28 by coderforlife, last changed 2018-12-04 10:47 by serhiy.storchaka. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 3006 merged Segev Finer, 2017-08-05 12:33
PR 6842 merged miss-islington, 2018-05-14 23:55
PR 6843 merged miss-islington, 2018-05-14 23:57
PR 7441 merged serhiy.storchaka, 2018-06-06 04:42
Messages (18)
msg179080 - (view) Author: Jeffrey Bush (coderforlife) Date: 2013-01-04 21:28
The environment is Windows 8 Pro 64-bit running Python 64-bit in the WinPython distribution. Python is v2.7.3 built on Apr 10 2012. I first found this with create_string_buffer however I found out that it happens with an even simpler example.

The following code throws an AttributeException: class must define a _length_ attribute, which must be a positive integer.

import ctypes
c_char * int(2*1024*1024*1024) # 2GB, also fails with long() instead of int()

However the following works

import ctypes
c_char * int(2*1024*1024*1024-1) # 1 byte less than 2GB


This is the same with the other c_ types (not limited to size of memory since c_int * int(2*1024*1024*1024-1) works and would be nearly 4 times the size of the failed c_char one).
msg179083 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2013-01-04 21:44
Would you like to investigate a patch?
msg179084 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2013-01-04 21:47
Note that adding support for >2GB arrays is a new feature and therefore can't go in 2.7 (but it would be OK for 3.4+).  The error message could be improved though.
msg179085 - (view) Author: Jeffrey Bush (coderforlife) Date: 2013-01-04 21:51
I have no idea where I would start and don't have much time...

I am not so sure it is a new features. It seems that the ctypes system is internally using unsigned integers for length but should be using size_t (or at least ssize_t). Seems like a bug.
msg179086 - (view) Author: Jeffrey Bush (coderforlife) Date: 2013-01-04 21:52
I mean using signed integers currently.
msg179087 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2013-01-04 22:02
If it works elsewhere and/or it's documented to work, then it might indeed considered a bug.  Maybe someone more familiar with ctype can comment.
msg179089 - (view) Author: Jeffrey Bush (coderforlife) Date: 2013-01-04 22:24
Okay, so I tested in Linux (CentOS 6.3) which has Python 2.6.6 64-bit. It works. So the Windows 2.7.3 64-bit version is bugged. I was able to perform the c_char * long(32*1024*1024*1024) [the highest value I tried] and it worked fine. The Linux machine I tested this on was limited in RAM so ran into memory issues, but I was able to allocate a 2GB buffer with create_string_buffer().

So all in all, it is a bug. Works with Linux Python v2.6.6 64-bit but not Windows Python v2.7.3.

The ctypes documentation does not mention an upper limit so I would assume that it should be based on the maximum memory allocation of the underlying system (e.g. Windows 32-bit can't allocate more than 2GB, but Windows 64-bit should be very very large).
msg179090 - (view) Author: Meador Inge (meador.inge) * (Python committer) Date: 2013-01-04 22:34
This case works fine on 64-bit Linux (Ubuntu) and OS X 10.7.5.  I suspect this is due to the fact that 64-bit Windows uses the LLP64 data model and we are using longs somewhere.  I am investigating further now.
msg179537 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2013-01-10 12:14
In _ctypes.c there are (only!) two occurrences of the "long" type... both are related to ctypes arrays and look suspect.
msg293395 - (view) Author: David Heffernan (David Heffernan) Date: 2017-05-10 09:50
I just ran into this issue. I'm trying to write code like this:

(ctypes.c_char*bufferLen).from_buffer(buffer)

where buffer is a bytearray. When bufferLen is greater than 2GB I fail foul of this code in _ctypes.c

    long length;
    ....
    length = PyLong_AsLongAndOverflow(length_attr, &overflow);
    if (overflow) {
        PyErr_SetString(PyExc_OverflowError,
                        "The '_length_' attribute is too large");
        Py_DECREF(length_attr);
        goto error;
    }

Surely this should not be forcing long on us. Can't it use PyLong_AsSsize_t or perhaps PyLong_AsLongLongAndOverflow?
msg293403 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2017-05-10 11:51
In older versions of ctypes, before it was added to the standard library, the underlying length field was a C int, and CArrayType_new used the PyInt_AS_LONG macro. ctypes was added to the standard library in 2.5, by which time the length field is Py_ssize_t, but CArrayType_new still used the PyInt_AS_LONG macro. That's still the case in 2.7. Python 3 changed this to call PyLong_AsLongAndOverflow, but apparently Christian Heimes didn't consider fixing this properly to use Py_ssize_t:

https://hg.python.org/cpython/rev/612d8dea7f6c

David, maybe there's a workaround for your use case, if you can provide some more details.
msg293419 - (view) Author: David Heffernan (David Heffernan) Date: 2017-05-10 13:31
Erik,

As you can no doubt guess, this is related to the questions I have been asking on SO that you have so expertly been answering. Thank you!

When I solved the latest problem, getting at the internal buffer of a bytearray, I used the code in my previous comment, which came from a comment (now deleted) of yours to the question.

Looking at this more closely I realise that, as you said in your answer, the array length is not important for my specific needs, to it is fine to use zero. That side steps this issue completely. Thanks again.
msg316604 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-05-14 23:54
New changeset 735abadd5bd91db4a9e6f4311969b0afacca0a1a by Serhiy Storchaka (Segev Finer) in branch 'master':
bpo-16865: Support arrays >=2GB in ctypes. (GH-3006)
https://github.com/python/cpython/commit/735abadd5bd91db4a9e6f4311969b0afacca0a1a
msg316623 - (view) Author: miss-islington (miss-islington) Date: 2018-05-15 05:40
New changeset 2ce72e243fbc0e4f07f1191b20be548bfa5cbe11 by Miss Islington (bot) in branch '3.7':
bpo-16865: Support arrays >=2GB in ctypes. (GH-3006)
https://github.com/python/cpython/commit/2ce72e243fbc0e4f07f1191b20be548bfa5cbe11
msg316624 - (view) Author: miss-islington (miss-islington) Date: 2018-05-15 05:55
New changeset 726894addc02effaa369fded3caaba94875c1f3d by Miss Islington (bot) in branch '3.6':
bpo-16865: Support arrays >=2GB in ctypes. (GH-3006)
https://github.com/python/cpython/commit/726894addc02effaa369fded3caaba94875c1f3d
msg331028 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-12-04 10:38
New changeset 93d7918f77278f973a4a106c1d01ad2d9805816d by Serhiy Storchaka in branch '2.7':
[2.7] bpo-16865: Support arrays >=2GB in ctypes. (GH-3006). (GH-7441)
https://github.com/python/cpython/commit/93d7918f77278f973a4a106c1d01ad2d9805816d
msg331029 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-12-04 10:39
> The environment is Windows 8 Pro 64-bit running Python 64-bit in the WinPython distribution.

This issue is specific to system with 32-bit long but 64-bit void* I guess? So only Windows is impacted?
msg331030 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-12-04 10:47
This issue is specific to system with 32-bit long but 64-bit size_t. Yes, seems the only supported impacted system is Windows.
History
Date User Action Args
2018-12-04 10:47:28serhiy.storchakasetstatus: open -> closed
resolution: fixed
messages: + msg331030

stage: patch review -> resolved
2018-12-04 10:39:50vstinnersetnosy: + vstinner
messages: + msg331029
2018-12-04 10:38:10serhiy.storchakasetmessages: + msg331028
2018-06-06 04:42:45serhiy.storchakasetpull_requests: + pull_request7068
2018-05-15 05:55:55miss-islingtonsetmessages: + msg316624
2018-05-15 05:40:29miss-islingtonsetnosy: + miss-islington
messages: + msg316623
2018-05-14 23:57:39miss-islingtonsetpull_requests: + pull_request6524
2018-05-14 23:55:38miss-islingtonsetkeywords: + patch
stage: needs patch -> patch review
pull_requests: + pull_request6523
2018-05-14 23:54:38serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg316604
2017-08-17 18:23:47i3vsetnosy: + i3v
2017-08-05 12:33:06Segev Finersetpull_requests: + pull_request3040
2017-05-10 13:31:13David Heffernansetmessages: + msg293419
2017-05-10 11:51:15eryksunsetnosy: + eryksun

messages: + msg293403
versions: + Python 3.5, Python 3.6, Python 3.7, - Python 3.2, Python 3.3, Python 3.4
2017-05-10 09:50:46David Heffernansetnosy: + David Heffernan
messages: + msg293395
2013-01-10 12:14:15amaury.forgeotdarcsetmessages: + msg179537
2013-01-04 22:34:26meador.ingesetstage: needs patch
messages: + msg179090
versions: + Python 3.2, Python 3.3, Python 3.4
2013-01-04 22:24:39coderforlifesetmessages: + msg179089
2013-01-04 22:02:49ezio.melottisetmessages: + msg179087
2013-01-04 21:52:12coderforlifesetmessages: + msg179086
2013-01-04 21:51:29coderforlifesetmessages: + msg179085
2013-01-04 21:47:38ezio.melottisetnosy: + belopolsky, amaury.forgeotdarc, meador.inge, ezio.melotti
messages: + msg179084
2013-01-04 21:44:47loewissetnosy: + loewis
messages: + msg179083
2013-01-04 21:28:18coderforlifecreate