classification
Title: Remove IsBadStringPtr calls in ctypes
Type: enhancement Stage: resolved
Components: ctypes, Windows Versions:
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: steve.dower Nosy List: eryksun, python-dev, steve.dower, tim.golden, zach.ware
Priority: normal Keywords:

Created on 2015-03-24 18:39 by steve.dower, last changed 2015-04-03 15:20 by steve.dower. This issue is now closed.

Messages (9)
msg239167 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2015-03-24 18:39
Modules/_ctypes/cfield.c has this horror in it (twice):

    /* XXX What about invalid pointers ??? */
    if (*(void **)ptr) {
#if defined(MS_WIN32) && !defined(_WIN32_WCE)
        if (IsBadStringPtrA(*(char **)ptr, -1)) {
            PyErr_Format(PyExc_ValueError,
                         "invalid string pointer %p",
                         *(char **)ptr);
            return NULL;
        }
#endif
        return PyBytes_FromStringAndSize(*(char **)ptr,
                                         strlen(*(char **)ptr));

IsBadStringPtr should generally not be used, and the -1 parameter makes it even worse. See http://blogs.msdn.com/b/oldnewthing/archive/2006/09/27/773741.aspx for details, but the main reason is that if it is actually a bad pointer, we've just deferred the crash from the obvious location to somewhere that should "never" crash.

The strlen() call has exactly the same behaviour as IsBadStringPtrA except the crash will occur here.

A better alternative would be to use the safe strlen function to limit the maximum length of strings, but since we likely can't agree on a suitable maximum we should just stop trying to handle this case at all.
msg239176 - (view) Author: Tim Golden (tim.golden) * (Python committer) Date: 2015-03-24 20:28
Unless someone comes back who remembers what the ulterior motive was: I 
agree; remove the check and just the crash happen.
msg239221 - (view) Author: Roundup Robot (python-dev) Date: 2015-03-25 06:29
New changeset 585e555247ac by Steve Dower in branch 'default':
Issue #23765: Remove IsBadStringPtr calls in ctypes
https://hg.python.org/cpython/rev/585e555247ac
msg239239 - (view) Author: Eryk Sun (eryksun) * Date: 2015-03-25 09:24
c_char_p.__repr__ (defined in __init__.py) also checks IsBadStringPtrA via FFI. 

Defining the repr differently on Windows is a wart, IMO. The following repr should be used on all platforms for c_char_p:

    "%s(%s)" % (self.__class__.__name__, 
                c_void_p.from_buffer(self).value)

On a related note, wide-character c_wchar_p uses the default _SimpleCData.__repr__, which calls the getfunc. It should be overridden in the same manner as c_char_p because this can easily segfault Python on a non-Windows platform. Even on Windows, calling IsBadStringPtrW in Z_get can't make guarantees given multiple threads. The GIL helps, but with ctypes there's a good chance that many threads are running concurrently.
msg239250 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2015-03-25 13:02
Thanks, I forgot to scan .py files.
msg239284 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2015-03-25 22:47
So the repr that's there for c_char_p is currently::

    "%s(%s)" % (self.__class__.__name__, cast(self, c_void_p).value)

But if I remove the override then it renders the value as b'abc'. Basically, we can have one of:

>>> from ctypes import *
>>> cast(create_string_buffer(b'abc'), c_char_p)
c_char_p(b'abc')

or

>>> cast(create_string_buffer(b'abc'), c_char_p)
c_char_p(52808208)

I prefer the former (remove c_char_p.__repr__ completely), but the latter is clearly there for some reason. Any opinions?
msg239287 - (view) Author: Eryk Sun (eryksun) * Date: 2015-03-25 23:43
> So the repr that's there for c_char_p is currently::
>    "%s(%s)" % (self.__class__.__name__, cast(self, c_void_p).value)

I suggested switching to using from_buffer:

    c_void_p.from_buffer(self).value 

from_buffer is more efficient. cast() is imlemented as an FFI call. There's also another approach using the buffer protocol:

    int.from_bytes(self, sys.byteorder)

> I prefer the former (remove c_char_p.__repr__ completely), but 
> the latter is clearly there for some reason. Any opinions?

It's there for same reason I suggested overriding c_wchar_p.__repr__. Getting the repr of an object shouldn't segfault the interpreter. The existing check with IsBadStringPtrA is bogus in a multithreaded environment. The repr should just show the address. At least that won't crash.

A topic for another discussion is disabling the SEH handler because it's not used consistently anyway (e.g. to guard pointer and CData base pointer access). To compensate, the faulthandler could be enhanced. It could call the CRT's __pxcptinfoptrs (not documented, but it should be) to get the Windows EXCEPTION_POINTERS record. This would enable at least an improved error message, e.g. "Segmentation fault reading [addr]" and "Segmentation fault writing [addr]". The funny thing is the current faulthandler example uses ctypes.string_at, which, because it's implemented as an FFI call guarded by the SEH handler, just raises OSError on Windows.
msg239304 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2015-03-26 04:54
Ah okay, I understand better what your suggestion was trying to achieve now. I agree, we should just show the address in repr - I'll make that change to both c_char_p and c_wchar_p.
msg239305 - (view) Author: Roundup Robot (python-dev) Date: 2015-03-26 04:58
New changeset 339d90983e29 by Steve Dower in branch 'default':
Issue #23765: Removed IsBadStringPtr calls in ctypes
https://hg.python.org/cpython/rev/339d90983e29
History
Date User Action Args
2015-04-03 15:20:37steve.dowersetstatus: open -> closed
2015-03-26 04:58:55python-devsetmessages: + msg239305
2015-03-26 04:54:52steve.dowersetmessages: + msg239304
2015-03-25 23:43:26eryksunsetmessages: + msg239287
2015-03-25 22:47:09steve.dowersetmessages: + msg239284
2015-03-25 13:02:22steve.dowersetstatus: closed -> open

messages: + msg239250
2015-03-25 09:24:13eryksunsetnosy: + eryksun
messages: + msg239239
2015-03-25 06:29:53steve.dowersetstatus: open -> closed
resolution: fixed
stage: needs patch -> resolved
2015-03-25 06:29:15python-devsetnosy: + python-dev
messages: + msg239221
2015-03-24 20:28:27tim.goldensetmessages: + msg239176
2015-03-24 18:39:05steve.dowercreate