This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: ctypes callback with structure crashes in Python 3.8 on Windows x86
Type: crash Stage: test needed
Components: ctypes, Windows Versions: Python 3.10, Python 3.9, Python 3.8
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: amaury.forgeotdarc, belopolsky, eryksun, itsgk92, meador.inge, paul.moore, steve.dower, tim.golden, vinay.sajip, xs_iceman, zach.ware
Priority: normal Keywords:

Created on 2020-06-18 09:26 by itsgk92, last changed 2022-04-11 14:59 by admin.

Files
File name Uploaded Description Edit
callback.zip itsgk92, 2020-06-19 08:11 source zip
Messages (8)
msg371796 - (view) Author: Aravindhan (itsgk92) Date: 2020-06-18 09:26
Python Process crashes with unauthorised memory access when the C++ DLL callback has arguments as structures. Happens in python 3.8. python 3.7 works fine with same source code. Initial investigations revels that the structure is called back as a pointer instead of plain python object. Callbacks with Primitive types are successful. This might be something to do with vector calling in python 3.8?

>>>Replicate:

>>>A c++ DLL with "subscribe_cb" function and a callback "listen_cb"; where FILETIME is windows FILETIME structure (it can be any structure for that matter)

extern "C" CBFUNC(void, listen_cb)(int value, FILETIME ts );
extern "C" EXFUNC(void) subscribe_cb(listen_cb);

EXFUNC(void) subscribe_cb(listen_cb cb)
{
	
	int i = 0;
	while (true) {
		FILETIME systime;
		GetSystemTimeAsFileTime(&systime);
		i++;
		Sleep(1000);
		cb(i, systime);
   }
}


>>>Python client for the dll

class FT(Structure):
    _fields_ = [("dwLowDateTime", c_ulong, 32),
                ("dwHighDateTime", c_ulong, 32)]


@WINFUNCTYPE(c_void_p, c_int, FT)
def cb(val, ft):
    print(f"callback {val} {ft.dwLowDateTime} {ft.dwHighDateTime}")


lib = WinDLL(r"C:\Temp\CBsimulate\CbClient\CbClient\Debug\DummyCb.dll")

lib.subscribe_cb(cb)

while 1:
    sleep(5)
msg371844 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2020-06-19 00:28
Please provide complete code that can be compiled as is, and the required build environment. I wrote a similar example in C, compiled for x64 with MSVC, and it worked fine for me. Also, please provide more details about the error -- a traceback in a native debugger, exception analysis, or a dump file. 

One detail that stands out to me is that you set the ctypes callback return type to c_void_p -- instead of void (None). But that shouldn't cause a crash. Another concern is your use of WINFUNCTYPE (stdcall) and WinDLL instead of CFUNCTYPE (cdecl) and CDLL. In x64 they're the same, but it's still important to use the function's declared calling convention, for the sake of x86 compatibility. Make sure you're really using stdcall.

Here's the code that worked for me:

c/test.c:

    #include <windows.h>

    typedef void listen_fn(int, FILETIME);

    static void
    subscribe_thread(listen_fn *cb)
    {
        int i = 0;
        FILETIME systime;
        while (1) {
            GetSystemTimeAsFileTime(&systime);
            cb(i++, systime);
            Sleep(1000);
       }
    }

    void __declspec(dllexport)
    subscribe(listen_fn *cb)
    {
        CreateThread(NULL, 0, (LPTHREAD_START_ROUTINE)subscribe_thread, 
            cb, 0, NULL);
    }

test.py:

    import ctypes
    from ctypes import wintypes

    lib = ctypes.CDLL('c/test.dll')
    listen_fn = ctypes.CFUNCTYPE(None, ctypes.c_int, wintypes.FILETIME)

    @listen_fn
    def cb(val, ft):
        print(f"callback {val:03d} {ft.dwHighDateTime} {ft.dwLowDateTime}")

    print('press enter to quit')
    lib.subscribe(cb)
    input()
msg371851 - (view) Author: Aravindhan (itsgk92) Date: 2020-06-19 08:11
Hi, Thanks for response @eryksun

Sorry that I did not mention it is compiled for x86 and the crash happens in 32bit python. Yes, the dll is declared with stdcall and built for x86 with MSVC. It is tested with python 3.7.4 32bit which is successful and fails on python 3.8.2 32bit. I did not look into on x64 though. Below is the source code. I have also attached them as a zip.

>>> c/DummyCb.h

#pragma once
#include <Windows.h>
#include <iostream>

#define CALLBACK    __stdcall
#define WINAPI      __stdcall

#define EXFUNC(type) _declspec(dllexport) void WINAPI // export function
#define CBFUNC(type,func) typedef void (CALLBACK* func) // callback function

extern "C" CBFUNC(void, listen_cb)(int value, FILETIME ts );
extern "C" EXFUNC(void) subscribe_cb(listen_cb);


>>> c/DummyCb.cpp

#include "pch.h"
#include "DummyCb.h"
#include <stdio.h>

EXFUNC(void) subscribe_cb(listen_cb cb)
{
	
	int i = 0;
	while (true) {
		FILETIME systime;
		GetSystemTimeAsFileTime(&systime);
		i++;
		Sleep(1000);
		cb(i, systime);
   }
}


>>> c/CbClient.cpp

#include <iostream>
#include "DummyCb.h"
#include <stdio.h>
#include <Windows.h>

void __stdcall lis(int a, FILETIME ts) {

	printf("calling back %d, %d, %d \n", a,ts.dwHighDateTime, ts.dwLowDateTime);
}


int main()
{
	subscribe_cb(&lis);
}



>>> CbClient.py


from ctypes import *
from time import sleep


class FT(Structure):
    _fields_ = [("dwLowDateTime", c_ulong, 32),
                ("dwHighDateTime", c_ulong, 32)]


@WINFUNCTYPE(c_void_p, c_int, FT)
def cb(val, ft):
    print(f"callback {val} {ft.dwLowDateTime} {ft.dwHighDateTime}")


lib = WinDLL(r"c/DummyCb.dll")

lib.subscribe_cb(cb)

while 1:
    sleep(5)


>>> Traceback on python 

    lib.subscribe_cb(cb)
OSError: exception: access violation writing 0xCA35789B


>>> Native Debug traceback

Run-Time Check Failure #0 - The value of ESP was not properly saved across a function call. This is usually a result of calling a function declared with one calling convention with a function pointer declared with a different calling convention. 

>>> Further analysis

change the callback declaration to a pointer of structure and it has same crash on python 3.7 32 bit. 

@WINFUNCTYPE(c_void_p, c_int, POINTER(FT))
def cb(val, ft):
    print(f"callback {val} {ft.contents}")

Which makes me correlate is it something to do with vector calling on python 3.8 which makes callabck as *cb (a pointer) instead of a plain object call (just a thought, did not investigate change logs). Though its interesting to note that it did not fail on x64 with cdecl calling convention. That explains why our larger code base din't crash on Linux x86 linked as a .SO .


cheers.
msg371920 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2020-06-20 05:43
I can reproduce a crash with an x86 build that uses a stdcall callback with a 32-bit argument followed by any ffi_type_sint64 argument (FILETIME, long long, etc). Apparently this is a bug in libffi's ffi_prep_cif. Given the 2nd argument has a size of 8 bytes and an alignment of 8 bytes, it sets the CIF (call interface) to 16 bytes instead of the correct value of 12 bytes. 

source:

    import ctypes

    @ctypes.WINFUNCTYPE(None, ctypes.c_int, ctypes.c_int64)
    def test(i, j):
        print(f"callback {i} {j}")

debugger:

    0:000:x86> k 2
    ChildEBP RetAddr
    00bef6a0 6f1873bd _ctypes_d!_ctypes_alloc_callback+0x24c
    00bef6dc 6e7ad81b _ctypes_d!PyCFuncPtr_new+0x32d

The `bytes` field should be 12, not 16:

    0:000:x86> ?? p->cif
    struct ffi_cif
       +0x000 abi              : 2 ( FFI_STDCALL )
       +0x004 nargs            : 2
       +0x008 arg_types        : 0x02dc43bc  -> 0x00742e5c _ffi_type
       +0x00c rtype            : 0x6f1accec _ffi_type
       +0x010 bytes            : 0x10
       +0x014 flags            : 8

    0:000:x86> ?? p->cif.arg_types[0]
    struct _ffi_type * 0x00742e5c
       +0x000 size             : 4
       +0x004 alignment        : 4
       +0x006 type             : 0xa
       +0x008 elements         : (null)

    0:000:x86> ?? p->cif.arg_types[1]
    struct _ffi_type * 0x02dd635c
       +0x000 size             : 8
       +0x004 alignment        : 8
       +0x006 type             : 0xc
       +0x008 elements         : (null)

When the callback returns, it cleans 16 bytes from stack instead of the expected 12. With stack-allocated `systime` in the sample code, eventually GetSystemTimeAsFileTime(&systime) ends up overwriting the return address, and it tries to return to an arbitrary address. If you're lucky, this address isn't executable memory, and it immediately fails on an access violation.

The callback in Aravindhan's sample code works fine if I modify the CIF to use 12 bytes instead of 16:

    _ctypes_d!_ctypes_alloc_callback+0x24c:
    6e9f36ec 83c414          add     esp,14h
    0:000:x86> ?? p->cif.bytes = 12
    unsigned int 0xc

    0:000:x86> bd 0; g
    callback 1 477180670 30820036
    callback 2 487502671 30820036
    callback 3 497655140 30820036
    callback 4 507801994 30820036
    callback 5 517818512 30820036
    callback 6 527961670 30820036

I suppose this issue should be closed as third party, with a bug report pushed upstream -- unless for now it would be better if ctypes implements a workaround.
msg402695 - (view) Author: Andreas Skaar (xs_iceman) Date: 2021-09-27 08:24
What is the next steps on this bug?

Have you created a bug with libffi if this is not working correctly eryksun? Sounds like you understand what the bug report need to contain.

On implementing a workaround in ctypes. Is this possible?
msg403035 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2021-10-01 21:54
Judging from https://github.com/libffi/libffi, they could do with more contributors/support over there.

I see a bunch of reports along these lines from a few years back, apparently someone has a test suite that covers it, but hasn't merged it into the libffi test suite which is where it really belongs.

If it turns out to be a straightforward fix, we're able to patch our own build rather than waiting for a libffi release (and since we're unlikely to get new 3.3 releases for 3.10 and earlier, that's what we'll need to release it in updates). But we don't run their test suite on our build, so we'd want it to be upstream as well.
msg412729 - (view) Author: Andreas Skaar (xs_iceman) Date: 2022-02-07 10:20
Steve Dower do you have any links to the issues you mention? Are these now solved?

I am not able to see if the fix is straightforward, but from the information from Eryk Sun it sounds like it should be?
msg412771 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2022-02-07 17:22
Only by following the link I posted and searching for issues that sound like this one.

Which I just did for you: https://github.com/libffi/libffi/issues/367

There may be more, though. I just grabbed the first one that looked like a match.
History
Date User Action Args
2022-04-11 14:59:32adminsetgithub: 85193
2022-02-07 17:22:23steve.dowersetmessages: + msg412771
2022-02-07 10:20:19xs_icemansetmessages: + msg412729
2021-10-01 21:54:27steve.dowersetmessages: + msg403035
2021-09-27 08:24:47xs_icemansetnosy: + xs_iceman
messages: + msg402695
2020-06-23 23:20:37eryksunsetnosy: + vinay.sajip

versions: + Python 3.9, Python 3.10
2020-06-20 05:43:53eryksunsetmessages: + msg371920
2020-06-19 08:28:41itsgk92settitle: ctypes callback with structure crashes in Python 3.8 on Windows -> ctypes callback with structure crashes in Python 3.8 on Windows x86
2020-06-19 08:11:05itsgk92setfiles: + callback.zip

messages: + msg371851
2020-06-19 00:40:30eryksunsettitle: Ctype callback with Structures crashes on python 3.8 on windows. -> ctypes callback with structure crashes in Python 3.8 on Windows
stage: test needed
2020-06-19 00:28:48eryksunsetnosy: + paul.moore, tim.golden, eryksun, zach.ware, steve.dower
messages: + msg371844
components: + Windows
2020-06-18 09:26:16itsgk92create