Issue 1798: Add ctypes calling convention that allows safe access of errno

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/46131

classification

Title:	Add ctypes calling convention that allows safe access of errno
Type:	enhancement	Stage:
Components:	Extension Modules	Versions:	Python 2.6

process

Status:	closed	Resolution:	fixed
Dependencies:		Superseder:
Assigned To:	theller	Nosy List:	amaury.forgeotdarc, arigo, fijal, loewis, theller
Priority:	normal	Keywords:	patch

Created on 2008-01-11 20:51 by theller, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
ctypes-errno.patch	theller, 2008-01-11 20:51	ctypes-errno.patch
ctypes-errno-3.patch	theller, 2008-05-30 20:40

Messages (27)
msg59748 - (view)	Author: Thomas Heller (theller) *	Date: 2008-01-11 20:51
This patch adds new calling conventions to ctypes foreign functions by passing 'errno=True' or 'GetLastError=True' to the CDLL or WinDLL constructor. If CDLL(..., errno=True) or WinDLL(..., errno=True) is used, the function objects available in the CDLL or WinDLL instance will set the C global errno to zero before the actual call, and attach the C global errno value after the call as 'errno' attribute to the function object. This attribute is stored in thread-local storage. Similarly, if CDLL(..., GetLastError=True) or WinDLL(..., GetLastError=True) is used, the win32 api function 'SetLastError(0)' is used to reset the windows last error code before the actual call, and the value returned by 'GetLastError()' is attached as 'LastError' attribute to the function object, in thread local storage. Of course this only occurs on Windows. The LastError and errno attributes are readonly from Python code, accessing them before a foreign function call has occurred in the current thread raises a ValueError.
msg59756 - (view)	Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) *	Date: 2008-01-11 21:27
I like this way to tell that a function modifies errno or GetLastError. But this thread-local attribute on the function seems bizarre to me. I would prefer another way to get the errno. I can see two alternatives: - the function returns a tuple (normalresult, errno) on each call. - when errno is not zero, EnvironmentError (or WindowsError) is raised.
msg59835 - (view)	Author: Thomas Heller (theller) *	Date: 2008-01-12 19:56
> But this thread-local attribute on the function seems bizarre to me. > I would prefer another way to get the errno. I can see two alternatives: > - the function returns a tuple (normalresult, errno) on each call. > - when errno is not zero, EnvironmentError (or WindowsError) is raised. I'd stronly prefer NOT to add errno to the function return value. Raising an Exception when errno or LastError != zero is wrong. There are functions that set the errno or LastError value even if they actually succeed. The recommended way to check for errors that I had in mind is in the 'errcheck' result checker: func = CDLL(..., errno=True) func.argtypes = [...] func.restype = ... def errcheck(result, func, args): if result == -1: # function failed raise EnvironmentError(func.errno) func.errcheck = errcheck Of course, an alternative to a thread local storage attribute would be to pass the error value to the errcheck function. I just felt it would be better not to change the signature, but maybe I was wrong. Anyway, this patch should be extended so that it is also possible to create a foreign function using the descibed calling convention from a prototype created by CFUNCTYPE or WINFUNCTYPE.
msg67169 - (view)	Author: Maciek Fijalkowski (fijal)	Date: 2008-05-21 19:34
thread local storage sounds also a bit weird to me. Errcheck sounds way better, I would also like to have errcheck as a possible default on library, when calling CDLL constructor. This would be a convinient way of specifying errno handling for all functions from that library. Cheers, fijal
msg67206 - (view)	Author: Armin Rigo (arigo) *	Date: 2008-05-22 19:57
Alternatively, we can try to make ctypes "feel like" C itself: ctypes.set_errno(0) while True: dirent = linux_c_lib.readdir(byref(dir)) if not dirent: if ctypes.get_errno() == 0: break else: raise IOError("oups!") We are doing this kind of thing in PyPy with the "rffi" foreign function interface. To achieve this result, ctypes would have to maintain internally an internal, global (but thread-local) variable representing the current errno value. Just before doing a C library function call, ctypes would copy this variable into the real C-level errno; and immediately after the call it would read the C-level errno back into its internal variable. In this way, other calls to C functions unrelated to ctypes don't interfere. The get_errno() and set_errno() functions merely access the thread-local variable (not the real C-level errno).
msg67227 - (view)	Author: Thomas Heller (theller) *	Date: 2008-05-23 11:15
That's a great idea. > Just before doing a C library function call, > ctypes would copy this variable into the real C-level errno; and Is the ctypes.set_errno(...) function really needed? Wouldn't it be sufficient if errno is simply set to zero before each function call?
msg67233 - (view)	Author: Martin v. Löwis (loewis) *	Date: 2008-05-23 12:41
I also think that there should be explicit set_errno/get_errno calls. If you are interfacing to C, things should be as they are in C, i.e. you have to explicitly set errno to zero if you want to read it afterwards in a reliable manner (unless you can find out whether an error occurred by different means, e.g. a -1 return value - which is fairly common).
msg67237 - (view)	Author: Thomas Heller (theller) *	Date: 2008-05-23 13:42
But would it hurt to set errno to zero before any function call? My experiments show that it is faster to clear errno always instead of trying to get a previously set value from tls storage in a ctypes-global object created by calling "threading.local()".
msg67238 - (view)	Author: Martin v. Löwis (loewis) *	Date: 2008-05-23 14:01
People might get confused that errno is modified "without reason". I don't understand the need for the TLS, either - can't you just read and write the "true" errno (which will also be thread-local)?
msg67239 - (view)	Author: Thomas Heller (theller) *	Date: 2008-05-23 14:14
This does not work because Python can run arbitrary code, even in the same thread, between the call to a function in a shared library and the call to get_errno().
msg67243 - (view)	Author: Maciek Fijalkowski (fijal)	Date: 2008-05-23 14:52
On the one hand, no it does not hurt when we're resetting errno before any call. On the other hand if we try hard "to be like C", it's a bit strange that we always reset errno (which is on the other hand very good practice anyway). IMO it's fine to reset errno before any call, until someone can find example where it's not good. Cheers, fijal
msg67249 - (view)	Author: Martin v. Löwis (loewis) *	Date: 2008-05-23 15:32
How can Python run arbitrary code between the return from a ctypes method and the next Python instruction? None of the code should have any effect on errno.
msg67250 - (view)	Author: Thomas Heller (theller) *	Date: 2008-05-23 15:37
> How can Python run arbitrary code between the return from a ctypes > method and the next Python instruction? None of the code should have any > effect on errno. By freeing objects because their refcount has reached zero?
msg67252 - (view)	Author: Armin Rigo (arigo) *	Date: 2008-05-23 15:57
I'm in favor of keeping set_errno(). Not only is it more like C, but it's also future-proof against a user that will end up really needing the feature. If we go for always setting errno to zero, we cannot change that later any more, because everybody's code will rely on it. (An example of possible usage of set_errno(): the user might need to save and restore the value around a subfunction call that may call more C function via ctypes.) I personally don't care if this means a few percents of performance loss; I don't see how ctypes can be performance-critical in the first place given its huge overhead compared to e.g. rewriting the code in C. I imagine that the same approach could work with GetLastError() on Windows. (But is there a SetLastError() on Windows?) Using the native errno instead of a custom TLS value is bad because a lot of things can occur; for example, adding debugging prints or a pdb breakpoint between the function return and the call to get_errno() is very likely to end up overwriting the C-level errno value. For all I know, on some platforms errno could even be overwritten by the calls to the unlock/lock operation that CPython does at regular intervals in order to let other threads run. (For example I can easily imagine that GetLastError() is overridden by the lock/unlock operations on Windows.)
msg67254 - (view)	Author: Martin v. Löwis (loewis) *	Date: 2008-05-23 17:31
>> How can Python run arbitrary code between the return from a ctypes >> method and the next Python instruction? None of the code should have any >> effect on errno. > > By freeing objects because their refcount has reached zero? No, that should not set errno. Free cannot fail, and will not modify errno. Try running #include <errno.h> int main() { errno = 117; free(malloc(100)); printf("%d\n", errno); } malloc might set errno, but only if it fails (in which case you'll get a Python exception, anyway).
msg67255 - (view)	Author: Martin v. Löwis (loewis) *	Date: 2008-05-23 17:34
[...] > > Using the native errno instead of a custom TLS value is bad because a > lot of things can occur So what's the semantics of set_errno then? Set the real errno? If so, what if it gets changed between the call to set_errno, and the actual invocation of API function? Set the TLS errno? If so, when does it get copied into the real errno?
msg67256 - (view)	Author: Thomas Heller (theller) *	Date: 2008-05-23 17:47
>> Using the native errno instead of a custom TLS value is bad because a >> lot of things can occur > > So what's the semantics of set_errno then? Set the real errno? If so, > what if it gets changed between the call to set_errno, and the actual > invocation of API function? Set the TLS errno? If so, when does it get > copied into the real errno? AFAIU, set_errno/get_errno should provide a ctypes-private copy of the real errno. The copy is copied into the 'real' errno just before ffi_call (in Modules/_ctypes/callproc.c), and the real errno is copied in to ctypes copy right after the call. Probably the real errno from before this action should be restored at the end. Code that I have in mind: __thread int _ctypes_errno; /* ctypes-private global error number in thread local storage / static int _call_function_pointer(...) { int old_errno; . . old_errno = errno; errno = _ctypes_errno; ffi_call(....); _ctypes_errno = errno; errno = old_errno; . . } static PyObject _ctypes_set_errno(PyObject self, Pyobject args) { . _ctypes_errno = parsed_argument; . } static PyObject * _ctypes_get_errno(PyObject self, Pyobject args) { return PyInt_FromLong(_ctypes_errno); }
msg67260 - (view)	Author: Martin v. Löwis (loewis) *	Date: 2008-05-23 18:53
> AFAIU, set_errno/get_errno should provide a ctypes-private copy of the real errno. > The copy is copied into the 'real' errno just before ffi_call (in Modules/_ctypes/callproc.c), > and the real errno is copied in to ctypes copy right after the call. If you clear errno, anyway, you can also drop the set_errno call, and zero-initialize errno before each call. The point of set_errno would be that you have the choice of not calling it, i.e. passing into the function the errno value that was there before you made the API call. If you fill something else into errno always, the application has no way of not modifying errno before the API call.
msg67265 - (view)	Author: Thomas Heller (theller) *	Date: 2008-05-23 19:34
>> AFAIU, set_errno/get_errno should provide a ctypes-private copy of the real errno. >> The copy is copied into the 'real' errno just before ffi_call (in Modules/_ctypes/callproc.c), >> and the real errno is copied in to ctypes copy right after the call. > > If you clear errno, anyway, you can also drop the set_errno call, and My code sample did not intend to clear errno, it was intended to set errno to the last value that the ctypes-copy had before the call. > zero-initialize errno before each call. The point of set_errno would > be that you have the choice of not calling it, i.e. passing into > the function the errno value that was there before you made the API > call. If you fill something else into errno always, the application has > no way of not modifying errno before the API call. To make my point clear (in case is isn't already): In the code sample that Armin posted, it is impossible to guarantee that no stdlib function is called between the readdir() and the get_errno() call: dirent = linux_c_lib.readdir(byref(dir)) if not dirent: if ctypes.get_errno() == 0: Consider the garbage collector free some objects that release resources they have, in other words call functions different from free(3). Similarly for calling set_errno() or not calling it; other stdlib function calls in the meantime may have changed errno and that might not be what the Python programmer expects. Anyway, how can we proceed? Do you have a suggestion? Should set_errno() set a flag that is examined and consumed before call_function_pointer() is called?
msg67266 - (view)	Author: Martin v. Löwis (loewis) *	Date: 2008-05-23 19:54
> My code sample did not intend to clear errno, it was intended to set errno > to the last value that the ctypes-copy had before the call. It's the same thing (to me): you change errno. > To make my point clear (in case is isn't already): In the code > sample that Armin posted, it is impossible to guarantee that no > stdlib function is called between the readdir() and the get_errno() call: > > dirent = linux_c_lib.readdir(byref(dir)) > if not dirent: > if ctypes.get_errno() == 0: > > Consider the garbage collector free some objects that release resources > they have, in other words call functions different from free(3). However, even the TLS copy of errno may change because of this, if the finalizer of some object invokes ctypes, right? > Anyway, how can we proceed? Do you have a suggestion? I'll leave it up to you to decide - you'll take both blame and praise, anyway, so I don't want to talk you into coming up with an API that you don't like.
msg67285 - (view)	Author: Armin Rigo (arigo) *	Date: 2008-05-24 07:58
> However, even the TLS copy of errno may change because of this, > if the finalizer of some object invokes ctypes, right? Yes, it's annoying, but at least the Python programmer has a way to fix this problem: he can save and restore the TLS copy of errno around the call to ctypes done in the finalizer, using get_errno()/set_errno(). This matches what a C programmer would have to do if he wanted to write, say, a debugging function that can be called from anywhere including between an OS call and the following check for errno. (Another note: the C-level errno and the TLS copy should also be synchronized when the C code invokes a Python callback.) In PyPy, when creating and using our own internal FFI, we started by having a pair of functions that could directly read or change the C-level errno. Then we ran into various troubles (e.g. just stepping through the Python code in a debugger messes things up because of the debugger's own input/output) and finally reached the design I propose here. (Admittedly PyPy's internal FFI is a bit more low-level than ctypes, so maybe it's more appropriate for ctypes to use a higher-level approach that a return value checker.) (A related issue that we may or may not care about: it's more than likely that various people have already come up with various workarounds to handle errno, and these workarounds will probably stop working after ctypes is changed...)
msg67508 - (view)	Author: Thomas Heller (theller) *	Date: 2008-05-29 18:56
Ok, here is the plan (basically Armin's proposal): ctypes maintains a gloabl, but thread-local, variable that contains an error number; called 'ctypes_errno' for this discussion. ctypes.set_errno(value) copies 'value' into ctypes_errno, and returns the old value. ctypes.get_errno() returns the current ctypes_errno value. Foreign functions created with CDLL(..., use_errno=True), when called, copy the ctypes_errno value into the real errno just before they are called, and copy the real errno value into ctypes_errno. The 'use_errno' parameter defaults to False, in this case ctypes_errno is not touched. The same scheme is implemented on Windows for GetLastError() and SetLastError(), the variable is named 'ctypes_LastError', and the CDLL and WinDLL optional parameter is named 'use_LastError', and also defaults to False. Armin Rigo schrieb: > (Another note: the C-level errno and the TLS copy should also be > synchronized when the C code invokes a Python callback.) Not sure what this is for, please proofread the patch when I attach one. Even better, someone could supply test-cases for all this, either as executable code or as prosa. > (A related issue that we may or may not care about: it's more than > likely that various people have already come up with various workarounds > to handle errno, and these workarounds will probably stop working after > ctypes is changed...) Since the new behaviour will only be invoked when use_errno or use_LastError is used, they should be able to still use their workarounds.
msg67550 - (view)	Author: Thomas Heller (theller) *	Date: 2008-05-30 20:40
Here is a patch implementing the plan. This text could serve as a start for the documentation, but it also describes the current implementation. Usage recipes should probably be added: /* ctypes maintains a module-global, but thread-local, variable that contains an error number; called 'ctypes_errno' for this discussion. This variable is a private copy of the stdlib 'errno' value; it is swapped with the 'errno' variable on several occasions. Foreign functions created with CDLL(..., use_errno=True), when called, swap the values just before they are actual function call, and swapped again afterwards. The 'use_errno' parameter defaults to False, in this case ctypes_errno is not touched. The values are also swapped immeditately before and after ctypes callback functions are called, if the callbacks are constructed using the new optional use_errno parameter for CFUNCTYPE(..., use_errno=TRUE) or WINFUNCTYPE(..., use_errno=True). Two new ctypes functions are provided to access the 'ctypes_errno' value from Python: - ctypes.set_errno(value) sets ctypes_errno to 'value', the previous ctypes_errno value is returned. - ctypes.get_errno() returns the current ctypes_errno value. The same scheme is implemented on Windows for GetLastError() and SetLastError(), and the CDLL and WinDLL optional parameter is named 'use_LastError', and also defaults to False. On Windows, TlsSetValue and TlsGetValue calls are used to provide thread local storage for the variables; ctypes compiled with __GNUC__ uses __thread variables. */
msg67552 - (view)	Author: Thomas Heller (theller) *	Date: 2008-05-30 20:46
Thomas Heller schrieb: > Here is a patch implementing the plan. ctypes-errno-3.patch, in case of doubt.
msg67756 - (view)	Author: Thomas Heller (theller) *	Date: 2008-06-06 08:37
A different patch but implementing the same functionality was now committed as trunk rev 63977.
msg67759 - (view)	Author: Armin Rigo (arigo) *	Date: 2008-06-06 11:07
> (Another note: the C-level errno and the TLS copy should also be > synchronized when the C code invokes a Python callback.) What I meant is what should occur when a pure Python function is used as a callback. At this point there is already some logic e.g. to re-acquire the GIL if necessary. Maybe it needs to grow logic to optionally copy the C-level errno into the TLS variable at the start, and at the end copy it back into the C-level errno at the end, for the cases where the C code expects the callback to be able to set errno.
msg67760 - (view)	Author: Thomas Heller (theller) *	Date: 2008-06-06 11:14
>> (Another note: the C-level errno and the TLS copy should also be >> synchronized when the C code invokes a Python callback.) > > What I meant is what should occur when a pure Python function is used > as a callback. At this point there is already some logic e.g. to > re-acquire the GIL if necessary. Maybe it needs to grow logic to > optionally copy the C-level errno into the TLS variable at the start, > and at the end copy it back into the C-level errno at the end, for the > cases where the C code expects the callback to be able to set errno. I figured that out in the meantime and implemented it in this way. See the code around line 295 in Modules/_ctypes/callbacks.c.

History
Date	User	Action	Args
2022-04-11 14:56:29	admin	set	github: 46131
2008-06-06 11:14:00	theller	set	messages: + msg67760
2008-06-06 11:07:05	arigo	set	messages: + msg67759
2008-06-06 08:37:42	theller	set	status: open -> closed resolution: fixed messages: + msg67756
2008-05-30 20:46:52	theller	set	messages: + msg67552
2008-05-30 20:40:25	theller	set	files: + ctypes-errno-3.patch messages: + msg67550
2008-05-29 18:56:59	theller	set	messages: + msg67508
2008-05-24 07:58:54	arigo	set	messages: + msg67285
2008-05-23 19:54:14	loewis	set	messages: + msg67266
2008-05-23 19:34:27	theller	set	messages: + msg67265
2008-05-23 18:53:32	loewis	set	messages: + msg67260
2008-05-23 17:47:20	theller	set	messages: + msg67256
2008-05-23 17:34:22	loewis	set	messages: + msg67255
2008-05-23 17:31:10	loewis	set	messages: + msg67254
2008-05-23 15:57:21	arigo	set	messages: + msg67252
2008-05-23 15:37:55	theller	set	messages: + msg67250
2008-05-23 15:32:16	loewis	set	messages: + msg67249
2008-05-23 14:52:49	fijal	set	messages: + msg67243
2008-05-23 14:14:12	theller	set	messages: + msg67239
2008-05-23 14:01:07	loewis	set	messages: + msg67238
2008-05-23 13:42:30	theller	set	messages: + msg67237
2008-05-23 12:41:16	loewis	set	nosy: + loewis messages: + msg67233
2008-05-23 11:15:50	theller	set	messages: + msg67227
2008-05-22 19:57:03	arigo	set	nosy: + arigo messages: + msg67206
2008-05-21 19:34:53	fijal	set	nosy: + fijal messages: + msg67169
2008-01-12 19:56:54	theller	set	messages: + msg59835
2008-01-11 21:27:51	amaury.forgeotdarc	set	nosy: + amaury.forgeotdarc messages: + msg59756
2008-01-11 20:51:36	theller	create