Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ctypes calling convention that allows safe access of errno #46131

Closed
theller opened this issue Jan 11, 2008 · 27 comments
Closed

Add ctypes calling convention that allows safe access of errno #46131

theller opened this issue Jan 11, 2008 · 27 comments
Assignees
Labels
extension-modules C modules in the Modules dir type-feature A feature request or enhancement

Comments

@theller
Copy link

theller commented Jan 11, 2008

BPO 1798
Nosy @loewis, @arigo, @theller, @amauryfa
Files
  • ctypes-errno.patch: ctypes-errno.patch
  • ctypes-errno-3.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/theller'
    closed_at = <Date 2008-06-06.08:37:42.759>
    created_at = <Date 2008-01-11.20:51:36.537>
    labels = ['extension-modules', 'type-feature']
    title = 'Add ctypes calling convention that allows safe access of errno'
    updated_at = <Date 2008-06-06.11:14:00.893>
    user = 'https://github.com/theller'

    bugs.python.org fields:

    activity = <Date 2008-06-06.11:14:00.893>
    actor = 'theller'
    assignee = 'theller'
    closed = True
    closed_date = <Date 2008-06-06.08:37:42.759>
    closer = 'theller'
    components = ['Extension Modules']
    creation = <Date 2008-01-11.20:51:36.537>
    creator = 'theller'
    dependencies = []
    files = ['9130', '10480']
    hgrepos = []
    issue_num = 1798
    keywords = ['patch']
    message_count = 27.0
    messages = ['59748', '59756', '59835', '67169', '67206', '67227', '67233', '67237', '67238', '67239', '67243', '67249', '67250', '67252', '67254', '67255', '67256', '67260', '67265', '67266', '67285', '67508', '67550', '67552', '67756', '67759', '67760']
    nosy_count = 5.0
    nosy_names = ['loewis', 'arigo', 'theller', 'amaury.forgeotdarc', 'fijal']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = None
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue1798'
    versions = ['Python 2.6']

    @theller
    Copy link
    Author

    theller commented Jan 11, 2008

    This patch adds new calling conventions to ctypes foreign functions by
    passing 'errno=True' or 'GetLastError=True' to the CDLL or WinDLL
    constructor.

    If CDLL(..., errno=True) or WinDLL(..., errno=True) is used, the
    function objects available in the CDLL or WinDLL instance will set the C
    global errno to zero before the actual call, and attach the C global
    errno value after the call as 'errno' attribute to the function object.
    This attribute is stored in thread-local storage.

    Similarly, if CDLL(..., GetLastError=True) or WinDLL(...,
    GetLastError=True) is used, the win32 api function 'SetLastError(0)' is
    used to reset the windows last error code before the actual call, and
    the value returned by 'GetLastError()' is attached as 'LastError'
    attribute to the function object, in thread local storage. Of course
    this only occurs on Windows.

    The LastError and errno attributes are readonly from Python code,
    accessing them before a foreign function call has occurred in the
    current thread raises a ValueError.

    @theller theller self-assigned this Jan 11, 2008
    @theller theller added extension-modules C modules in the Modules dir type-feature A feature request or enhancement labels Jan 11, 2008
    @amauryfa
    Copy link
    Member

    I like this way to tell that a function modifies errno or GetLastError.

    But this thread-local attribute on the function seems bizarre to me.
    I would prefer another way to get the errno. I can see two alternatives:

    • the function returns a tuple (normalresult, errno) on each call.
    • when errno is not zero, EnvironmentError (or WindowsError) is raised.

    @theller
    Copy link
    Author

    theller commented Jan 12, 2008

    But this thread-local attribute on the function seems bizarre to me.
    I would prefer another way to get the errno. I can see two alternatives:

    • the function returns a tuple (normalresult, errno) on each call.
    • when errno is not zero, EnvironmentError (or WindowsError) is raised.

    I'd stronly prefer NOT to add errno to the function return value.
    Raising an Exception when errno or LastError != zero is wrong.
    There are functions that set the errno or LastError value even
    if they actually succeed.

    The recommended way to check for errors that I had in mind is in
    the 'errcheck' result checker:

      func = CDLL(..., errno=True)
      func.argtypes = [...]
      func.restype = ...
      def errcheck(result, func, args):
          if result == -1: # function failed
              raise EnvironmentError(func.errno)
      func.errcheck = errcheck

    Of course, an alternative to a thread local storage
    attribute would be to pass the error value to the errcheck
    function. I just felt it would be better not to change
    the signature, but maybe I was wrong.

    Anyway, this patch should be extended so that it is also possible
    to create a foreign function using the descibed calling convention
    from a prototype created by CFUNCTYPE or WINFUNCTYPE.

    @fijal
    Copy link

    fijal commented May 21, 2008

    thread local storage sounds also a bit weird to me. Errcheck sounds way
    better, I would also like to have errcheck as a possible default on
    library, when calling CDLL constructor. This would be a convinient way
    of specifying errno handling for all functions from that library.

    Cheers,
    fijal

    @arigo
    Copy link
    Mannequin

    arigo mannequin commented May 22, 2008

    Alternatively, we can try to make ctypes "feel like" C itself:

         ctypes.set_errno(0)
         while True:
             dirent = linux_c_lib.readdir(byref(dir))
             if not dirent:
                 if ctypes.get_errno() == 0:
                     break
                 else:
                     raise IOError("oups!")

    We are doing this kind of thing in PyPy with the "rffi" foreign function
    interface. To achieve this result, ctypes would have to maintain
    internally an internal, global (but thread-local) variable representing
    the current errno value. Just before doing a C library function call,
    ctypes would copy this variable into the real C-level errno; and
    immediately after the call it would read the C-level errno back into its
    internal variable. In this way, other calls to C functions unrelated to
    ctypes don't interfere. The get_errno() and set_errno() functions
    merely access the thread-local variable (not the real C-level errno).

    @theller
    Copy link
    Author

    theller commented May 23, 2008

    That's a great idea.

    Just before doing a C library function call,
    ctypes would copy this variable into the real C-level errno; and

    Is the ctypes.set_errno(...) function really needed? Wouldn't it be sufficient
    if errno is simply set to zero before each function call?

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented May 23, 2008

    I also think that there should be explicit set_errno/get_errno calls. If
    you are interfacing to C, things should be as they are in C, i.e. you
    have to explicitly set errno to zero if you want to read it afterwards
    in a reliable manner (unless you can find out whether an error occurred
    by different means, e.g. a -1 return value - which is fairly common).

    @theller
    Copy link
    Author

    theller commented May 23, 2008

    But would it hurt to set errno to zero before *any* function call?

    My experiments show that it is faster to clear errno always
    instead of trying to get a previously set value from tls storage in a
    ctypes-global object created by calling "threading.local()".

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented May 23, 2008

    People might get confused that errno is modified "without reason".

    I don't understand the need for the TLS, either - can't you just read
    and write the "true" errno (which will also be thread-local)?

    @theller
    Copy link
    Author

    theller commented May 23, 2008

    This does not work because Python can run arbitrary code,
    even in the same thread, between the call to a function in
    a shared library and the call to get_errno().

    @fijal
    Copy link

    fijal commented May 23, 2008

    On the one hand, no it does not hurt when we're resetting errno before
    any call. On the other hand if we try hard "to be like C", it's a bit
    strange that we always reset errno (which is on the other hand very good
    practice anyway).

    IMO it's fine to reset errno before any call, until someone can find
    example where it's not good.

    Cheers,
    fijal

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented May 23, 2008

    How can Python run arbitrary code between the return from a ctypes
    method and the next Python instruction? None of the code should have any
    effect on errno.

    @theller
    Copy link
    Author

    theller commented May 23, 2008

    How can Python run arbitrary code between the return from a ctypes
    method and the next Python instruction? None of the code should have any
    effect on errno.

    By freeing objects because their refcount has reached zero?

    @arigo
    Copy link
    Mannequin

    arigo mannequin commented May 23, 2008

    I'm in favor of keeping set_errno(). Not only is it more like C, but
    it's also future-proof against a user that will end up really needing
    the feature. If we go for always setting errno to zero, we cannot
    change that later any more, because everybody's code will rely on it.
    (An example of possible usage of set_errno(): the user might need to
    save and restore the value around a subfunction call that may call more
    C function via ctypes.)

    I personally don't care if this means a few percents of performance
    loss; I don't see how ctypes can be performance-critical in the first
    place given its huge overhead compared to e.g. rewriting the code in C.

    I imagine that the same approach could work with GetLastError() on
    Windows. (But is there a SetLastError() on Windows?)

    Using the native errno instead of a custom TLS value is bad because a
    lot of things can occur; for example, adding debugging prints or a pdb
    breakpoint between the function return and the call to get_errno() is
    very likely to end up overwriting the C-level errno value. For all I
    know, on some platforms errno could even be overwritten by the calls to
    the unlock/lock operation that CPython does at regular intervals in
    order to let other threads run. (For example I can easily imagine that
    GetLastError() is overridden by the lock/unlock operations on Windows.)

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented May 23, 2008

    > How can Python run arbitrary code between the return from a ctypes
    > method and the next Python instruction? None of the code should have any
    > effect on errno.

    By freeing objects because their refcount has reached zero?

    No, that should not set errno. Free cannot fail, and will not modify
    errno. Try running

    #include <errno.h>
    int main()
    {
        errno = 117;
        free(malloc(100));
        printf("%d\n", errno);
    }

    malloc might set errno, but only if it fails (in which case you'll get
    a Python exception, anyway).

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented May 23, 2008

    [...]

    Using the native errno instead of a custom TLS value is bad because a
    lot of things can occur

    So what's the semantics of set_errno then? Set the real errno? If so,
    what if it gets changed between the call to set_errno, and the actual
    invocation of API function? Set the TLS errno? If so, when does it get
    copied into the real errno?

    @theller
    Copy link
    Author

    theller commented May 23, 2008

    > Using the native errno instead of a custom TLS value is bad because a
    > lot of things can occur

    So what's the semantics of set_errno then? Set the real errno? If so,
    what if it gets changed between the call to set_errno, and the actual
    invocation of API function? Set the TLS errno? If so, when does it get
    copied into the real errno?

    AFAIU, set_errno/get_errno should provide a ctypes-private copy of the real errno.
    The copy is copied into the 'real' errno just before ffi_call (in Modules/_ctypes/callproc.c),
    and the real errno is copied in to ctypes copy right after the call.

    Probably the real errno from before this action should be restored at the end.

    Code that I have in mind:

    __thread int _ctypes_errno; /* ctypes-private global error number in thread local storage */

    static int _call_function_pointer(...)
    {
       int old_errno;
       .
       .
       old_errno = errno;
       errno = _ctypes_errno;
       ffi_call(....);
       _ctypes_errno = errno;
       errno = old_errno;
       .
       .
    }
    
    static PyObject *
    _ctypes_set_errno(PyObject *self, Pyobject *args)
    {
        .
        _ctypes_errno = parsed_argument;
        .
    }
    
    static PyObject *
    _ctypes_get_errno(PyObject *self, Pyobject *args)
    {
        return PyInt_FromLong(_ctypes_errno);
    }

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented May 23, 2008

    AFAIU, set_errno/get_errno should provide a ctypes-private copy of the real errno.
    The copy is copied into the 'real' errno just before ffi_call (in Modules/_ctypes/callproc.c),
    and the real errno is copied in to ctypes copy right after the call.

    If you clear errno, anyway, you can also drop the set_errno call, and
    zero-initialize errno before each call. The point of set_errno would
    be that you have the choice of *not* calling it, i.e. passing into
    the function the errno value that was there before you made the API
    call. If you fill something else into errno always, the application has
    no way of not modifying errno before the API call.

    @theller
    Copy link
    Author

    theller commented May 23, 2008

    > AFAIU, set_errno/get_errno should provide a ctypes-private copy of the real errno.
    > The copy is copied into the 'real' errno just before ffi_call (in Modules/_ctypes/callproc.c),
    > and the real errno is copied in to ctypes copy right after the call.

    If you clear errno, anyway, you can also drop the set_errno call, and

    My code sample did not intend to clear errno, it was intended to set errno
    to the last value that the ctypes-copy had before the call.

    zero-initialize errno before each call. The point of set_errno would
    be that you have the choice of *not* calling it, i.e. passing into
    the function the errno value that was there before you made the API
    call. If you fill something else into errno always, the application has
    no way of not modifying errno before the API call.

    To make my point clear (in case is isn't already): In the code
    sample that Armin posted, it is impossible to guarantee that no
    stdlib function is called between the readdir() and the get_errno() call:

             dirent = linux_c_lib.readdir(byref(dir))
             if not dirent:
                 if ctypes.get_errno() == 0:

    Consider the garbage collector free some objects that release resources
    they have, in other words call functions different from free(3).

    Similarly for calling set_errno() or not calling it; other stdlib function
    calls in the meantime may have changed errno and that might not be what
    the Python programmer expects.

    Anyway, how can we proceed? Do you have a suggestion?

    Should set_errno() set a flag that is examined and consumed before
    call_function_pointer() is called?

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented May 23, 2008

    My code sample did not intend to clear errno, it was intended to set errno
    to the last value that the ctypes-copy had before the call.

    It's the same thing (to me): you change errno.

    To make my point clear (in case is isn't already): In the code
    sample that Armin posted, it is impossible to guarantee that no
    stdlib function is called between the readdir() and the get_errno() call:

         dirent = linux_c_lib.readdir(byref(dir))
         if not dirent:
             if ctypes.get_errno() == 0:
    

    Consider the garbage collector free some objects that release resources
    they have, in other words call functions different from free(3).

    However, even the TLS copy of errno may change because of this,
    if the finalizer of some object invokes ctypes, right?

    Anyway, how can we proceed? Do you have a suggestion?

    I'll leave it up to you to decide - you'll take both blame and praise,
    anyway, so I don't want to talk you into coming up with an API that you
    don't like.

    @arigo
    Copy link
    Mannequin

    arigo mannequin commented May 24, 2008

    However, even the TLS copy of errno may change because of this,
    if the finalizer of some object invokes ctypes, right?

    Yes, it's annoying, but at least the Python programmer has a way to fix
    this problem: he can save and restore the TLS copy of errno around the
    call to ctypes done in the finalizer, using get_errno()/set_errno().
    This matches what a C programmer would have to do if he wanted to write,
    say, a debugging function that can be called from anywhere including
    between an OS call and the following check for errno.

    (Another note: the C-level errno and the TLS copy should also be
    synchronized when the C code invokes a Python callback.)

    In PyPy, when creating and using our own internal FFI, we started by
    having a pair of functions that could directly read or change the
    C-level errno. Then we ran into various troubles (e.g. just stepping
    through the Python code in a debugger messes things up because of the
    debugger's own input/output) and finally reached the design I propose
    here. (Admittedly PyPy's internal FFI is a bit more low-level than
    ctypes, so maybe it's more appropriate for ctypes to use a higher-level
    approach that a return value checker.)

    (A related issue that we may or may not care about: it's more than
    likely that various people have already come up with various workarounds
    to handle errno, and these workarounds will probably stop working after
    ctypes is changed...)

    @theller
    Copy link
    Author

    theller commented May 29, 2008

    Ok, here is the plan (basically Armin's proposal):

    ctypes maintains a gloabl, but thread-local, variable that contains an error number;
    called 'ctypes_errno' for this discussion.

    ctypes.set_errno(value) copies 'value' into ctypes_errno, and returns the old value.
    ctypes.get_errno() returns the current ctypes_errno value.

    Foreign functions created with CDLL(..., use_errno=True), when called, copy
    the ctypes_errno value into the real errno just before they are called, and copy
    the real errno value into ctypes_errno. The 'use_errno' parameter defaults to False,
    in this case ctypes_errno is not touched.

    The same scheme is implemented on Windows for GetLastError() and SetLastError(),
    the variable is named 'ctypes_LastError', and the CDLL and WinDLL optional parameter
    is named 'use_LastError', and also defaults to False.

    Armin Rigo schrieb:

    (Another note: the C-level errno and the TLS copy should also be
    synchronized when the C code invokes a Python callback.)

    Not sure what this is for, please proofread the patch when I attach one.
    Even better, someone could supply test-cases for all this, either
    as executable code or as prosa.

    (A related issue that we may or may not care about: it's more than
    likely that various people have already come up with various workarounds
    to handle errno, and these workarounds will probably stop working after
    ctypes is changed...)

    Since the new behaviour will only be invoked when use_errno or use_LastError
    is used, they should be able to still use their workarounds.

    @theller
    Copy link
    Author

    theller commented May 30, 2008

    Here is a patch implementing the plan. This text could serve as a start for the
    documentation, but it also describes the current implementation. Usage recipes
    should probably be added:

    /*
    ctypes maintains a module-global, but thread-local, variable that contains
    an error number; called 'ctypes_errno' for this discussion. This variable
    is a private copy of the stdlib 'errno' value; it is swapped with the
    'errno' variable on several occasions.

    Foreign functions created with CDLL(..., use_errno=True), when called, swap
    the values just before they are actual function call, and swapped again
    afterwards. The 'use_errno' parameter defaults to False, in this case
    ctypes_errno is not touched.

    The values are also swapped immeditately before and after ctypes callback
    functions are called, if the callbacks are constructed using the new
    optional use_errno parameter for CFUNCTYPE(..., use_errno=TRUE) or
    WINFUNCTYPE(..., use_errno=True).

    Two new ctypes functions are provided to access the 'ctypes_errno' value
    from Python:

    • ctypes.set_errno(value) sets ctypes_errno to 'value', the previous
      ctypes_errno value is returned.

    • ctypes.get_errno() returns the current ctypes_errno value.

    The same scheme is implemented on Windows for GetLastError() and
    SetLastError(), and the CDLL and WinDLL optional parameter is named
    'use_LastError', and also defaults to False.

    On Windows, TlsSetValue and TlsGetValue calls are used to provide thread
    local storage for the variables; ctypes compiled with __GNUC__ uses __thread
    variables.
    */

    @theller
    Copy link
    Author

    theller commented May 30, 2008

    Thomas Heller schrieb:

    Here is a patch implementing the plan.

    ctypes-errno-3.patch, in case of doubt.

    @theller
    Copy link
    Author

    theller commented Jun 6, 2008

    A different patch but implementing the same functionality was now
    committed as trunk rev 63977.

    @theller theller closed this as completed Jun 6, 2008
    @arigo
    Copy link
    Mannequin

    arigo mannequin commented Jun 6, 2008

    (Another note: the C-level errno and the TLS copy should also be
    synchronized when the C code invokes a Python callback.)

    What I meant is what should occur when a pure Python function is used
    as a callback. At this point there is already some logic e.g. to
    re-acquire the GIL if necessary. Maybe it needs to grow logic to
    optionally copy the C-level errno into the TLS variable at the start,
    and at the end copy it back into the C-level errno at the end, for the
    cases where the C code expects the callback to be able to set errno.

    @theller
    Copy link
    Author

    theller commented Jun 6, 2008

    > (Another note: the C-level errno and the TLS copy should also be
    > synchronized when the C code invokes a Python callback.)

    What I meant is what should occur when a pure Python function is used
    as a callback. At this point there is already some logic e.g. to
    re-acquire the GIL if necessary. Maybe it needs to grow logic to
    optionally copy the C-level errno into the TLS variable at the start,
    and at the end copy it back into the C-level errno at the end, for the
    cases where the C code expects the callback to be able to set errno.

    I figured that out in the meantime and implemented it in this way.
    See the code around line 295 in Modules/_ctypes/callbacks.c.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    extension-modules C modules in the Modules dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants