Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize out Py_UNREACHABLE in the release mode #82430

Closed
serhiy-storchaka opened this issue Sep 22, 2019 · 13 comments
Closed

Optimize out Py_UNREACHABLE in the release mode #82430

serhiy-storchaka opened this issue Sep 22, 2019 · 13 comments
Labels
3.9 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage

Comments

@serhiy-storchaka
Copy link
Member

BPO 38249
Nosy @warsaw, @pitrou, @vstinner, @serhiy-storchaka
PRs
  • bpo-38249: Expand Py_UNREACHABLE() to __builtin_unreachable() in the release mode. #16329
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2020-03-09.18:50:16.942>
    created_at = <Date 2019-09-22.09:47:06.857>
    labels = ['interpreter-core', '3.9', 'performance']
    title = 'Optimize out Py_UNREACHABLE in the release mode'
    updated_at = <Date 2020-03-09.18:50:16.941>
    user = 'https://github.com/serhiy-storchaka'

    bugs.python.org fields:

    activity = <Date 2020-03-09.18:50:16.941>
    actor = 'serhiy.storchaka'
    assignee = 'none'
    closed = True
    closed_date = <Date 2020-03-09.18:50:16.942>
    closer = 'serhiy.storchaka'
    components = ['Interpreter Core']
    creation = <Date 2019-09-22.09:47:06.857>
    creator = 'serhiy.storchaka'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 38249
    keywords = ['patch']
    message_count = 13.0
    messages = ['352965', '353025', '353026', '353029', '353039', '353941', '353950', '353953', '363674', '363704', '363705', '363706', '363760']
    nosy_count = 4.0
    nosy_names = ['barry', 'pitrou', 'vstinner', 'serhiy.storchaka']
    pr_nums = ['16329']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue38249'
    versions = ['Python 3.9']

    @serhiy-storchaka
    Copy link
    Member Author

    Py_UNREACHABLE is used to indicate that a specific point in the program cannot be reached, even if the compiler might otherwise think it can. This is exact the case for __builtin_unreachable in GCC and Clang. I propose to extend Py_UNREACHABLE() to __builtin_unreachable() in the release mode. This will allow the compiler to generate more efficient code.

    If there are circumstances in which Py_UNREACHABLE() is reachable, then it is improper use of Py_UNREACHABLE(). It should be replaced with raising an appropriate exception (like TypeError, ValueError, RuntimeError or SystemError) or, in extreme cases, with explicit Py_FatalError()

    @serhiy-storchaka serhiy-storchaka added 3.9 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage labels Sep 22, 2019
    @vstinner
    Copy link
    Member

    What does __builtin_unreachable()? Does it nothing? Call abort()?

    Py_UNREACHABLE is used to indicate that a specific point in the program cannot be reached, even if the compiler might otherwise think it can.

    I disagree here.
    https://docs.python.org/dev/c-api/intro.html#c.Py_UNREACHABLE

    For me, it's not an issue with the compiler, but more about handling corner cases. Py_UNREACHABLE() is used in cases which "should not occur" but can technically occur if you pass invalid data, or in "very unlikely case".

    I looked at the Python code base.

    tracemalloc_realloc():

            if (ADD_TRACE(ptr2, new_size) < 0) {
                /* Memory allocation failed. The error cannot be reported to
                   the caller, because realloc() may already have shrunk the
                   memory block and so removed bytes.
               This case is very unlikely: a hash entry has just been
               released, so the hash table should have at least one free entry.
    
                   The GIL and the table lock ensures that only one thread is
                   allocating memory. */
                Py_UNREACHABLE();
            }

    Technically, Py_UNREACHABLE() can be reached if there is very low available memory, but it "should not".

    dictobject.c:

    static Py_ssize_t
    lookdict_index(PyDictKeysObject *k, Py_hash_t hash, Py_ssize_t index)
    {
        size_t mask = DK_MASK(k);
        size_t perturb = (size_t)hash;
        size_t i = (size_t)hash & mask;
    
        for (;;) {
            Py_ssize_t ix = dictkeys_get_index(k, i);
            if (ix == index) {
                return i;
            }
            if (ix == DKIX_EMPTY) {
                return DKIX_EMPTY;
            }
            perturb >>= PERTURB_SHIFT;
            i = mask & (i*5 + perturb + 1);
        }
        Py_UNREACHABLE();
    }

    Here it's unclear to me if this case can be reached or not.

    _PyLong_Format() calls Py_UNREACHABLE() if base is not in (2, 8, 16).

    long_bitwise() calls Py_UNREACHABLE() if op is not in ('&', '^', '|'). Technically, this case cannot occur, since it's static function which has exactly 3 call sites all with valid 'op'. This *very specific* case could use __builtin_unreachable().

    pytime.c:

    _PyTime_t
    _PyTime_GetSystemClock(void)
    {
        _PyTime_t t;
        if (pygettimeofday(&t, NULL, 0) < 0) {
            /* should not happen, _PyTime_Init() checked the clock at startup */
            Py_UNREACHABLE();
        }
        return t;
    }
    
    _PyTime_t
    _PyTime_GetPerfCounter(void)
    {
        _PyTime_t t;
        if (_PyTime_GetPerfCounterWithInfo(&t, NULL)) {
            Py_UNREACHABLE();
        }
        return t;
    }

    Clocks should not fail, but if they fail I would prefer to call Py_FatalError() to collect the traceback where the bug occurred.

    This is exact the case for __builtin_unreachable in GCC and Clang. I propose to extend Py_UNREACHABLE() to __builtin_unreachable() in the release mode. This will allow the compiler to generate more efficient code.

    If there are circumstances in which Py_UNREACHABLE() is reachable, then it is improper use of Py_UNREACHABLE(). It should be replaced with raising an appropriate exception (like TypeError, ValueError, RuntimeError or SystemError) or, in extreme cases, with explicit Py_FatalError()

    If you consider that it's really worth it to optimize Py_UNREACHABLE(), I would prefer to have a new macro which is only used for compiler warnings: for cases which really "cannot happen".

    But it sounds hard to ensure that bugs cannot happen. I'm not sure that it's worth it to optimize Py_UNREACHABLE(). IMHO the current implementation is a good tradeoff between correctness and performance.

    @vstinner
    Copy link
    Member

    See also bpo-38147.

    @vstinner
    Copy link
    Member

    Ah, moreover, it's a public macro, so I would prefer to not change its behavior. I consider that modify the macro to call __builtin_unreachable() changes the behavior.

    @serhiy-storchaka
    Copy link
    Member Author

    What does __builtin_unreachable()? Does it nothing? Call abort()?

    It is more than does nothing. It gives a hint to the compiler, so it can optimize out checks that lead to this code. For example, in

        switch (k) {
        case 0: ...
        case 1: ...
        case 2: ...
        case 3: ...
        default: __builtin_unreachable();
        }

    the compiler can remove the check that k is in the range from 0 to 3 and use direct jump table. Or it can keep tests for k == 0, k == 1, k == 2 and remove the test for k == 3.

    Py_UNREACHABLE() replaces assert(0), but additionally silences compiler warning about unreachable code. It was the purpose of introducing Py_UNREACHABLE() (see bpo-14656).

    For me, it's not an issue with the compiler, but more about handling corner cases. Py_UNREACHABLE() is used in cases which "should not occur" but can technically occur if you pass invalid data, or in "very unlikely case".

    It is bad if Py_UNREACHABLE() is now used in both cases: for assert(0) and for abort(). Using Py_UNREACHABLE() for "very unlikely case" is technically incorrect. Raise an exception or call Py_FatalError() in that case.

    We need to fix incorrect uses of Py_UNREACHABLE() or add Py_TRUE_UNREACHABLE() for true unreachable code.

    Here it's unclear to me if this case can be reached or not.

    This is an infinite loop without "break". What can be the doubts?

    _PyLong_Format() calls Py_UNREACHABLE() if base is not in (2, 8, 16).

    There is assert(base == 2 || base == 8 || base == 16) above in the code. Technically it is a bug. _PyLong_Format() can be used from public PyNumber_ToBase(). Either PyNumber_ToBase() or _PyLong_Format() should check that the base is 2, 8, 10 or 16, and raise an exception otherwise, but not abort the program.

    @pitrou
    Copy link
    Member

    pitrou commented Oct 4, 2019

    I agree with Serhiy. If you hit a rare situation that you don't want to handle, use Py_FatalError(), not Py_UNREACHABLE().

    Py_UNREACHABLE() is for situations which are *logically* impossible, not just uncommon.

    @vstinner
    Copy link
    Member

    vstinner commented Oct 4, 2019

    Py_UNREACHABLE() is for situations which are *logically* impossible, not just uncommon.

    I suggest to clarify/enhance Py_UNREACHABLE() documentation to make it more explicit in that case.

    @warsaw
    Copy link
    Member

    warsaw commented Oct 4, 2019

    +1 Serhiy.

    @serhiy-storchaka
    Copy link
    Member Author

    Updated the PR according to review comments.

    PyNumber_ToBase() now raises SystemError for unsupported base instead of crashing. unicode_eq() asserts that unicode strings are ready (it is only called for hashed strings and if we have a hash they should be ready). PyCode_Optimize() returns the malformed bytecode unmodified. Other potentially non-unreachable code call now Py_FatalError(). Feel free to propose better messages. All other cases look truly unreachable to me.

    Added support for the Intel and Microsoft compilers.

    @vstinner
    Copy link
    Member

    vstinner commented Mar 9, 2020

    Antoine: "I agree with Serhiy. If you hit a rare situation that you don't want to handle, use Py_FatalError(), not Py_UNREACHABLE()."

    Py_UNREACHABLE() documentation still says "Use this when you have a code path that you do not expect to be reached." and "Use this in places where you might be tempted to put an assert(0) or abort() call."

    https://docs.python.org/dev/c-api/intro.html#c.Py_UNREACHABLE

    Can we clarify when Py_UNREACHABLE() must *not* be used?

    I understand that there are two cases:

    • Code path very unlikely to reach, but can be reached in some exceptional case. If it's reached, Python must report an error. Example: _PyTime_GetSystemClock().
    • Code path which cannot be technically reached: compiler complains but it's a compiler/linter false alarm. Example: lookdict_index().

    I propose to rephrase the Py_UNREACHABLE() macro documentation:

    "Use this when you have a code path that cannot be reached by design. For example, in the default: clause in a switch statement for which all possible values are covered in case statements. Use this in places where you might be tempted to put an assert(0) or abort() call.

    Don't use this macro for very unlikely code path if it can be reached under exceptional case. For example, under low memory condition or if a system call returns a value out of the expected range."

    I suggest to update the doc as part of PR 16329.

    @vstinner
    Copy link
    Member

    vstinner commented Mar 9, 2020

    The GCC documentation contains a good example:
    https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html

    """
    Another use for __builtin_unreachable is following a call a function that never returns but that is not declared __attribute__((noreturn)), as in this example:

    void function_that_never_returns (void);
    
    int g (int c)
    {
      if (c)
        {
          return 1;
        }
      else
        {
          function_that_never_returns ();
          __builtin_unreachable ();
        }
    }
    """

    This example is more explicit than a function which gets an "unexpected" value: such case should use Py_FatalError() if I understand correctly.

    By the way, Py_FatalError() should be avoided: it's always better to report the failure to the caller :-)

    @vstinner
    Copy link
    Member

    vstinner commented Mar 9, 2020

    """
    Use this when you have a code path that cannot be reached by design. For example, in the default: clause in a switch statement for which all possible values are covered in case statements. Use this in places where you might be tempted to put an assert(0) or abort() call.

    In release mode, the macro helps the compiler to optimize the code, and avoids a warning about unreachable code. For example, the macro is implemented with __builtin_unreachable() on GCC in release mode.

    An use for Py_UNREACHABLE() is following a call a function that never returns but that is not declared _Py_NO_RETURN.

    If a code path is very unlikely code but can be reached under exceptional case, this macro must not be used. For example, under low memory condition or if a system call returns a value out of the expected range. In this case, it's better to report the error to the caller. If the error cannot be reported to caller, :c:func:`Py_FatalError` can be used.
    """

    @serhiy-storchaka
    Copy link
    Member Author

    New changeset eebaa9b by Serhiy Storchaka in branch 'master':
    bpo-38249: Expand Py_UNREACHABLE() to __builtin_unreachable() in the release mode. (GH-16329)
    eebaa9b

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants