This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Regression caused by CALL_FUNCTION specialization for C function calls (test_urllib fails when run multiple times)
Type: Stage: patch review
Components: Tests Versions: Python 3.11
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Mark.Shannon, brandtbucher, corona10, eric.snow, gvanrossum, kj, sobolevn, vstinner
Priority: release blocker Keywords: 3.11regression, patch

Created on 2022-01-22 01:30 by vstinner, last changed 2022-04-11 14:59 by admin.

Files
File name Uploaded Description Edit
bug.py vstinner, 2022-01-22 02:18
Pull Requests
URL Status Linked Edit
PR 30826 closed kj, 2022-01-23 13:02
PR 31404 sobolevn, 2022-02-18 11:00
Messages (14)
msg411217 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-01-22 01:30
test_unittest: TestBreakSignalDefault.testInstallHandler() fails if run after TestBreak:
---------
$ ./python -m test -v test_unittest -R 3:3 -m '*TestBreak.testHandlerReplacedButCalled' -m '*TestBreak.testInstallHandler' -m '*TestBreakSignalDefault.testInstallHandler'
(...)
beginning 6 repetitions
123456
testHandlerReplacedButCalled (unittest.test.test_break.TestBreak) ... ok
testInstallHandler (unittest.test.test_break.TestBreak) ... ok
testInstallHandler (unittest.test.test_break.TestBreakSignalDefault) ... ok

----------------------------------------------------------------------
Ran 3 tests in 0.003s

OK
(...)
.testHandlerReplacedButCalled (unittest.test.test_break.TestBreak) ... ok
testInstallHandler (unittest.test.test_break.TestBreak) ... ok
testInstallHandler (unittest.test.test_break.TestBreakSignalDefault) ... FAIL

======================================================================
FAIL: testInstallHandler (unittest.test.test_break.TestBreakSignalDefault)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/vstinner/python/main/Lib/unittest/test/test_break.py", line 38, in testInstallHandler
    self.assertTrue(unittest.signals._interrupt_handler.called)
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: False is not true
(...)
---------

Or you can use a "bisect" file:
---
unittest.test.test_break.TestBreak.testHandlerReplacedButCalled
unittest.test.test_break.TestBreak.testInstallHandler
unittest.test.test_break.TestBreakSignalDefault.testInstallHandler
---

And the command:

./python -m test -v -F test_unittest --matchfile=bisect

Or the command:

./python -m test -v test_unittest -R 3:3 --matchfile=bisect

It fails starting at the 4th iteration.
msg411219 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-01-22 01:42
It's a regression introduced by the following change:

commit 3163e68c342434db37c69669017f96a4bb2d5f13
Author: Ken Jin <28750310+Fidget-Spinner@users.noreply.github.com>
Date:   Wed Oct 20 07:16:36 2021 +0800

    bpo-44525: Specialize ``CALL_FUNCTION`` for C function calls (GH-26934)
msg411220 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-01-22 02:18
Attached bug.py reproduces the bug without unittest, just run:

./python bug.py

On Python 3.10, it writes:
---
LOOP 0
LOOP 1
(...)
LOOP 14
LOOP 15
---

On the main branch, it fails with:
---
LOOP 0
LOOP 1
LOOP 2
LOOP 3
LOOP 4
LOOP 5
LOOP 6
LOOP 7
Traceback (most recent call last):
  File "/home/vstinner/bug.py", line 24, in <module>
    Bug().run()
    ^^^^^^^^^^^
  File "/home/vstinner/bug.py", line 19, in run
    raise Exception("bug")
    ^^^^^^^^^^^^^^^^^^^^^^
Exception: bug
---
msg411288 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2022-01-22 20:22
@kj, are you looking into this?
msg411348 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-01-23 05:50
The problem is that the optimization no longer checks for pending signals in TARGET(CALL_NO_KW_BUILTIN_FAST). The patch below fix my issue. I guess that other opcode needs an additional CHECK_EVAL_BREAKER().

diff --git a/Python/ceval.c b/Python/ceval.c
index 9aaddd99eda..7cc0f805366 100644
--- a/Python/ceval.c
+++ b/Python/ceval.c
@@ -4822,6 +4822,7 @@ _PyEval_EvalFrameDefault(PyThreadState *tstate, InterpreterFrame *frame, int thr
                 */
                 goto error;
             }
+            CHECK_EVAL_BREAKER();
             DISPATCH();
         }
msg411349 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-01-23 05:56
It seems like the following target miss CHECK_EVAL_BREAKER():

TARGET(CALL_NO_KW_METHOD_DESCRIPTOR_FAST) 
TARGET(CALL_NO_KW_METHOD_DESCRIPTOR_O)
TARGET(CALL_NO_KW_BUILTIN_FAST)
TARGET(CALL_NO_KW_BUILTIN_O)
TARGET(CALL_NO_KW_BUILTIN_CLASS_1)

CHECK_EVAL_BREAKER() matters for signals, but also multithreading (drop the GIL), asynchronous exception and pending calls.
msg411364 - (view) Author: Ken Jin (kj) * (Python committer) Date: 2022-01-23 11:52
Thanks @Victor for catching this and figuring it out. I had a hard time narrowing this down since I'm on Windows which doesn't support signals :(.

I'll handle the required patch and tests.
msg411366 - (view) Author: Ken Jin (kj) * (Python committer) Date: 2022-01-23 13:24
A short summary (thanks to Victor's findings!):

- os.kill(pid, signal.SIGINT) is specialized to CALL_NO_KW_FAST
- CALL_NO_KW_FAST doesn't check eval breaker, so the signal is ignored
- signal handler wasn't called, test fails

I'd like to add tests for some of the other CALL_X in the future, but I'm a little short on time at the moment, sorry! For now, I've tried to turn some of the tests "hot" to see if I can get those to trigger.
msg411367 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2022-01-23 13:33
Is this a bug?

Signal handling in Python is asynchronous. https://docs.python.org/3/library/signal.html#execution-of-python-signal-handlers

The example code tests whether the interpreter responds synchronously and immediately.

If you add `for _in range(1): pass` or a call to any Python function in between the `os.kill(pid, SIGNUM)` and the `if not self.called:` then the test passes.
msg411369 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-01-23 13:46
> Is this a bug?

In Python 3.10, the code works. In Python 3.11, it fails.

It's a behavior change. IMO this change is unwanted. I expect that signals are handled "as soon as possible", *especially* if we receive it "during" an os.kill() call on the current process on purpose.
msg411370 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-01-23 13:49
It's also interesting to note that the implementation of os.kill() and signal.raise_signal() do *not* call PyErr_CheckSignal().

The following signal functions *do* call call PyErr_CheckSignal():

* signal.signal()
* signal.pause()
* signal.pthread_kill()
* signal.pthread_sigmask()

Some other signal functions call PyErr_CheckSignal() if a syscall fails with EINTR (PEP 475).
msg411371 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-01-23 13:52
> Signal handling in Python is asynchronous. https://docs.python.org/3/library/signal.html#execution-of-python-signal-handlers

See my previous comment: "CHECK_EVAL_BREAKER() matters for signals, but also multithreading (drop the GIL), asynchronous exception and pending calls."

If a thread executes a function which only uses opcodes which don't call CHECK_EVAL_BREAKER(), I understand that it can eat more than its slice of 5 ms, see sys.getswitchinterval():
https://docs.python.org/dev/library/sys.html#sys.getswitchinterval

So it's also a threads scheduling issue, no?
msg411461 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2022-01-24 10:23
> So it's also a threads scheduling issue, no?

No, it isn't. The interpreter checks the eval breaker frequently enough. It checks on all back edges and on all calls to Python functions.

The test probably needs to be fixed, or extended. It is signals sent from another process or thread that we should be testing for.

I'm happy to merge PR 30826, but if you are really concerned about prompt delivery of signals, then you should be worried about C extensions.

If you are worried about being able to interrupt programs, then you also
worry about poorly written Python code that uses `except:`.

Regarding C extensions, I think clear documentation that extension authors need to check for signals in any code that might run for a few hundred microseconds or longer is the best approach.

For `except:`, maybe we could issue a syntax warning, as `except:` is universally considered to be bad practice.
msg411652 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-01-25 18:11
> Regarding C extensions, I think clear documentation that extension authors need to check for signals in any code that might run for a few hundred microseconds or longer is the best approach.

Python 3.10 doesn't require it. If you want to change Python 3.11 behavior, please announce the change on python-dev and capi-sig mailing lists, and document it carefully in What's New in Python 3.11.

I'm not talking about checking frequently interruptions while a function is running, I'm only talking about the "end" of a C function (exit). In Python 3.10, ceval.c does it automatically.

        case TARGET(CALL_FUNCTION): {
            PREDICTED(CALL_FUNCTION);
            PyObject **sp, *res;
            sp = stack_pointer;
            res = call_function(tstate, &trace_info, &sp, oparg, NULL);
            stack_pointer = sp;
            PUSH(res);
            if (res == NULL) {
                goto error;
            }
            CHECK_EVAL_BREAKER();  // <==== HERE
            DISPATCH();
        }
History
Date User Action Args
2022-04-11 14:59:55adminsetstatus: pending -> open
github: 90623
2022-02-18 11:00:24sobolevnsetstatus: open -> pending
nosy: + sobolevn
pull_requests: + pull_request29545
2022-02-11 13:20:31vstinnersettitle: Regression caused by CALL_FUNCTION specialization for C function calls -> Regression caused by CALL_FUNCTION specialization for C function calls (test_urllib fails when run multiple times)
2022-02-11 13:20:16vstinnerlinkissue46709 superseder
2022-01-26 17:40:02brandtbuchersetnosy: + brandtbucher
2022-01-25 18:11:09vstinnersetmessages: + msg411652
2022-01-24 10:23:11Mark.Shannonsetmessages: + msg411461
2022-01-23 13:52:56vstinnersetmessages: + msg411371
2022-01-23 13:49:53vstinnersetmessages: + msg411370
2022-01-23 13:46:13vstinnersetmessages: + msg411369
2022-01-23 13:33:58Mark.Shannonsetmessages: + msg411367
2022-01-23 13:24:27kjsetmessages: + msg411366
2022-01-23 13:02:36kjsetkeywords: + patch
stage: patch review
pull_requests: + pull_request29013
2022-01-23 11:52:40kjsetmessages: + msg411364
2022-01-23 05:56:27vstinnersetmessages: + msg411349
2022-01-23 05:50:39vstinnersetmessages: + msg411348
2022-01-23 03:45:57corona10setnosy: + corona10
2022-01-22 22:57:33vstinnersettitle: test_unittest: TestBreakSignalDefault.testInstallHandler() fails if run after TestBreak -> Regression caused by CALL_FUNCTION specialization for C function calls
2022-01-22 20:22:28gvanrossumsetmessages: + msg411288
2022-01-22 02:18:28vstinnersetfiles: + bug.py

messages: + msg411220
2022-01-22 01:42:23vstinnersetpriority: normal -> release blocker

nosy: + gvanrossum, Mark.Shannon, eric.snow, kj
messages: + msg411219

keywords: + 3.11regression
2022-01-22 01:30:09vstinnercreate