This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Tracing causes module globals to be mutated when calling functions from C
Type: crash Stage: resolved
Components: Interpreter Core Versions: Python 3.10
process
Status: closed Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Mark.Shannon, seberg
Priority: normal Keywords:

Created on 2022-01-21 04:54 by seberg, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (6)
msg411082 - (view) Author: Sebastian Berg (seberg) * Date: 2022-01-21 04:54
Starting here, but there could be Cython interaction or something else in theory.  But, when running the following:

* Python 3.10.1 (not 3.9.9, debug version or not)
* Setting a tracing function  (not setting a trace-function will fix the issue)
* Running Cython (maybe also C code) calling back into Python (the best I can tell)

It can happen that module globals in the called funtions scope seem to be modified.  Oddly enough to a value used in the locals of that function?!

The pandas issue:

    https://github.com/pandas-dev/pandas/issues/41935

has a reproducer (sorry that it takes NumPy and pandas for now).  I will paste it at the end here also.

I can find that the value is modified by the time the `trace` function is called.  No other "trace" triggers are processed before in this example (Cython calls other functions in NumPy, but all of those are C implemented, so I guess that is why).
The function in question (unlike all?) should further be called with `__Pyx_PyFunction_FastCall`, so that is probably an important data point: Fastcall protocol seems involved.

(Reproducible using NumPy 1.21.5 and Pandas 1.3.5, but except maybe pandas due to the Cython version, I don't expect version dependency.)

The output of the script is very brief:

    Something happened here, `np.core.numeric.dtype IS np.dtype`
    <frame at 0x7ff8d84416c0, file '/home/sebastian/forks/numpy/build/testenv/lib/python3.10/site-packages/numpy/core/numeric.py', line 289, code full> call None


The full reproducer script is:


import sys
import numpy as np
import pandas as pd

from numpy.core import numeric

stop = False
def trace(frame, event, arg):
    global stop
    if stop:
        return None

    if np.core.numeric.dtype is not np.dtype:
        print("Something happened here, `np.core.numeric.dtype IS np.dtype`")
        print(frame, event, arg)
        stop = True
    else:
        print(frame, event, arg)

    return trace

sys.settrace(trace)

pd._libs.lib.maybe_convert_objects(np.array([None], dtype=object))


For completeness, the Cython code calling the NumPy function in question, is (likley, there is more, this is Cython, I just cut it out a bit :)):

  #if CYTHON_FAST_PYCALL
  if (PyFunction_Check(__pyx_t_5)) {
    PyObject *__pyx_temp[3] = {__pyx_t_2, __pyx_t_6, Py_False};
    __pyx_t_15 = __Pyx_PyFunction_FastCall(__pyx_t_5, __pyx_temp+1-__pyx_t_8, 2+__pyx_t_8); if (unlikely(!__pyx_t_15)) __PYX_ERR(0, 2441, __pyx_L1_error)
    __Pyx_XDECREF(__pyx_t_2); __pyx_t_2 = 0;
    __Pyx_GOTREF(__pyx_t_15);
    __Pyx_DECREF(__pyx_t_6); __pyx_t_6 = 0;
  } else
  #endif
msg411084 - (view) Author: Sebastian Berg (seberg) * Date: 2022-01-21 05:05
Ahh, a further data-point.  The name from the module scope that is overwritten IS a parameter name used in the function locals.  Strangly, if I modify the tracing to print more:

stop = 0
def trace(frame, event, arg):
    global stop
    if stop > 10:
        return None

    if np.core.numeric.dtype is not np.dtype:
        #print("Something happened here, `np.core.numeric.dtype IS np.dtype`")
        print(np.core.numeric.dtype)
        print(frame, event, arg)
        stop += 1
    else:
        print(frame, event, arg)

    return trace

Then what I get is:

None
<frame at 0x7faa254396c0, file '/home/sebastian/forks/numpy/build/testenv/lib/python3.10/site-packages/numpy/core/numeric.py', line 289, code full> call None
None
<frame at 0x7faa254396c0, file '/home/sebastian/forks/numpy/build/testenv/lib/python3.10/site-packages/numpy/core/numeric.py', line 337, code full> line None
None
<frame at 0x7faa254396c0, file '/home/sebastian/forks/numpy/build/testenv/lib/python3.10/site-packages/numpy/core/numeric.py', line 340, code full> line None
None
<frame at 0x7faa254396c0, file '/home/sebastian/forks/numpy/build/testenv/lib/python3.10/site-packages/numpy/core/numeric.py', line 341, code full> line None
None
<frame at 0x7faa254396c0, file '/home/sebastian/forks/numpy/build/testenv/lib/python3.10/site-packages/numpy/core/numeric.py', line 342, code full> line None
bool


So, upon entering the function, the value is (already) cleared/set to None (which is correct of course for `dtype=None`) and then while the function runs storing into the function locals _mutates_ the module global?

For the fact that it keeps changing during the function run, points very strongly at CPython?
msg412240 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2022-02-01 10:10
Can you reproduce this in pure Python?
If not, can you produce a minimal reproducer using just NumPy?

If you can't do either, I'm going to have to assume that this is a NumPy or Pandas bug.

Maybe NumPy or Pandas is accessing CPython internals, but not via the C-API, and those internals changed between 3.9 and 3.10?
msg412276 - (view) Author: Sebastian Berg (seberg) * Date: 2022-02-01 15:21
Thanks for having a look. I have confirmed this is related to Cython (no pandas/NumPy involved) – repro at https://github.com/seberg/bpo46451.  What happens under the hood in Cython is probably:
    https://github.com/cython/cython/blob/master/Cython/Utility/ObjectHandling.c#L2569-L2611

Which generates `PyEval_EvalCodeEx`, and I could not repro with just a `PyObject_FastCallDict`, so I assume Cython is doing something wrong and will open an issue there, but if you have a quick tip as to what might wrong, that could be nice :).

Otherwise, will just close this, and may reopen if Cython hits a wall.
msg412280 - (view) Author: Sebastian Berg (seberg) * Date: 2022-02-01 15:42
Not reopening for now, but I will note again that (AFAIK) Cython uses `PyEval_EvalCodeEx`, and the docs say that it is not used internally to CPython anymore.

So it seems pretty plausible that the bug is in `PyEval_EvalCodeEx` and not the generated Cython code?
msg412288 - (view) Author: Sebastian Berg (seberg) * Date: 2022-02-01 16:32
While I have a repro for Python, I think the pre release of cython already fixes it (and I just did not regenerated the C sources when trying, I guess.  A `git clean` to the rescue...).
History
Date User Action Args
2022-04-11 14:59:55adminsetgithub: 90609
2022-02-01 16:32:18sebergsetmessages: + msg412288
2022-02-01 15:42:17sebergsetmessages: + msg412280
2022-02-01 15:21:06sebergsetstatus: open -> closed
stage: resolved
2022-02-01 15:21:01sebergsetmessages: + msg412276
2022-02-01 10:10:23Mark.Shannonsetnosy: + Mark.Shannon
messages: + msg412240
2022-01-31 17:29:49sebergsettitle: Possibly bad interaction with tracing and cython? -> Tracing causes module globals to be mutated when calling functions from C
2022-01-21 05:05:15sebergsetmessages: + msg411084
2022-01-21 04:54:22sebergcreate