Title: UnicodeEncodeError on recusion limit if the script filename is undecodable
Type: Stage:
Components: Unicode Versions: Python 3.3
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: haypo
Priority: normal Keywords: patch

Created on 2011-02-10 12:21 by haypo, last changed 2011-02-21 21:07 by haypo. This issue is now closed.

File name Uploaded Description Edit
ceval_filename.patch haypo, 2011-02-10 12:21
Messages (2)
msg128285 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-02-10 12:21
If the script filename is not decodable from the filesystem encoding, Python fails with a UnicodeEncodeError when we reach the recursion limit. The problem doesn't come from the user script, but from Python internals. It is difficult to understand and the user expects a 

A longer explanation:

test_runpy fails with a pydebug build if the script filename is not encodable to UTF-8. In pydebug build only, PyEval_EvalFrameEx() encodes the frame filename to UTF-8. If the filename contains a surrogate character (which only occurs on UNIX with undecodable filename),the encoding function fails. PyEval_EvalFrameEx() ignores the error except if we hit the recusion limit (if the overflowed attribute of the thread state if set): in this case, the error is not ignored.

To reproduce the problem, change the Python directory (your local repository) to an undecodable filename (eg. b'py3k\xe9\xff' with UTF-8 locale encoding) and run: ./python Lib/test/ test_py.

Solutions :

I propose to remove the filename variable from PyEval_EvalFrameEx() because it is only used by the old gdb macros: doesn't need it anymore.

=> see attached patch

Or if you really want to keep it, tstate->overflowed should be reinitialized. But on overflow, other variables are changed, like _Py_CheckRecursionLimit. I don't know this code enough to write a correct patch, but the minimal patch is:

--- a/Python/ceval.c
+++ b/Python/ceval.c
@@ -1234,7 +1234,7 @@ PyEval_EvalFrameEx(PyFrameObject *f, int throwflag)
         filename = _PyUnicode_AsString(co->co_filename);
         if (filename == NULL && tstate->overflowed) {
             /* maximum recursion depth exceeded */
-            goto exit_eval_frame;
+            tstate->overflowed = 0;
         PyErr_Restore(error_type, error_value, error_traceback);
msg128996 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-02-21 21:06
r88480 removes the filename variable: use or the faulthandler module to get a Python backtrace.
Date User Action Args
2011-02-21 21:07:21hayposetstatus: open -> closed
resolution: fixed
versions: - Python 3.2
2011-02-21 21:06:54hayposetmessages: + msg128996
2011-02-10 12:21:42haypocreate