This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: test_doctest fails with iso-8859-15 locale
Type: behavior Stage: needs patch
Components: Library (Lib), Tests Versions: Python 3.1, Python 3.2
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: brunogola, loewis, pitrou, tim.peters, vstinner
Priority: normal Keywords: patch

Created on 2010-11-21 17:24 by pitrou, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
bdb.patch vstinner, 2011-01-06 00:23
Messages (7)
msg121954 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-11-21 17:24
$ LANG=ISO-8859-15 ./python -m test.regrtest test_doctest
[1/1] test_doctest
**********************************************************************
File "/home/antoine/py3k/__svn__/Lib/test/test_doctest.py", line 1676, in test.test_doctest.test_debug
Failed example:
    try: doctest.debug_src(s)
    finally: sys.stdin = real_stdin
Expected:
    > <string>(1)<module>()
    (Pdb) next
    12
    --Return--
    > <string>(1)<module>()->None
    (Pdb) print(x)
    12
    (Pdb) continue
Got:
    > /home/antoine/py3k/__svn__/Lib/encodings/iso8859_15.py(15)decode()
    -> return codecs.charmap_decode(input,errors,decoding_table)
    (Pdb) next
    --Return--
    > /home/antoine/py3k/__svn__/Lib/encodings/iso8859_15.py(15)decode()->('<string>', 8)
    -> return codecs.charmap_decode(input,errors,decoding_table)
    (Pdb) print(x)
    *** NameError: name 'x' is not defined
    (Pdb) continue
    12
**********************************************************************
1 items had failures:
   1 of   4 in test.test_doctest.test_debug
***Test Failed*** 1 failures.
test test_doctest failed -- 1 of 429 doctests failed
1 test failed:
    test_doctest


Also visible on the following buildbot:
http://www.python.org/dev/buildbot/all/builders/x86%20debian%20parallel%203.x/builds/934/steps/test/logs/stdio
msg121957 - (view) Author: Bruno Gola (brunogola) Date: 2010-11-21 18:26
tests are OK for me

running on ubuntu 10.04 64bits and py3k from svn repository.
msg123771 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-12-11 03:38
You can reproduce the bug with:

$ LANG=fr_FR.iso885915@euro ./python -c 'import pdb; pdb.Pdb(nosigint=True).run("exec(%r)" % "x=12")'
> /home/haypo/prog/SVN/py3k/Lib/encodings/iso8859_15.py(15)decode()
-> return codecs.charmap_decode(input,errors,decoding_table)
(Pdb) quit

(it should print "x=12" in the backtrace, not ...iso8859_15.py...)

Simplified C backtrace: builtin_exec() -> PyRun_StringFlags() -> PyAST_CompileEx() -> makecode() -> PyUnicode_DecodeFSDefault().

ISO-8859-15 codec is implemented in Python whereas ASCII, ISO-8859-1 and UTF-8 are implemented in C. Pdb stops at the first Python instruction. The user expects that the first instruction is "x=12", but no, the real first Python instruction is calling ISO-8859-15 to decode the byte string "<string>" (script filename).

I see two solutions:
 - set the trace function later. Eg. replace exec(cmd, ...) by code=compile(cmd, ...) + exec(code) and set the trace function after the call to compile. I don't know if both codes are equivalent.
 - reimplement ISO-8859-15 in Python: it doesn't solve the issue, there are other encodings implemented in Python
msg123774 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-12-11 04:00
See a more complex solution: #3080 (don't decode the filename in the parser, keep unicode strings).
msg125487 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-01-06 00:23
> set the trace function later. Eg. replace exec(cmd, ...)
> by code=compile(cmd, ...) + exec(code) and set the trace function 
> after the call to compile.

Implemented in the attached patch, bdb.patch: trace the execution of the code, not the compilation of the code.

> I don't know if both codes are equivalent.

I still don't know :-p
msg125492 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-01-06 00:51
bdb.patch doesn't work if cmd is not a string (if cmd is a code object).

r87780 fixes this issue: bdb.Bdb.run() only traces the execution of the code, not the compilation (if the input is a string).

With this fix, the whole test suite pass on Linux with ISO-8859-1, ISO-8859-15 and UTF-8 locale encodings (I only tested in an ASCII path).
msg125496 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-01-06 01:11
"x86 debian parallel 3.x" buildbot is green again! :-)
History
Date User Action Args
2022-04-11 14:57:09adminsetgithub: 54701
2011-01-06 01:11:43vstinnersetnosy: tim.peters, loewis, pitrou, vstinner, brunogola
messages: + msg125496
2011-01-06 00:52:53vstinnersetstatus: open -> closed
nosy: tim.peters, loewis, pitrou, vstinner, brunogola
resolution: fixed
2011-01-06 00:51:20vstinnersetnosy: tim.peters, loewis, pitrou, vstinner, brunogola
messages: + msg125492
2011-01-06 00:23:28vstinnersetfiles: + bdb.patch

messages: + msg125487
keywords: + patch
nosy: tim.peters, loewis, pitrou, vstinner, brunogola
2010-12-11 04:00:04vstinnersetmessages: + msg123774
2010-12-11 03:38:49vstinnersetmessages: + msg123771
2010-11-21 18:26:59brunogolasetnosy: + brunogola
messages: + msg121957
2010-11-21 17:24:02pitroucreate