classification
Title: interactive interpreter, source encoding
Type: behavior Stage: resolved
Components: Interpreter Core Versions: Python 3.1, Python 3.2, Python 2.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: amaury.forgeotdarc, ezio.melotti, jmfauth, r.david.murray, terry.reedy
Priority: normal Keywords:

Created on 2009-03-24 19:32 by jmfauth, last changed 2010-08-27 21:01 by terry.reedy. This issue is now closed.

Messages (4)
msg84107 - (view) Author: jmf (jmfauth) Date: 2009-03-24 19:31
A few hours ago I sent a comment to the issue #4626. I didn't notice the
issue was closed. So I repeat it here. I'm interested in comments
because I have the feeling it is still a pending annoying isssue.

---

I'm glad to have discovered this topic. I bumped into something similar
when I toyed with an interactive interpreter.

from code import InteractiveInterpreter

ii = InteractiveInterpreter()
source = ...
ii.runsource(source)

What should be the encoding and/or the type (str, bytes) of the "source"
string? Taking into account the encoding of the script which contains
this code, I have the feeling there is always something going wrong,
this can be a "non ascii" char in the source (encoded in utf-8!) or the
interactive interpreter does not accept very well a byte string
representing a utf-8 encoded string.

IDLE is not suffering from this. Its interactive interpreter is somehow
receiving "ucs-2 ready string" from tkinter.

I'm a little bit confused here (win2k, winXP sp2, Python 3.0.1).
msg109792 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2010-07-09 20:12
I fail to see the issue. runsource() takes a (unicode) string because a Python script is a text; you cannot pass a bytes object, it must be decoded before.
msg115020 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-08-26 18:40
Agreed.  Although the docs do not explicitly say "you cannot use bytes as source", this is clearly implicit in the Python3 bytes/string separation.  The docs talk only about string inputs.
msg115132 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2010-08-27 21:01
Additional note: RefMan 2. Lexical analysis:
"Python reads program text as Unicode code points;"

Doc for runsource says "Compile and run some source in the interpreter. Arguments are the same as for compile_command()". Latter says "sourse is the source string".
History
Date User Action Args
2010-08-27 21:01:03terry.reedysetnosy: + terry.reedy
messages: + msg115132
2010-08-26 18:40:22r.david.murraysetstatus: open -> closed

nosy: + r.david.murray
messages: + msg115020

resolution: not a bug
stage: test needed -> resolved
2010-07-09 20:15:28ezio.melottisetnosy: + ezio.melotti
2010-07-09 20:12:03amaury.forgeotdarcsetmessages: + msg109792
2010-07-09 17:33:54BreamoreBoysetnosy: + amaury.forgeotdarc
stage: test needed

versions: + Python 3.1, Python 2.7, Python 3.2, - Python 3.0
2009-03-24 19:32:00jmfauthcreate