classification
Title: Allow more Unicode on sys.stdout
Type: Stage:
Components: Interpreter Core Versions:
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: loewis Nosy List: lemburg, loewis, nobody
Priority: normal Keywords: patch

Created on 2002-09-21 20:32 by loewis, last changed 2003-05-10 07:11 by loewis. This issue is now closed.

Files
File name Uploaded Description Edit
stdout.txt loewis, 2002-09-21 20:32
stdout2.txt loewis, 2002-10-26 17:47
stdout3.txt loewis, 2003-03-29 14:40
Messages (11)
msg41194 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2002-09-21 20:32
This patch extends the set of Unicode strings that can
be printed to sys.stdout, to support all strings that
the terminal will likely support. It also adds an
encoding attribute to sys.std{in,out}.

To do that:
- it adds a .encoding attribute to all file objects,
which is normally None
- initializes the encoding of sys.stdin and sys.stdout
if either is a terminal.
- adds a wrapper object around sys.stdout in site.py
that encodes all Unicode objects according to the
detected encoding, if that encoding is known to Python

To find the encoding of the terminal, it
- uses GetConsoleCP and GetConsoleOutputCP on Windows,
- uses nl_langinfo(CODESET) on Unix, if available.

The primary rationale for this change is that people
should be able to print Unicode in an interactive
session. A parallel change needs to be added for IDLE,
so that it adds the .encoding attribute to the emulated
stdout (it already supports printing of Unicode on stdout).
msg41195 - (view) Author: Nobody/Anonymous (nobody) Date: 2002-09-24 08:10
Logged In: NO 

I like the .encoding concept. 

I don't really like the sys.stdout wrapper. Wouldn't it be 
better to add the functionality to the file object .write() and 
.writelines() methods and then only use the wrapper in case 
sys.stdout is not a true file object ?
msg41196 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2002-09-24 09:02
Logged In: YES 
user_id=21627

I have considered implementing it in the file object.
However, it becomes quite involved, and heavy C code:
PyFile_WriteObject calls PyObject_Print. Since Unicode does
not implement a tp_print, this calls str/repr, which
converts using the default encoding.

It is not clear at which point the file encoding should be
taking into account.
msg41197 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2002-10-25 12:09
Logged In: YES 
user_id=38388

I think it could work by adding a special case to 
PyFile_WriteObject() instead of calling PyObject_Print().
You first encode the Unicode object and then let
PyFile_WriteString() take care of the writing to the
FILE* object.

I see no other way, since you can't place the .encoding 
information into the FILE* object.
msg41198 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2002-10-26 17:47
Logged In: YES 
user_id=21627

I've attached a revised version which implements your
proposal; this version works without modification of site.py.

In its current form, the file encoding is only applied in
print; for sys.stdout.write, it is ignored. For print, it is
applied independent of whether this is a script or
interactive mode.
msg41199 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2003-03-23 11:59
Logged In: YES 
user_id=21627

Is the patch now acceptable?
msg41200 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2003-03-28 08:44
Logged In: YES 
user_id=38388

Looks ok except for the direct hacking
of f_encoding in the sys module. Please add
either a macro or a new API to make changing
the encoding from C possible without tapping
directly into the implementation.
msg41201 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2003-03-29 14:40
Logged In: YES 
user_id=21627

In stdout3.txt, PyFile_SetEncoding has been added, wrapping
the creation and assignment of the string object f_encoding.
msg41202 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2003-05-09 09:08
Logged In: YES 
user_id=21627

Any chance that this can go into 2.3b2?
msg41203 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2003-05-09 09:14
Logged In: YES 
user_id=38388

Sorry for not getting back to you on this earlier.

stdout3.txt looks OK. Please check it in.

Thanks !
msg41204 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2003-05-10 07:11
Logged In: YES 
user_id=21627

Committed as 

concrete.tex 1.25
libstdtypes.tex 1.124
fileobject.h 2.32
fileobject.c 2.178
sysmodule.c 2.119
NEWS 1.763
History
Date User Action Args
2002-09-21 20:32:23loewiscreate