Author vstinner
Recipients amaury.forgeotdarc, ezio.melotti, lemburg, loewis, vstinner
Date 2010-07-08.22:04:41
SpamBayes Score 1.53355e-05
Marked as misclassified No
Message-id <1278626687.5.0.550059895432.issue9198@psf.upfronthosting.co.za>
In-reply-to
Content
amaury> Should repr() print unicode characters outside the BMP?

Yes. I don't understand why characters outside the BMP will be considered differently than other characters. Is it a workaround for bogus operating systems? My Linux terminal (Konsole on KDE) is able to display Ugaritic characters (range starting at U+10383).

amaury> it may be better to define "printable" based on the
amaury> sys.stdout/sys.stderr encoding

You can not do that: stdout and stderr encoding might be different, stdout and stderr errors are different, and repr() output is not always written to stdout or stderr. If you write repr() output to a file, you have to know the encoding of the file. How can I get the encoding? If sys.stdout is replaced by a io.StringIO() object (eg. doctests), you don't have any "encoding" (StringIO only manipulate unicode objects, no bytes objects).

ezio> I want to change only the behavior of the interactive interpreter

This idea was rejected by the PEP.

I agree with "may add confusion of the kind "it works in interactive mode but not when redirecting to a file".

I already noticed such problem: the interactive interpreter adds '' to sys.path, and so import behaves differently in the interpreter than a script. It's annoying because it took me hours to understand why it was different.

ezio> and only when the string sent to stdout is not encodable

Which means setting sys.stdout.errors to something else than strict. I prefer to detect unicode problems earlier.

Eg. if you set errors to 'replace', write will never fail. If the output is used as input for another program (UNIX pipe), you will send "?" to the reader process. I'm not sure that it is the expected behaviour. And if stdout is not a TTY, stdout uses ASCII encoding (which will raise an unicode errors at the first non ASCII character!).
History
Date User Action Args
2010-07-08 22:04:47vstinnersetrecipients: + vstinner, lemburg, loewis, amaury.forgeotdarc, ezio.melotti
2010-07-08 22:04:47vstinnersetmessageid: <1278626687.5.0.550059895432.issue9198@psf.upfronthosting.co.za>
2010-07-08 22:04:42vstinnerlinkissue9198 messages
2010-07-08 22:04:41vstinnercreate