This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author georg.brandl
Recipients
Date 2007-02-25.23:27:35
SpamBayes Score
Marked as misclassified
Message-id
In-reply-to
Content
> >>> sys.getfilesystemencoding()
> 'UTF-8'
>
> so python is really dumb if print does not know my filesystemencoding, but
> knows my terminal encoding.

the file system encoding is the encoding of file names, not of file content.

> I though breaking the least surprising behaviour was not considered
> pythonic, and now you tell me that having a program running on console but
> issuing an exception when redirected is intended. I would prefer an
> exception in both cases. Or, even better, using
> sys.getfilesystemencoding(), or allowing me to set defaultencoding()

I agree that using the terminal encoding is perhaps a bit too DWIMish, but you
can always get consistent results if you *do not write Unicode strings anywhere*.

> Do you mean that I need to say print unicode(whatever).encode('utf8'),
> like:
> 
> >>> a = unicode('\xc3\xa1','utf8') # instead of 'á', easy to read and
> understand, even in files encoded as utf8. Assume this is a literal or
> input

No. You can directly put Unicode literals in your files, with u'...'.
For that to work, you need to tell Python the encoding your file has,
using the coding cookie (see the docs).

> ...
> >>> print unicode(a).encode('utf8') # because a could be a number, or a
> different object
> 
> every time, instead of "a='á'; print a"

> Cool, I'm starting to really love it. Concise and pythonic

> Are you seriously meaning that there is no way to tell print to use a
> default encoding, and it will magically try to find it and fail for
> everything not being a terminal?

This is not magic. "print" looks for an "encoding" attribute on the file
it is printing to. This is the terminal encoding for sys.stdout and None for
other files.

> Are you seriously telling me that this is not a bug? Even worse, that it
> is "intended behaviour". BTW, jython acts differently about this, in all
> the versions I tried.

It *is* not a bug. This was implemented as a simplification for terminal output.

> And with -S I am allowed to change the encoding, which is crippled in site
> for no known good reason. 

> python -S -c "import sys; sys.setdefaultencoding('utf8'); print
> unicode('\xc3\xa1','utf8')" >test
> (works, test contains an accented a as intended

Because setdefaultencoding() affects *every* conversion from unicode to string
and from string to unicode, which can be very confusing if you have to handle
different encodings.


>>use Unicode everywhere inside the
>>program, and byte strings for input and output.

> Have you ever wondered that to use unicode everywhere inside the program,
> one needs to decode literals (or input) to unicode (the next sentence you
> complain about)?

Yes, you have to decode input (for files, you can do this automatically if you
use codecs.open(), not builtin open()). No, you don't have to decode literals as
Unicode literals exist.

> I follow this principle in my programming since about 6 years ago, so I'm
> not a novice. I'm playing by the rules:
> a) "decodes it to unicode" is the first step to get it into processing.
> This is just a test case, so processing is zero.
> b) I refuse to believe that the only way to ensure something to be printed
> right is wrapping every item into unicode(var).encode('utf8') [The
> redundant unicode call is because the var could be a number, or a different
> object]

No, that is of course not the only way. An alternative is to use an encoded file,
as the codecs module offers.

If you e.g. set

sys.stdout = codecs.EncodedFile(sys.stdout, 'utf-8')

you can print Unicode strings to stdout, and they will automatically be converted
using utf-8. This is clear and explicit.

> c) or making my code non portable by patching site.py to get a real
> encoding instead of ascii.

If you still cannot live without setdefaultencoding(), you can do reload(sys) to get
a sys module with this method.

Closing again.
History
Date User Action Args
2007-08-23 14:52:07adminlinkissue1668295 messages
2007-08-23 14:52:07admincreate