Message 31339 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	sgala
Recipients
Date	2007-02-25.11:10:53
SpamBayes Score
Marked as misclassified
Message-id
In-reply-to

Content
I know that python is very funny WRT unicode processing, but this defies all my knowledge. I use the es_ES.UTF-8 encoding on linux. The script: python -c "print unicode('á %s' % 'éí','utf8') " works, i.e., prints á éí in the next line. However, if I redirect it to less or to a file, like python -c "print unicode('á %s' % 'éí','utf8') " >test Traceback (most recent call last): File "<string>", line 1, in ? UnicodeEncodeError: 'ascii' codec can't encode character u'\xe1' in position 0: ordinal not in range(128) Why is the behaviour different when stdout is redirected? How can I get it to do "the right thing" in both cases?

I know that python is very funny WRT unicode processing, but this defies all my knowledge.

I use the es_ES.UTF-8 encoding on linux. The script:


python -c "print unicode('á %s' % 'éí','utf8') " works, i.e., prints á éí in the next line.

However, if I redirect it to less or to a file, like

python -c "print unicode('á %s' % 'éí','utf8') " >test
Traceback (most recent call last):
  File "<string>", line 1, in ?
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe1' in position 0: ordinal not in range(128)


Why is the behaviour different when stdout is redirected? How can I get it to do "the right thing" in both cases?

History
Date	User	Action	Args
2007-08-23 14:52:06	admin	link	issue1668295 messages
2007-08-23 14:52:06	admin	create