classification
Title: raw_input() displays wrong unicode prompt
Type: Stage:
Components: Interpreter Core Versions: Python 2.4
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: Nosy List: georg.brandl, georg.brandl, prikryl
Priority: high Keywords:

Created on 2005-01-10 10:33 by prikryl, last changed 2006-06-14 16:49 by georg.brandl. This issue is now closed.

Files
File name Uploaded Description Edit
pythonBug20050110.zip prikryl, 2005-01-10 10:33
Messages (8)
msg23912 - (view) Author: Petr Prikryl (prikryl) Date: 2005-01-10 10:33
I have observed a problem when running 
Python 2.4, Windows version (python-2.4.msi)
and using raw_input() with unicode prompt
string in a console program (ran in the DOS window).

I do use the following sitecustomize.py file to set
the default encoding in the English Windows 2000 Server:

sitecustomize.py
=================================
import sys
sys.setdefaultencoding('cp1250')
=================================


test.py
=================================
# -*- coding: cp1250 -*-
s = u'string with accented letters (different than this)'
print s                    # OK
val = raw_input(s)    # s displayed differently (wrong)
=================================

See the test.png
(captured from screen) and the test.py for the
used string -- inside the attached zip file. 

The "type test.py" (result visible on the captured
screen) displays the string
definition also wrongly, because the DOS window
uses different encoding than cp1250. The print
command prints the string correctly, converting
the internal unicode string to the encoding that
the is defined by the output environment. However,
the raw_input() probably does convert the unicode
string to the cp1250 and does not do the same
(more clever) thing that the print does.

I did not use the unicode in older Python (2.3.4),
so I do not know what was the behaviour earlier.

Could you confirm the bug? Sorry if the bug
is well known.

Petr
msg23913 - (view) Author: Petr Prikryl (prikryl) Date: 2005-04-14 14:26
Logged In: YES 
user_id=771873

New observation: sys.stdout.write(s) behaves visually on the 
screen exactly as the raw_input(s) does. So, print does 
something more when displaying on the screen...

Petr
msg23914 - (view) Author: Petr Prikryl (prikryl) Date: 2005-04-14 14:34
Logged In: YES 
user_id=771873

Python 2.4.1 for Windows behaves the same way.

Petr
msg23915 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2005-06-26 20:34
Logged In: YES 
user_id=1188172

Actually, your sys.stdout.encoding is set to something
different than cp1250, which is why the result of DOS type
looks the same as the one of print.

This is because print observes sys.stdout.encoding, while
sys.stdout.write uses the system default encoding, which is,
as you set it, cp1250 and is displayed wrong on the console.

Closing this bug, as it is currently expected behaviour (but
will perhaps change when patch #1214889 is accepted).
msg23916 - (view) Author: Petr Prikryl (prikryl) Date: 2005-06-28 05:56
Logged In: YES 
user_id=771873

Should I understand it that there is no bug, but I do use it 
incorrectly? I cannot agree that this is expected behaviour. (I 
am not the only one who found this strange.) 

Of course, the sys.stdout.encoding is different for a DOS 
window (cp852) than the default encoding (cp1250). Windows 
simply behaves this way when working with DOS window 
(because of legacy DOS applications).

I do not complain on behaviour of sys.stdout.write() but on 
behaviour of raw_input(). The output of raw_input() prompt 
should be displayed the same way as the print diplays the 
results to the user. The raw_input() is used for building user 
interface. Its prompt should not be displayed differently in 
windows that use different encoding (i.e. DOS console vs. 
say IDLE console).

In other words, how should I use raw_input() to make it 
working correctly?
msg23917 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2005-06-28 06:40
Logged In: YES 
user_id=1188172

You'll have to explicitly encode the unicode string using
raw_input(s.encode(sys.stdout.encoding)).

As said, this behaviour will change if the patch mentioned
is accepted.
msg23918 - (view) Author: Petr Prikryl (prikryl) Date: 2005-08-08 08:37
Logged In: YES 
user_id=771873

As the patch #1214889 that would have solved the problem 
on lower levels was rejected, the problem should be reopened 
and the raw_input() internals should be implemented similarly 
to print. 

Thanks, 
  Petr
msg23919 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2006-06-14 16:49
Logged In: YES 
user_id=849994

Per python-dev discussion, this is still expected behavior
(and a wart).
History
Date User Action Args
2005-01-10 10:33:24prikrylcreate