classification
Title: getpass.getpass() fails with non-ASCII characters in prompt
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.5, Python 3.4, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Arfrever, kushal.das, loewis, python-dev, r.david.murray, serhiy.storchaka
Priority: normal Keywords: easy, patch

Created on 2014-04-07 11:23 by Arfrever, last changed 2014-04-14 14:32 by r.david.murray. This issue is now closed.

Files
File name Uploaded Description Edit
getpass_test_python2 Arfrever, 2014-04-07 11:24
getpass_test_python3 Arfrever, 2014-04-07 11:24
issue21169.patch kushal.das, 2014-04-08 22:20 Stops the breakage of nonascii chars in non utf-8 environment review
issue21169_v2.patch kushal.das, 2014-04-08 22:46 Version2 of the patch with a test update. review
issue21169_v3.patch kushal.das, 2014-04-09 17:24 Updated patch with discovering of currect locale and corresponding test case. review
issue21169_v4.patch kushal.das, 2014-04-09 17:26 New patch with actual test case :) review
issue21169_v5.patch kushal.das, 2014-04-14 00:58 Newer version of the patch with stream.encoding review
issue21169_v6.patch kushal.das, 2014-04-14 01:33 New patchset. review
issue21169_v7.patch kushal.das, 2014-04-14 13:49 Another patch with docs update and one line code comment. review
Messages (19)
msg215697 - (view) Author: Arfrever Frehtes Taifersar Arahesis (Arfrever) * Date: 2014-04-07 11:23
getpass.getpass() fails with non-ASCII characters in prompt.

The attached example scripts (for Python 2 and 3) contain non-ASCII unicode prompt (Polish "hasło" == English "password") written in UTF-8.
Python-2 version fails always. Python-3 version fails in non-UTF-8 locale.

$ ./getpass_test_python2
Traceback (most recent call last):
  File "./getpass_test_python2", line 5, in <module>
    getpass.getpass(u"Hasło: ")
  File "/usr/lib64/python2.7/getpass.py", line 71, in unix_getpass
    passwd = _raw_input(prompt, stream, input=input)
  File "/usr/lib64/python2.7/getpass.py", line 128, in _raw_input
    prompt = str(prompt)
UnicodeEncodeError: 'ascii' codec can't encode character u'\u0142' in position 3: ordinal not in range(128)
$ LC_ALL="en_US.UTF-8" ./getpass_test_python3
Hasło: 
$ LC_ALL="C" ./getpass_test_python3
Traceback (most recent call last):
  File "./getpass_test_python3", line 5, in <module>
    getpass.getpass("Has\u0142o: ")
  File "/usr/lib64/python3.4/getpass.py", line 78, in unix_getpass
    passwd = _raw_input(prompt, stream, input=input)
  File "/usr/lib64/python3.4/getpass.py", line 138, in _raw_input
    stream.write(prompt)
UnicodeEncodeError: 'ascii' codec can't encode character '\u0142' in position 3: ordinal not in range(128)
$
msg215778 - (view) Author: Kushal Das (kushal.das) * (Python committer) Date: 2014-04-08 22:20
Here is a patch which stops the breakage in getpass for python3.
msg215779 - (view) Author: Kushal Das (kushal.das) * (Python committer) Date: 2014-04-08 22:46
Version 2 of the patch with a test update.
msg215828 - (view) Author: Kushal Das (kushal.das) * (Python committer) Date: 2014-04-09 17:24
Updated patch with discovering of currect locale and corresponding test case.
msg215829 - (view) Author: Kushal Das (kushal.das) * (Python committer) Date: 2014-04-09 17:26
New patch with actual test case :)
msg216002 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-04-13 08:51
I don't think this is a bug. Any text output operation can fail when outputs unencodable string. You should use a stream with proper encoding and/or error handler.
msg216006 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2014-04-13 09:14
I agree that it is not a bug if the device where the prompt is shown simply does not support the characters; on Unix, this includes cases where the locale does not support the characters.

Arfrever: when you say that it fails in Python 3 in a non-UTF-8 locale, which specific locale was that that it failed in?
msg216009 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-04-13 10:10
$ LC_ALL=en_US.iso88591 ./python -c "print('\u20ac')"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
UnicodeEncodeError: 'latin-1' codec can't encode character '\u20ac' in position 0: ordinal not in range(256)

$ LC_ALL=en_US.iso88591 ./python -c "input('\u20ac')"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
UnicodeEncodeError: 'latin-1' codec can't encode character '\u20ac' in position 0: ordinal not in range(256)

$ LC_ALL=en_US.iso88591 ./python -c "import getpass; getpass.getpass('\u20ac')"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/serhiy/py/cpython/Lib/getpass.py", line 78, in unix_getpass
    passwd = _raw_input(prompt, stream, input=input)
  File "/home/serhiy/py/cpython/Lib/getpass.py", line 138, in _raw_input
    stream.write(prompt)
UnicodeEncodeError: 'latin-1' codec can't encode character '\u20ac' in position 0: ordinal not in range(256)
msg216013 - (view) Author: Arfrever Frehtes Taifersar Arahesis (Arfrever) * Date: 2014-04-13 10:42
Martin v. Löwis: In this case, device support non-ASCII characters, but Python's getpass module forgets to properly encode string. Message 215697 contains example with C locale.
msg216041 - (view) Author: Kushal Das (kushal.das) * (Python committer) Date: 2014-04-14 00:58
Here is a new patch which uses stream.encoding instead getting the encoding from the locale as suggested by David. It also contains the new test.
msg216045 - (view) Author: Kushal Das (kushal.das) * (Python committer) Date: 2014-04-14 01:33
New patchset with updated test, now sending ascii stream into the call as argument.
msg216050 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2014-04-14 01:58
Arfrever: If you set the locale to C, the device does *not* (anymore) support the character. The terminal application you are using may, but the "system" does not. It only supports the characters available in the locale, which your character is not. There simply is no way in which Python *could* encode the character.
msg216051 - (view) Author: Roundup Robot (python-dev) Date: 2014-04-14 02:09
New changeset f430fdd1628e by R David Murray in branch '3.4':
#21169: fix getpass to use replace error handler on UnicodeEncodeError.
http://hg.python.org/cpython/rev/f430fdd1628e

New changeset 461f5863f2aa by R David Murray in branch 'default':
Mierge #21169: fix getpass to use replace error handler on UnicodeEncodeError.
http://hg.python.org/cpython/rev/461f5863f2aa
msg216052 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014-04-14 02:14
Since we don't want the prompting for the password to fail, what we do in the patch is use the replace error handler so that you get as much as could be encoded of the prompt.  (Note: this approach was reviewed by both Toshio and Marc Andre.)

Thanks for the patch, Kushal.
msg216054 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2014-04-14 02:19
Ok. I wish the patch had a comment saying that, or better even a documentation change pointing out that feature.
msg216077 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014-04-14 12:21
Ok, I'll reopen the issue to do that.
msg216078 - (view) Author: Kushal Das (kushal.das) * (Python committer) Date: 2014-04-14 13:49
Another patch with docs update and one line code comment.
msg216081 - (view) Author: Roundup Robot (python-dev) Date: 2014-04-14 14:31
New changeset bdde36cd9048 by R David Murray in branch '3.4':
#21169: add comment and doc update for getpass change.
http://hg.python.org/cpython/rev/bdde36cd9048

New changeset fe532dccf8f6 by R David Murray in branch 'default':
Merge: #21169: add comment and doc update for getpass change.
http://hg.python.org/cpython/rev/fe532dccf8f6
msg216082 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014-04-14 14:32
I decided to tweak the language slightly, Kushal.  If this isn't what you were looking for, Martin, let me know.
History
Date User Action Args
2014-04-14 14:32:58r.david.murraysetstatus: open -> closed

messages: + msg216082
2014-04-14 14:31:08python-devsetmessages: + msg216081
2014-04-14 13:49:16kushal.dassetfiles: + issue21169_v7.patch

messages: + msg216078
2014-04-14 12:21:04r.david.murraysetstatus: closed -> open

messages: + msg216077
2014-04-14 02:19:23loewissetmessages: + msg216054
2014-04-14 02:14:14r.david.murraysetstatus: open -> closed
type: behavior
resolution: fixed
messages: + msg216052
2014-04-14 02:09:45python-devsetnosy: + python-dev
messages: + msg216051
2014-04-14 01:58:41loewissetmessages: + msg216050
2014-04-14 01:33:15kushal.dassetfiles: + issue21169_v6.patch

messages: + msg216045
2014-04-14 00:58:37kushal.dassetfiles: + issue21169_v5.patch

messages: + msg216041
2014-04-13 10:42:24Arfreversetresolution: not a bug -> (no value)
messages: + msg216013
2014-04-13 10:10:10serhiy.storchakasetmessages: + msg216009
2014-04-13 09:14:28loewissetstatus: pending -> open
nosy: + loewis
messages: + msg216006

2014-04-13 08:51:20serhiy.storchakasetstatus: open -> pending

nosy: + serhiy.storchaka
messages: + msg216002

resolution: not a bug
stage: needs patch -> resolved
2014-04-12 21:09:55r.david.murraysetnosy: + r.david.murray
2014-04-09 17:26:44kushal.dassetfiles: + issue21169_v4.patch

messages: + msg215829
2014-04-09 17:24:16kushal.dassetfiles: + issue21169_v3.patch

messages: + msg215828
2014-04-08 22:46:53kushal.dassetfiles: + issue21169_v2.patch

messages: + msg215779
2014-04-08 22:20:41kushal.dassetfiles: + issue21169.patch

nosy: + kushal.das
messages: + msg215778

keywords: + patch
2014-04-07 11:24:18Arfreversetfiles: + getpass_test_python3
2014-04-07 11:24:09Arfreversetfiles: + getpass_test_python2
2014-04-07 11:23:55Arfrevercreate