This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author stefanholek
Recipients ezio.melotti, stefanholek
Date 2011-11-04.14:37:13
SpamBayes Score 3.041031e-08
Marked as misclassified No
Message-id <1320417434.92.0.472252310596.issue13342@psf.upfronthosting.co.za>
In-reply-to
Content
The input builtin always uses "strict" error handling for Unicode conversions. This means that when I enter a latin-1 string in a utf-8 environment, input breaks with a UnicodeDecodeError. Now don't tell me not to do that, I have a valid use-case. ;-)

While "strict" may be a good default choice, it is clearly not sufficient. I would like to propose an optional 'errors' argument to input, similar to the 'errors' argument the decode and encode methods have.

I have in fact implemented such an input method for my own use:
https://github.com/stefanholek/rl/blob/surrogate-input/rl/input.c

While this solves my immediate needs, the fact that my implementation is basically just a copy of bltinmode.input with one additional argument, makes me think that this could be fixed in Python proper.

There cannot be a reason input() should be confined to "strict", or can there? ;-)
History
Date User Action Args
2011-11-04 14:37:14stefanholeksetrecipients: + stefanholek, ezio.melotti
2011-11-04 14:37:14stefanholeksetmessageid: <1320417434.92.0.472252310596.issue13342@psf.upfronthosting.co.za>
2011-11-04 14:37:14stefanholeklinkissue13342 messages
2011-11-04 14:37:13stefanholekcreate