Message 147005 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	stefanholek
Recipients	ezio.melotti, stefanholek
Date	2011-11-04.14:37:13
SpamBayes Score	3.041031e-08
Marked as misclassified	No
Message-id	<1320417434.92.0.472252310596.issue13342@psf.upfronthosting.co.za>
In-reply-to

Content
The input builtin always uses "strict" error handling for Unicode conversions. This means that when I enter a latin-1 string in a utf-8 environment, input breaks with a UnicodeDecodeError. Now don't tell me not to do that, I have a valid use-case. ;-) While "strict" may be a good default choice, it is clearly not sufficient. I would like to propose an optional 'errors' argument to input, similar to the 'errors' argument the decode and encode methods have. I have in fact implemented such an input method for my own use: https://github.com/stefanholek/rl/blob/surrogate-input/rl/input.c While this solves my immediate needs, the fact that my implementation is basically just a copy of bltinmode.input with one additional argument, makes me think that this could be fixed in Python proper. There cannot be a reason input() should be confined to "strict", or can there? ;-)

The input builtin always uses "strict" error handling for Unicode conversions. This means that when I enter a latin-1 string in a utf-8 environment, input breaks with a UnicodeDecodeError. Now don't tell me not to do that, I have a valid use-case. ;-)

While "strict" may be a good default choice, it is clearly not sufficient. I would like to propose an optional 'errors' argument to input, similar to the 'errors' argument the decode and encode methods have.

I have in fact implemented such an input method for my own use:
https://github.com/stefanholek/rl/blob/surrogate-input/rl/input.c

While this solves my immediate needs, the fact that my implementation is basically just a copy of bltinmode.input with one additional argument, makes me think that this could be fixed in Python proper.

There cannot be a reason input() should be confined to "strict", or can there? ;-)

History
Date	User	Action	Args
2011-11-04 14:37:14	stefanholek	set	recipients: + stefanholek, ezio.melotti
2011-11-04 14:37:14	stefanholek	set	messageid: <1320417434.92.0.472252310596.issue13342@psf.upfronthosting.co.za>
2011-11-04 14:37:14	stefanholek	link	issue13342 messages
2011-11-04 14:37:13	stefanholek	create