This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author serhiy.storchaka
Recipients benspiller, docs@python, ezio.melotti, serhiy.storchaka, steven.daprano, terry.reedy
Date 2016-05-19.17:54:55
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1463680495.68.0.412131044276.issue26369@psf.upfronthosting.co.za>
In-reply-to
Content
> btw If anyone can find the place in the code (sorry I tried and failed!) where str.encode('utf-8', error=X) is resulting in an implicit call to the equivalent of decode(defaultencoding, errors=strict) (as suggested by the exception message) I think it'll be easier to discuss the details of fixing.

There is no single place. Search lines "str = PyUnicode_FromObject(str);" in Modules/_codecsmodule.c.

> But that's not what happens - it *silently works* (is a no-op) as long as you happen to be using ASCII characters so this so-called 'programming bug' will go unnoticed by most programmers (and authors of third party library code you might be relying on!)... but the moment a non-ascii character get introduced suddenly you'll get an exception, maybe in some library code you rely on but can't fix.

The problem is that encoding ASCII str to UTF-8 is legal operation in some circumstances and is a programming bug in other. There is no way to distinguish these two cases automatically.

As non-English speaker I am familiar with the problems you described. This is a bug in the design of Python 2, and the only solution is using Python 3.

You can experiment with your idea, but I'm afraid that the patch will be more difficult than you expect and break the tests. I want to warn that even if your experiment is quite successful, there is not much chance to take it in 2.7. This is more like a new feature than a bug fix. Programs that depend on this feature will be incompatible with previous bugfix releases. It is unlikely to help the migration on Python 3, but rather would encourage writing code that is incompatible with Python 3.
History
Date User Action Args
2016-05-19 17:54:55serhiy.storchakasetrecipients: + serhiy.storchaka, terry.reedy, ezio.melotti, steven.daprano, docs@python, benspiller
2016-05-19 17:54:55serhiy.storchakasetmessageid: <1463680495.68.0.412131044276.issue26369@psf.upfronthosting.co.za>
2016-05-19 17:54:55serhiy.storchakalinkissue26369 messages
2016-05-19 17:54:55serhiy.storchakacreate