Author terry.reedy
Recipients Arfrever, ezio.melotti, gvanrossum, jkloth, lemburg, mrabarnett, pitrou, r.david.murray, tchrist, terry.reedy, v+python, vstinner
Date 2011-09-02.19:24:21
SpamBayes Score 5.91284e-08
Marked as misclassified No
Message-id <1314991463.17.0.368545554887.issue12729@psf.upfronthosting.co.za>
In-reply-to
Content
Ezio, that is a lot of nice work to track down those pieces of the standard. I think the operative phrase in many of those quotes is 'open interchange'. Codecs are also used for private storage. If I use the unassigned or private-use code points in a private project, I would use utf-8 to save the work between active sessions. That is quite fine under the standard. But I should not put files with such codings on the net for 'open interchange'. And if I receive them, the one thing I should not do is interpret them as meaningful abstract characters.

So the codec should allow for both public and private use. I have the impression that is does so now. A Python programmer should know whether the codec is being used for private (or local group) files or random stuff from the net, and therefore, what the appropriate error handling is. If they do not now, the docs could suggest that public text should normally be decoded with 'strict' or 'replace' and that 'ignore' should normally be reserved for local text that is known to intentionally have 'errors'.

I am pretty sure that the intent of prohibiting non-standard interpretation of code points as abstract characters is to prevent 'organic' evolution of the code point -- abstract character mapping in which anyone (or any company) who wants to do so creates a new pairing and promotes its wide recognition around the world. Conforming implementations are strict in both what they produce (publicly) *and* in what they accept (from public sources). Many now think that liberal acceptance of sloppy html promoted sloppy production of html.
History
Date User Action Args
2011-09-02 19:24:23terry.reedysetrecipients: + terry.reedy, lemburg, gvanrossum, pitrou, vstinner, jkloth, ezio.melotti, mrabarnett, Arfrever, v+python, r.david.murray, tchrist
2011-09-02 19:24:23terry.reedysetmessageid: <1314991463.17.0.368545554887.issue12729@psf.upfronthosting.co.za>
2011-09-02 19:24:22terry.reedylinkissue12729 messages
2011-09-02 19:24:21terry.reedycreate