Message 283409 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	vstinner
Recipients	Jan Niklas Hasse, abarry, ezio.melotti, lemburg, methane, ncoghlan, r.david.murray, vstinner
Date	2016-12-16.15:15:34
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1481901334.84.0.637678025286.issue28180@psf.upfronthosting.co.za>
In-reply-to

Content
> I believe Victor put quite a bit of time into trying to get more selective approaches to work reliably and eventually gave up. Yeah, it just doesn't work to use more than one encoding per process. You should use the same encoding for the whole lifetime of a process. If you decode early data from an encoding A and later encode it back to encoding B, you get mojibake. The problem is simple. Using more than one encoding per process means starting to make assumtpions on how data is used. For example, consider that environment variables use the encoding A, but filenames should use the encoding B. Or, but what if an environment variable contains a filename? Similar issues for command line arguments, subprocess pipes, standard streams (sys.std*), etc.

> I believe Victor put quite a bit of time into trying to get more selective approaches to work reliably and eventually gave up.

Yeah, it just doesn't work to use more than one encoding per process. You should use the same encoding for the whole lifetime of a process.

If you decode early data from an encoding A and later encode it back to encoding B, you get mojibake. The problem is simple.

Using more than one encoding per process means starting to make assumtpions on how data is used. For example, consider that environment variables use the encoding A, but filenames should use the encoding B. Or, but what if an environment variable contains a filename? Similar issues for command line arguments, subprocess pipes, standard streams (sys.std*), etc.

History
Date	User	Action	Args
2016-12-16 15:15:34	vstinner	set	recipients: + vstinner, lemburg, ncoghlan, ezio.melotti, r.david.murray, methane, abarry, Jan Niklas Hasse
2016-12-16 15:15:34	vstinner	set	messageid: <1481901334.84.0.637678025286.issue28180@psf.upfronthosting.co.za>
2016-12-16 15:15:34	vstinner	link	issue28180 messages
2016-12-16 15:15:34	vstinner	create