This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients sdaoden, vstinner
Date 2011-01-27.11:19:27
SpamBayes Score 2.907674e-13
Marked as misclassified No
Message-id <1296127168.49.0.954723791274.issue11022@psf.upfronthosting.co.za>
In-reply-to
Content
> - Using locale.setlocale(..., ...)
> - Re-open causes same error, I/O layer codec has not been changed!

Yes, this is the expected behaviour with the current code.

TextIOWrapper uses indirectly locale.getpreferredencoding() to choose your file encoding. If locale has the CODESET constant, this function sets LC_CTYPE to "" and uses nl_langinfo(CODESET) to get the locale encoding.

locale.getpreferredencoding() has an option to not set the LC_CTYPE to "": locale.getpreferredencoding(False).

Example:
---------------------------
$ python3.1
Type "help", "copyright", "credits" or "license" for more information.
>>> from locale import getpreferredencoding, setlocale, LC_CTYPE
>>> from locale import nl_langinfo, CODESET

>>> setlocale(LC_CTYPE, None)
'fr_FR.utf8'
>>> getpreferredencoding()
'UTF-8'
>>> getpreferredencoding(False)
'UTF-8'

>>> setlocale(LC_CTYPE, 'fr_FR.iso88591')
'fr_FR.iso88591'
>>> nl_langinfo(CODESET)
'ISO-8859-1'
>>> getpreferredencoding()
'UTF-8'
>>> getpreferredencoding(False)
'ISO-8859-1'
---------------------------

Setting LC_CTYPE does change directly nl_langinfo(CODESET) result, but not getpreferredencoding() result because getpreferredencoding() doesn't care of the current locale: it uses its own LC_CTYPE value ("").

getpreferredencoding(False) uses the current locale and give the expected result.

> - Using os.environ["LC_ALL"] = ...
> - Re-open works properly, I/O layer codec has been changed.

Set LC_ALL works because getpreferredencoding() sets the LC_CTYPE to "" which will read the current value of the "LC_ALL" and "LC_CTYPE" environment variables.

--

Actually, TextIOWrapper doesn't use the current locale, it only uses (indirectly) the environment variables. I don't know which behaviour is better.

If you would like that TextIOWrapper uses your current locale, use: open(filename, encoding=locale.getpreferredencoding(True)).

Anyway, I don't know understand why do you change your locale, because you know that your file encoding is Latin1. Why don't you use directly: open(filename, encoding='latin1')?
History
Date User Action Args
2011-01-27 11:19:28vstinnersetrecipients: + vstinner, sdaoden
2011-01-27 11:19:28vstinnersetmessageid: <1296127168.49.0.954723791274.issue11022@psf.upfronthosting.co.za>
2011-01-27 11:19:27vstinnerlinkissue11022 messages
2011-01-27 11:19:27vstinnercreate