Title: IDLE uses the locale encoding for config files
Type: enhancement Stage: test needed
Components: IDLE Versions: Python 3.10
Status: open Resolution:
Dependencies: Superseder:
Assigned To: terry.reedy Nosy List: serhiy.storchaka, taleinat, terry.reedy
Priority: normal Keywords:

Created on 2020-06-28 15:26 by serhiy.storchaka, last changed 2020-06-28 20:29 by terry.reedy.

Messages (3)
msg372520 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-06-28 15:26
IDLE uses the locale encoding for reading and writing config files. Default config files are ASCII-only, but if user config files contain non-ASCII data, it makes them non-portable and depending on the environment of IDLE.

Could they contain file paths? If yes, then not all file paths can be saved.

In any case it is better to use a fixed encoding for config files (ASCII or UTF-8).
msg372529 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2020-06-28 19:15
I opened # 41152 to revise the iomenu code setting encoding and errors. The encoding is always 'utf-8' when testing and running on Windows and, I believe, macOS.  So that and this issues are about finding and using locale on *nix, and would require testing on *nix.

Serhiy, I notice that you have been fixing similar issues for other modules.  I have been meaning to ask you to help us review the multiple uses of iomenu.encoding throughout IDLE, so thanks for opening this ;-).

It seems to me that all encoded text within IDLE and through any inter-process streams should be utf-8.  I particular, when you and I last revised the interprocess file classes, in, we left the encoding and errors parameters to set from the iomenu values.  (The 'utf-8' and 'strict' defaults are never used. Since user code can print any unicode, I think the encoding just be set to utf-8 to transparently pass on and possibly display anything the user sends.  Such a change should have no back-compatibility issues.  Agreed?

All filenames entered and displayed within IDLE are unicode and hence utf8-encodable.  And they *can* appear in any of the .idlerc config-xyz.cfg files and in breakpoints.lst and recent-files.lst.  So they *should* be saved in utf-8, though in the absence of visible issues, doing so seems like a low priority.

Problem 1 is that users can and do edit these files, and may even add one, and they might not save in utf8.  We could add a comment "# If this file has non-ascii, save in utf-8." (and try reading with the locale encoding if reading with utf-8 fails).

Problem 2 is that if these files are shared across releases.  If one currently has non-ascii written in non-utf8 by existing IDLE, and new IDLE rewrites in utf8, then existing IDLE will not be able to read the file.  The best solution I can think of is to break user-config compatibility with the past, such as by switching to '.idle3rc'.  If and when we do this, I would want to change other things, such as a few defaults.  So not in 3.9, maybe in 3.10.  There would need to be enough payoff for the pain.

I believe all other text files, which users do not edit, are already ascii or utf-8, and any new files we add that could have non-ascii should be read as utf-8.
msg372533 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2020-06-28 20:29
After writing the above, I discovered that IDLE *itself* does not use any particular encoding for config files.  Its configparser.Configparser subclasses load function calls .read(filename), which in turn calls open(filename, encoding=None), which calls locale.getpreferredencoding().  For me, this is cp1252.  (Changing the console codepage with chcp or the process locale with setlocale don't change this.  Something else might.)  So this issue currently has nothing to do with iomenu.encoding.  Rather it is a proposal that IDLE explicitly pass a locale, in particular 'utf-8'.  This has the same issues given above.
Date User Action Args
2020-06-28 20:29:10terry.reedysetmessages: + msg372533
2020-06-28 19:15:08terry.reedysettype: enhancement
stage: test needed
messages: + msg372529
versions: + Python 3.10
2020-06-28 15:26:32serhiy.storchakacreate