Author terry.reedy
Recipients serhiy.storchaka, taleinat, terry.reedy
Date 2020-06-28.19:15:07
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1593371708.36.0.660098958981.issue41148@roundup.psfhosted.org>
In-reply-to
Content
I opened # 41152 to revise the iomenu code setting encoding and errors. The encoding is always 'utf-8' when testing and running on Windows and, I believe, macOS.  So that and this issues are about finding and using locale on *nix, and would require testing on *nix.

Serhiy, I notice that you have been fixing similar issues for other modules.  I have been meaning to ask you to help us review the multiple uses of iomenu.encoding throughout IDLE, so thanks for opening this ;-).

It seems to me that all encoded text within IDLE and through any inter-process streams should be utf-8.  I particular, when you and I last revised the interprocess file classes, in run.py, we left the encoding and errors parameters to set from the iomenu values.  (The 'utf-8' and 'strict' defaults are never used. Since user code can print any unicode, I think the encoding just be set to utf-8 to transparently pass on and possibly display anything the user sends.  Such a change should have no back-compatibility issues.  Agreed?

All filenames entered and displayed within IDLE are unicode and hence utf8-encodable.  And they *can* appear in any of the .idlerc config-xyz.cfg files and in breakpoints.lst and recent-files.lst.  So they *should* be saved in utf-8, though in the absence of visible issues, doing so seems like a low priority.

Problem 1 is that users can and do edit these files, and may even add one, and they might not save in utf8.  We could add a comment "# If this file has non-ascii, save in utf-8." (and try reading with the locale encoding if reading with utf-8 fails).

Problem 2 is that if these files are shared across releases.  If one currently has non-ascii written in non-utf8 by existing IDLE, and new IDLE rewrites in utf8, then existing IDLE will not be able to read the file.  The best solution I can think of is to break user-config compatibility with the past, such as by switching to '.idle3rc'.  If and when we do this, I would want to change other things, such as a few defaults.  So not in 3.9, maybe in 3.10.  There would need to be enough payoff for the pain.

I believe all other text files, which users do not edit, are already ascii or utf-8, and any new files we add that could have non-ascii should be read as utf-8.
History
Date User Action Args
2020-06-28 19:15:08terry.reedysetrecipients: + terry.reedy, taleinat, serhiy.storchaka
2020-06-28 19:15:08terry.reedysetmessageid: <1593371708.36.0.660098958981.issue41148@roundup.psfhosted.org>
2020-06-28 19:15:08terry.reedylinkissue41148 messages
2020-06-28 19:15:07terry.reedycreate