classification
Title: IDLE Internationalization
Type: enhancement Stage: test needed
Components: IDLE Versions: Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: terry.reedy Nosy List: Al.Sweigart, Todd.Rovito, ezio.melotti, mariedam, ned.deily, olberger, pitrou, roger.serwy, terry.reedy
Priority: normal Keywords: patch

Created on 2013-04-17 14:13 by mariedam, last changed 2017-10-14 21:33 by terry.reedy.

Files
File name Uploaded Description Edit
patch.diff mariedam, 2013-04-17 14:13 review
patch_2.tar.gz mariedam, 2013-04-21 09:54
Messages (14)
msg187165 - (view) Author: Damien Marié (mariedam) Date: 2013-04-17 14:13
Following the issue 17760

Internationalization should be implemented.
I propose to implement it as an optionnal settings first. And with the gettext library.

I'm not experienced with the idlelib module but here is a first patch, don't hesitate to comment it. It just add i18n to the menu for now.
msg187176 - (view) Author: Olivier Berger (olberger) Date: 2013-04-17 16:03
Excellent. I've started playing with pygettext and msgfmt and it looks like this works, from the initial tests I've made
msg187219 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2013-04-18 02:46
Whether and how much to internationalize Idle is being discussed on idle-dev thread "I18n of IDLE's interface ?". I was going to suggest there that the menu system would be the first place to start. A prime concern for me is that we not break anything (hence some of the questions below), and the menu labels seem relatively safe (compared to format strings -- see post on thread).

While locale is used to format dates and times, I believe this would be the first use of gettext within the stdlib itself. To me it is a plausible to do because the idlelib modules are used to write and run code rather than being imported into code. I still have this concern: if a beginner can manage to handle 'English' keywords, builtin names, and exception names and messages*, does having translated menu labels give enough benefit to be worth the bother? I am open to the answer being yes, but before I were to commit this patch to the CPython repository, I would like to see a working example translation and a report of a field test with real students.

(* I an not including stdlib because many beginners programs make little use of imports.)

As for the patch: It looks good as far as it goes, but I have little knowledge of locale and gettext beyond the bare bones and no experience with either. The gettext doc is not all that clear to me, and it seems exclusively unix-focused, whereas I am on Windows. My questions:

1. Does it actually work on Windows (and Mac), without bugs?
(I could sometime look as test_gettext and try it on Windows, but not on the machine I am on at the moment.)

2. +gettext.bindtextdomain('idlelib')
What does this actually do. Where do .mo files go on the various OSes. This sort of doc has to be part of a patch.

3. Does the gettext machinery look at an environmental locale variable behind the scenes? Is that how it decides on the translation language, if any?

4. +_ = gettext.gettext
Leave aside the issue of doing the binding in builtins versus each module, as in the patch (this seems safer). The doc is skimpier than I would like: "Return the localized translation of message, based on the current global domain, language, and locale directory." The latter part is part of my question above.

As for the first part: is the default behavior to simply echo the text passed in? If Idle executes in a non-default environment/locale, but there is no translation file, does it echo the original string (English) or raise an exception? Same question if there is an appropriate translation file but no entry for the particular string? Overall, does gettext *ever* raise an exception, or does it *always* return a string of the correct type, or might it return bytes when unicode is expected (in 3.x)? In other words, can replacing a string literal with a gettext call cause Idle to crash?

5. Focusing only on the menus, do you expect anything more applied to the repository than a patch like this and a doc patch? Who do you see as running Tools/i18n/pygettext.py, a core dev, one other person, or each translator? What do you see happening with the .pot file for each release? Include it with the release? Distribute on pypi? or regenerate it by each translator? And what about .mo files? It would seem silly to have multiple French .mo files, although I can imaging that different teachers might disagree on the best translation for their students ;-).

Let me put is another way. The patch by itself is useless. In fact, even if it is completely transparent to users, it will *not* be transparent to Idle developers working on the code. It will actually be a detriment* unless there is additional work done. Who do you two, or any other advocates of IdleIl8n envision as doing the various tasks needed to make it useful?

Perhaps there should be an IdleIl8n project on PyPI. In fact, such a project could be done without 'official' cooperation. If indeed there is no such project, I would wonder whether such absence indicates an absence of need. Or is it knowledge of how? Testing something as a 3rd party distribution and getting community acceptance is one normal way for things to get added to the stdlib.

(6, Suppose English speaking teachers or user might want to customize the menu labels. Can that also be done with .gettext?)

* Besides uglifying the code a bit, this patch will break any existing patches on the tracker that target the same lines and make tracing the history of any patched line through hg annotate harder.

Thought also needs to be given to the extension mechanism. As I understand it, pygettest.py will not pick up menu entries dynamically added by extensions. Roger, the extension expert, might comment on this.
msg187248 - (view) Author: Roger Serwy (roger.serwy) * (Python committer) Date: 2013-04-18 13:53
Extensions would need to be modified to use the gettext module.
msg187266 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2013-04-18 16:38
Also, IDLE makes use of features provided by Tk and those vary by platform.  In some cases, IDLE uses some Tk-supplied default menus and menu items.  So internationalization of IDLE would need to investigate and make use of Tk i18n features on all supported platforms.
msg187268 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-04-18 16:48
+1 for internationalizing IDLE. Any decent GUI app is internationalized these days, even developer tools.

Re: gettext, its semantics and API are a bit of a PITA, but it's basically the standard for internationalization of FLOSS projects. There are all kind of tools to help edit gettext localization files.

Re: how to organize translation work, Olivier Berger already answered on idle-dev, I believe.
msg187498 - (view) Author: Damien Marié (mariedam) Date: 2013-04-21 09:54
Here is a new patch featuring:
_ a setting to disable idle i18n
_ a documentation

Things needed:
_ taking into account Windows (where IDLE is mainly used)
_ a much in-depth translation of the interface: Context-menu, dialogs, ...
_ unit-testing it

To test it by yourself without touching your /usr/share/local you can modify the binddomain() (in i18n.py) to another dir:
Like " gettext.bindtextdomain('idlelib',"/home/you/your_trans_dir/") "

And put in this dir the "en" dir in the tar file.

The .mo generation is explained in the module documentation.

So, here is a tar archive with:
_ a screenshot of the patch in action
_ the patch
_ the trans dir to try it by yourself
_ the .po file (thanks to Olivier Berger)
msg188345 - (view) Author: Damien Marié (mariedam) Date: 2013-05-04 09:26
A side note to justify the localization of IDLE:

I think the internationalization part is important, It's a nearly invisible overhead for the code but it will be helpful for example:

- In France, most of the highschool student have to learn python so they learn keyword and logic but most of them don't master english. And also in prep school like statued in the issue 17760
- Making the beginners more confident with the tool (as it's mostly used by beginners)
- Most of the IDE are localized and the OS is localized, it's a matter of consistency to localize the IDLE UI

But I'm against localizing exception messages and anything built into python
 
I hope I responsed to some concerns.
msg234473 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2015-01-22 05:20
I answered my Q1 in msg187219: test.test_gettest is currently passing, with no skips, on 2.7 and 3.4 on Win 7.

patch.diff: I would rather add the 4 lines of the proposed idle_i18n.py to an existing module, perhaps Bindings.py itself, since that is the first place _ will be used.  I think +-60 modules is already too many.

The binding of '_' to gettext.gettext conflicts with the somewhat common use of '_' as a dummy identifier.  I do not know of any such uses in idlelib, but there might be.  There are about 4500 lines in idlelib with '_'; too many to review.  Someone should do a more refined search with an re that excludes '_' preceded or followed by an identifier char, to skip '__xyz__' or '_x' or 'y_'.

If '_ is used for gettest, a new rule to not otherwise bind '_' should be added the currently non-existent Idle maintainer guide.

patch2.tar.gz is not readable by Rietveld, Firefox, IE, or Windows.  Patches should be uploaded as plaintext.diff or .patch.

Damien: Contributors must submit a signed Contributor Agreement. See https://www.python.org/psf/contrib/ and https://www.python.org/psf/contrib/contrib-form/ (the online form).  Please do this even before re-uploading patch2.  Receipt and acceptance of a form is acknowledged by addition of an * after "Author: nick(real name)".
msg240539 - (view) Author: Al Sweigart (Al.Sweigart) * Date: 2015-04-12 04:34
> Someone should do a more refined search with an re that excludes '_' preceded or followed by an identifier char, to skip '__xyz__' or '_x' or 'y_'.

I've run this regex over all the .py and .pyw files in idlelib: [^_'"a-zA-Z0-9]_[^a-zA-Z0-9_]

The only possible conflict I've found is in rpc.py's displayhook() function, which sets builtins._ to the argument passed to displayhook().

There is a cryptic comment: # Set '_' to None to avoid recursion

I'm not sure what the reasoning behind this code is. This was written by Andrew Svetlov for issue 14200
msg248008 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2015-08-04 23:26
I presume setting builtins._ is part of imitating the shell. It could be replace with setattr.
msg248245 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2015-08-08 02:51
I just noticed https://pypi.python.org/pypi/idle-lif/1.0 Python IDLE Language Pack.  Have not looked at it.

If someone decides to work on this, I have ideas on how i18n could be done with minimal impact on the code, partly based on https://pythonhosted.org/flufl.i18n/
msg304403 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2017-10-14 21:26
In the last few months, configdialog has been refactored to have a class for each settings tab.  This makes it easier to revise and add new options, such as 'menu language'.

With this done, the IDLE features implemented as (optional) extensions were turned into normal features.  Non-key option settings were moved to the General tab. The menu items were moved to mainmenu.py.  Having everything in one place should make translation easier.  When making further menu system changes, I will keep I18n in mind.

AFAIK, there is no runtime need to have 60+translation calls in the invisible mainmenu.menudefs structure, which would make it harder to read.  I would rather change the 'label=name' option in menu item insertion calls to 'label=_(name)' or 'label=gettext(name)' or possibly 'label=trandict[name].  In macosx the explicit "label='window'" would need the same change.  So would the scattered context menu insertion calls.

Just for translating the menu, gettext seems possibly like overkill.  I remain reluctance to use it without my questions above answered.  Another question: can the automatic selection mechanism be overridden by the user?

All that is needed is a simple template file that creates a translation dictionary  I think translation would be easiest if the template preserved both the order and hierarchical structure.  It would be trivial to write a function to create a full version of the following.

trandict = {
    'file': 
        '_New File':
        ...
    ...
}
[Note: top level names like 'file' are lowercased in menudefs but then capitalized in the menu.  Maybe they should be uppercased to start with.]

The output instead could match what whatever the gettext machinery requires.  I don't know if the gettext extractor preserves the order, but I am sure it will not preserve the structure.

Recently, translations were added to the official doc site.  A translation of the IDLE section should have a good-enough translation of the menu that can be extracted into a dict for IDLE.  The only problem is that they will not have the underscores for hot keys.  I don't know if these are universally used.

The Japanese translation includes the IDLE section.  The top level names (File, Edit, ... Help) are not tranlated.  Perhaps the team felt that such terms should be familiar enough to Japanese users to not need translation.  The dropdown menu labels *are* translated, but keep the English, as in
'New File [新規ファイル]'.     In any case, this is enough for experiments.
msg304404 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2017-10-14 21:33
Ned> IDLE uses some Tk-supplied default menus and menu items

On Windows and *nix, the visible menu labels all come from IDLE.  I don't yet understand what is going on with Mac menus.  Some menu items call the tk file dialog, and the config highlight tab calls the color dialog, and these are localized however they are.  But that does not affect the menus.

Ned> So internationalization of IDLE would need to investigate and make use of Tk i18n features on all supported platforms.

Other than the dialogs mentioned above, (and the font dialog that is not used) I don't know what features you mean.
History
Date User Action Args
2017-10-14 21:33:05terry.reedysetmessages: + msg304404
2017-10-14 21:26:12terry.reedysetassignee: terry.reedy
messages: + msg304403
versions: + Python 3.7, - Python 2.7, Python 3.4, Python 3.5
2015-08-08 02:51:40terry.reedysetmessages: + msg248245
2015-08-04 23:26:51terry.reedysetmessages: + msg248008
2015-04-12 04:34:33Al.Sweigartsetmessages: + msg240539
2015-01-22 05:20:03terry.reedysetmessages: + msg234473
versions: + Python 3.5, - Python 3.3
2015-01-06 22:30:06Al.Sweigartsetnosy: + Al.Sweigart
2013-05-04 09:26:18mariedamsetmessages: + msg188345
2013-04-21 09:54:32mariedamsetfiles: + patch_2.tar.gz

messages: + msg187498
2013-04-18 16:48:23pitrousetnosy: + pitrou
messages: + msg187268
2013-04-18 16:38:10ned.deilysetnosy: + ned.deily
messages: + msg187266
2013-04-18 13:53:31roger.serwysetmessages: + msg187248
2013-04-18 12:52:57r.david.murraylinkissue17760 superseder
2013-04-18 03:31:06Todd.Rovitosetnosy: + Todd.Rovito
2013-04-18 02:46:29terry.reedysetmessages: + msg187219
stage: patch review -> test needed
2013-04-17 16:14:39ezio.melottisetnosy: + terry.reedy, ezio.melotti, roger.serwy
stage: patch review

versions: + Python 2.7, Python 3.3, Python 3.4
2013-04-17 16:03:36olbergersetnosy: + olberger
messages: + msg187176
2013-04-17 14:13:03mariedamcreate