This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author reingart
Recipients Ramchandra Apte, ezio.melotti, georg.brandl, neologix, pitrou, r.david.murray, reingart, terry.reedy
Date 2012-10-30.23:39:10
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1351640352.62.0.184930926213.issue16344@psf.upfronthosting.co.za>
In-reply-to
Content
Sorry for taking so long to replying, and for this long follow up...

> Antoine Pitrou added the comment:

> I think the PEP should be proposed on python-dev or python-ideas.
> Also, it's probably better if the PEP is encoded in utf-8, not 
> latin-1.

Ok, I'll update, polish, encode in utf-8 and send to python-dev
It was already discussed in Python-ideas (maybe not in particular/detail), but it seems that no one have more to add there, or they are bussy with the Async API :-).


> Terry J. Reedy added the comment:

> I am sympathetic with non-English speakers wanting a native-language
> translation. 

Sympathetic is a kind of compassion?
It may be a correct meaning here. 
Just read that almost every of you complain arguing that you'll hesitate because you can/could receive a message in a foreign language that could not understand.
Well, that's is what is already happening to non-English speakers of the python language IMHO. 
And it is not just frustrating, sometimes it is also a wasting of time because of the distractions and delays it produces.

> But I think the interpreter should *always* emit the standard message
> and that any translation should be an addition, not a replacement.
> This would maintain discoverablity and help people learn the English
> version, not hinder it.

I'll explore the alternatives to show both messages (original and translated), but I think that would be more confusing.
I do not think that it hinders the meaning, it just translates it, and I didn't see any other language / tool that puts both messages, but I'll investigate more (maybe the  exception name -that is untranslated-  plus an error code like in PostgreSQL would be more helpful to discoverablity)

Learning English by showing both messages may be a interesting experiment, but, for me, it's like traditional education focus on "memorizing" things instead of understanding them, and depending on the context, it can be lead to good results or misleading repetition.

> The real question to me is how deep in the interpreter such support
> should go. Third party shells can (and sometimes do) intercept
> tracebacks and reformat (and translate) as they wish. But there would
> be advantages and disadvanteges is adding the translation sooner.

About except hook approach, it doesn't work very reliable because you don't have the original unformatted message, so you have to interpolate the results to find the correct translation. 
Beside that it will be slower and it could be error prone, the main problem is not technical, but "social", as it could lead to translation effort duplication, segregation and proliferation of custom tools, with the aggravation that in some scenarios except hook is not honoured:

http://bugs.python.org/issue12643 (just an example)

You can take a look at one of my attempt trying to translate using interpolation (my algorithm is some kind of brute force "guessing" using regular expressions just to test the idea):

http://code.google.com/p/pydiversity/source/browse/__diversity__/__init__.py

I think that approach "left in the wild" (and/or "do it yourself") is not only more complex, also it could be more dangerous that having a unified translation resources, where all messages all listed, a common infrastructure is used and general rules are agreed.


> Ezio Melotti added the comment:

> There are two solutions to this problem:
> 1) adapt the language to the users;
> 2) teach the users English;
>
> While the first (i.e. what you are proposing) works as a short term 
> solution, I believe the second is a much better long term solution, 
> because IMHO users will anyway have to learn English sooner or later.

Teach the users English may be an altruist goal in the long term, but for many teachers (like my case) it a barrier right now that can tip the balance to other "more friendly languages"

Anyway, and don't get me wrong, but, force novice users to learn a second language, aside it is likely impractical, it may sound at least rude, ethnocentric or as a neocolonialism in some contexts (if we want to go further...). 
Education takes a lot of resources, I don't think it would happen just showing some English messages (BTW, English may be one of the most difficult languages to learn as a second tongue, depending on the part of the world you live... at least in my country you only can archive an acceptable skill before 6 to 9 years, depending your age and other socio-economical factors)

IMHO, it would be more encouraging a message like "we can help you in your first programming steps with python localized for your language, but please consider to learn English to better communicate in the international community"

> I've seen buildbots reporting unintelligible error messages in 
> German, and just a few days ago I even came across a mercurial 
> version in Russian.

Well, I think this only reinforces my point.
Ignoring that many people out there are localizing their products/projects will not solve that problem either.

Being more aware of internationalization tools could do the trick, not only for python tracebacks, also for third-party modules/libraries that are currently translating their messages.

> It makes somewhat sense to translate OS error messages, because they 
> are read by regular users that have a localized OS and expect
> localized messages.  The same could be said for bash, even if the 
> distinction between "regular users" and "developers" starts to fade a
> bit here.

Exact.
Please, consider that 11-year old pupils learning to program a Robot or or 16-year old students trying to understand a simple algorithm, are users too.

Many maybe will continue with the IT / CS career, if we do a good work ;-)
Those that continues working on programming will surely be exposed sooner or later to formal technical English course at University or similar.
But, if they don't continue their studies, or choose a different career, maybe their English skill will never be enough.

> For example the other day I saw a student confused by this error message:
>>>> a, b = 1, 2, 3
> ValueError: too many values to unpack (expected 2)
>
> The offender here is most likely the word "unpack". "Unpack" is 
> closely related to the concept of tuple unpacking, so if the student
> is aware of what tuple unpacking is, he might fail to associate the
> problem with it if the error uses another word.  In addition, I can
> not think of any word that might be a suitable translation for 
> "unpack" in my native language.  In Spanish "desempaquetar" could 
> maybe be used, but I'm not sure how well it works.

Enhancing some cryptic exception messages can be a parallel job and could be beneficed from the different points of view that opens internationalization in different languages.
Here, as you point, translation poses a new perspective, why take that as a threat instead of an opportunity to bring better messages?
 
I don't agree that this is a beginner-only problem.
I remember an occasion where finishing a coding-dojo, a syntax error was raised and the attendee could not complete their work.
We (a University Teacher, a Teaching Assistant and a Core Developer!) spent a lot of time to discover that it was caused by changing 07 to 08.
Of course, the "SyntaxError: invalid token" was not helpful because the nice editor didn't printed the traceback in a correct monoespaced font (btw, it take a while to understand why the ^ was pointing the 8)...
Anyway, a better error like "SyntaxError: invalid token for octal representation" would be more helpful, even in English ;-)

So the argument "the exception must be in English" to be able to google it may be weak at least (apart from some incompleteness, exception message depends on its context and is just a part of the whole traceback, they change over the time on some occasions, and they may be difficult to copy&paste correctly...).

In the other hand, localized error messages could eventually produce better search results for non-English speakers if there is enough material written in they language. 

You can take a look at this and other nice examples I've recollected so far (incomplete, of course):

http://python.org.ar/pyar/MensajesExcepcionales

>> The mechanism to restore the language is the common one ...
>
> It's not difficult to change, but you would have to remember how to
> do it and what LC_* variable you should change.  Assuming this gets
> implemented it would most likely require a command line parameter
> and an envvar too.

First: this proposal doesn't enables translation per default

Second: if this proposal get implemented, you don't need even to install the translation files! (so translation cannot be turned on by accident)

Third: isn't this a small price to pay for advanced users (just changing a locale setting, if ever are required to do so), in comparison that it could open/enhance python to new languages and users?


>> If PostgreSQL and other tools could do that, why Python could not?
>
> Does any other popular programming language do it?  And if so, how?

Does other popular language uses indentation instead of brackets?

I doubt topics like this can be compared directly (due some differences in communities, goals, etc.), but I've did a quick search and I found this:


.NET support internationalization with a special mechanism called Culture (similar to locale/gettext)
AFAIK it is done by default and embedded in the platform and system libraries.
They even have a "exception message design guideline" where they says "Localizing the error message helps non-English speakers feel more comfortable on our platform. "
Some argues that is not easy to disable this feature to get English only 
 message. Other pitfalls are that if the resource file is missing, it can produce incorrect error messages instead of showing the English one (both things should not happen here as gettext is somewhat more manageable)
Other benefits/objections presented are similar to the ones discussed here, for example:

http://blogs.msdn.com/b/brada/archive/2004/01/28/64255.aspx


Java has a Throwable.getLocalizedMessage(), hence, the dual approach, but AFAIK it is mainly unimplemented (at least for system libraries and internal exceptions). 
It is not an automatic approach (its depend on ResourceBundle, MessageFormat, etc) so it is not easy to implement anyway.


> Ramchandra Apte added the comment:

> Unless Python's grammar is translated into other languages I'm -1 
> on this.

I think python grammar is not comparable to English grammar.
I've already pointed, for example, a similar approach like in PostgreSQL where SQL sentences were not translated.

> I don't see any use of this. You anyway have to know English 
> to understand the docs and Python's grammar is English.

Keywords are similar to English words, but that is all (some are even not English words like __rmul__)
You don't need to know how to write correct English sentences to write python code, or am I missing something?
Punctuation is different too.

That most of documentation is not translated is not an excuse to me.
At least some part should be translated too, as for example, the Python Tutorial was translated by the local community to Spanish:

http://docs.python.org.ar/tutorial/contenido.html

> Terry J. Reedy added the comment:
>
>>It'll get tricky if in a couple months, we start getting bug reports 
>>with traceback in Finnish or French...
>
> That is another reason to *always* output the standard English message
> first. I think this was discussed a couple of years ago on PyDev, or
> maybe another issue. I think the proposal to *replace* English
> messages should be rejected.

I think it is unlikely to receive a bug report from any person that turns on translation because he doesn't understand the language.
How could he/she even write the email/issue in the first place?

Again, this can be seen as an opportunity to foster local/regional users groups, not as a disadvantageousness.
It will be easier for an non-English speaker to communicate in their own tongue with regional advanced users, and then they can find a way to help him to submit the bug report in English, if appropriate.

IMHO this can improve the language and their international community via cooperation and diversity.
History
Date User Action Args
2012-10-30 23:39:13reingartsetrecipients: + reingart, georg.brandl, terry.reedy, pitrou, ezio.melotti, r.david.murray, neologix, Ramchandra Apte
2012-10-30 23:39:12reingartsetmessageid: <1351640352.62.0.184930926213.issue16344@psf.upfronthosting.co.za>
2012-10-30 23:39:12reingartlinkissue16344 messages
2012-10-30 23:39:10reingartcreate