This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: WindowsError messages are not properly encoded
Type: behavior Stage:
Components: Windows Versions: Python 2.7
process
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: amaury.forgeotdarc Nosy List: amaury.forgeotdarc, christian.heimes, eckhardt, gvanrossum, loewis, methane, r37c, terry.reedy
Priority: normal Keywords:

Created on 2008-01-07 11:24 by r37c, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
windowserror.patch amaury.forgeotdarc, 2008-01-08 00:42
Pull Requests
URL Status Linked Edit
PR 2413 open alberfontan1, 2017-06-27 13:29
Messages (16)
msg59441 - (view) Author: Rômulo (r37c) Date: 2008-01-07 11:24
The message for WindowsError is taken from the Windows API's
FormatMessage() function, following the OS language. Currently Python
does no conversion for those messages, so non-ASCII characters end up
improperly encoded in the console. For example:

  >>> import os
  >>> os.rmdir('E:\\temp')
  Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
  WindowsError: [Error 41] A pasta nÒo estß vazia: 'E:\\temp'

Should be: "A pasta não está vazia" [Folder is not empty].

Python could check what is the code page of the current output interface
and change the message accordingly.
msg59448 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2008-01-07 15:24
Crys, can you confirm this?

It would seem we'll need to fix this twice -- once for 2.x, once for 3.0.
msg59452 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2008-01-07 15:32
Oh nice ... 

Amaury knows probably more about the wide char Windows API than me. The
function Python/error.c:PyErr_SetExcFromWindows*() needs to be modified.
msg59455 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-01-07 15:37
I confirm the problem (with French accents) on python 2.5.
Python 3.0 already fixed the problem by using the FormatMessageW()
unicode version of the API.

We could do the same for python 2.5, but the error message must be
converted to str early (i.e when building the Exception). What is the
correct encoding to use?
msg59462 - (view) Author: Rômulo (r37c) Date: 2008-01-07 16:21
"... but the error message must be converted to str early (i.e when
building the Exception)."

Wouldn't that create more problems? What if somebody wants to intercept
the exception and do something with it, like, say, redirect it to a log
file? The programmer must, then, be aware of the different encoding. I
thought about keeping the exception message in Unicode and converting it
just before printing. Is that possible for Python 2.x?
msg59472 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-01-07 18:18
I think this is not possible if we want to preserve compatibility; at
least, str(e.strerror) must not fail.

I can see different solutions:
1) Don't fix, and upgrade to python 3.0
2) Store an additional e.unicodeerror member, use it in a new
EnvironmentError.__unicode__ method, and call this from PyErr_Display.
3) Force FormatMessage to return US-English messages.

My preferred being 1): python2.5 is mostly encoding-naive, python3 is
unicode aware, and I am not sure we want python2.6 contain both code.
Other opinions?
msg59474 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2008-01-07 18:38
3.0 will be a long way away for many users.  Perhaps forcing English
isn't so bad, as Python's own error messages aren't translated anyway?
msg59490 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008-01-07 21:00
I would claim that this is not a bug. Sure, the message doesn't come out
correctly, but only because you run it in a cmd.exe window, not in (say)
IDLE.

IIUC, the problem is that Python computes the message in CP_ACP (i.e.
the ANSI code page), whereas the terminal interprets it in CP_OEMCP
(i.e. the OEM code page).

If we declare that all strings are considered as CP_ACP in the
exception, then the only way to fix it would be to convert it from
CP_ACP to CP_OEMCP (or, more generally, sys.stderr.encoding) on
printing. Such conversion should be implemented in an unfailing way,
either using replacement characters or falling back to no conversion.

Forcing English messages would certainly reduce the problems, but it
still might be that the file name in the error message does not come out
correctly.
msg59495 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-01-07 21:27
> Forcing English messages would certainly reduce the problems
And it does not even work: my French Windows XP does not contain the
English error messages :-(

> If we declare that all strings are considered as CP_ACP in the
> exception, then the only way to fix it would be to convert it from
> CP_ACP to CP_OEMCP (or, more generally, sys.stderr.encoding) on
> printing. Such conversion should be implemented in an unfailing way,
> either using replacement characters or falling back to no conversion.

If this is chosen, I propose to use CharToOem as the "unfailing"
conversion function. I will try to come with a patch following this idea.
msg59498 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008-01-07 21:36
> If this is chosen, I propose to use CharToOem as the "unfailing"
> conversion function. I will try to come with a patch following this idea.

Sounds fine to me.
msg59512 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-01-08 00:42
Here is a patch. Now I feel it is a hack, but it is the only place I
found where I can access both the exception object and the encoding...
msg97794 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2010-01-14 22:56
I think WindowsError's message should be English like other errors.
FormatMessageW() function can take dwLanguageId parameter.
So I think Python should pass `MAKELANGID(LANG_ENGLISH, SUBLANG_ENGLISH_US)` to the parameter.
msg105120 - (view) Author: Rômulo (r37c) Date: 2010-05-06 01:51
> I think WindowsError's message should be English like other errors.
> FormatMessageW() function can take dwLanguageId parameter.
> So I think Python should pass `MAKELANGID(LANG_ENGLISH, 
> SUBLANG_ENGLISH_US)` to the parameter.

On a non-english system FormatMessageW fails with ERROR_RESOURCE_LANG_NOT_FOUND (The specified resource language ID cannot be found in the image file) when called with that parameter.
msg112899 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2010-08-04 21:28
Should we close this?

There was some opinion that this is not a bug. 

The argument for not closing this before "3.0 will be a long way away for many users." is obsolete as 3.1.2 is here and 3.2 will be in less than 6 months.

Or, Amaury, do you have any serious prospect of applying the patch to 2.7?
msg112902 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2010-08-04 21:34
Somebody should investigate the status of this on 3.x. If the message comes out as a nice Unicode string, I'd close it as fixed. If the message comes out as a byte string, it definitely needs fixing.

For 2.x, the issue is out of date.
msg112934 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2010-08-05 00:36
The message is definitely an str (unicode) string. WinXP,3.1.2,

import os
try: os.rmdir('nonexist')
except Exception as e:
    print(repr(e.args[1]), '\n', repr(e.strerror), '\n', e.filename)
os.rmdir('nonexist')

# prints
'The system cannot find the file specified' 
 'The system cannot find the file specified' 
 nonexist
...
WindowsError: [Error 2] The system cannot find the file specified: 'nonexist'
History
Date User Action Args
2022-04-11 14:56:29adminsetgithub: 46094
2017-06-27 13:29:46alberfontan1setpull_requests: + pull_request2490
2010-08-05 00:36:50terry.reedysetstatus: open -> closed
resolution: out of date
messages: + msg112934
2010-08-04 21:34:56loewissetmessages: + msg112902
2010-08-04 21:28:58terry.reedysetnosy: + terry.reedy

messages: + msg112899
versions: + Python 2.7, - Python 2.6, Python 2.5, Python 3.0
2010-05-06 01:51:56r37csetmessages: + msg105120
2010-01-14 22:56:39methanesetnosy: + methane
messages: + msg97794
2009-01-27 18:03:29eckhardtsetnosy: + eckhardt
2008-01-08 00:42:24amaury.forgeotdarcsetfiles: + windowserror.patch
messages: + msg59512
2008-01-07 21:36:56loewissetmessages: + msg59498
2008-01-07 21:27:37amaury.forgeotdarcsetmessages: + msg59495
2008-01-07 21:00:18loewissetnosy: + loewis
messages: + msg59490
2008-01-07 18:38:31gvanrossumsetmessages: + msg59474
2008-01-07 18:18:53amaury.forgeotdarcsetmessages: + msg59472
2008-01-07 16:21:44r37csetmessages: + msg59462
2008-01-07 15:37:16amaury.forgeotdarcsetmessages: + msg59455
2008-01-07 15:32:29christian.heimessetpriority: normal
assignee: amaury.forgeotdarc
versions: + Python 2.6, Python 3.0
messages: + msg59452
nosy: + amaury.forgeotdarc
2008-01-07 15:24:02gvanrossumsetnosy: + gvanrossum, christian.heimes
messages: + msg59448
2008-01-07 11:24:25r37ccreate