classification
Title: Fatal error on startup with invalid PYTHONIOENCODING
Type: crash Stage:
Components: Interpreter Core, Windows Versions: Python 3.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Daniel.Goertzen, amaury.forgeotdarc, atuining, belopolsky, eric.araujo, flox, grahamd, haypo, pitrou, python-dev
Priority: normal Keywords: patch

Created on 2009-07-17 10:21 by grahamd, last changed 2014-02-13 10:35 by haypo. This issue is now closed.

Files
File name Uploaded Description Edit
issue6501_PYTHONIOENCODING_crash.diff flox, 2009-12-20 14:37 Patch, apply to py3k review
device_encoding.patch haypo, 2011-05-01 23:46 review
Messages (19)
msg90616 - (view) Author: Graham Dumpleton (grahamd) Date: 2009-07-17 10:21
When using Python 3.1 for Apache/mod_wsgi (3.0c4) on Windows, Apache will 
crash on startup because Python is forcing the process to exit with:

Fatal Python error: Py_Initialize: can't initialize sys standard streams
LookupError: unknown encoding: cp0

I first mentioned this on issue6393, but have now created it as a separate 
issue as appears to be distinct from the issue on MacOS X, athough possibly 
related.

In the Windows case there is actually an encoding, that of 'cp0' where as on 
MacOS X, the encoding name was empty.

The same mod_wsgi code works fine under Python 3.1 on MacOS X.
msg90618 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2009-07-17 11:20
- Apache is not a Console application, so the Windows GetConsoleCP()
function returns zero (and os.device_encoding(1) returns 'cp0').
- pythonw.exe has no console either; but in pythonrun.c, the test
(fileno(stdin) < 0) is true, and the standard streams are all set to None.
- It is probable that Apache has redefined stdin & co, so the previous
test does not work there.

As a workaround, I suggest to set the environment variable
PYTHONIOENCODING before starting Apache, or before the call to
Py_Initialize.
msg90620 - (view) Author: Graham Dumpleton (grahamd) Date: 2009-07-17 11:32
Yes, Apache remaps stdout and stderr to the Apache error log to still 
capture anything that errant modules don't log via the Apache error log 
functions. In mod_wsgi it replaces sys.stdout and sys.stderr with special 
file like objects that redirect via Apache error logging functions. This 
though obviously happens after Python first initialises sys.stdout and 
sys.stderr.

What would be an appropriate value to set PYTHONIOENCODING to on Windows 
as a workaround?
msg90640 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2009-07-17 20:50
On a Western Windows, I suggest
    PYTHONIOENCODING=cp1252:backslashreplace

But 
    PYTHONIOENCODING=mbcs
is also OK, except that characters outside the Windows code page will be 
replaced with '?'
msg90813 - (view) Author: Graham Dumpleton (grahamd) Date: 2009-07-22 12:15
The workaround of using:

#if defined(WIN32) && PY_MAJOR_VERSION >= 3
        _wputenv(L"PYTHONIOENCODING=cp1252:backslashreplace");
#endif

        Py_Initialize();

gets around the crash on startup.

I haven't done sufficient testing to know if this may introduce any other 
problems given that one is overriding default I/O encoding for whole 
process.
msg91311 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-08-05 11:01
Graham, is the workaround ok for you or do you think this is something
Python itself should handle?
msg91317 - (view) Author: Graham Dumpleton (grahamd) Date: 2009-08-05 12:08
Python should be as robust as possible and thus should be fixed, but I am 
happy with using the workaround so if this is more a question of what 
priority to mark this, I wouldn't see it as being urgent.
msg96685 - (view) Author: Florent Xicluna (flox) * (Python committer) Date: 2009-12-20 14:37
Patch to prevent crash when PYTHONIOENCODING is invalid:

~ $ PYTHONIOENCODING= ./python 
Fatal Python error: Py_Initialize: can't initialize sys standard streams
LookupError: unknown encoding: 
Abandon
msg96703 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-12-20 17:47
Well, this might prevent the crash but how does the system behave
afterwards? Do the standard streams use utf-8 by default?
At the minimum, we should still output a warning on stderr.
msg127762 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2011-02-02 21:23
The issue is not Windows specific, so I am changing the title to reflect that.   On OSX, for example, I get

$ PYTHONIOENCODING=xyz ./python.exe
Fatal Python error: Py_Initialize: can't initialize sys standard streams
LookupError: unknown encoding: xyz
Abort trap

I agree that abort() is too drastic for a typo in the environment variable setting, but ignoring it silently is not a good option either.  Someone setting PYTHONIOENCODING most likely does it for a reason and giving him or her some sort of default behavior for mistyped encoding is not helpful.  (Note that this is how many C libraries treat TZ environment variable setting and this is often very frustrating.)

I think errors in environment variables that can be detected on startup should be treated the same way as the command line typos: a descriptive message on C stderr and exit(1).

Currently different environment variables are treated differently.  For example, mistakes in PYTHONHOME and PYTHONIOENCODING cause fatal error while an error in PYTHONSTARTUP is reported but does not terminate python:

$ PYTHONSTARTUP=xyz.py ./python.exe
Python 3.2rc2+ (py3k:88320, Feb  2 2011, 14:07:18) 
[GCC 4.2.1 (Apple Inc. build 5646) (dot 1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
Could not open PYTHONSTARTUP
IOError: [Errno 2] No such file or directory: 'xyz.py'
>>>
msg132743 - (view) Author: Daniel Goertzen (Daniel.Goertzen) Date: 2011-04-01 15:01
I run into this problem when I start a Python app as a subprocess from Erlang (open_port() function).  The PYTHONIOENCODING fix works when I launch my py app via pythonw.exe, but it does *not* work when I use the cx-freeze version of the app.

I am using the Win32GUI base for cx-freeze which appears to be a thin WinMain() wrapper around Py_Initialize().  I am going to continue investigating the cx-freeze related problems.

I am using Python 3.2 under Windows.  The failure is basically silent under Windows (generic MSVC runtime error), so I wasted a lot time figuring out what the problem actually was.  Python 2 doesn't seem to have this problem.
msg132831 - (view) Author: Daniel Goertzen (Daniel.Goertzen) Date: 2011-04-03 03:32
It turns out that cx-freeze deliberately sets Py_IgnoreEnvironmentFlag to ensure that the final executable is really an isolated, standalone executable (ie, it can't be subverted by setting PYTHONPATH.)  Therefore the PYTHONIOENCODING work-around does not work in this situation.

I am currently using a cx-freeze work-around from the author to enable the PYTHONIOENCODING work-around.  Altogether not that pleasant.

Could Python 3 could just default to some reasonable encoding and keep on chugging?
msg134947 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-05-01 23:46
> Fatal Python error: Py_Initialize: can't initialize sys standard streams
> LookupError: unknown encoding: cp0

That's a bug in os.device_encoding(): os.device_encoding(sys.stdout.fileno()) should return None if the application has no console (if sys.stdout is not a Windows console stream).

Attached device_encoding.patch should fix this issue. (I didn't test the patch yet.)
msg134948 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-05-01 23:52
> On a Western Windows, I suggest
>    PYTHONIOENCODING=cp1252:backslashreplace

Why using this very small charset whereas a web server can use UTF-8?

I don't think that using backslashreplace on stdout is a good idea.

> But 
>    PYTHONIOENCODING=mbcs
> is also OK, except that characters outside the Windows code
> page will be replaced with '?'

Starting at Python 3.2, you should use mbcs:replace to replace unencodable characters by '?'. The strict error handler is now strict: it raises a UnicodeEncodeError if a character is not encodable to mbcs.

Note: mbcs is the ANSI code page.

--

Using device_encoding.patch, I suppose that sys.std* streams will use the ANSI code page (mbcs, which is the code page 1252 on a Western Windows setup) in grahamd's usecase (Python program running in Apache).
msg136667 - (view) Author: Roundup Robot (python-dev) Date: 2011-05-23 16:13
New changeset 5783a55a2418 by Victor Stinner in branch 'default':
Issue #6501: os.device_encoding() returns None on Windows if the application
http://hg.python.org/cpython/rev/5783a55a2418
msg136670 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-05-23 16:17
@grahamd: Can you try the development version of Python 3.3, or try to patch your version using device_encoding.patch? You will not get cp0 encoding anymore.

If the patch fixes your issue, I will backport it. I don't see anything interesting to do for this issue.
msg147666 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-11-15 13:06
> Currently different environment variables are treated differently.  For example,
> mistakes in PYTHONHOME and PYTHONIOENCODING cause fatal error while an error in
> PYTHONSTARTUP is reported but does not terminate python:

If PYTHONSTARTUP is the only envvar with non-fatal errors, I think it’s okay.  PYTHONHOME contains vital information, PYTHONIOENCODING is set by the programmer/admin and their code probably depends on it, but PYTHONSTARTUP is just niceties for the interactive interpreter, so non-vital IMO.
msg205396 - (view) Author: Mark Lawrence (BreamoreBoy) Date: 2013-12-06 19:28
Is there anything to backport as referred to in msg136670 or can this be closed?
msg211140 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2014-02-13 10:35
> Is there anything to backport as referred to in msg136670 or can this be closed?

The fix is present in Python 3.3 since Python 3.3.0 according to the changelog. Python 3.2 doesn't accept bugfixes anymore.

Python 2.7 doesn't have the function os.device_encoding(), I don't think that it is affected.

There is nothing more to do, I'm closing the bug. Thanks for the report.
History
Date User Action Args
2014-02-13 10:35:21hayposetstatus: open -> closed
resolution: fixed
messages: + msg211140

versions: - Python 2.7, Python 3.2
2014-02-03 15:43:50BreamoreBoysetnosy: - BreamoreBoy
2013-12-06 19:28:41BreamoreBoysetnosy: + BreamoreBoy
messages: + msg205396
2011-11-15 13:06:56eric.araujosetnosy: + eric.araujo

messages: + msg147666
versions: + Python 2.7, Python 3.3, - Python 3.1
2011-05-23 16:17:20hayposetmessages: + msg136670
2011-05-23 16:13:18python-devsetnosy: + python-dev
messages: + msg136667
2011-05-01 23:52:19hayposetmessages: + msg134948
2011-05-01 23:46:17hayposetfiles: + device_encoding.patch

messages: + msg134947
2011-04-03 03:32:38Daniel.Goertzensetmessages: + msg132831
2011-04-01 18:08:26atuiningsetnosy: + atuining
2011-04-01 15:01:10Daniel.Goertzensetnosy: + Daniel.Goertzen
messages: + msg132743
2011-03-21 02:07:38eric.araujosetnosy: + haypo
2011-02-02 21:23:01belopolskysetnosy: + belopolsky

messages: + msg127762
title: Fatal LookupError: unknown encoding: cp0 on Windows embedded startup. -> Fatal error on startup with invalid PYTHONIOENCODING
2009-12-20 17:47:09pitrousetmessages: + msg96703
2009-12-20 14:37:34floxsetfiles: + issue6501_PYTHONIOENCODING_crash.diff
versions: + Python 3.2
nosy: + flox

messages: + msg96685

keywords: + patch
2009-12-20 11:47:47floxlinkissue7441 superseder
2009-08-05 12:08:58grahamdsetmessages: + msg91317
2009-08-05 11:01:37pitrousetnosy: + pitrou
messages: + msg91311
2009-07-22 12:15:32grahamdsetmessages: + msg90813
2009-07-17 20:50:37amaury.forgeotdarcsetmessages: + msg90640
2009-07-17 11:32:26grahamdsetmessages: + msg90620
2009-07-17 11:20:10amaury.forgeotdarcsetnosy: + amaury.forgeotdarc
messages: + msg90618
2009-07-17 10:21:43grahamdcreate