classification
Title: No good way to set 'PYTHONIOENCODING' when embedding python.
Type: enhancement Stage: needs patch
Components: Windows Versions: Python 3.4
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: ncoghlan Nosy List: Arfrever, asvetlov, christian.heimes, haypo, ideasman42, lemburg, loewis, ncoghlan
Priority: normal Keywords: patch

Created on 2012-10-04 12:58 by ideasman42, last changed 2012-11-28 06:42 by ncoghlan.

Files
File name Uploaded Description Edit
pyos_putenv.diff ideasman42, 2012-11-28 06:13 patch on python 3.4 hg - 80610:618ea5612e83 review
Messages (7)
msg171938 - (view) Author: Campbell Barton (ideasman42) Date: 2012-10-04 12:58
note, I was asked to report this issue, posted on the py dev mailing list: see - http://code.activestate.com/lists/python-dev/118015/

---

We've run into an issue recently with blender3d on ms-windows where we
want to enforce the encoding is UTF-8 with the embedded python
interpreter.
(the encoding defaults to cp437).

I naively thought setting the environment variable before calling
Py_Initialize() would work, but the way python DLL loads, it gets its
own environment variables that cant be modified directly [1].
eg, _putenv("PYTHONIOENCODING=utf-8:surrogateescape");

We had bug reports by windows users not able to export files because
the stdout errors on printing paths with unsupported encoding. [2],[3]

---

Of course we could distribute blender with a bat file launcher that
sets env variables, or ask the user to set their env variable - but I
dont think this is really a good option.

I tried overriding the stderr & stdout, but this caused another bug on exiting, giving an assert in MSVCR90.DLL's write.c (called from python32_d.dll):
_VALIDATE_CLEAR_OSSERR_RETURN((_osfile(fh) & FOPEN), EBADF, -1);

import sys, io
sys.__stdout__ = sys.stdout =
io.TextIOWrapper(io.open(sys.stdout.fileno(), "wb", -1),
encoding='utf-8', errors='surrogateescape', newline="\n",
line_buffering=True)
sys.__stderr__ = sys.stderr =
io.TextIOWrapper(io.open(sys.stderr.fileno(), "wb", -1),
encoding='utf-8', errors='surrogateescape', newline="\n",
line_buffering=True)



IMHO either of these solutions would be fine.

* have a PyOS_PutEnv() function, gettext has gettext_putenv() to
workaround this problem.

* manage this the same as Py_GetPythonHome(), which can be defined by
the embedding application to override the default.


[1] http://stackoverflow.com/questions/5153547/environment-variables-are-different-for-dll-than-exe
[2] http://projects.blender.org/tracker/index.php?func=detail&aid=32750
[3] http://projects.blender.org/tracker/index.php?func=detail&aid=31555
msg171941 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2012-10-04 13:26
See also issue #15216.
msg172054 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2012-10-05 07:02
> IMHO either of these solutions would be fine.
> 
> * have a PyOS_PutEnv() function, gettext has gettext_putenv() to
> workaround this problem.

This solution would help in many other cases as well, so adding
such an API would certainly help more than specialized interfaces.

> * manage this the same as Py_GetPythonHome(), which can be defined by
> the embedding application to override the default.

I think you meant Py_SetPythonHome().

Given that the IO encoding is very important for Python 3.x, a special
API just for setting the encoding may be useful to have as well.

Care must be taken, though, that the encoding cannot be set after
Py_Initialize() has been called.

It may overall be easier to go with the PyOS_PutEnv() solution to
not run into the problems with having to check for an initialized
interpreter first.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 05 2012)
>>> Python Projects, Consulting and Support ...   http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2012-09-27: Released eGenix PyRun 1.1.0 ...       http://egenix.com/go35
2012-09-26: Released mxODBC.Connect 2.0.1 ...     http://egenix.com/go34
2012-09-25: Released mxODBC 3.2.1 ...             http://egenix.com/go33
2012-10-23: Python Meeting Duesseldorf ...                 18 days to go

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/
msg172073 - (view) Author: Campbell Barton (ideasman42) Date: 2012-10-05 11:17
Agree PyOS_PutEnv would be good since its not restricted to string encoding and resolves the problem of not being able to control env vars for an embedded interpreter in general.

Having ways to change encoding is good too but a bit outside the scope of this report and possibly not the best solution either since its possible (through unlikely), that you need to set the encoding at the very start of python initialization- rather than site & builtin modules loads.
msg176514 - (view) Author: Campbell Barton (ideasman42) Date: 2012-11-28 06:13
patch attached, simply wraps putenv()
msg176515 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2012-11-28 06:30
At first glance your proposed fix looks like an easy hack to get around the issue. However it's not going to work properly. Embedded Python interpreters should isolate themselves from the user's environment. When `Py_IgnoreEnvironmentFlag` is enabled, Py_GETENV() always returns NULL.

IMHO we can't get around Py_GetIOEncoding(), Py_SetIOEncoding() and Py_IOEncoding.
msg176516 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2012-11-28 06:42
Claiming this one, mainly because I want people to largely leave the already hairy initialisation process alone until we get a chance to discuss it at the language summit next year.

I plan to write up a comprehensive overview of the initialisation sequence before then, because we need to be figuring out how to *delete* code here, instead of adding even more.
History
Date User Action Args
2012-11-28 06:42:07ncoghlansetassignee: ncoghlan

messages: + msg176516
nosy: + ncoghlan
2012-11-28 06:30:27christian.heimessetnosy: + christian.heimes
messages: + msg176515
2012-11-28 06:13:42ideasman42setfiles: + pyos_putenv.diff
keywords: + patch
messages: + msg176514
2012-10-08 01:30:38Arfreversetnosy: + Arfrever
2012-10-07 19:38:03asvetlovsetnosy: + asvetlov

stage: needs patch
2012-10-05 18:34:36pitrousettype: enhancement
2012-10-05 11:17:06ideasman42setmessages: + msg172073
2012-10-05 07:02:42lemburgsetmessages: + msg172054
2012-10-04 22:50:24pitrousetnosy: + lemburg, loewis
2012-10-04 13:26:27hayposetnosy: + haypo
messages: + msg171941
2012-10-04 12:58:14ideasman42create