classification
Title: No good way to set 'PYTHONIOENCODING' when embedding python.
Type: enhancement Stage: resolved
Components: Windows Versions: Python 3.4
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: ncoghlan Nosy List: Arfrever, Brecht.Van.Lommel, asvetlov, christian.heimes, haypo, ideasman42, larry, lemburg, loewis, mont29, ncoghlan, python-dev, skrah
Priority: normal Keywords: patch

Created on 2012-10-04 12:58 by ideasman42, last changed 2014-02-25 21:03 by python-dev. This issue is now closed.

Files
File name Uploaded Description Edit
pyos_putenv.diff ideasman42, 2012-11-28 06:13 patch on python 3.4 hg - 80610:618ea5612e83 review
setstdio.diff mont29, 2013-09-18 14:58 patch... review
setstdio.diff mont29, 2013-10-07 18:43 review
Messages (39)
msg171938 - (view) Author: Campbell Barton (ideasman42) Date: 2012-10-04 12:58
note, I was asked to report this issue, posted on the py dev mailing list: see - http://code.activestate.com/lists/python-dev/118015/

---

We've run into an issue recently with blender3d on ms-windows where we
want to enforce the encoding is UTF-8 with the embedded python
interpreter.
(the encoding defaults to cp437).

I naively thought setting the environment variable before calling
Py_Initialize() would work, but the way python DLL loads, it gets its
own environment variables that cant be modified directly [1].
eg, _putenv("PYTHONIOENCODING=utf-8:surrogateescape");

We had bug reports by windows users not able to export files because
the stdout errors on printing paths with unsupported encoding. [2],[3]

---

Of course we could distribute blender with a bat file launcher that
sets env variables, or ask the user to set their env variable - but I
dont think this is really a good option.

I tried overriding the stderr & stdout, but this caused another bug on exiting, giving an assert in MSVCR90.DLL's write.c (called from python32_d.dll):
_VALIDATE_CLEAR_OSSERR_RETURN((_osfile(fh) & FOPEN), EBADF, -1);

import sys, io
sys.__stdout__ = sys.stdout =
io.TextIOWrapper(io.open(sys.stdout.fileno(), "wb", -1),
encoding='utf-8', errors='surrogateescape', newline="\n",
line_buffering=True)
sys.__stderr__ = sys.stderr =
io.TextIOWrapper(io.open(sys.stderr.fileno(), "wb", -1),
encoding='utf-8', errors='surrogateescape', newline="\n",
line_buffering=True)



IMHO either of these solutions would be fine.

* have a PyOS_PutEnv() function, gettext has gettext_putenv() to
workaround this problem.

* manage this the same as Py_GetPythonHome(), which can be defined by
the embedding application to override the default.


[1] http://stackoverflow.com/questions/5153547/environment-variables-are-different-for-dll-than-exe
[2] http://projects.blender.org/tracker/index.php?func=detail&aid=32750
[3] http://projects.blender.org/tracker/index.php?func=detail&aid=31555
msg171941 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2012-10-04 13:26
See also issue #15216.
msg172054 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2012-10-05 07:02
> IMHO either of these solutions would be fine.
> 
> * have a PyOS_PutEnv() function, gettext has gettext_putenv() to
> workaround this problem.

This solution would help in many other cases as well, so adding
such an API would certainly help more than specialized interfaces.

> * manage this the same as Py_GetPythonHome(), which can be defined by
> the embedding application to override the default.

I think you meant Py_SetPythonHome().

Given that the IO encoding is very important for Python 3.x, a special
API just for setting the encoding may be useful to have as well.

Care must be taken, though, that the encoding cannot be set after
Py_Initialize() has been called.

It may overall be easier to go with the PyOS_PutEnv() solution to
not run into the problems with having to check for an initialized
interpreter first.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 05 2012)
>>> Python Projects, Consulting and Support ...   http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2012-09-27: Released eGenix PyRun 1.1.0 ...       http://egenix.com/go35
2012-09-26: Released mxODBC.Connect 2.0.1 ...     http://egenix.com/go34
2012-09-25: Released mxODBC 3.2.1 ...             http://egenix.com/go33
2012-10-23: Python Meeting Duesseldorf ...                 18 days to go

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/
msg172073 - (view) Author: Campbell Barton (ideasman42) Date: 2012-10-05 11:17
Agree PyOS_PutEnv would be good since its not restricted to string encoding and resolves the problem of not being able to control env vars for an embedded interpreter in general.

Having ways to change encoding is good too but a bit outside the scope of this report and possibly not the best solution either since its possible (through unlikely), that you need to set the encoding at the very start of python initialization- rather than site & builtin modules loads.
msg176514 - (view) Author: Campbell Barton (ideasman42) Date: 2012-11-28 06:13
patch attached, simply wraps putenv()
msg176515 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2012-11-28 06:30
At first glance your proposed fix looks like an easy hack to get around the issue. However it's not going to work properly. Embedded Python interpreters should isolate themselves from the user's environment. When `Py_IgnoreEnvironmentFlag` is enabled, Py_GETENV() always returns NULL.

IMHO we can't get around Py_GetIOEncoding(), Py_SetIOEncoding() and Py_IOEncoding.
msg176516 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2012-11-28 06:42
Claiming this one, mainly because I want people to largely leave the already hairy initialisation process alone until we get a chance to discuss it at the language summit next year.

I plan to write up a comprehensive overview of the initialisation sequence before then, because we need to be figuring out how to *delete* code here, instead of adding even more.
msg197894 - (view) Author: Brecht Van Lommel (Brecht.Van.Lommel) Date: 2013-09-16 11:58
With the language summit passed some time ago, is there now a more clear picture of how this would fit in the initialisation process?

Any idea of what a proper implementation would look like, or is the message still to wait?
msg197903 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-09-16 14:48
The gory details of the current startup situation are in PEP 432. However, the comprehensive solution described in that PEP isn't going to make it into Python 3.4, so a simpler interim fix would be worthwhile.

Since Blender is designed to support building against the system Python, the trick of forcing Python to link to an alternate implementation of a function won't work.

Inspired by http://docs.python.org/3/c-api/init.html#Py_SetPath, I suggest offering an API like:

int Py_SetStandardStreamEncoding(char *encoding, char *errors)
{
    if (Py_IsInitialized()) {
        return -1;
    }
    Py_StandardStreamEncoding = _PyMem_Strdup(encoding);
    Py_StandardStreamErrors = _PyMem_Strdup(errors);
}

The initstdio function in pythonrun.c would then be updated to use the specified Py_StandardStreamEncoding and Py_StandardStreamErrors if they weren't NULL (since that would indicate an embedding application had called SetStandardStreamEncoding).
msg198023 - (view) Author: Bastien Montagne (mont29) Date: 2013-09-18 14:58
Hi,

Decided to give it a try and implement suggested Py_SetStandardStreamEncoding, see attached patch.

A quick test with Blender under Linux (with an ASCII console) seems to work OK, unfortunatly I do not have Windows to test it where it really matters...

That's my first piece of code ever written for Python, I hope it does not contain too much obvious errors!
msg198864 - (view) Author: Bastien Montagne (mont29) Date: 2013-10-02 19:24
No one to check the patch? It’s rather small, should not take much time… And we really need a solution for this issue, so better to try to get this settled before 3.4 freeze! ;)
msg199153 - (view) Author: Bastien Montagne (mont29) Date: 2013-10-07 18:43
Updated patch (mostly from Brecht's remarks, removed an obvious bug...)
msg199255 - (view) Author: Brecht Van Lommel (Brecht.Van.Lommel) Date: 2013-10-08 21:17
I tested the patch on Windows and can confirm it solves the problem for Blender.
msg200120 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-10-17 12:38
I added tests to the patch and tweaked a few details. Once the local test run gives it a clean bill of health, I'll commit it so it will be included in 3.4a4 this weekend and you can check it works with Blender.
msg200122 - (view) Author: Roundup Robot (python-dev) Date: 2013-10-17 13:22
New changeset 9cd88b39ef62 by Nick Coghlan in branch 'default':
Issue #16129: Add `Py_SetStandardStreamEncoding`
http://hg.python.org/cpython/rev/9cd88b39ef62
msg200123 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-10-17 13:23
Putting this to pending for the moment - please try it out with Blender once 3.4a4 is released, and then we'll close it.
msg200124 - (view) Author: Roundup Robot (python-dev) Date: 2013-10-17 13:27
New changeset 537e13ca7683 by Nick Coghlan in branch 'default':
Issue #16129: Move Py_SetStandardStreamEncoding declaration
http://hg.python.org/cpython/rev/537e13ca7683
msg200125 - (view) Author: Bastien Montagne (mont29) Date: 2013-10-17 13:30
Thanks Nick, we’ll do it for sure. :)
msg200137 - (view) Author: Roundup Robot (python-dev) Date: 2013-10-17 15:46
New changeset 51480f6428a5 by Nick Coghlan in branch 'default':
Skip #16129 test until I debug cross-platform issues
http://hg.python.org/cpython/rev/51480f6428a5
msg200138 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-10-17 15:49
Buildbots weren't happy with the test case. I've done two extra commits (one to turn off the diff limit, then a second one to skip it entirely), so will hopefully be able to debug it and keep this in for the alpha.

(Larry - FYI, I'll have this test fixed or the patch reverted before the alpha on the weekend)
msg200139 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-10-17 15:52
According to http://buildbot.python.org/all/builders/AMD64%20Ubuntu%20LTS%203.x/builds/2716/steps/test/logs/stdio, output in the subprocess was fine, but for some reason sys.stdout.encoding and sys.stderr.encoding were None in the test itself o.O
msg200140 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-10-17 15:57
http://buildbot.python.org/all/builders/AMD64%20FreeBSD%2010.0%203.x/builds/579/steps/test/logs/stdio shows the same symptoms
msg200141 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-10-17 16:04
Antoine pointed out this also needs a :versionadded: tag in the docs.
msg200142 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-10-17 16:13
So reusing the run command from the buildbot still passed on my system. 

I thought using sys.stdout.encoding (etc) would reliably get me the default IO encoding, but it appears not :(
msg200145 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-10-17 16:41
You don't want to mention the return value (-1) on error in the doc?
msg200146 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2013-10-17 17:27
Unfortunately the tests pass when I run them manually. On my FreeBSD
bot thsi may be relevant:

    o LC_* is "C".

    o The filesystem encoding is ascii.
msg200177 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-10-17 23:55
On 18 Oct 2013 02:41, "STINNER Victor" <report@bugs.python.org> wrote:
>
>
> STINNER Victor added the comment:
>
> You don't want to mention the return value (-1) on error in the doc?

I guess it should explicitly say any non-zero return value indicates an
error rather than leaving that implied by the fact that zero indicates
success.

Something else occurred to me, though - this runs before Py_Initialize, and
now I'm not sure all the API calls are valid in that context.

>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue16129>
> _______________________________________
msg200249 - (view) Author: Roundup Robot (python-dev) Date: 2013-10-18 13:13
New changeset 12d7b6171b28 by Nick Coghlan in branch 'default':
Issue #16129: Py_SetStandardStreamEncoding cleanups
http://hg.python.org/cpython/rev/12d7b6171b28
msg200250 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-10-18 13:17
I dealt with the comments folks made here and on python-dev regarding the docs and the limited API, as well as the fact it isn't safe to call PyErr_NoMemory() when this function fails.

However, I still haven't been able to reproduce the failure seen on the buildbots. It isn't a simple cross-test interference problem, as I tried the randomization seed from the Ubuntu log and that still passed for me. I'm still trying the seed from Stefan's FreeBSD system, but I expect that will be clean here as well.
msg200251 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-10-18 13:36
(As expected, the original FreeBSD randomization seed passed here without any problems)

Before I disabled the test completely, I did a run with full error diffs enabled.

Here's the FreeBSD log with full diffs:
http://buildbot.python.org/all/builders/AMD64%20FreeBSD%2010.0%203.x/builds/580/steps/test/logs/stdio

This indicates both "{0.stdout.encoding}" and "{0.stdout.encoding}" produced the string "None" in the formatting operation that creates the expected output.

The Ubuntu run exhibits the same behaviour:
http://buildbot.python.org/all/builders/AMD64%20Ubuntu%20LTS%203.x/builds/2717/steps/test/logs/stdio

Note that on both symptoms, the later run passes without a problem, and (aside from the full diff being enabled the second time), the symptoms were identical (further pushing against a cross-test compatibility problem).

And the full gory list of failed logs with this error and the full diffs:
http://buildbot.python.org/all/builders/AMD64%20OpenIndiana%203.x/builds/6502/steps/test/logs/stdio
http://buildbot.python.org/all/builders/AMD64%20Snow%20Leop%203.x/builds/170/steps/test/logs/stdio
http://buildbot.python.org/all/builders/x86%20Gentoo%20Non-Debug%203.x/builds/5175/steps/test/logs/stdio
http://buildbot.python.org/all/builders/AMD64%20FreeBSD%209.0%203.x/builds/5508/steps/test/logs/stdio
http://buildbot.python.org/all/builders/x86%20Ubuntu%20Shared%203.x/builds/8767/steps/test/logs/stdio

I couldn't find a full diff failure for this one, but this log has the truncated diff:
http://buildbot.python.org/all/builders/PPC64%20PowerLinux%203.x/builds/861/steps/test/logs/stdio
msg200252 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2013-10-18 13:44
A test order problem is indeed unlikely:  I ran the tests as the buildbot user
with the same random seed and they passed. Let's blame the buildbot software. ;)
msg200254 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-10-18 13:47
Larry - I'm kinda stumped on this one. I'm not sure how to debug the failure on the buildbots, because, as Victor said, sys.stdout/err having an encoding attribute of None *isn't* supposed to happen, so the test should be fine as it stands. And nobody has been able to reproduce the problem on a direct run.

It's probably OK to leave the test disabled for the alpha (since even the failures show that *this* feature was working correctly, it was the test case itself that was breaking).

While I just noticed my RHEL buildbot hit the failure as well, Stefan's inability to reproduce the problem on his FreeBSD system suggests that isn't likely to help: http://buildbot.python.org/all/builders/x86%20RHEL%206%203.x/builds/2871/steps/test/logs/stdio

(And /facepalm moment, I just realised all my testing tonight has been pointless, because I forgot to remove the skip decorator from the offending test...)
msg200255 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-10-18 13:57
Ah, that got it:

LC_ALL=C ./python -m test -W test_capi

=> boom :)

And, of course it's a StringIO object at that point...
msg200257 - (view) Author: Roundup Robot (python-dev) Date: 2013-10-18 14:01
New changeset 4af20969c592 by Nick Coghlan in branch 'default':
Issue #16129: this should appease the buildbots
http://hg.python.org/cpython/rev/4af20969c592
msg200259 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-10-18 14:03
So, does that mean "asked the release manager for a ruling" is the buildbot debugging equivalent of "threatened it with tech support?" :)
msg202002 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-11-03 07:05
Bastien, did you get a chance to try embedding Python 3.4a4 in Blender yet? If that works for you, we can mark this one as closed.
msg202007 - (view) Author: Bastien Montagne (mont29) Date: 2013-11-03 09:53
Wow… Good thing you remind me that. Just tested it here (linux with ASCII terminal), works perfectly. Thanks again for all the integration work, Nick!
msg202017 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-11-03 12:23
Excellent!

Zachary Ware got the embedding tests running and passing on Windows in issue 19439 (previously they were only executed on *nix systems), so Python 3.4 should resolve this problem on all platforms.
msg212210 - (view) Author: Roundup Robot (python-dev) Date: 2014-02-25 21:03
New changeset 4cd620d8c3f6 by R David Murray in branch 'default':
whatsnew: DynanicClassAttribute (#19030), Py_SetStandardStreamEncoding (#16129)
http://hg.python.org/cpython/rev/4cd620d8c3f6
History
Date User Action Args
2014-02-25 21:03:43python-devsetmessages: + msg212210
2013-11-03 12:23:10ncoghlansetstatus: open -> closed

messages: + msg202017
2013-11-03 09:53:54mont29setmessages: + msg202007
2013-11-03 07:05:02ncoghlansetstatus: pending -> open

messages: + msg202002
2013-10-18 14:03:05ncoghlansetstatus: open -> pending
priority: release blocker -> normal
messages: + msg200259

resolution: fixed
stage: needs patch -> resolved
2013-10-18 14:01:09python-devsetmessages: + msg200257
2013-10-18 13:57:40ncoghlansetmessages: + msg200255
2013-10-18 13:49:15ncoghlansetpriority: normal -> release blocker
nosy: + larry
2013-10-18 13:47:56ncoghlansetmessages: + msg200254
2013-10-18 13:44:25skrahsetmessages: + msg200252
2013-10-18 13:36:34ncoghlansetmessages: + msg200251
2013-10-18 13:17:20ncoghlansetmessages: + msg200250
2013-10-18 13:13:18python-devsetmessages: + msg200249
2013-10-17 23:55:24ncoghlansetmessages: + msg200177
2013-10-17 17:27:30skrahsetnosy: + skrah
messages: + msg200146
2013-10-17 16:41:36hayposetmessages: + msg200145
2013-10-17 16:13:19ncoghlansetmessages: + msg200142
2013-10-17 16:04:52ncoghlansetmessages: + msg200141
2013-10-17 15:57:14ncoghlansetmessages: + msg200140
2013-10-17 15:52:11ncoghlansetmessages: + msg200139
2013-10-17 15:49:08ncoghlansetresolution: fixed -> (no value)
messages: + msg200138
stage: resolved -> needs patch
2013-10-17 15:46:52python-devsetmessages: + msg200137
2013-10-17 13:30:15mont29setmessages: + msg200125
2013-10-17 13:27:52python-devsetstatus: pending -> open

messages: + msg200124
2013-10-17 13:23:01ncoghlansetstatus: open -> pending
resolution: fixed
messages: + msg200123

stage: needs patch -> resolved
2013-10-17 13:22:05python-devsetnosy: + python-dev
messages: + msg200122
2013-10-17 12:38:53ncoghlansetassignee: ncoghlan
messages: + msg200120
2013-10-08 21:17:25Brecht.Van.Lommelsetmessages: + msg199255
2013-10-07 18:43:05mont29setfiles: + setstdio.diff

messages: + msg199153
2013-10-02 19:24:08mont29setmessages: + msg198864
2013-09-18 14:58:58mont29setfiles: + setstdio.diff
nosy: + mont29
messages: + msg198023

2013-09-16 14:48:26ncoghlansetassignee: ncoghlan -> (no value)
messages: + msg197903
2013-09-16 11:58:44Brecht.Van.Lommelsetnosy: + Brecht.Van.Lommel
messages: + msg197894
2012-11-28 06:42:07ncoghlansetassignee: ncoghlan

messages: + msg176516
nosy: + ncoghlan
2012-11-28 06:30:27christian.heimessetnosy: + christian.heimes
messages: + msg176515
2012-11-28 06:13:42ideasman42setfiles: + pyos_putenv.diff
keywords: + patch
messages: + msg176514
2012-10-08 01:30:38Arfreversetnosy: + Arfrever
2012-10-07 19:38:03asvetlovsetnosy: + asvetlov

stage: needs patch
2012-10-05 18:34:36pitrousettype: enhancement
2012-10-05 11:17:06ideasman42setmessages: + msg172073
2012-10-05 07:02:42lemburgsetmessages: + msg172054
2012-10-04 22:50:24pitrousetnosy: + lemburg, loewis
2012-10-04 13:26:27hayposetnosy: + haypo
messages: + msg171941
2012-10-04 12:58:14ideasman42create