Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No good way to set 'PYTHONIOENCODING' when embedding python. #60333

Closed
ideasman42 mannequin opened this issue Oct 4, 2012 · 39 comments
Closed

No good way to set 'PYTHONIOENCODING' when embedding python. #60333

ideasman42 mannequin opened this issue Oct 4, 2012 · 39 comments
Assignees
Labels
OS-windows type-feature A feature request or enhancement

Comments

@ideasman42
Copy link
Mannequin

ideasman42 mannequin commented Oct 4, 2012

BPO 16129
Nosy @malemburg, @loewis, @ncoghlan, @vstinner, @larryhastings, @tiran, @asvetlov, @ideasman42, @skrah
Files
  • pyos_putenv.diff: patch on python 3.4 hg - 80610:618ea5612e83
  • setstdio.diff: patch...
  • setstdio.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/ncoghlan'
    closed_at = <Date 2013-11-03.12:23:10.879>
    created_at = <Date 2012-10-04.12:58:14.980>
    labels = ['type-feature', 'OS-windows']
    title = "No good way to set 'PYTHONIOENCODING' when embedding python."
    updated_at = <Date 2014-02-25.21:03:43.056>
    user = 'https://github.com/ideasman42'

    bugs.python.org fields:

    activity = <Date 2014-02-25.21:03:43.056>
    actor = 'python-dev'
    assignee = 'ncoghlan'
    closed = True
    closed_date = <Date 2013-11-03.12:23:10.879>
    closer = 'ncoghlan'
    components = ['Windows']
    creation = <Date 2012-10-04.12:58:14.980>
    creator = 'ideasman42'
    dependencies = []
    files = ['28144', '31808', '31985']
    hgrepos = []
    issue_num = 16129
    keywords = ['patch']
    message_count = 39.0
    messages = ['171938', '171941', '172054', '172073', '176514', '176515', '176516', '197894', '197903', '198023', '198864', '199153', '199255', '200120', '200122', '200123', '200124', '200125', '200137', '200138', '200139', '200140', '200141', '200142', '200145', '200146', '200177', '200249', '200250', '200251', '200252', '200254', '200255', '200257', '200259', '202002', '202007', '202017', '212210']
    nosy_count = 13.0
    nosy_names = ['lemburg', 'loewis', 'ncoghlan', 'vstinner', 'larry', 'christian.heimes', 'Arfrever', 'asvetlov', 'ideasman42', 'skrah', 'python-dev', 'Brecht.Van.Lommel', 'mont29']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue16129'
    versions = ['Python 3.4']

    @ideasman42
    Copy link
    Mannequin Author

    ideasman42 mannequin commented Oct 4, 2012

    note, I was asked to report this issue, posted on the py dev mailing list: see - http://code.activestate.com/lists/python-dev/118015/

    ---

    We've run into an issue recently with blender3d on ms-windows where we
    want to enforce the encoding is UTF-8 with the embedded python
    interpreter.
    (the encoding defaults to cp437).

    I naively thought setting the environment variable before calling
    Py_Initialize() would work, but the way python DLL loads, it gets its
    own environment variables that cant be modified directly [1].
    eg, _putenv("PYTHONIOENCODING=utf-8:surrogateescape");

    We had bug reports by windows users not able to export files because
    the stdout errors on printing paths with unsupported encoding. [2],[3]

    ---

    Of course we could distribute blender with a bat file launcher that
    sets env variables, or ask the user to set their env variable - but I
    dont think this is really a good option.

    I tried overriding the stderr & stdout, but this caused another bug on exiting, giving an assert in MSVCR90.DLL's write.c (called from python32_d.dll):
    _VALIDATE_CLEAR_OSSERR_RETURN((_osfile(fh) & FOPEN), EBADF, -1);

    import sys, io
    sys.__stdout__ = sys.stdout =
    io.TextIOWrapper(io.open(sys.stdout.fileno(), "wb", -1),
    encoding='utf-8', errors='surrogateescape', newline="\n",
    line_buffering=True)
    sys.__stderr__ = sys.stderr =
    io.TextIOWrapper(io.open(sys.stderr.fileno(), "wb", -1),
    encoding='utf-8', errors='surrogateescape', newline="\n",
    line_buffering=True)

    IMHO either of these solutions would be fine.

    • have a PyOS_PutEnv() function, gettext has gettext_putenv() to
      workaround this problem.

    • manage this the same as Py_GetPythonHome(), which can be defined by
      the embedding application to override the default.

    [1] http://stackoverflow.com/questions/5153547/environment-variables-are-different-for-dll-than-exe
    [2] http://projects.blender.org/tracker/index.php?func=detail&aid=32750
    [3] http://projects.blender.org/tracker/index.php?func=detail&aid=31555

    @ideasman42 ideasman42 mannequin added the OS-windows label Oct 4, 2012
    @vstinner
    Copy link
    Member

    vstinner commented Oct 4, 2012

    See also issue bpo-15216.

    @malemburg
    Copy link
    Member

    IMHO either of these solutions would be fine.

    • have a PyOS_PutEnv() function, gettext has gettext_putenv() to
      workaround this problem.

    This solution would help in many other cases as well, so adding
    such an API would certainly help more than specialized interfaces.

    • manage this the same as Py_GetPythonHome(), which can be defined by
      the embedding application to override the default.

    I think you meant Py_SetPythonHome().

    Given that the IO encoding is very important for Python 3.x, a special
    API just for setting the encoding may be useful to have as well.

    Care must be taken, though, that the encoding cannot be set after
    Py_Initialize() has been called.

    It may overall be easier to go with the PyOS_PutEnv() solution to
    not run into the problems with having to check for an initialized
    interpreter first.

    --
    Marc-Andre Lemburg
    eGenix.com

    Professional Python Services directly from the Source  (#1, Oct 05 2012)
    >>> Python Projects, Consulting and Support ...   http://www.egenix.com/
    >>> mxODBC.Zope/Plone.Database.Adapter ...       http://zope.egenix.com/
    >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
    ________________________________________________________________________
    2012-09-27: Released eGenix PyRun 1.1.0 ...       http://egenix.com/go35
    2012-09-26: Released mxODBC.Connect 2.0.1 ...     http://egenix.com/go34
    2012-09-25: Released mxODBC 3.2.1 ...             http://egenix.com/go33
    2012-10-23: Python Meeting Duesseldorf ...                 18 days to go

    eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
    Registered at Amtsgericht Duesseldorf: HRB 46611
    http://www.egenix.com/company/contact/

    @ideasman42
    Copy link
    Mannequin Author

    ideasman42 mannequin commented Oct 5, 2012

    Agree PyOS_PutEnv would be good since its not restricted to string encoding and resolves the problem of not being able to control env vars for an embedded interpreter in general.

    Having ways to change encoding is good too but a bit outside the scope of this report and possibly not the best solution either since its possible (through unlikely), that you need to set the encoding at the very start of python initialization- rather than site & builtin modules loads.

    @pitrou pitrou added the type-feature A feature request or enhancement label Oct 5, 2012
    @ideasman42
    Copy link
    Mannequin Author

    ideasman42 mannequin commented Nov 28, 2012

    patch attached, simply wraps putenv()

    @tiran
    Copy link
    Member

    tiran commented Nov 28, 2012

    At first glance your proposed fix looks like an easy hack to get around the issue. However it's not going to work properly. Embedded Python interpreters should isolate themselves from the user's environment. When Py_IgnoreEnvironmentFlag is enabled, Py_GETENV() always returns NULL.

    IMHO we can't get around Py_GetIOEncoding(), Py_SetIOEncoding() and Py_IOEncoding.

    @ncoghlan
    Copy link
    Contributor

    Claiming this one, mainly because I want people to largely leave the already hairy initialisation process alone until we get a chance to discuss it at the language summit next year.

    I plan to write up a comprehensive overview of the initialisation sequence before then, because we need to be figuring out how to *delete* code here, instead of adding even more.

    @ncoghlan ncoghlan self-assigned this Nov 28, 2012
    @BrechtVanLommel
    Copy link
    Mannequin

    BrechtVanLommel mannequin commented Sep 16, 2013

    With the language summit passed some time ago, is there now a more clear picture of how this would fit in the initialisation process?

    Any idea of what a proper implementation would look like, or is the message still to wait?

    @ncoghlan
    Copy link
    Contributor

    The gory details of the current startup situation are in PEP-432. However, the comprehensive solution described in that PEP isn't going to make it into Python 3.4, so a simpler interim fix would be worthwhile.

    Since Blender is designed to support building against the system Python, the trick of forcing Python to link to an alternate implementation of a function won't work.

    Inspired by http://docs.python.org/3/c-api/init.html#Py_SetPath, I suggest offering an API like:

    int Py_SetStandardStreamEncoding(char *encoding, char *errors)
    {
        if (Py_IsInitialized()) {
            return -1;
        }
        Py_StandardStreamEncoding = _PyMem_Strdup(encoding);
        Py_StandardStreamErrors = _PyMem_Strdup(errors);
    }

    The initstdio function in pythonrun.c would then be updated to use the specified Py_StandardStreamEncoding and Py_StandardStreamErrors if they weren't NULL (since that would indicate an embedding application had called SetStandardStreamEncoding).

    @ncoghlan ncoghlan removed their assignment Sep 16, 2013
    @mont29
    Copy link
    Mannequin

    mont29 mannequin commented Sep 18, 2013

    Hi,

    Decided to give it a try and implement suggested Py_SetStandardStreamEncoding, see attached patch.

    A quick test with Blender under Linux (with an ASCII console) seems to work OK, unfortunatly I do not have Windows to test it where it really matters...

    That's my first piece of code ever written for Python, I hope it does not contain too much obvious errors!

    @mont29
    Copy link
    Mannequin

    mont29 mannequin commented Oct 2, 2013

    No one to check the patch? It’s rather small, should not take much time… And we really need a solution for this issue, so better to try to get this settled before 3.4 freeze! ;)

    @mont29
    Copy link
    Mannequin

    mont29 mannequin commented Oct 7, 2013

    Updated patch (mostly from Brecht's remarks, removed an obvious bug...)

    @BrechtVanLommel
    Copy link
    Mannequin

    BrechtVanLommel mannequin commented Oct 8, 2013

    I tested the patch on Windows and can confirm it solves the problem for Blender.

    @ncoghlan
    Copy link
    Contributor

    I added tests to the patch and tweaked a few details. Once the local test run gives it a clean bill of health, I'll commit it so it will be included in 3.4a4 this weekend and you can check it works with Blender.

    @ncoghlan ncoghlan self-assigned this Oct 17, 2013
    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Oct 17, 2013

    New changeset 9cd88b39ef62 by Nick Coghlan in branch 'default':
    Issue bpo-16129: Add Py_SetStandardStreamEncoding
    http://hg.python.org/cpython/rev/9cd88b39ef62

    @ncoghlan
    Copy link
    Contributor

    Putting this to pending for the moment - please try it out with Blender once 3.4a4 is released, and then we'll close it.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Oct 17, 2013

    New changeset 537e13ca7683 by Nick Coghlan in branch 'default':
    Issue bpo-16129: Move Py_SetStandardStreamEncoding declaration
    http://hg.python.org/cpython/rev/537e13ca7683

    @mont29
    Copy link
    Mannequin

    mont29 mannequin commented Oct 17, 2013

    Thanks Nick, we’ll do it for sure. :)

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Oct 17, 2013

    New changeset 51480f6428a5 by Nick Coghlan in branch 'default':
    Skip bpo-16129 test until I debug cross-platform issues
    http://hg.python.org/cpython/rev/51480f6428a5

    @ncoghlan
    Copy link
    Contributor

    Buildbots weren't happy with the test case. I've done two extra commits (one to turn off the diff limit, then a second one to skip it entirely), so will hopefully be able to debug it and keep this in for the alpha.

    (Larry - FYI, I'll have this test fixed or the patch reverted before the alpha on the weekend)

    @ncoghlan
    Copy link
    Contributor

    According to http://buildbot.python.org/all/builders/AMD64%20Ubuntu%20LTS%203.x/builds/2716/steps/test/logs/stdio, output in the subprocess was fine, but for some reason sys.stdout.encoding and sys.stderr.encoding were None in the test itself o.O

    @ncoghlan
    Copy link
    Contributor

    @ncoghlan
    Copy link
    Contributor

    Antoine pointed out this also needs a :versionadded: tag in the docs.

    @ncoghlan
    Copy link
    Contributor

    So reusing the run command from the buildbot still passed on my system.

    I thought using sys.stdout.encoding (etc) would reliably get me the default IO encoding, but it appears not :(

    @vstinner
    Copy link
    Member

    You don't want to mention the return value (-1) on error in the doc?

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Oct 17, 2013

    Unfortunately the tests pass when I run them manually. On my FreeBSD
    bot thsi may be relevant:

    o LC_* is "C".
    
    o The filesystem encoding is ascii.
    

    @ncoghlan
    Copy link
    Contributor

    On 18 Oct 2013 02:41, "STINNER Victor" <report@bugs.python.org> wrote:

    STINNER Victor added the comment:

    You don't want to mention the return value (-1) on error in the doc?

    I guess it should explicitly say any non-zero return value indicates an
    error rather than leaving that implied by the fact that zero indicates
    success.

    Something else occurred to me, though - this runs before Py_Initialize, and
    now I'm not sure all the API calls are valid in that context.

    ----------


    Python tracker <report@bugs.python.org>
    <http://bugs.python.org/issue16129\>


    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Oct 18, 2013

    New changeset 12d7b6171b28 by Nick Coghlan in branch 'default':
    Issue bpo-16129: Py_SetStandardStreamEncoding cleanups
    http://hg.python.org/cpython/rev/12d7b6171b28

    @ncoghlan
    Copy link
    Contributor

    I dealt with the comments folks made here and on python-dev regarding the docs and the limited API, as well as the fact it isn't safe to call PyErr_NoMemory() when this function fails.

    However, I still haven't been able to reproduce the failure seen on the buildbots. It isn't a simple cross-test interference problem, as I tried the randomization seed from the Ubuntu log and that still passed for me. I'm still trying the seed from Stefan's FreeBSD system, but I expect that will be clean here as well.

    @ncoghlan
    Copy link
    Contributor

    (As expected, the original FreeBSD randomization seed passed here without any problems)

    Before I disabled the test completely, I did a run with full error diffs enabled.

    Here's the FreeBSD log with full diffs:
    http://buildbot.python.org/all/builders/AMD64%20FreeBSD%2010.0%203.x/builds/580/steps/test/logs/stdio

    This indicates both "{0.stdout.encoding}" and "{0.stdout.encoding}" produced the string "None" in the formatting operation that creates the expected output.

    The Ubuntu run exhibits the same behaviour:
    http://buildbot.python.org/all/builders/AMD64%20Ubuntu%20LTS%203.x/builds/2717/steps/test/logs/stdio

    Note that on both symptoms, the later run passes without a problem, and (aside from the full diff being enabled the second time), the symptoms were identical (further pushing against a cross-test compatibility problem).

    And the full gory list of failed logs with this error and the full diffs:
    http://buildbot.python.org/all/builders/AMD64%20OpenIndiana%203.x/builds/6502/steps/test/logs/stdio
    http://buildbot.python.org/all/builders/AMD64%20Snow%20Leop%203.x/builds/170/steps/test/logs/stdio
    http://buildbot.python.org/all/builders/x86%20Gentoo%20Non-Debug%203.x/builds/5175/steps/test/logs/stdio
    http://buildbot.python.org/all/builders/AMD64%20FreeBSD%209.0%203.x/builds/5508/steps/test/logs/stdio
    http://buildbot.python.org/all/builders/x86%20Ubuntu%20Shared%203.x/builds/8767/steps/test/logs/stdio

    I couldn't find a full diff failure for this one, but this log has the truncated diff:
    http://buildbot.python.org/all/builders/PPC64%20PowerLinux%203.x/builds/861/steps/test/logs/stdio

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Oct 18, 2013

    A test order problem is indeed unlikely: I ran the tests as the buildbot user
    with the same random seed and they passed. Let's blame the buildbot software. ;)

    @ncoghlan
    Copy link
    Contributor

    Larry - I'm kinda stumped on this one. I'm not sure how to debug the failure on the buildbots, because, as Victor said, sys.stdout/err having an encoding attribute of None *isn't* supposed to happen, so the test should be fine as it stands. And nobody has been able to reproduce the problem on a direct run.

    It's probably OK to leave the test disabled for the alpha (since even the failures show that *this* feature was working correctly, it was the test case itself that was breaking).

    While I just noticed my RHEL buildbot hit the failure as well, Stefan's inability to reproduce the problem on his FreeBSD system suggests that isn't likely to help: http://buildbot.python.org/all/builders/x86%20RHEL%206%203.x/builds/2871/steps/test/logs/stdio

    (And /facepalm moment, I just realised all my testing tonight has been pointless, because I forgot to remove the skip decorator from the offending test...)

    @ncoghlan
    Copy link
    Contributor

    Ah, that got it:

    LC_ALL=C ./python -m test -W test_capi

    => boom :)

    And, of course it's a StringIO object at that point...

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Oct 18, 2013

    New changeset 4af20969c592 by Nick Coghlan in branch 'default':
    Issue bpo-16129: this should appease the buildbots
    http://hg.python.org/cpython/rev/4af20969c592

    @ncoghlan
    Copy link
    Contributor

    So, does that mean "asked the release manager for a ruling" is the buildbot debugging equivalent of "threatened it with tech support?" :)

    @ncoghlan
    Copy link
    Contributor

    ncoghlan commented Nov 3, 2013

    Bastien, did you get a chance to try embedding Python 3.4a4 in Blender yet? If that works for you, we can mark this one as closed.

    @mont29
    Copy link
    Mannequin

    mont29 mannequin commented Nov 3, 2013

    Wow… Good thing you remind me that. Just tested it here (linux with ASCII terminal), works perfectly. Thanks again for all the integration work, Nick!

    @ncoghlan
    Copy link
    Contributor

    ncoghlan commented Nov 3, 2013

    Excellent!

    Zachary Ware got the embedding tests running and passing on Windows in bpo-19439 (previously they were only executed on *nix systems), so Python 3.4 should resolve this problem on all platforms.

    @ncoghlan ncoghlan closed this as completed Nov 3, 2013
    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Feb 25, 2014

    New changeset 4cd620d8c3f6 by R David Murray in branch 'default':
    whatsnew: DynanicClassAttribute (bpo-19030), Py_SetStandardStreamEncoding (bpo-16129)
    http://hg.python.org/cpython/rev/4cd620d8c3f6

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    OS-windows type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants