classification
Title: Add _PyPreConfig.configure_locale: allow to leave LC_CTYPE unchanged when embedding Python
Type: Stage: resolved
Components: Interpreter Core Versions: Python 3.8
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: ncoghlan, steve.dower, twouters, vstinner
Priority: normal Keywords: patch

Created on 2019-05-16 23:16 by vstinner, last changed 2019-05-17 21:14 by vstinner. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 13368 merged vstinner, 2019-05-16 23:18
Messages (11)
msg342677 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-05-16 23:16
Py_Initiliaze() always call setlocale(LC_CTYPE, "") on all platforms to set the LC_CTYPE locale to the user preferred locale.

That's fine when Py_Main() is used: when Python is the only "owner" of a process.

I'm not sure that it's fine when Python is embedded into an application.

I propose to add _PyPreConfig.configure_locale: when set to 0, it leaves the LC_CTYPE locale unchanged. The caller is responsible to choose its favorite locale.

I'm not sure if it's real issue in practice. Maybe users learnt how to workaround this limitation.

By the way, I modified Python to always call setlocale(LC_CTYPE, "") on Windows, bpo-34485:

commit 177d921c8c03d30daa32994362023f777624b10d
Author: Victor Stinner <vstinner@redhat.com>
Date:   Wed Aug 29 11:25:15 2018 +0200

    bpo-34485, Windows: LC_CTYPE set to user preference (GH-8988)
    
    On Windows, the LC_CTYPE is now set to the user preferred locale at
    startup: _Py_SetLocaleFromEnv(LC_CTYPE) is now called during the
    Python initialization. Previously, the LC_CTYPE locale was "C" at
    startup, but changed when calling setlocale(LC_CTYPE, "") or
    setlocale(LC_ALL, "").
    
    pymain_read_conf() now also calls _Py_SetLocaleFromEnv(LC_CTYPE) to
    behave as _Py_InitializeCore(). Moreover, it doesn't save/restore the
    LC_ALL anymore.
    
    On Windows, standard streams like sys.stdout now always use
    surrogateescape error handler by default (ignore the locale).

This change caused a performance regression: bpo-35195 "[Windows] Python 3.7 initializes LC_CTYPE locale at startup, causing performance issue on msvcrt isdigit()".

Attached PR implements proposed change.
msg342701 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-05-17 09:51
> I'm not sure if it's real issue in practice. Maybe users learnt how to workaround this limitation.

I sent an email to capi-sig to get a feedback on this question :-)

https://mail.python.org/archives/list/capi-sig@python.org/thread/KP2QNE4YITGJDZMZW5Y4EGIPZR2L6HRD/
msg342703 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-05-17 09:58
> Add _PyPreConfig.configure_locale: allow to leave LC_CTYPE unchanged when embedding Python

This parameter is similar the _PyCoreConfig.configure_c_stdio parameter that I just added. It allows to behave more as a "regular Python" or to better "isolate Python".
msg342732 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2019-05-17 15:45
I think this should be opt-in, not opt-out.

Imagine you're an existing application and you want to embed Python. Why would you ever want it to suddenly change your global settings like this?

As a general rule, an embedded Python runtime should deal with all the settings it's been provided and not forcibly change *any* of them (though maybe the embedding application will discover bugs and have to update their entire application to deal with it - or switch to a language that isn't so demanding!).

For this specific case, it seems just as easy to opt-in by calling setlocale(LC_CTYPE, "") before initializing Python. We can recommend this in docs and do it ourselves in Py_Main, but I don't see why we'd add a specific option for configuring the user's C runtime.
msg342735 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-05-17 16:48
Steve:
> Imagine you're an existing application and you want to embed Python. Why would you ever want it to suddenly change your global settings like this?

I have a very good news for you :-) Slowly but steadily, we converge on a agreement ;-)

The more time I spend on PEP 587, the more I think that the following use cases are too different and "incompatible":

* customized Python interpreter: behaves as the "regular Python"
* Embedded Python

I drafted "Isolate Python" in my PEP 587. With subtle options like configure_c_stdio, parse_argv and now configure_locale, it seems like we really need two *separated* these two configurations.

I implemented exactly that in PR 13388, I added 2 functions:

* _PyCoreConfig_InitPythonConfig(): provide a default configuration which parses command line arguments, enable UTF-8 Mode depending on the locale, read global configuration variables, read environment variables, etc.

* _PyCoreConfig_InitIsolateConfig(): isolated from the system, ignore command line arguments, ignore global configuration variables, leave the LC_CTYPE locale and C streams unchanged, etc.

I'm working on an update of my PEP to better describe that.

--

So I understand that this option is needed and you should prefer to turn it off by default in _PyCoreConfig_InitIsolateConfig.
msg342736 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2019-05-17 17:00
> I have a very good news for you :-) Slowly but steadily, we converge on a agreement ;-)

Yay! :)

> I added 2 functions …

Nice, these will be helpful. Probably a good thing to try out with some embedders too, to see whether they're intuitive enough.

(I promise I will get time to work on some code here soon, but right now I'm trying to get PEP 578 merged.)
msg342737 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2019-05-17 17:01
> * customized Python interpreter: behaves as the "regular Python"

For this, why wouldn't we say "start by copying all the code in Programs/python.c"? Is there any reason why that file needs to only be one single call into Py_Main? Maybe there's some reason I'm not aware of...
msg342742 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-05-17 17:12
> Nice, these will be helpful. Probably a good thing to try out with some embedders too, to see whether they're intuitive enough.

I suggest to move this discussion on the WIP PR of 4th version of my PEP:
https://github.com/python/peps/pull/1056

I will reply you there :-)
msg342744 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-05-17 17:44
I asked on Twitter: "If you embed Python in your application, would you prefer Python to leave your locale unchanged?"

Mai Giménez replied: "I would like to."
https://twitter.com/maidotgimenez/status/1129357352774393856

Steve Dower wrote:
"I think this should be opt-in, not opt-out."

I count at least two users who would like to get such option, so my idea wasn't stupid :-)

I updated my PR and I plan to merge it soon.
msg342748 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2019-05-17 18:13
> I suggest to move this discussion on the WIP PR of 4th version of my PEP

I prefer to keep it on the issue tracker where it doesn't disappear when merged and I get proper notifications. (I want to make the most of getting good notifications of new comments before they get taken away :( )
msg342762 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-05-17 20:44
New changeset bcfbbd704646622e919c1306a91fba61d603483d by Victor Stinner in branch 'master':
bpo-36945: Add _PyPreConfig.configure_locale (GH-13368)
https://github.com/python/cpython/commit/bcfbbd704646622e919c1306a91fba61d603483d
History
Date User Action Args
2019-05-17 21:14:26vstinnersetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2019-05-17 20:44:27vstinnersetmessages: + msg342762
2019-05-17 18:13:40steve.dowersetmessages: + msg342748
2019-05-17 17:44:45vstinnersetmessages: + msg342744
2019-05-17 17:12:09vstinnersetmessages: + msg342742
2019-05-17 17:01:08steve.dowersetmessages: + msg342737
2019-05-17 17:00:08steve.dowersetmessages: + msg342736
2019-05-17 16:48:49vstinnersetmessages: + msg342735
2019-05-17 15:45:56steve.dowersetmessages: + msg342732
2019-05-17 09:58:48vstinnersetmessages: + msg342703
2019-05-17 09:51:24vstinnersetmessages: + msg342701
2019-05-16 23:18:16vstinnersetkeywords: + patch
stage: patch review
pull_requests: + pull_request13278
2019-05-16 23:16:46vstinnercreate