Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PYTHONCOERCECLOCALE is ignored when using -E or -I option #78820

Closed
vstinner opened this issue Sep 11, 2018 · 5 comments
Closed

PYTHONCOERCECLOCALE is ignored when using -E or -I option #78820

vstinner opened this issue Sep 11, 2018 · 5 comments
Labels
3.7 (EOL) end of life 3.8 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs)

Comments

@vstinner
Copy link
Member

BPO 34639
Nosy @ncoghlan, @vstinner
Superseder
  • bpo-34589: Py_Initialize() and Py_Main() should not enable C locale coercion
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2018-09-19.23:05:10.755>
    created_at = <Date 2018-09-11.23:47:42.928>
    labels = ['interpreter-core', '3.7', '3.8']
    title = 'PYTHONCOERCECLOCALE is ignored when using -E or -I option'
    updated_at = <Date 2018-09-19.23:05:10.754>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2018-09-19.23:05:10.754>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2018-09-19.23:05:10.755>
    closer = 'vstinner'
    components = ['Interpreter Core']
    creation = <Date 2018-09-11.23:47:42.928>
    creator = 'vstinner'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 34639
    keywords = []
    message_count = 5.0
    messages = ['325096', '325240', '325244', '325276', '325816']
    nosy_count = 2.0
    nosy_names = ['ncoghlan', 'vstinner']
    pr_nums = []
    priority = 'normal'
    resolution = 'duplicate'
    stage = 'resolved'
    status = 'closed'
    superseder = '34589'
    type = None
    url = 'https://bugs.python.org/issue34639'
    versions = ['Python 3.7', 'Python 3.8']

    @vstinner
    Copy link
    Member Author

    I modified Py_Main() to ignore the PYTHONCOERCECLOCALE environment variable if -E or -I command line option is used. But Nick asks to always read PYTHONCOERCECLOCALE.

    We should either update the PEP or change the code.

    I am not sure why PYTHONCOERCECLOCALE should be handled differently than other PYTHON* variable like PYTHONWARNINGS or PYTHONUTF8. Is it because it impacts the encodings? Is it because there was a chicken-and-egg issue before I reworked Py_Main() code? (PYTHONCOERCECLOCALE env var was read before reading command line arguments.)

    --

    Copy of Nick Coghlan's msg325009:

    (The one exception to "nothing gets decoded incorrectly" is that PYTHONCOERCECLOCALE itself is always interpreted as an ASCII field: the values that it checks for are actually ASCII byte sequences, not Unicode code points.

    The documentation could definitely be much clearer on that point though, as even in the PEP it's only implied by the final paragraph in https://www.python.org/dev/peps/pep-0538/#legacy-c-locale-coercion-in-the-standalone-python-interpreter-binary which is mostly talking about the fact that this particular environment variable is still checked, even if you pass the -I or -E command line options.

    @vstinner vstinner added 3.7 (EOL) end of life 3.8 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Sep 11, 2018
    @ncoghlan
    Copy link
    Contributor

    Respecting -E and -I isn't a problem per se - the problem is moving the _Py_CoerceLegacyLocale call to a point that's incredibly late in the startup process *just* to get it to respect those flags.

    I don't actually mind if you reinstate the extra pass through the command line arguments just to check for -E and -I early enough for it to affect a properly located call to _Py_CoerceLegacyLocale, I just don't think it's necessary to do so.

    @ncoghlan
    Copy link
    Contributor

    For the PYTHONCOERCECLOCALE=warn case, it turns out that my preferred approach to implementing bpo-34589 also naturally ends up respecting -I and -E for that (i.e. supplying -I or -E will suppressed the warning).

    However, my upcoming PR for that also reinstates and expands on my original comment that explained why it was valuable to support "PYTHONCOERCECLOCALE=0 python3 -E ..." and "PYTHONCOERCECLOCALE=0 python3 -I ...": so you can readily reproduce the way that locale coercion behaves on a platform *without* a suitable target locale (e.g. CentOS 7), even if your current platform actually does have such a locale available (e.g. Fedora).

    @vstinner
    Copy link
    Member Author

    Respecting -E and -I isn't a problem per se - the problem is moving the _Py_CoerceLegacyLocale call to a point that's incredibly late in the startup process *just* to get it to respect those flags.

    Would you mind to elaborate how it is an issue? The LC_CTYPE is coerced *before* the configuration is read a second time. If there is an issue, would you mind to show an example where something is decoded from the wrong encoding?

    @vstinner
    Copy link
    Member Author

    The discussion moved back to bpo-34589.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life 3.8 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs)
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants