Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PEP 538: silence locale coercion and compatibility warnings by default? #74750

Closed
ncoghlan opened this issue Jun 4, 2017 · 13 comments
Closed
Assignees
Labels
3.7 (EOL) end of life type-feature A feature request or enhancement

Comments

@ncoghlan
Copy link
Contributor

ncoghlan commented Jun 4, 2017

BPO 30565
Nosy @warsaw, @ncoghlan, @vstinner, @methane
PRs
  • bpo-30565: Remove PEP 538 warnings #2186
  • bpo-30565: Add PYTHONCOERCECLOCALE=warn runtime flag #2260
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/ncoghlan'
    closed_at = <Date 2017-06-18.02:32:29.007>
    created_at = <Date 2017-06-04.09:57:45.118>
    labels = ['type-feature', '3.7']
    title = 'PEP 538: silence locale coercion and compatibility warnings by default?'
    updated_at = <Date 2017-06-18.02:32:29.005>
    user = 'https://github.com/ncoghlan'

    bugs.python.org fields:

    activity = <Date 2017-06-18.02:32:29.005>
    actor = 'ncoghlan'
    assignee = 'ncoghlan'
    closed = True
    closed_date = <Date 2017-06-18.02:32:29.007>
    closer = 'ncoghlan'
    components = []
    creation = <Date 2017-06-04.09:57:45.118>
    creator = 'ncoghlan'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 30565
    keywords = []
    message_count = 13.0
    messages = ['295120', '295881', '295882', '296006', '296007', '296057', '296078', '296238', '296240', '296246', '296249', '296252', '296253']
    nosy_count = 4.0
    nosy_names = ['barry', 'ncoghlan', 'vstinner', 'methane']
    pr_nums = ['2186', '2260']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue30565'
    versions = ['Python 3.7']

    @ncoghlan
    Copy link
    Contributor Author

    ncoghlan commented Jun 4, 2017

    This is a follow-up to PEP-538 that reflects the uncertainty over whether or not the warning on *successful* implicit locale coercion is a good idea.

    The argument for this warning is that it alerts redistributors, system integrators, and application developers to the fact that LC_CTYPE may not be what they expect it to be.

    The argument against it is that in many, and potentially even most, cases where the warning is omitted, there won't be any subsequent problems, and so the warning qualifies as a false alarm (especially for end users of applications that just happen to be written in Python), and the PEP-538 section in the .37 What's New, together with the fact that "LC_CTYPE=C.UTF-8" (or similar) appears in the process environment can be considered sufficient notice of the change for debugging purposes.

    The initial PEP-538 implementation at #659 includes the warning, this issue reflects the possibility that we may decide to reverse that decision prior to the release of Python 3.7.0.

    @ncoghlan ncoghlan added the 3.7 (EOL) end of life label Jun 4, 2017
    @ncoghlan ncoghlan self-assigned this Jun 4, 2017
    @ncoghlan ncoghlan added the type-feature A feature request or enhancement label Jun 4, 2017
    @ncoghlan
    Copy link
    Contributor Author

    My latest proposal on python-dev:

    • no warning by default on successful coercion
    • set "PYTHONCOERCECLOCALE=warn" to enable the warning when it's considered a configuration error in the runtime environment

    @ncoghlan
    Copy link
    Contributor Author

    Some relevant mailing list threads on the usability problems posed by issuing the warning by default:

    python-dev (impact on developer experience): https://mail.python.org/pipermail/python-dev/2017-June/148323.html

    Fedora's python-devel (build environment warnings due to F26 backport of PEP-538): https://lists.fedorahosted.org/archives/list/python-devel@lists.fedoraproject.org/thread/VSEGOF76XMBJOAO4C2MORNK3I2VIPOTU/

    @vstinner
    Copy link
    Member

    The warnings caused something around 27 test failures on "AMD64 FreeBSD CURRENT Non-Debug 3.x" buildbot:

    http://buildbot.python.org/all/builders/AMD64%20FreeBSD%20CURRENT%20Non-Debug%203.x/builds/438/steps/test/logs/stdio

    27 tests failed again:
    test_asyncio test_base64 test_c_locale_coercion test_capi
    test_cmd_line test_cmd_line_script test_compileall
    test_concurrent_futures test_doctest test_faulthandler
    test_file_eintr test_inspect test_io test_json test_keyword
    test_module test_multiprocessing_fork
    test_multiprocessing_forkserver test_multiprocessing_main_handling
    test_subprocess test_symbol test_sys test_threading test_tools
    test_traceback test_venv test_warnings

    @vstinner
    Copy link
    Member

    I wrote a PR to remove PEP-538 warnings:
    #2186

    Reference (unpatched):

    haypo@selma$ env -i LC_ALL=C ./python -c "import locale; print(locale.getpreferredencoding())"
    Python runtime initialized with LC_CTYPE=C (a locale with default ASCII encoding), which may cause Unicode compatibility problems. Using C.UTF-8, C.utf8, or UTF-8 (if available) as alternative Unicode-compatible locales is recommended.
    ANSI_X3.4-1968

    haypo@selma$ env -i LC_CTYPE=C ./python -c "import locale; print(locale.getpreferredencoding())"
    Python detected LC_CTYPE=C: LC_CTYPE coerced to C.UTF-8 (set another locale or PYTHONCOERCECLOCALE=0 to disable this locale coercion behavior).
    UTF-8

    haypo@selma$ env -i LC_ALL=C ./python -c "import locale; print(locale.getpreferredencoding())"
    Python runtime initialized with LC_CTYPE=C (a locale with default ASCII encoding), which may cause Unicode compatibility problems. Using C.UTF-8, C.utf8, or UTF-8 (if available) as alternative Unicode-compatible locales is recommended.
    ANSI_X3.4-1968

    With my change:

    haypo@selma$ env -i ./python -c "import locale; print(locale.getpreferredencoding())"
    UTF-8

    haypo@selma$ env -i LC_CTYPE=C ./python -c "import locale; print(locale.getpreferredencoding())"
    UTF-8

    haypo@selma$ env -i LC_ALL=C ./python -c "import locale; print(locale.getpreferredencoding())"
    ANSI_X3.4-1968

    @ncoghlan
    Copy link
    Contributor Author

    As Victor notes above, for systems where no suitable coercion target locale is available, even the "unsupported locale" warning is an issue, since it's only full Unicode text handling that's unsupported in such locales - you can still process ASCII-only and non-text data just fine in such environments (and even Unicode data if you're sufficiently careful).

    As a matter of future-proofing CPython, we also need to account for the fact that some C implementations (potentially including future versions of glibc) may starting using UTF-8 as the default text encoding in the C locale itself (requiring the use of variants like C.ASCII or C.latin-1 to opt-in to the use of legacy text encodings).

    I still think the warnings are potentially valuable for system integrators and operating system developers trying to ensure that C.UTF-8 is genuinely pervasive as their default locale, so my counter-proposal to Victor's PR is:

    • remove the build-time warning flag
    • gate the warnings for both successful and unsuccessful locale coercion behind a runtime check for PYTHONCOERCECLOCALE=warn

    That means:

    • we provide the best possible developer experience we can by default (i.e. no visible warnings, we assume UTF-8 where possible)
    • redistributors have a relatively simple patch to always emit the warnings if they want to do so (i.e. remove the conditional checks)
    • system integrators and folks debugging application misbehaviour can readily enable the warnings at runtime as part of fault investigation activities

    @ncoghlan
    Copy link
    Contributor Author

    Updated issue title to reflect the fact we're now considering just silencing *all* the warnings by default.

    @ncoghlan ncoghlan changed the title PEP 538: default to skipping warning for implicit locale coercion? PEP 538: silence locale coercion and compatibility warnings by default? Jun 15, 2017
    @methane
    Copy link
    Member

    methane commented Jun 17, 2017

    If this warnings are disabled by default, who enable it?
    How about just remove them?

    I'm OK to remove them all.
    Since it's not ideal, nothing go worse than Python 3.6.

    Additionally, if PEP-540 is accepted, we can use UTF-8 for
    stdio and filesystem encoding even when there are no UTF-8 locale.

    @ncoghlan
    Copy link
    Contributor Author

    By having the warnings always compiled in, but off by default, "PYTHONCOERCECLOCALE=warn" becomes a debugging tool for integration failures that we (or end users) suspect might be due to the locale coercion behaviour. It's essentially an even more esoteric variant of tools like "PYTHONINSPECT=1", "python -X faulthandler", "python -v", etc.

    I already learned something myself based on updating the test cases to cover the opt-in warning model: I initially thought that when we set "LC_ALL=C" we'd get *both* warnings, but we don't (at least with glibc).

    My working theory is that setting 'LC_ALL=C' causes 'setlocale(LC_CTYPE, "C.UTF-8")' to fail, even if there's otherwise a C.UTF-8 locale available. (I haven't conclusively proven that theory, but it's the most likely point for the locale coercion to be implicitly bypassed)

    @vstinner
    Copy link
    Member

    As I wrote on python-dev, I would prefer no warning and no option to enable
    warnings. But it's not my PEP, I would prefer that Nick makes a choice
    here. Right now, my main worry is that my little (freebsd) buildbots are
    still sick

    @ncoghlan
    Copy link
    Contributor Author

    OK, based on the latest round of custom buildbot results (e.g. http://buildbot.python.org/all/builders/AMD64%20FreeBSD%20CURRENT%20Debug%20custom/builds/12/steps/test/logs/stdio ), it looks like the main remaining problems are those covered by bpo-30672, where *BSD platforms handle some locales differently from the way Linux handles them (and Mac OS X then inherits those differences), and bpo-30647 (where attempting to use the UTF-8 locale can cause nl_langinfo to fail).

    So for the latest iteration on *this* change, I'm going to do the following:

    1. Disable "UTF-8" as a candidate target locale
    2. Adjust the test suite's expectations accordingly

    @ncoghlan
    Copy link
    Contributor Author

    New changeset eb81795 by Nick Coghlan in branch 'master':
    bpo-30565: Add PYTHONCOERCECLOCALE=warn runtime flag (GH-2260)
    eb81795

    @ncoghlan
    Copy link
    Contributor Author

    OK, PEP-538 is effectively disabled on FreeBSD and Mac OS X now, and the locale coercion and compatibility warnings are off by default everywhere.

    PYTHONCOERCECLOCALE=warn is explicitly documented as a debugging tool for use when folks already suspect that locale coercion or legacy locale incompatibility might be to blame for particular problems that they're seeing.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants