Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle "POSIX" in the legacy locale detection #76419

Open
ncoghlan opened this issue Dec 7, 2017 · 5 comments
Open

Handle "POSIX" in the legacy locale detection #76419

ncoghlan opened this issue Dec 7, 2017 · 5 comments
Labels
3.7 (EOL) end of life 3.8 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) OS-mac topic-unicode type-bug An unexpected behavior, bug, or error

Comments

@ncoghlan
Copy link
Contributor

ncoghlan commented Dec 7, 2017

BPO 32238
Nosy @ronaldoussoren, @ncoghlan, @vstinner, @jwilk, @ned-deily, @ezio-melotti, @koobs
Dependencies
  • bpo-32002: test_c_locale_coercion fails when the default LC_CTYPE != "C"
  • Superseder
  • bpo-30672: PEP 538: Unexpected locale behaviour on *BSD (including Mac OS X)
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2017-12-07.00:28:35.555>
    labels = ['OS-mac', 'interpreter-core', 'type-bug', '3.8', '3.7', 'expert-unicode']
    title = 'Handle "POSIX" in the legacy locale detection'
    updated_at = <Date 2019-10-11.21:32:55.917>
    user = 'https://github.com/ncoghlan'

    bugs.python.org fields:

    activity = <Date 2019-10-11.21:32:55.917>
    actor = 'vstinner'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Interpreter Core', 'macOS', 'Unicode', 'FreeBSD']
    creation = <Date 2017-12-07.00:28:35.555>
    creator = 'ncoghlan'
    dependencies = ['32002']
    files = []
    hgrepos = []
    issue_num = 32238
    keywords = []
    message_count = 4.0
    messages = ['307781', '307782', '307783', '354502']
    nosy_count = 7.0
    nosy_names = ['ronaldoussoren', 'ncoghlan', 'vstinner', 'jwilk', 'ned.deily', 'ezio.melotti', 'koobs']
    pr_nums = []
    priority = 'normal'
    resolution = None
    stage = 'test needed'
    status = 'open'
    superseder = '30672'
    type = 'behavior'
    url = 'https://bugs.python.org/issue32238'
    versions = ['Python 3.7', 'Python 3.8']

    @ncoghlan
    Copy link
    Contributor Author

    ncoghlan commented Dec 7, 2017

    Right now, the legacy locale detection introduced in PEP-538 doesn't trigger for "LANG=POSIX" and "LC_CTYPE=POSIX" on macOS and other *BSD systems.

    This is because we're looking specifically for "C" as the response from "setlocale(LC_CTYPE, NULL)", which works on Linux (where glibc reports "C" if you configured "POSIX"), but not on *BSD systems (where POSIX and C behave the same way, but are still reported as distinct locales).

    As per Jakub Wilk's comments at https://mail.python.org/pipermail/python-dev/2017-December/151105.html, this isn't right: we should allow either string to be returned from setlocale, and consider both of them as indicating a legacy locale to be coerced to an explicitly UTF-8 based one if possible.

    @ncoghlan ncoghlan added 3.7 (EOL) end of life 3.8 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) OS-mac topic-unicode type-bug An unexpected behavior, bug, or error labels Dec 7, 2017
    @ncoghlan
    Copy link
    Contributor Author

    ncoghlan commented Dec 7, 2017

    Added a dependency on https://bugs.python.org/issue32002, as we should finish the test case refactoring proposed there before adjusting the POSIX locale handling on macOS and other *BSD systems.

    @ncoghlan
    Copy link
    Contributor Author

    ncoghlan commented Dec 7, 2017

    Oops, I forgot I already had an open issue for this discrepancy - I just hadn't decided how to resolve it yet.

    Marking as a duplicate of https://bugs.python.org/issue30672

    @vstinner
    Copy link
    Member

    In Python 3.8, if the LC_CTYPE is "POSIX", the default stdio error handler is now "surrogateescape" instead of "strict", and the UTF-8 is now enabled. In short, LC_CTYPE="POSIX" now behaves as LC_CTYPE="C".

    This change impacts at least FreeBSD. If I correctly, if there is no LC_ALL, LC_CTYPE or LANG environment variable on FreeBSD, the LC_CTYPE locale is "POSIX".

    See bpo-34485, bpo-19977 and the "POSIX locale on FreeBSD" section of my article:
    https://vstinner.github.io/python3-locales-encodings.html

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @ronaldoussoren
    Copy link
    Contributor

    For macOS the default LC_CTYPE is UTF-8, that is:

    • in a terminal session I see LC_TYPE=UTF-8 in the output of env (and in os.environ() in a Python session)
    • when I clear the variable (unset LC_CTYPE) and start Python I still see LC_CTYPE=UTF-8 in the environment

    Is there anything left to do for this issue? In particular for macOS as I don't have BSD VM's at hand.

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life 3.8 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) OS-mac topic-unicode type-bug An unexpected behavior, bug, or error
    Projects
    Status: No status
    Development

    No branches or pull requests

    3 participants