Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sys.stdout.errors is set to "surrogateescape" #69526

Closed
serhiy-storchaka opened this issue Oct 8, 2015 · 12 comments
Closed

sys.stdout.errors is set to "surrogateescape" #69526

serhiy-storchaka opened this issue Oct 8, 2015 · 12 comments
Assignees
Labels
topic-IO type-bug An unexpected behavior, bug, or error

Comments

@serhiy-storchaka
Copy link
Member

BPO 25339
Nosy @ncoghlan, @vstinner, @serhiy-storchaka
Files
  • default_io_error_handler.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/serhiy-storchaka'
    closed_at = <Date 2016-04-10.11:50:50.914>
    created_at = <Date 2015-10-08.06:51:27.937>
    labels = ['type-bug', 'expert-IO']
    title = 'sys.stdout.errors is set to "surrogateescape"'
    updated_at = <Date 2016-04-10.11:50:50.913>
    user = 'https://github.com/serhiy-storchaka'

    bugs.python.org fields:

    activity = <Date 2016-04-10.11:50:50.913>
    actor = 'serhiy.storchaka'
    assignee = 'serhiy.storchaka'
    closed = True
    closed_date = <Date 2016-04-10.11:50:50.914>
    closer = 'serhiy.storchaka'
    components = ['IO']
    creation = <Date 2015-10-08.06:51:27.937>
    creator = 'serhiy.storchaka'
    dependencies = []
    files = ['41004']
    hgrepos = []
    issue_num = 25339
    keywords = ['patch']
    message_count = 12.0
    messages = ['252515', '253009', '253011', '253018', '254463', '262991', '263012', '263015', '263090', '263091', '263130', '263133']
    nosy_count = 4.0
    nosy_names = ['ncoghlan', 'vstinner', 'python-dev', 'serhiy.storchaka']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue25339'
    versions = ['Python 3.5', 'Python 3.6']

    @serhiy-storchaka
    Copy link
    Member Author

    The error handler of sys.stdout and sys.stdin is set to "surrogateescape" even for non-ASCII encoding.

    $ LANG= PYTHONIOENCODING=UTF-8 ./python -c 'import sys; print(sys.stdout.encoding, sys.stdout.errors)'
    UTF-8 surrogateescape

    @serhiy-storchaka serhiy-storchaka added topic-IO type-bug An unexpected behavior, bug, or error labels Oct 8, 2015
    @vstinner
    Copy link
    Member

    Sorry, I don't understand the issue. Do you consider that using surrogateescape is a bug?

    Which behaviour do you expect?

    Python 3.5 now uses surrogateescape by default for stdout and stderr when the locale is POSIX. I guess that you got the POSIX locale using "LANG=".

    @serhiy-storchaka
    Copy link
    Member Author

    I'm not sure this is a bug, but it looks at least unexpected, that surrogateescape is used with non-ASCII encoding. For example my last test for bpo-19058 fails on POSIX locale in 3.5+, and it is not so easy to make it working.

    May be change error handler to surrogateescape only if PYTHONIOENCODING is not specified?

    @vstinner
    Copy link
    Member

    "it looks at least unexpected, that surrogateescape is used with non-ASCII encoding"

    What do you mean by non-ASCII encoding? surrogateescape is used by all encodings for all OS operations on Python 3, like os.listdir(), even for UTF-8.

    @serhiy-storchaka
    Copy link
    Member Author

    The default encoding of sys.stdio and sys.stdout is determined by (in order of increasing precedence):

    1. locale
    2. PYTHONIOENCODING
    3. Py_SetStandardStreamEncoding()

    The default error handler before 3.5 was determined by:

    1. 'strict'
    2. PYTHONIOENCODING
    3. Py_SetStandardStreamEncoding()

    The default error handler since 3.5 (bpo-19977) is determined by:

    1. PYTHONIOENCODING
    2. locale
    3. Py_SetStandardStreamEncoding()

    Even if you explicitly specified the error handler by PYTHONIOENCODING, it doesn't have effect in POSIX locale. This doesn't look right to me. I think the order should be the same as for encoding.

    Proposed patch makes PYTHONIOENCODING to override locale default for error handler.

    @serhiy-storchaka
    Copy link
    Member Author

    What do you think about this Victor?

    @ncoghlan
    Copy link
    Contributor

    ncoghlan commented Apr 8, 2016

    I believe the problem may be that we can't readily tell the difference between "PYTHONIOENCODING=ascii" and "PYTHONIOENCODING=ascii:strict", and in the former case we'd ideally still end up using "surrogateescape" by default.

    That said, the real intent of the change was "If the detected encoding is ASCII, enable surrogateescape automatically", and detecting the POSIX locale was a proxy for that. We didn't account for PYTHONIOENCODING being used to select a more sensible encoding.

    @serhiy-storchaka
    Copy link
    Member Author

    Making "PYTHONIOENCODING=ascii" to mean "PYTHONIOENCODING=ascii:surrogateescape" is different (and may be more complex) issue. What error handler should use open(name, encoding='ascii')? open(name) in POSIX locale?

    This issue is about incorrect working of PYTHONIOENCODING in POSIX locale.

    @vstinner
    Copy link
    Member

    vstinner commented Apr 9, 2016

    Ok, I now understand the issue. Your change looks good to me.

    I agree that strict error handler is good choice for PYTHONIOENCODING=ascii.

    @vstinner
    Copy link
    Member

    vstinner commented Apr 9, 2016

    The patch looks good to me.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Apr 10, 2016

    New changeset 56eca1c08738 by Serhiy Storchaka in branch '3.5':
    Issue bpo-25339: PYTHONIOENCODING now has priority over locale in setting the
    https://hg.python.org/cpython/rev/56eca1c08738

    New changeset 9c6623099da1 by Serhiy Storchaka in branch 'default':
    Issue bpo-25339: PYTHONIOENCODING now has priority over locale in setting the
    https://hg.python.org/cpython/rev/9c6623099da1

    @serhiy-storchaka
    Copy link
    Member Author

    Thank you for your review Victor. I have added yet one minor change in tests because -I doesn't suppress PYTHONIOENCODING.

    @serhiy-storchaka serhiy-storchaka self-assigned this Apr 10, 2016
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    topic-IO type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants