Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_sys.test_ioencoding_nonascii() fails with ASCII locale encoding #63258

Open
pitrou opened this issue Sep 20, 2013 · 13 comments
Open

test_sys.test_ioencoding_nonascii() fails with ASCII locale encoding #63258

pitrou opened this issue Sep 20, 2013 · 13 comments
Labels
tests Tests in the Lib/test dir type-bug An unexpected behavior, bug, or error

Comments

@pitrou
Copy link
Member

pitrou commented Sep 20, 2013

BPO 19058
Nosy @pitrou, @vstinner, @bitdancer, @serhiy-storchaka
Files
  • sys_test_ioencoding_locale.patch
  • sys_test_ioencoding.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2013-09-20.21:29:35.651>
    labels = ['type-bug', 'tests']
    title = 'test_sys.test_ioencoding_nonascii() fails with ASCII locale encoding'
    updated_at = <Date 2015-10-14.16:33:57.297>
    user = 'https://github.com/pitrou'

    bugs.python.org fields:

    activity = <Date 2015-10-14.16:33:57.297>
    actor = 'vstinner'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Tests']
    creation = <Date 2013-09-20.21:29:35.651>
    creator = 'pitrou'
    dependencies = []
    files = ['31865', '31899']
    hgrepos = []
    issue_num = 19058
    keywords = ['patch']
    message_count = 13.0
    messages = ['198174', '198175', '198181', '198369', '198370', '198373', '198379', '198544', '198552', '198553', '198554', '252514', '253008']
    nosy_count = 4.0
    nosy_names = ['pitrou', 'vstinner', 'r.david.murray', 'serhiy.storchaka']
    pr_nums = []
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue19058'
    versions = ['Python 3.4']

    @pitrou
    Copy link
    Member Author

    pitrou commented Sep 20, 2013

    The test added in bpo-18818 fails on the new OS X buildbot:

    ======================================================================
    FAIL: test_ioencoding_nonascii (test.test_sys.SysModuleTest)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/Users/buildbot/buildarea/3.x.murray-snowleopard/build/Lib/test/test_sys.py", line 581, in test_ioencoding_nonascii
        self.assertEqual(out, os.fsencode(test.support.FS_NONASCII))
    AssertionError: b'' != b'\xc3\xa6'

    http://buildbot.python.org/all/builders/AMD64%20Snow%20Leop%203.x/builds/4

    @pitrou pitrou added tests Tests in the Lib/test dir type-bug An unexpected behavior, bug, or error labels Sep 20, 2013
    @vstinner
    Copy link
    Member

    The test fails with ASCII locale encoding (ex: LANG= on Linux).

    The test should not try to display a non-ASCII character, but should check the encoding (sys.stdout.encoding) instead. The test should ensure that sys.stdout.encoding is the same with the PYTHONIOENCODING unset (python started with -E option and the current environment) and with the variable set to an empty value.

    @bitdancer
    Copy link
    Member

    I set LC_CTYPE to en_US.utf-8 on the buildbot, which I think is the better setting for that buildbot, so the test doesn't fail there anymore. However, the test should still be fixed (and maybe we should have a buildbot running with no language set at all).

    @vstinner vstinner changed the title test_ioencoding_nonascii (test_sys) fails on Snow Leopard test_sys.test_ioencoding_nonascii() fails with ASCII locale encoding Sep 24, 2013
    @serhiy-storchaka
    Copy link
    Member

    Shouldn't FS_NONASCII be None with ASCII locale encoding?

    @vstinner
    Copy link
    Member

    Shouldn't FS_NONASCII be None with ASCII locale encoding?

    See the description of the variable in test.support:

    # FS_NONASCII: non-ASCII character encodable by os.fsencode(),
    # or None if there is no such character.

    The file system encoding an the locale encoding can be different... especially when PYTHONIOENCODING is used.

    The test should not use FS_NONASCII.

    @bitdancer
    Copy link
    Member

    Also note that on OS X I believe the fsencoding is always utf-8, but the locale can of course be something else.

    @serhiy-storchaka
    Copy link
    Member

    Indeed.

    Here is a patch. It uses same algorithm to obtain encodable non-ASCII string as for FS_NONASCII, but with locale encoding. It also adds new tests and simplifies existing tests.

    @vstinner
    Copy link
    Member

    Here is a patch. It uses same algorithm to obtain encodable
    non-ASCII string as for FS_NONASCII, but with locale encoding.
    It also adds new tests and simplifies existing tests.

    I don't like your patch. The purpose of PYTHONIOENCODING is to set sys.stdin/stdout/stderr encodings. Your patch does not check sys.stdout.encoding, but check directly the codec. Two codecs may encode the same character as the same byte sequence.

    Your test is skipped if the locale encoding is ASCII, whereas the purpopse of PYTHONIOENCODING is to write non-ASCII characters without having to care of the locale encoding.

    I would really prefer to simply check sys.stdin.encoding, sys.stdout.encoding and sys.stderr.encoding attributes.

    If you really want to check the codec itself, you should use known sequence, ex: 'héllo€'.encode('cp1252') gives b'h\xe9llo\x80'.

    @serhiy-storchaka
    Copy link
    Member

    Here is a patch which directly checks sys.std* attributes.

    @serhiy-storchaka
    Copy link
    Member

    Your test is skipped if the locale encoding is ASCII, whereas the purpopse of PYTHONIOENCODING is to write non-ASCII characters without having to care of the locale encoding.

    This case was tested in previous test.

    If you really want to check the codec itself, you should use known sequence, ex: 'héllo€'.encode('cp1252') gives b'h\xe9llo\x80'.

    We can't be sure that OS supports cp1252 (or any other non-default) locale.

    @serhiy-storchaka
    Copy link
    Member

    Your patch does not check sys.stdout.encoding, but check directly the codec. Two codecs may encode the same character as the same byte sequence.

    Checking encoding name is too rigid. Python interpreter can normalize encoding name before assigning it to standard streams. This is implementation detail.

    @serhiy-storchaka
    Copy link
    Member

    What could you say about the recent patch Victor?

    @vstinner
    Copy link
    Member

    What could you say about the recent patch Victor?

    I'm not sure that it works in all cases. io.TextIOWrapper doesn't care to normalize the encoding name. You should use something like:

       encoding = codecs.lookup(encoding).name

    Otherwise, the test can fail if you care one of the various aliases of each encoding. Example: "UTF-8" vs "utf8" vs "utf-8".

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    tests Tests in the Lib/test dir type-bug An unexpected behavior, bug, or error
    Projects
    Status: No status
    Development

    No branches or pull requests

    4 participants