New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test_sys.test_ioencoding_nonascii() fails with ASCII locale encoding #63258
Comments
The test added in bpo-18818 fails on the new OS X buildbot: ====================================================================== Traceback (most recent call last):
File "/Users/buildbot/buildarea/3.x.murray-snowleopard/build/Lib/test/test_sys.py", line 581, in test_ioencoding_nonascii
self.assertEqual(out, os.fsencode(test.support.FS_NONASCII))
AssertionError: b'' != b'\xc3\xa6' http://buildbot.python.org/all/builders/AMD64%20Snow%20Leop%203.x/builds/4 |
The test fails with ASCII locale encoding (ex: LANG= on Linux). The test should not try to display a non-ASCII character, but should check the encoding (sys.stdout.encoding) instead. The test should ensure that sys.stdout.encoding is the same with the PYTHONIOENCODING unset (python started with -E option and the current environment) and with the variable set to an empty value. |
I set LC_CTYPE to en_US.utf-8 on the buildbot, which I think is the better setting for that buildbot, so the test doesn't fail there anymore. However, the test should still be fixed (and maybe we should have a buildbot running with no language set at all). |
Shouldn't FS_NONASCII be None with ASCII locale encoding? |
Shouldn't FS_NONASCII be None with ASCII locale encoding? See the description of the variable in test.support: # FS_NONASCII: non-ASCII character encodable by os.fsencode(), The file system encoding an the locale encoding can be different... especially when PYTHONIOENCODING is used. The test should not use FS_NONASCII. |
Also note that on OS X I believe the fsencoding is always utf-8, but the locale can of course be something else. |
Indeed. Here is a patch. It uses same algorithm to obtain encodable non-ASCII string as for FS_NONASCII, but with locale encoding. It also adds new tests and simplifies existing tests. |
I don't like your patch. The purpose of PYTHONIOENCODING is to set sys.stdin/stdout/stderr encodings. Your patch does not check sys.stdout.encoding, but check directly the codec. Two codecs may encode the same character as the same byte sequence. Your test is skipped if the locale encoding is ASCII, whereas the purpopse of PYTHONIOENCODING is to write non-ASCII characters without having to care of the locale encoding. I would really prefer to simply check sys.stdin.encoding, sys.stdout.encoding and sys.stderr.encoding attributes. If you really want to check the codec itself, you should use known sequence, ex: 'héllo€'.encode('cp1252') gives b'h\xe9llo\x80'. |
Here is a patch which directly checks sys.std* attributes. |
This case was tested in previous test.
We can't be sure that OS supports cp1252 (or any other non-default) locale. |
Checking encoding name is too rigid. Python interpreter can normalize encoding name before assigning it to standard streams. This is implementation detail. |
What could you say about the recent patch Victor? |
I'm not sure that it works in all cases. io.TextIOWrapper doesn't care to normalize the encoding name. You should use something like: encoding = codecs.lookup(encoding).name Otherwise, the test can fail if you care one of the various aliases of each encoding. Example: "UTF-8" vs "utf8" vs "utf-8". |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: