Issue37584
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2019-07-13 10:13 by dimitern, last changed 2022-04-11 14:59 by admin.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
cpython_test_output.log | dimitern, 2019-07-13 10:13 | Tests output |
Messages (5) | |||
---|---|---|---|
msg347794 - (view) | Author: Dimiter Naydenov (dimitern) * | Date: 2019-07-13 10:13 | |
I'm running Ubuntu 19.04 on a ZFS mirrored pool, where my home partition is configured with 'utf8only=on' attribute. I've cloned cpython and after running the tests, as described in devguide.python.org, I have 11 test failures: == Tests result: FAILURE == 389 tests OK. 11 tests failed: test_cmd_line_script test_httpservers test_imp test_import test_ntpath test_os test_posixpath test_socket test_unicode_file test_unicode_file_functions test_zipimport I've been looking for similar or matching reported issues, but could not find one. I'm on the EuroPython 2019 CPython sprint and we'll be looking into this with the help of some of the core devs. |
|||
msg347801 - (view) | Author: Dimiter Naydenov (dimitern) * | Date: 2019-07-13 10:35 | |
Here's some additional information I found for that specific attribute: From the documentation at http://dlc.sun.com/osol/docs/content/ZFSADMIN/gazss.html (link is dead, but here's where I found the section below: https://zfs-discuss.opensolaris.narkive.com/3NqQVG0H/utf8only-and-normalization-properties#post1) utf8only Boolean Off This property indicates whether a file system should reject file names that include characters that are not present in the UTF-8 character code set. If this property is explicitly set to off, the normalization property must either not be explicitly set or be set to none. The default value for the utf8only property is off. This property cannot be changed after the file system is created. |
|||
msg347998 - (view) | Author: Ezio Melotti (ezio.melotti) * | Date: 2019-07-16 01:05 | |
I think Dimiter was able to fix most of the failures, except test_unicode_file_functions. Yesterday during the sprints we were looking at it, and we did some tests using the following snippet: import os import unicodedata upsilon_diaeresis_and_hook = "ϔ" for form in ["NFC", "NFD", "NFKC", "NFKD"]: unicode_filename = unicodedata.normalize(form, upsilon_diaeresis_and_hook) with open(unicode_filename, "w") as f: f.write(form) print("N:", ascii(unicode_filename)) print([ascii(filename) for filename in os.listdir('.')]) On ext4 this creates 4 different files: ['\u03d4', '\u03d2\u0308', '\u03ab', '\u03a5\u0308'] On ZFS with utf8only=true (and I believe normalization=formD), only 2 files are created but each of the 4 filenames can be used to access either of the 2 files. This is also the default behavior on Mac. The test is already skipped on darwin (Lib/test/test_unicode_file_functions.py:120), and should be skipped for ZFS too (might depend on the exact flags used), however we weren't able to find a portable way to determine the filesystem and flags. An alternative is to try creating the 4 files and skip the test if only 2 gets created and if all the names can be used to open these two files, however this might mask other failures. Unless someone can come up with a better way to do this, I think this is the only option. In addition, different filesystems that don't exhibit this behavior can be used on Mac, so the test shouldn't be skipped in those cases. |
|||
msg348006 - (view) | Author: STINNER Victor (vstinner) * | Date: 2019-07-16 08:46 | |
""" On ext4 this creates 4 different files: ['\u03d4', '\u03d2\u0308', '\u03ab', '\u03a5\u0308'] On ZFS with utf8only=true (and I believe normalization=formD), only 2 files are created but each of the 4 filenames can be used to access either of the 2 files. This is also the default behavior on Mac. The test is already skipped on darwin (Lib/test/test_unicode_file_functions.py:120), and should be skipped for ZFS too (might depend on the exact flags used), however we weren't able to find a portable way to determine the filesystem and flags. """ I suggest to create a temporary directory, create the 4 files and see how many files you can using os.listdir(). If you get 4, the FS doesn't normalize anything. If you get less, it's likely that the FS normalizes names. |
|||
msg408420 - (view) | Author: Gregory P. Smith (gregory.p.smith) * | Date: 2021-12-13 02:23 | |
Confirmed. Repro: Do an ubuntu 20.04 install and choose "experimental zfs" support during install - https://ubuntu.com/blog/zfs-focus-on-ubuntu-20-04-lts-whats-new). On such a zfs filesystem, the following tests from a ./python -m test.regrtest run fail in 3.10: 11 tests failed: test_cmd_line_script test_httpservers test_imp test_import test_ntpath test_os test_posixpath test_socket test_unicode_file test_unicode_file_functions test_zipimport Move over to a tmpfs and all but test_httpservers now pass. test_httpservers tries to create such a path on /tmp ====================================================================== ERROR: test_undecodable_filename (test.test_httpservers.SimpleHTTPServerTestCase) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/greg/test/cpython/Lib/test/test_httpservers.py", line 400, in test_undecodable_filename with open(os.path.join(self.tempdir, filename), 'wb') as f: OSError: [Errno 84] Invalid or incomplete multibyte or wide character: '/tmp/tmpnt9ch98x/@test_124227_tmp\udce7w\udcf0.txt' I expect any filesystem mounted to reject non-UTF8 pathnames to cause similar failures. Our test suite needs to detect this environment and skip these tests there. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:59:18 | admin | set | github: 81765 |
2021-12-13 02:23:38 | gregory.p.smith | set | nosy:
+ gregory.p.smith messages: + msg408420 versions: + Python 3.10, Python 3.11, - Python 3.7, Python 3.8 |
2019-07-16 08:46:21 | vstinner | set | messages: + msg348006 |
2019-07-16 01:05:20 | ezio.melotti | set | nosy:
+ serhiy.storchaka messages: + msg347998 stage: test needed |
2019-07-13 10:35:22 | dimitern | set | messages: + msg347801 |
2019-07-13 10:13:57 | dimitern | create |