Message347998
I think Dimiter was able to fix most of the failures, except test_unicode_file_functions.
Yesterday during the sprints we were looking at it, and we did some tests using the following snippet:
import os
import unicodedata
upsilon_diaeresis_and_hook = "ϔ"
for form in ["NFC", "NFD", "NFKC", "NFKD"]:
unicode_filename = unicodedata.normalize(form, upsilon_diaeresis_and_hook)
with open(unicode_filename, "w") as f: f.write(form)
print("N:", ascii(unicode_filename))
print([ascii(filename) for filename in os.listdir('.')])
On ext4 this creates 4 different files: ['\u03d4', '\u03d2\u0308', '\u03ab', '\u03a5\u0308']
On ZFS with utf8only=true (and I believe normalization=formD), only 2 files are created but each of the 4 filenames can be used to access either of the 2 files.
This is also the default behavior on Mac.
The test is already skipped on darwin (Lib/test/test_unicode_file_functions.py:120), and should be skipped for ZFS too (might depend on the exact flags used), however we weren't able to find a portable way to determine the filesystem and flags.
An alternative is to try creating the 4 files and skip the test if only 2 gets created and if all the names can be used to open these two files, however this might mask other failures. Unless someone can come up with a better way to do this, I think this is the only option.
In addition, different filesystems that don't exhibit this behavior can be used on Mac, so the test shouldn't be skipped in those cases. |
|
Date |
User |
Action |
Args |
2019-07-16 01:05:20 | ezio.melotti | set | recipients:
+ ezio.melotti, vstinner, benjamin.peterson, serhiy.storchaka, dimitern |
2019-07-16 01:05:20 | ezio.melotti | set | messageid: <1563239120.13.0.336643650174.issue37584@roundup.psfhosted.org> |
2019-07-16 01:05:20 | ezio.melotti | link | issue37584 messages |
2019-07-16 01:05:19 | ezio.melotti | create | |
|