Author tchrist
Recipients docs@python, eric.araujo, georg.brandl, jh45, tchrist, vstinner
Date 2011-08-12.02:36:30
SpamBayes Score 2.28085e-05
Marked as misclassified No
Message-id <1313116591.64.0.0484807939011.issue11230@psf.upfronthosting.co.za>
In-reply-to
Content
How does this work for modules that have filesystem names different from the one used for import? The issue I'm thinking about is that the Mac HSF+ filesystem keeps its Unicode filenames in (close to) NFD form. That means that a module named "caf\N{LATIN SMALL LETTER E WITH ACUTE}" with 4 graphemes and 4 code points in its name winds up in the filesystem as "cafe\N{COMBINING ACUTE ACCENT}" still with 4 graphemes but now with 5 code points.

I believe (well, suspect; I have empirical evidence not proof) Python stores its own identifiers in NFD, so this may not be quite as much of a problem as it might otherwise be.  Nonetheless, I have had users complain about what HFS+ does with such filenames, although I am not quite sure why. I think it’s because they access a file with 4 chars but they need a 5-char fileglob to wildcard it, so touch "caf\N{LATIN SMALL LETTER E WITH ACUTE}" and then you need a wildcard of "?????" with an extra ? to find it. Kinda weird.
History
Date User Action Args
2011-08-12 02:36:31tchristsetrecipients: + tchrist, georg.brandl, vstinner, eric.araujo, docs@python, jh45
2011-08-12 02:36:31tchristsetmessageid: <1313116591.64.0.0484807939011.issue11230@psf.upfronthosting.co.za>
2011-08-12 02:36:31tchristlinkissue11230 messages
2011-08-12 02:36:30tchristcreate