classification
Title: test_imp fails on OS X; filename normalization issue.
Type: behavior Stage: resolved
Components: macOS, Unicode Versions: Python 3.1, Python 3.2
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: flox Nosy List: benjamin.peterson, brett.cannon, ezio.melotti, flox, mark.dickinson, ned.deily
Priority: release blocker Keywords: patch

Created on 2010-03-13 20:02 by mark.dickinson, last changed 2010-03-23 11:50 by flox. This issue is now closed.

Files
File name Uploaded Description Edit
issue8133_test_imp.diff flox, 2010-03-20 17:27 Patch, apply to 3.x
Messages (11)
msg101018 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-03-13 20:02
test_issue5604 from test_imp is currently failing on OS X !0.6 (py3k branch), with the following output:

======================================================================
ERROR: test_issue5604 (__main__.ImportTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "Lib/test/test_imp.py", line 121, in test_issue5604
    file, filename, info = imp.find_module(temp_mod_name)
ImportError: No module named test_imp_helper_ä

----------------------------------------------------------------------


I think this has something to do with the platform automatically
using NFD normalization for filenames.  Here's an interactive session: 

Python 3.2a0 (py3k:78936, Mar 13 2010, 19:42:52) 
[GCC 4.2.1 (Apple Inc. build 5646) (dot 1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import imp, unicodedata
>>> fname = 'test' + b'\xc3\xa4'.decode('utf-8')
>>> with open(fname+'.py', 'w') as file: file.write('a = 1\n')
... 
6
>>> imp.find_module(fname)   # expected this to succeed
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named testä
>>> imp.find_module(unicodedata.normalize('NFD', fname))
(<_io.TextIOWrapper name=4 encoding='utf-8'>, 'testä.py', ('.py', 'U', 1))


In contrast, a simple 'open' doesn't seem to care about normalization:

>>> open(fname+'.py')
<_io.TextIOWrapper name='testä.py' encoding='UTF-8'>
[50305 refs]
>>> open(unicodedata.normalize('NFD', fname)+'.py')
<_io.TextIOWrapper name='testä.py' encoding='UTF-8'>
[50305 refs]
msg101019 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-03-13 20:08
Also affects 3.1.
msg101180 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-03-16 19:25
Brett:  any thoughts on this?  Should imp.find_module automatically apply NFD normalization to the given string on OS X?

It seems to me that doing this properly is a bit nasty, since the correct condition isn't that the OS is OS X, but that the relevant filesystem is HFS+;  presumably a single call to imp.find_module could end up checking directories with differing filesystems.
msg101186 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2010-03-16 20:21
Trying to get this right is nasty as mixed filesystem stuff is always tricky, especially since NFD is still UTF-8 as is NFC so sys.getdefaultencoding() doesn't help.

Without some way to get that extra bit of info about what form of UTF-8 encoding is being used for the filesystem, I think the test should be modified to use os.listdir() to find the name as encoded by the filesystem and use that as the argument to imp.find_module() instead of assuming the filesystem didn't tweak what it was given.
msg101253 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2010-03-18 09:09
Test failing on 3.1.2rc1.  Should this be considered a release blocker?  Perhaps just disable temporarily?
msg101344 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2010-03-19 21:40
(BTW, the problem exists on other versions of OS X, not just 10.6.)
msg101378 - (view) Author: Florent Xicluna (flox) * (Python committer) Date: 2010-03-20 16:50
Note: issue #8180 is related to the same NFC/NFD issue.
http://developer.apple.com/mac/library/qa/qa2001/qa1173.html
msg101382 - (view) Author: Florent Xicluna (flox) * (Python committer) Date: 2010-03-20 17:27
Could you tell if the patch fix the issue?
msg101383 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-03-20 17:31
That patch works for me.

(You should probably commit the comment fix in the patch separately though, rather than mixing it up with this issue.)
msg101392 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2010-03-20 20:04
Patch works for me as well. Go ahead and commit it, Florent, with the comment fix as a separate commit as Mark suggested.
msg101396 - (view) Author: Florent Xicluna (flox) * (Python committer) Date: 2010-03-20 20:40
Fixed with r79144 on 3.x and r79146 on 3.1.
History
Date User Action Args
2010-03-23 11:50:09floxsetstatus: pending -> closed
2010-03-20 20:40:10floxsetstatus: open -> pending
resolution: accepted -> fixed
messages: + msg101396

stage: commit review -> resolved
2010-03-20 20:04:44brett.cannonsetassignee: brett.cannon -> flox
messages: + msg101392
stage: patch review -> commit review
2010-03-20 17:31:31mark.dickinsonsetmessages: + msg101383
2010-03-20 17:27:20floxsetfiles: + issue8133_test_imp.diff
keywords: + patch
messages: + msg101382

stage: patch review
2010-03-20 16:50:22floxsetnosy: + flox
messages: + msg101378

components: + macOS, Unicode
resolution: accepted
2010-03-20 16:42:06benjamin.petersonlinkissue8182 superseder
2010-03-19 21:40:05ned.deilysetmessages: + msg101344
title: test_imp fails on OS X 10.6; filename normalization issue. -> test_imp fails on OS X; filename normalization issue.
2010-03-19 20:57:08benjamin.petersonsetpriority: release blocker
assignee: brett.cannon
2010-03-18 09:09:36ned.deilysetnosy: + ned.deily, benjamin.peterson
messages: + msg101253
2010-03-16 20:21:52brett.cannonsetmessages: + msg101186
2010-03-16 19:25:35mark.dickinsonsetnosy: + brett.cannon
messages: + msg101180
2010-03-13 20:08:10mark.dickinsonsetmessages: + msg101019
versions: + Python 3.1
2010-03-13 20:02:24mark.dickinsoncreate