classification
Title: os.path.normpath doesn't preserve unicode
Type: behavior Stage: resolved
Components: Library (Lib), Unicode Versions: Python 2.7, Python 2.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: ezio.melotti Nosy List: ezio.melotti, kcwu, loewis, mgiuca, sandberg
Priority: normal Keywords: patch

Created on 2009-04-24 04:23 by mgiuca, last changed 2010-01-12 04:07 by ezio.melotti. This issue is now closed.

Files
File name Uploaded Description Edit
normpath.patch mgiuca, 2009-04-24 04:23 (Obsolete) Fix for posixpath.normpath and ntpath.normpath.
normpath.2.patch mgiuca, 2009-11-19 00:39 Fix for posixpath.normpath and ntpath.normpath.
Messages (7)
msg86395 - (view) Author: Matt Giuca (mgiuca) Date: 2009-04-24 04:23
In the Python 2.x branch, os.path.normpath will sometimes return a str
even if given a unicode. This is not an issue in the Python 3.0 branch.

This happens specifically when it throws away all string data and
constructs its own:

>>> os.path.normpath(u'')
'.'
>>> os.path.normpath(u'.')
'.'
>>> os.path.normpath(u'/')
'/'

This is a problem if working with code which expects all strings to be
unicode strings (sometimes, functions raise exceptions if given a str,
when expecting a unicode).

I have attached patches (with test cases) for posixpath and ntpath which
correctly preserve the unicode-ness of the input string, such that the
new behaviour is:

>>> os.path.normpath(u'')
u'.'
>>> os.path.normpath(u'.')
u'.'
>>> os.path.normpath(u'/')
u'/'

I tried it on os2emxpath and plat-riscos/riscospath (the other two
OS-specific path modules), and it already worked fine for them.
Therefore, this patch fixes all necessary OS-specific versions of os.path.
msg95223 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2009-11-14 01:10
Thanks for the patch, I tried it on Linux and it seems to solve the problem.

A few comments about it:
1) I'd change all the self.assertEqual(type(posixpath.normpath(u"")),
unicode) to self.assertTrue(isinstance(posixpath.normpath(u""), unicode));
2) a test for normpath(u'.') should probably be added;
3) in ntpath.py the 'slash' is actually a backslash, so the name of the
var should be changed;
msg95459 - (view) Author: Matt Giuca (mgiuca) Date: 2009-11-19 00:39
Thanks Ezio.

I've updated the patch to incorporate your suggestions.

Note that I too have only tested it on Linux, but I tested both
posixpath and ntpath (and there is no OS-specific code, except for the
filenames themselves).

I'm not sure if using assertTrue(isinstance ...) is better than
assertEqual(type ...), because the type equality checking produces this
error:
AssertionError: <type 'str'> != <type 'unicode'>
while isinstance produces this unhelpful error:
AssertionError: False is not True

But oh well, I made the change anyway as most test cases use isinstance.
msg95490 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2009-11-19 16:42
assertTrue() also accepts a 'msg' argument where to explain what went
wrong in case of failure [1].

[1]:
http://docs.python.org/library/unittest.html#unittest.TestCase.assertTrue
msg95493 - (view) Author: Erik Carstensen (sandberg) Date: 2009-11-19 17:01
Also, assertTrue has an alias failUnless which I personally find more
descriptive (I don't know if either form is preferred for inclusion in
Python though).
msg95495 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2009-11-19 17:06
failUnless is deprecated in Python3.1 [1]. The assert* methods are
preferred over the fail* ones that are now deprecated.

[1]:
http://docs.python.org/3.1/library/unittest.html#unittest.TestCase.failUnless
msg97621 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2010-01-12 04:07
Fixed in r77442 (trunk) and r77443 (release26-maint), thanks!
History
Date User Action Args
2010-01-12 04:07:34ezio.melottisetstatus: open -> closed
resolution: fixed
messages: + msg97621

stage: patch review -> resolved
2010-01-07 23:52:48ezio.melottisetassignee: ezio.melotti
2009-11-19 17:06:52ezio.melottisetmessages: + msg95495
2009-11-19 17:01:42sandbergsetmessages: + msg95493
2009-11-19 16:42:47ezio.melottisetmessages: + msg95490
2009-11-19 00:39:29mgiucasetfiles: + normpath.2.patch

messages: + msg95459
2009-11-14 01:10:35ezio.melottisetpriority: normal

nosy: + sandberg, loewis, kcwu, ezio.melotti
messages: + msg95223

stage: patch review
2009-11-14 00:38:19ezio.melottilinkissue6450 superseder
2009-04-24 04:24:22mgiucasettype: behavior
2009-04-24 04:23:57mgiucacreate