Title: urllib.urlopen("///C|/foo/bar/") IOError: [Errno 22]
Type: behavior Stage: test needed
Components: Library (Lib), Windows Versions: Python 2.6
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: orsenthil Nosy List: cgohlke, eric.araujo, orsenthil
Priority: normal Keywords: needs review, patch

Created on 2010-01-21 22:55 by cgohlke, last changed 2022-04-11 14:56 by admin. This issue is now closed.

url2pathname.patch cgohlke, 2010-01-21 22:55 patch
test_nturl2path.diff cgohlke, 2010-02-16 07:53
Messages (7)
Author: Christoph Gohlke (cgohlke) Date: 2010-01-21 22:55
On Windows 7, Python 2.6 raises an IOError when opening a valid file URL with urllib.urlopen(). A patch to the nturl2path.url2pathname function is attached. It replaces '%7C' by '|' in the url at the top of the url2pathname function.

Python 2.6.4 (r264:75708, Oct 26 2009, 08:23:19) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys, urllib
>>> fname = sys.executable
>>> fname
>>> fname = "file:///" + fname.replace('\\', '/').replace(':', '|')
>>> fname
>>> urllib.urlopen(fname)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "x:\python26\lib\", line 87, in urlopen
  File "x:\python26\lib\", line 206, in open
    return getattr(self, name)(url)
  File "x:\python26\lib\", line 468, in open_file
    return self.open_local_file(url)
  File "x:\python26\lib\", line 482, in open_local_file
    raise IOError(e.errno, e.strerror, e.filename)
IOError: [Errno 22] The filename, directory name, or volume label syntax is incorrect: '\\x|\\python26\\python.exe'
Author: Éric Araujo (eric.araujo) Date: 2010-02-16 05:13

I’m no expert on Python bugs, but I believe a test for the fixed behavior would improve your patch. Maybe a link to some documentation in a comment (or in an entry in Misc/NEWS) too.

Author: Christoph Gohlke (cgohlke) Date: 2010-02-16 07:53
A testcase is attached.

Information about the file URI scheme can be found at:
Author: Senthil Kumaran (orsenthil) Date: 2010-02-20 22:07
cgohlke, thanks for the patches and sorry for the delay. The fix however is not to replace the %HH character of '|' with '|', in the nturl2path, but the keep the '|' as safe character in the urllib.urlopen.

-        fullurl = quote(fullurl, safe="%/:=&?~#+!$,;'@()*[]")
+        fullurl = quote(fullurl, safe="%/:=&?~#+!$,;'@()*[]|")

Fixed in the revision 78268 with tests added.
Author: Éric Araujo (eric.araujo) Date: 2010-02-20 23:49

I’m no Windows expert but from what I know I’m puzzled by the test and the fix that have been committed. I thought the path C:\something would correspond to the URI file:///C|/something. Could you enlighten me?

Author: Senthil Kumaran (orsenthil) Date: 2010-02-21 04:00
The reason the problem was appearing in windows was, it is where, the | is normally observed in URLS, Without | being a safe character, that is it can appear literally in the url, the open method was translating it to %7C. Christopher's patch was to reconvert it to '|' later. 
I thought about it a bit and added '|' to safe characters.
Also, the tests were not very specific to windows, but to test the functionality of the open method leaving the safe characters unquoted.
Author: Éric Araujo (eric.araujo) Date: 2010-02-21 10:35
Ok, thanks for clarifying :)

