Title: urlparse normalize URL path
Type: behavior Stage:
Components: Library (Lib) Versions: Python 2.5
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: facundobatista, monk.e.boy, orsenthil
Priority: normal Keywords:

Created on 2008-04-08 13:56 by monk.e.boy, last changed 2008-05-21 00:25 by facundobatista. This issue is now closed.

Messages (4)
msg65162 - (view) Author: monk.e.boy (monk.e.boy) Date: 2008-04-08 13:56
  This is my first problem with anything Python :-) and my first issue.

  Doing in the following:

  urlparse.urljoin( '', '../../../../path/' )

  urlparse.urljoin( '', '/path/../path/.././path/./' )

These URLs are normalized to in both Firefox and
Google (the google spider would follow these OK)

  I think the documentation could be improved to point at the normpath function and how it solves the above. I blogged a
how to:

I hope my bug report is OK. Thanks for all the code :-)
msg66890 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2008-05-16 03:48
Just try it this way.
>>> print urlparse.urljoin('', 'path/../path/.././path/./')

The difference is the inital '/' in the second argument.
Human interpretation is:
Go to and 1) go to path directory 2) go to one-level
above (/../) which results in again 3) go to path directory 4)
go to one-level above (..) (results )5) Stay in the same
directory (.) 6) goto path 7) stay there (.) 
Final result is

When you start the path with a '/'
>>> print urlparse.urljoin('', '/path/../path/.././path/./')

The RFC (1808) suggests the following.
urlparse.urljoin('http://a/b/c/d','/./g') = <URL:http://a/./g>
The argument is taken as a complete path for the server.

The way to use this would be, this way:

>>> print urlparse.urljoin('', 'path/../path/.././path/./')

This is not a bug and can be closed.
msg66892 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2008-05-16 03:51
Btw, Thank you for the exciting report monk.e.boy. :-)
There are many hidden in urlparse,urllib*. I hope you will have fun time
finding them (and fixing them too :)

And one general comment. If the bug is valid, Python official
Documentation cannot be made to  reference a blog site. Instead, a patch
to fix the python doc would itself be welcome.
msg67141 - (view) Author: Facundo Batista (facundobatista) * (Python committer) Date: 2008-05-21 00:25
Not a bug...
Date User Action Args
2008-05-21 00:25:35facundobatistasetstatus: open -> closed
resolution: not a bug
messages: + msg67141
nosy: + facundobatista
2008-05-16 03:51:06orsenthilsetmessages: + msg66892
2008-05-16 03:48:38orsenthilsetnosy: + orsenthil
messages: + msg66890
2008-04-08 13:56:58monk.e.boycreate