urlparse normalize URL path #46835

monkeboy · 2008-04-08T13:56:58Z

BPO	2583
Nosy	@facundobatista, @orsenthil

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2008-05-21.00:25:35.668>
created_at = <Date 2008-04-08.13:56:58.156>
labels = ['invalid', 'type-bug', 'library']
title = 'urlparse normalize URL path'
updated_at = <Date 2008-05-21.00:25:35.506>
user = 'https://bugs.python.org/monkeboy'

bugs.python.org fields:

activity = <Date 2008-05-21.00:25:35.506>
actor = 'facundobatista'
assignee = 'none'
closed = True
closed_date = <Date 2008-05-21.00:25:35.668>
closer = 'facundobatista'
components = ['Library (Lib)']
creation = <Date 2008-04-08.13:56:58.156>
creator = 'monk.e.boy'
dependencies = []
files = []
hgrepos = []
issue_num = 2583
keywords = []
message_count = 4.0
messages = ['65162', '66890', '66892', '67141']
nosy_count = 3.0
nosy_names = ['facundobatista', 'orsenthil', 'monk.e.boy']
pr_nums = []
priority = 'normal'
resolution = 'not a bug'
stage = None
status = 'closed'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue2583'
versions = ['Python 2.5']

monkeboy · 2008-04-08T13:56:57Z

Hi,
This is my first problem with anything Python :-) and my first issue.

Doing in the following:

  urlparse.urljoin( 'http://site.com/', '../../../../path/' )
  'http://site.com/../../../../path/'

  urlparse.urljoin( 'http://site.com/', '/path/../path/.././path/./' )
  'http://site.com/path/../path/.././path/./'

These URLs are normalized to http://site.com/path/ in both Firefox and
Google (the google spider would follow these OK)

I think the documentation could be improved to point at the
posixpath.py normpath function and how it solves the above. I blogged a
how to:

http://teethgrinder.co.uk/blog/Normalize-URL-path-python/

I hope my bug report is OK. Thanks for all the code :-)

johng@neutralize.com

orsenthil · 2008-05-16T03:48:35Z

Just try it this way.
>>> print urlparse.urljoin('http://site.com/', 'path/../path/.././path/./')
http://site.com/path/
>>>

The difference is the inital '/' in the second argument.
Human interpretation is:
Go to http://site.com/ and 1) go to path directory 2) go to one-level
above (/../) which results in site.com again 3) go to path directory 4)
go to one-level above (..) (results site.com )5) Stay in the same
directory (.) 6) goto path 7) stay there (.)
Final result is http://www.site.com/path/

When you start the path with a '/'
>>> print urlparse.urljoin('http://site.com/', '/path/../path/.././path/./')
http://site.com/path/../path/.././path/./

The RFC (1808) suggests the following.
urlparse.urljoin('http://a/b/c/d','/./g') = <URL:http://a/./g\>
The argument is taken as a complete path for the server.

The way to use this would be, this way:

>>> print urlparse.urljoin('http://site.com/', 'path/../path/.././path/./')
http://site.com/path/
>>>

This is not a bug and can be closed.

orsenthil · 2008-05-16T03:51:06Z

Btw, Thank you for the exciting report monk.e.boy. :-)
There are many hidden in urlparse,urllib*. I hope you will have fun time
finding them (and fixing them too :)

And one general comment. If the bug is valid, Python official
Documentation cannot be made to reference a blog site. Instead, a patch
to fix the python doc would itself be welcome.

facundobatista · 2008-05-21T00:25:20Z

Not a bug...

monkeboy mannequin added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Apr 8, 2008

facundobatista closed this as completed May 21, 2008

facundobatista added the invalid label May 21, 2008

ezio-melotti transferred this issue from another repository Apr 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

urlparse normalize URL path #46835

urlparse normalize URL path #46835

monkeboy mannequin commented Apr 8, 2008

monkeboy mannequin commented Apr 8, 2008

orsenthil commented May 16, 2008

orsenthil commented May 16, 2008

facundobatista commented May 21, 2008

urlparse normalize URL path #46835

urlparse normalize URL path #46835

Comments

monkeboy mannequin commented Apr 8, 2008

monkeboy mannequin commented Apr 8, 2008

orsenthil commented May 16, 2008

orsenthil commented May 16, 2008

facundobatista commented May 21, 2008