Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

urlparse normalize URL path #46835

Closed
monkeboy mannequin opened this issue Apr 8, 2008 · 4 comments
Closed

urlparse normalize URL path #46835

monkeboy mannequin opened this issue Apr 8, 2008 · 4 comments
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@monkeboy
Copy link
Mannequin

monkeboy mannequin commented Apr 8, 2008

BPO 2583
Nosy @facundobatista, @orsenthil

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2008-05-21.00:25:35.668>
created_at = <Date 2008-04-08.13:56:58.156>
labels = ['invalid', 'type-bug', 'library']
title = 'urlparse normalize URL path'
updated_at = <Date 2008-05-21.00:25:35.506>
user = 'https://bugs.python.org/monkeboy'

bugs.python.org fields:

activity = <Date 2008-05-21.00:25:35.506>
actor = 'facundobatista'
assignee = 'none'
closed = True
closed_date = <Date 2008-05-21.00:25:35.668>
closer = 'facundobatista'
components = ['Library (Lib)']
creation = <Date 2008-04-08.13:56:58.156>
creator = 'monk.e.boy'
dependencies = []
files = []
hgrepos = []
issue_num = 2583
keywords = []
message_count = 4.0
messages = ['65162', '66890', '66892', '67141']
nosy_count = 3.0
nosy_names = ['facundobatista', 'orsenthil', 'monk.e.boy']
pr_nums = []
priority = 'normal'
resolution = 'not a bug'
stage = None
status = 'closed'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue2583'
versions = ['Python 2.5']

@monkeboy
Copy link
Mannequin Author

monkeboy mannequin commented Apr 8, 2008

Hi,
This is my first problem with anything Python :-) and my first issue.

Doing in the following:

  urlparse.urljoin( 'http://site.com/', '../../../../path/' )
  'http://site.com/../../../../path/'

  urlparse.urljoin( 'http://site.com/', '/path/../path/.././path/./' )
  'http://site.com/path/../path/.././path/./'

These URLs are normalized to http://site.com/path/ in both Firefox and
Google (the google spider would follow these OK)

I think the documentation could be improved to point at the
posixpath.py normpath function and how it solves the above. I blogged a
how to:

http://teethgrinder.co.uk/blog/Normalize-URL-path-python/

I hope my bug report is OK. Thanks for all the code :-)

johng@neutralize.com

@monkeboy monkeboy mannequin added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Apr 8, 2008
@orsenthil
Copy link
Member

Just try it this way.
>>> print urlparse.urljoin('http://site.com/', 'path/../path/.././path/./')
http://site.com/path/
>>>

The difference is the inital '/' in the second argument.
Human interpretation is:
Go to http://site.com/ and 1) go to path directory 2) go to one-level
above (/../) which results in site.com again 3) go to path directory 4)
go to one-level above (..) (results site.com )5) Stay in the same
directory (.) 6) goto path 7) stay there (.)
Final result is http://www.site.com/path/

When you start the path with a '/'
>>> print urlparse.urljoin('http://site.com/', '/path/../path/.././path/./')
http://site.com/path/../path/.././path/./

The RFC (1808) suggests the following.
urlparse.urljoin('http://a/b/c/d','/./g') = <URL:http://a/./g\>
The argument is taken as a complete path for the server.

The way to use this would be, this way:

>>> print urlparse.urljoin('http://site.com/', 'path/../path/.././path/./')
http://site.com/path/
>>>

This is not a bug and can be closed.

@orsenthil
Copy link
Member

Btw, Thank you for the exciting report monk.e.boy. :-)
There are many hidden in urlparse,urllib*. I hope you will have fun time
finding them (and fixing them too :)

And one general comment. If the bug is valid, Python official
Documentation cannot be made to reference a blog site. Instead, a patch
to fix the python doc would itself be welcome.

@facundobatista
Copy link
Member

Not a bug...

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

2 participants