msg153512 - (view) |
Author: zulla (zulla) |
Date: 2012-02-16 23:40 |
The "port" component of a URL is not properly be sanitized or validated. This may lead to the evasion of netloc/hostname based filters or exceptions.
|
msg153522 - (view) |
Author: zulla (zulla) |
Date: 2012-02-17 01:27 |
The "port" and "netloc" component of a ParsedResult-object is not properly sanitized or validated. This may lead to bypass-able hostname-based filters. Remote Crash vulnerabilities be be also possible.
|
msg153523 - (view) |
Author: R. David Murray (r.david.murray) * |
Date: 2012-02-17 01:45 |
Did you upload urlparse.py to the issue by accident?
Can you please provide some examples of where you think the current code is producing incorrect results?
|
msg153524 - (view) |
Author: zulla (zulla) |
Date: 2012-02-17 01:47 |
Hi. No, it's a patched version. It won't crash under circumstances like that [1] and won't succeed with invalid input:
>>> import urlparse
>>> urlparse.urlparse("http://www.google.com:foo")
ParseResult(scheme='http', netloc='www.google.com:foo', path='', params='', query='', fragment='')
>>> urlparse.urlparse("http://www.google.com:foo").port
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urlparse.py", line 105, in port
port = int(netloc.split(':')[1], 10)
ValueError: invalid literal for int() with base 10: 'foo'
>>>
|
msg153525 - (view) |
Author: zulla (zulla) |
Date: 2012-02-17 01:54 |
Whops. I forgot an int() :-)
Here's the right patch.
|
msg153527 - (view) |
Author: R. David Murray (r.david.murray) * |
Date: 2012-02-17 02:05 |
It's not a patch if it is the whole file. A diff would be much more useful, since then we could see the changes easily.
This kind of change would require a bit of discussion. I'm doubtful that it would be applied as a bug fix, and we might even want the validation to be optional and not the default. Part of the issue is that urlparse was originally based on the older standards, as you can see from the docstring.
You may find others to agree with you, but personally it doesn't look to me like this rises to the level of a security issue.
|
msg153528 - (view) |
Author: zulla (zulla) |
Date: 2012-02-17 02:18 |
I understand your point of view, but I disagree.
Various libraries and projects rely on urlparse.urlparse and urllib.parse.urlparse.
This bug just blew up in my face. I'm working with Cython and PyQt4.
When a developer relies on ParseResult().netloc being a valid netloc, and .port being None [bool(False)] or a integer between 1-65535 really bad things can happen in a environment that has 0-tolerance for security issues (like C/C++ mixed in python).
I agree that the
if self.scheme == "http":
return 80
elif self.scheme == "https":
[...]
part of my patch is debetable, but we should _at least_ ensure that IF there is a ParseResult().port, the developer can be sure that it is a valid port between 1-65545.
i apologize for upload the whole file; i attached the diff now.
regards,
dan
|
msg153545 - (view) |
Author: Alyssa Coghlan (ncoghlan) * |
Date: 2012-02-17 13:58 |
Could you provide some failing examples?
The suggestion also seems to run slightly at odds with itself - in one part, silently replacing an invalid port specification with a different value, in another adding additional validation checks.
Also, rather than hardcoding default port numbers for selected protocols, it would make more sense to just look them up via socket.getservbyname(scheme) (and still return None if the scheme isn't recognised). However, I'll need to think about that one for a while - urlparse is designed to be almost purely a string *parsing* library. Looking up default port numbers if one isn't specified really isn't its job. (For one thing, you'd break the ability to exactly recreate the original URL text from the parsed version).
There should definitely by a try/except around that conversion to int(), though - it's just that I'm not yet sure what an appropriate return value is at that point. The raw port string? None? Should there be a separate "raw_port" descriptor that always returns some kind of string, with the port descriptor promising to always return a valid 16-bit port number or None?
|
msg154833 - (view) |
Author: zulla (zulla) |
Date: 2012-03-03 12:38 |
>>> u("http://www.google.com:99999999999999999999999999").port
99999999999999999999999999L
|
msg155204 - (view) |
Author: Senthil Kumaran (orsenthil) * |
Date: 2012-03-09 03:07 |
Couple of points:
1. On your last example, which webserver treats 'L' as part of port number? I can't of anything.
2. Can you write a "real application" which is listening to beyond 65535? Which platform would it be?
Current way of handling invalid port like, int('foo') by raising ValueError seems to be a better than returning a None. A better error message could be desirable, but that does not make it a security issue.
Additionally, for the example of int changing long integer to 'L' appended one would a 2.x case as it is no longer the behavior in 3.x
Also, I would advice to look at getPort function in a C library or a Java library and see what it does. The only difference I see is, they return -1 where Python returns None.
I am changing the request type to an enhancement, because there is not a valid argument to support that it is a security issue.
|
msg161275 - (view) |
Author: Senthil Kumaran (orsenthil) * |
Date: 2012-05-21 14:51 |
I am not sure if anything should be done to this request. Saying that
int("99999999999999999999999999",10) is converting to 99999999999999999999999999L in Python2.7 it is a bug/security issue is hypothetical. Practically, such high port numbers cannot exist.
I suggest, we close this issue as won't fix.
Ezio, I noticed that you changed from pending to open. If you had any ideas for this issue, please share, otherwise you can consider closing this too.
Thanks!
|
msg161276 - (view) |
Author: zulla (zulla) |
Date: 2012-05-21 15:06 |
Your comment is completely senseless, sorry.
Of course such high port numbers do not exist.
An attacker is counting on that. Imagine something like that
pass_to_cython(urlparse("http://google.de:999999**999999[to be calculated]").port)
|
msg161277 - (view) |
Author: zulla (zulla) |
Date: 2012-05-21 15:09 |
we should at least check if the .port attribute is an intereger >= 1 and <= 65535. _because_ this is the only valid port range. otherwise, it is no valid port. but it may be a integer overflow attack attempt
when a developer uses .port, he is counting on the result being valid
|
msg161278 - (view) |
Author: Senthil Kumaran (orsenthil) * |
Date: 2012-05-21 15:23 |
pass_to_cython(urlparse("http://google.de:999999**999999[to be calculated]").port)
is no different than sending
pass_to_cython(999999**999999[to be calculated])
In that case, would the former make a security loop hole in urlparse? Looks pretty contrived to me as an example for .port bug.
However, I agree with one point in your assertion, namely that port be checked that it is within the range integer >= 1 and <= 65535. If it is not, return None as a response in port.
|
msg161279 - (view) |
Author: Ezio Melotti (ezio.melotti) * |
Date: 2012-05-21 15:32 |
> Ezio, I noticed that you changed from pending to open.
That was an accident, I just meant to add my self to the nosy.
I didn't have time yet to read all the messages and comment on the issue.
|
msg161506 - (view) |
Author: Roundup Robot (python-dev) |
Date: 2012-05-24 13:57 |
New changeset 988903cf24c5 by Senthil Kumaran in branch '2.7':
Issue #14036: return None when port in urlparse cross 65535
http://hg.python.org/cpython/rev/988903cf24c5
New changeset d769e64aed79 by Senthil Kumaran in branch '3.2':
Issue #14036: return None when port in urlparse cross 65535
http://hg.python.org/cpython/rev/d769e64aed79
New changeset b4d257c64db7 by Senthil Kumaran in branch 'default':
Issue #14036: return None when port in urlparse cross 65535
http://hg.python.org/cpython/rev/b4d257c64db7
|
msg161507 - (view) |
Author: Senthil Kumaran (orsenthil) * |
Date: 2012-05-24 13:59 |
This is taken care. I was not really convinced on the need as likely seemed a non issue from "urlparse" standpoint, But still there is no harm in returning valid port as semantically the attribute stands for a port.
Thanks!
|
msg235661 - (view) |
Author: Martin Panter (martin.panter) * |
Date: 2015-02-10 03:16 |
See Issue 20059 proposing to change this to raise ValueError
|
|
Date |
User |
Action |
Args |
2022-04-11 14:57:26 | admin | set | github: 58244 |
2015-02-10 03:16:40 | martin.panter | set | nosy:
+ martin.panter messages:
+ msg235661
|
2012-05-24 13:59:22 | orsenthil | set | status: open -> closed type: enhancement -> behavior messages:
+ msg161507
assignee: orsenthil resolution: fixed stage: resolved |
2012-05-24 13:57:55 | python-dev | set | nosy:
+ python-dev messages:
+ msg161506
|
2012-05-21 15:32:01 | ezio.melotti | set | messages:
+ msg161279 |
2012-05-21 15:23:16 | orsenthil | set | messages:
+ msg161278 |
2012-05-21 15:09:21 | zulla | set | messages:
+ msg161277 |
2012-05-21 15:06:40 | zulla | set | messages:
+ msg161276 |
2012-05-21 14:51:03 | orsenthil | set | messages:
+ msg161275 |
2012-05-19 14:17:33 | ezio.melotti | set | status: pending -> open nosy:
+ ezio.melotti
|
2012-03-09 03:07:01 | orsenthil | set | status: open -> pending type: security -> enhancement messages:
+ msg155204
|
2012-03-03 12:38:43 | zulla | set | messages:
+ msg154833 |
2012-02-17 13:58:32 | ncoghlan | set | nosy:
+ ncoghlan
messages:
+ msg153545 versions:
- Python 3.1, Python 3.4 |
2012-02-17 02:18:54 | zulla | set | files:
+ urlparse.diff keywords:
+ patch messages:
+ msg153528
|
2012-02-17 02:05:52 | r.david.murray | set | nosy:
+ orsenthil messages:
+ msg153527
|
2012-02-17 01:54:33 | zulla | set | files:
- urlparse.py |
2012-02-17 01:54:09 | zulla | set | files:
+ urlparse.py
messages:
+ msg153525 |
2012-02-17 01:47:03 | zulla | set | messages:
+ msg153524 |
2012-02-17 01:45:08 | r.david.murray | set | nosy:
+ r.david.murray messages:
+ msg153523
|
2012-02-17 01:27:49 | zulla | set | messages:
+ msg153522 |
2012-02-16 23:40:38 | zulla | create | |