Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

urlparse insufficient port property validation #58244

Closed
zulla mannequin opened this issue Feb 16, 2012 · 18 comments
Closed

urlparse insufficient port property validation #58244

zulla mannequin opened this issue Feb 16, 2012 · 18 comments
Assignees
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@zulla
Copy link
Mannequin

zulla mannequin commented Feb 16, 2012

BPO 14036
Nosy @ncoghlan, @orsenthil, @ezio-melotti, @bitdancer, @vadmium
Files
  • urlparse.py
  • urlparse.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/orsenthil'
    closed_at = <Date 2012-05-24.13:59:22.613>
    created_at = <Date 2012-02-16.23:40:38.832>
    labels = ['type-bug', 'library']
    title = 'urlparse insufficient port property validation'
    updated_at = <Date 2015-02-10.03:16:40.936>
    user = 'https://bugs.python.org/zulla'

    bugs.python.org fields:

    activity = <Date 2015-02-10.03:16:40.936>
    actor = 'martin.panter'
    assignee = 'orsenthil'
    closed = True
    closed_date = <Date 2012-05-24.13:59:22.613>
    closer = 'orsenthil'
    components = ['Library (Lib)']
    creation = <Date 2012-02-16.23:40:38.832>
    creator = 'zulla'
    dependencies = []
    files = ['24540', '24541']
    hgrepos = []
    issue_num = 14036
    keywords = ['patch']
    message_count = 18.0
    messages = ['153512', '153522', '153523', '153524', '153525', '153527', '153528', '153545', '154833', '155204', '161275', '161276', '161277', '161278', '161279', '161506', '161507', '235661']
    nosy_count = 7.0
    nosy_names = ['ncoghlan', 'orsenthil', 'ezio.melotti', 'r.david.murray', 'python-dev', 'martin.panter', 'zulla']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue14036'
    versions = ['Python 2.7', 'Python 3.2', 'Python 3.3']

    @zulla
    Copy link
    Mannequin Author

    zulla mannequin commented Feb 16, 2012

    The "port" component of a URL is not properly be sanitized or validated. This may lead to the evasion of netloc/hostname based filters or exceptions.

    @zulla zulla mannequin added type-security A security issue stdlib Python modules in the Lib dir labels Feb 16, 2012
    @zulla
    Copy link
    Mannequin Author

    zulla mannequin commented Feb 17, 2012

    The "port" and "netloc" component of a ParsedResult-object is not properly sanitized or validated. This may lead to bypass-able hostname-based filters. Remote Crash vulnerabilities be be also possible.

    @bitdancer
    Copy link
    Member

    Did you upload urlparse.py to the issue by accident?

    Can you please provide some examples of where you think the current code is producing incorrect results?

    @zulla
    Copy link
    Mannequin Author

    zulla mannequin commented Feb 17, 2012

    Hi. No, it's a patched version. It won't crash under circumstances like that [1] and won't succeed with invalid input:

    >>> import urlparse
    >>> urlparse.urlparse("http://www.google.com:foo")
    ParseResult(scheme='http', netloc='www.google.com:foo', path='', params='', query='', fragment='')
    >>> urlparse.urlparse("http://www.google.com:foo").port
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urlparse.py", line 105, in port
        port = int(netloc.split(':')[1], 10)
    ValueError: invalid literal for int() with base 10: 'foo'
    >>>

    @zulla
    Copy link
    Mannequin Author

    zulla mannequin commented Feb 17, 2012

    Whops. I forgot an int() :-)

    Here's the right patch.

    @bitdancer
    Copy link
    Member

    It's not a patch if it is the whole file. A diff would be much more useful, since then we could see the changes easily.

    This kind of change would require a bit of discussion. I'm doubtful that it would be applied as a bug fix, and we might even want the validation to be optional and not the default. Part of the issue is that urlparse was originally based on the older standards, as you can see from the docstring.

    You may find others to agree with you, but personally it doesn't look to me like this rises to the level of a security issue.

    @zulla
    Copy link
    Mannequin Author

    zulla mannequin commented Feb 17, 2012

    I understand your point of view, but I disagree.

    Various libraries and projects rely on urlparse.urlparse and urllib.parse.urlparse.

    This bug just blew up in my face. I'm working with Cython and PyQt4.

    When a developer relies on ParseResult().netloc being a valid netloc, and .port being None [bool(False)] or a integer between 1-65535 really bad things can happen in a environment that has 0-tolerance for security issues (like C/C++ mixed in python).

    I agree that the

    if self.scheme == "http":
        return 80
    elif self.scheme == "https":
        [...]

    part of my patch is debetable, but we should _at least_ ensure that IF there is a ParseResult().port, the developer can be sure that it is a valid port between 1-65545.

    i apologize for upload the whole file; i attached the diff now.

    regards,
    dan

    @ncoghlan
    Copy link
    Contributor

    Could you provide some failing examples?

    The suggestion also seems to run slightly at odds with itself - in one part, silently replacing an invalid port specification with a different value, in another adding additional validation checks.

    Also, rather than hardcoding default port numbers for selected protocols, it would make more sense to just look them up via socket.getservbyname(scheme) (and still return None if the scheme isn't recognised). However, I'll need to think about that one for a while - urlparse is designed to be almost purely a string *parsing* library. Looking up default port numbers if one isn't specified really isn't its job. (For one thing, you'd break the ability to exactly recreate the original URL text from the parsed version).

    There should definitely by a try/except around that conversion to int(), though - it's just that I'm not yet sure what an appropriate return value is at that point. The raw port string? None? Should there be a separate "raw_port" descriptor that always returns some kind of string, with the port descriptor promising to always return a valid 16-bit port number or None?

    @zulla
    Copy link
    Mannequin Author

    zulla mannequin commented Mar 3, 2012

    >>> u("http://www.google.com:99999999999999999999999999").port
    99999999999999999999999999L

    @orsenthil
    Copy link
    Member

    Couple of points:

    1. On your last example, which webserver treats 'L' as part of port number? I can't of anything.

    2. Can you write a "real application" which is listening to beyond 65535? Which platform would it be?

    Current way of handling invalid port like, int('foo') by raising ValueError seems to be a better than returning a None. A better error message could be desirable, but that does not make it a security issue.

    Additionally, for the example of int changing long integer to 'L' appended one would a 2.x case as it is no longer the behavior in 3.x

    Also, I would advice to look at getPort function in a C library or a Java library and see what it does. The only difference I see is, they return -1 where Python returns None.

    I am changing the request type to an enhancement, because there is not a valid argument to support that it is a security issue.

    @orsenthil orsenthil added type-feature A feature request or enhancement and removed type-security A security issue labels Mar 9, 2012
    @orsenthil
    Copy link
    Member

    I am not sure if anything should be done to this request. Saying that
    int("99999999999999999999999999",10) is converting to 99999999999999999999999999L in Python2.7 it is a bug/security issue is hypothetical. Practically, such high port numbers cannot exist.

    I suggest, we close this issue as won't fix.

    Ezio, I noticed that you changed from pending to open. If you had any ideas for this issue, please share, otherwise you can consider closing this too.

    Thanks!

    @zulla
    Copy link
    Mannequin Author

    zulla mannequin commented May 21, 2012

    Your comment is completely senseless, sorry.
    Of course such high port numbers do not exist.

    An attacker is counting on that. Imagine something like that

    pass_to_cython(urlparse("http://google.de:999999\*\*999999[to be calculated]").port)

    @zulla
    Copy link
    Mannequin Author

    zulla mannequin commented May 21, 2012

    we should at least check if the .port attribute is an intereger >= 1 and <= 65535. _because_ this is the only valid port range. otherwise, it is no valid port. but it may be a integer overflow attack attempt

    when a developer uses .port, he is counting on the result being valid

    @orsenthil
    Copy link
    Member

    pass_to_cython(urlparse("http://google.de:999999\*\*999999[to be calculated]").port)

    is no different than sending

    pass_to_cython(999999**999999[to be calculated])

    In that case, would the former make a security loop hole in urlparse? Looks pretty contrived to me as an example for .port bug.

    However, I agree with one point in your assertion, namely that port be checked that it is within the range integer >= 1 and <= 65535. If it is not, return None as a response in port.

    @ezio-melotti
    Copy link
    Member

    Ezio, I noticed that you changed from pending to open.

    That was an accident, I just meant to add my self to the nosy.
    I didn't have time yet to read all the messages and comment on the issue.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented May 24, 2012

    New changeset 988903cf24c5 by Senthil Kumaran in branch '2.7':
    Issue bpo-14036: return None when port in urlparse cross 65535
    http://hg.python.org/cpython/rev/988903cf24c5

    New changeset d769e64aed79 by Senthil Kumaran in branch '3.2':
    Issue bpo-14036: return None when port in urlparse cross 65535
    http://hg.python.org/cpython/rev/d769e64aed79

    New changeset b4d257c64db7 by Senthil Kumaran in branch 'default':
    Issue bpo-14036: return None when port in urlparse cross 65535
    http://hg.python.org/cpython/rev/b4d257c64db7

    @orsenthil
    Copy link
    Member

    This is taken care. I was not really convinced on the need as likely seemed a non issue from "urlparse" standpoint, But still there is no harm in returning valid port as semantically the attribute stands for a port.

    Thanks!

    @orsenthil orsenthil self-assigned this May 24, 2012
    @orsenthil orsenthil added type-bug An unexpected behavior, bug, or error and removed type-feature A feature request or enhancement labels May 24, 2012
    @vadmium
    Copy link
    Member

    vadmium commented Feb 10, 2015

    See bpo-20059 proposing to change this to raise ValueError

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants