Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

urllib.parse.urljoin does not handle WebSocket URLs #70083

Closed
imrehg mannequin opened this issue Dec 17, 2015 · 17 comments
Closed

urllib.parse.urljoin does not handle WebSocket URLs #70083

imrehg mannequin opened this issue Dec 17, 2015 · 17 comments
Labels
3.7 (EOL) end of life stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@imrehg
Copy link
Mannequin

imrehg mannequin commented Dec 17, 2015

BPO 25895
Nosy @rbtcollins, @bitdancer, @berkerpeksag, @vadmium, @MarkusH
PRs
  • [Do Not Merge] Convert Misc/NEWS so that it is managed by towncrier #552
  • Files
  • 0001-urllib.parse-enable-WebSocket-URL-schemes-to-be-corr.patch
  • 0001-Enable-WebSocket-URL-schemes-in-urllib.parse.urljoin.patch
  • 0001-Enable-WebSocket-URL-schemes-in-urllib.parse.urljoin.patch
  • 0001-Enable-WebSocket-URL-schemes-in-urllib.parse.urljoin.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2016-09-16.11:46:51.293>
    created_at = <Date 2015-12-17.05:58:53.616>
    labels = ['3.7', 'type-bug', 'library']
    title = 'urllib.parse.urljoin does not handle WebSocket URLs'
    updated_at = <Date 2017-03-31.16:36:12.867>
    user = 'https://bugs.python.org/imrehg'

    bugs.python.org fields:

    activity = <Date 2017-03-31.16:36:12.867>
    actor = 'dstufft'
    assignee = 'none'
    closed = True
    closed_date = <Date 2016-09-16.11:46:51.293>
    closer = 'berker.peksag'
    components = ['Library (Lib)']
    creation = <Date 2015-12-17.05:58:53.616>
    creator = 'imrehg'
    dependencies = []
    files = ['41334', '44612', '44615', '44649']
    hgrepos = []
    issue_num = 25895
    keywords = ['patch']
    message_count = 17.0
    messages = ['256581', '256582', '256583', '256587', '256628', '271722', '276126', '276130', '276142', '276152', '276155', '276160', '276163', '276292', '276379', '276701', '276702']
    nosy_count = 7.0
    nosy_names = ['rbcollins', 'r.david.murray', 'python-dev', 'berker.peksag', 'martin.panter', 'MarkusH', 'imrehg']
    pr_nums = ['552']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue25895'
    versions = ['Python 3.5', 'Python 3.6', 'Python 3.7']

    @imrehg
    Copy link
    Mannequin Author

    imrehg mannequin commented Dec 17, 2015

    Handling the "ws" and "wss" schemes of the WebSocket protocol in urllib.parse.urlparse was discussed in a previous issue http://bugs.python.org/issue13244 Parsing them seems to work correctly:

    >>> urllib.parse.urlparse("wss://domain/endpoint")
    ParseResult(scheme='wss', netloc='domain', path='/endpoint', params='', query='', fragment='')

    On the other hand, urllib.parse.urljoin does not know about these schemes and thus cannot be used to construct WebSocket URLs (as, for example some streaming data APIs now seem to use it, such as https://starfighter.readme.io/docs/quotes-ticker-tape-websocket).

    Using the example above I get:

    >>> urllib.parse.urljoin("wss://domain/","/endpoint")
    '/endpoint'

    This is not the expected result, 'wss://example.com/endpoint', because these WebSocket schemes are not included in the uses_relative and uses_netloc lists in the library settings. For clarity, based on the previous issue's discussion and RFC6455 Section 11.1 https://tools.ietf.org/html/rfc6455#section-11.1 , these are the only two portions that apply, and should not be listed in uses_params.

    As a workaround, similarly to the previous issue's recommendation, one can use:

    > import urlparse.parse
    > wsschemes = ["ws", "wss"]
    > urlparse.parse.uses_relative.extend(wsschemes)
    > urlparse.parse.uses_netloc.extend(wsschemes)

    Attached patch to this effect. Also added tests to cover these schemes, though comments are definitely welcome on all of this!

    @imrehg imrehg mannequin added the type-bug An unexpected behavior, bug, or error label Dec 17, 2015
    @1st1
    Copy link
    Member

    1st1 commented Dec 17, 2015

    Gergely, the patch looks alright. Please sign the contributor agreement and we'll have it merged.

    @imrehg
    Copy link
    Mannequin Author

    imrehg mannequin commented Dec 17, 2015

    Just signed it, thanks!

    @vadmium
    Copy link
    Member

    vadmium commented Dec 17, 2015

    The documentation should probably also be updated. It has a list of supported schemes. I imagine this would have to be taken as a new feature for 3.6+, rather than a bug fix.

    An alternative would be to enable joining for arbitrary schemes; see bpo-18828.

    @1st1
    Copy link
    Member

    1st1 commented Dec 17, 2015

    The documentation should probably also be updated. It has a list of supported schemes.

    I agree. Gergely, do you want to update your patch?

    I imagine this would have to be taken as a new feature for 3.6+, rather than a bug fix.

    Since urlparse can parse wss:// addresses, *but* urljoin can't work with them, I think we can classify this as a bug fix patch.

    @vadmium
    Copy link
    Member

    vadmium commented Jul 31, 2016

    My view is that because the schemes are not in the documented list of supported schemes, this is a new feature, and any documentation update should include a “versionadded” or similar notice.

    @vadmium vadmium added the stdlib Python modules in the Lib dir label Jul 31, 2016
    @rbtcollins
    Copy link
    Member

    It is a bugfix, but we should document when it started working using the version added segment IMO. I think addding this to 3.6 now would be ok.

    @MarkusH
    Copy link
    Mannequin

    MarkusH mannequin commented Sep 12, 2016

    As discussed with rbcollins during the KiwiPyCon sprints, I'll add patches for 3.5, 3.6 and master with docs mentioning the addition of ws and wss as of 3.5.3

    @MarkusH
    Copy link
    Mannequin

    MarkusH mannequin commented Sep 13, 2016

    Since the patch applies cleanly to the 3.5, 3.6 and master branch I only attached one updated version of it.

    @berkerpeksag
    Copy link
    Member

    I think a Misc/NEWS entry is enough. We didn't use versionadded/versionchanged directives for similar changes in the past (mimetypes.types_map for example) Even if we decide to add an annotation it should be versionchanged, not versionadded (the API is not new.)

    The rest of the patch looks good to me.

    @berkerpeksag berkerpeksag added the 3.7 (EOL) end of life label Sep 13, 2016
    @MarkusH
    Copy link
    Mannequin

    MarkusH mannequin commented Sep 13, 2016

    Thanks for your input, Berker. Updated as suggested. I still include the versionchanged annotation as I suspect more people to look at the docs than the entire changelog

    @rbtcollins
    Copy link
    Member

    I find details like this extremely useful in the main docs, so please do add there.

    @vadmium
    Copy link
    Member

    vadmium commented Sep 13, 2016

    IMO if a versionadded/versionchanged notice is relevant, that is a good sign it is a feature rather than bug fix. If the 3.6.1 documentation says “added in 3.5.3”, how do you know if it is in 3.6.0?

    I looked at the history of other schemes being added to try and understand if it is a bug fix or not. I guess it is not that clear. In Python 2 the sftp and sips schemes are documented as a new feature in 2.5. Is that why you aren’t proposing any change to 2.7?

    imap (bpo-618024), mms (7ec771893410): 2.3 features
    rsync (bpo-981299): 2.3 bugfix (3a7e34dc6ae2)
    svn: 2.4 bugfix (349a0fd598e0)
    sftp (bpo-1407902): 2.5 feature (r42108); 2.4 bugfix reverted (r42123); documented (r42125) with version (r42158)
    sips (r43520): 2.5 feature; documented with version
    nfs (bpo-4962): 2.7 (r70757), 3.1 (r70760) features; 2.6 bugfix (r81131); undocumented
    git (bpo-8657): 2.6, 2.7, 3.1 bugfixes; undocumented
    tel (bpo-16713): 2.7, 3.2, 3.3 bugfixes; undocumented

    @bitdancer
    Copy link
    Member

    I believe our stance on these has changed over time from "feature" to "bugfix", especially when there is a standards document to back it (but sometimes when there is not...we gave up a while ago on the MIME-type standards ever getting finalized and just bow to the reality of something being a defacto standard).

    Given that, there should be no versionchanged directive. I agree that if you want a versionchanged, then it is an enhancement and can only go into 3.7.

    I'm a bit conflicted about whether these are bug fixes or features...I used to be on the feature side, but it is really annoying to have these
    "standard" things not work, so I'm more on the bugfix side now.

    @MarkusH
    Copy link
    Mannequin

    MarkusH mannequin commented Sep 14, 2016

    Thanks for your input. I remove the versionchanged block.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Sep 16, 2016

    New changeset a5e8fe666c6b by Berker Peksag in branch '3.5':
    Issue bpo-25895: Enable WebSocket URL schemes in urllib.parse.urljoin
    https://hg.python.org/cpython/rev/a5e8fe666c6b

    New changeset 9bf370a33938 by Berker Peksag in branch '3.6':
    Issue bpo-25895: Merge from 3.5
    https://hg.python.org/cpython/rev/9bf370a33938

    New changeset 2fc89d6eff20 by Berker Peksag in branch 'default':
    Issue bpo-25895: Merge from 3.6
    https://hg.python.org/cpython/rev/2fc89d6eff20

    @berkerpeksag
    Copy link
    Member

    Thank you Gergely and Markus!

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants