classification
Title: urllib.parse.urljoin does not handle WebSocket URLs
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.7, Python 3.6, Python 3.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: MarkusH, berker.peksag, imrehg, martin.panter, python-dev, r.david.murray, rbcollins
Priority: normal Keywords: patch

Created on 2015-12-17 05:58 by imrehg, last changed 2017-03-31 16:36 by dstufft. This issue is now closed.

Files
File name Uploaded Description Edit
0001-urllib.parse-enable-WebSocket-URL-schemes-to-be-corr.patch imrehg, 2015-12-17 05:58 review
0001-Enable-WebSocket-URL-schemes-in-urllib.parse.urljoin.patch MarkusH, 2016-09-13 00:23
0001-Enable-WebSocket-URL-schemes-in-urllib.parse.urljoin.patch MarkusH, 2016-09-13 04:04
0001-Enable-WebSocket-URL-schemes-in-urllib.parse.urljoin.patch MarkusH, 2016-09-14 05:15 review
Pull Requests
URL Status Linked Edit
PR 552 closed dstufft, 2017-03-31 16:36
Messages (17)
msg256581 - (view) Author: Gergely Imreh (imrehg) * Date: 2015-12-17 05:58
Handling the "ws" and "wss" schemes of the WebSocket protocol in urllib.parse.urlparse was discussed in a previous issue http://bugs.python.org/issue13244 Parsing them seems to work correctly:

>>> urllib.parse.urlparse("wss://domain/endpoint")
ParseResult(scheme='wss', netloc='domain', path='/endpoint', params='', query='', fragment='')

On the other hand, urllib.parse.urljoin does not know about these schemes and thus cannot be used to construct WebSocket URLs (as, for example some streaming data APIs now seem to use it, such as https://starfighter.readme.io/docs/quotes-ticker-tape-websocket).

Using the example above I get:

>>> urllib.parse.urljoin("wss://domain/","/endpoint")
'/endpoint'

This is not the expected result, 'wss://example.com/endpoint', because these WebSocket schemes are not included in the uses_relative and uses_netloc lists in the library settings. For clarity, based on the previous issue's discussion and RFC6455 Section 11.1 https://tools.ietf.org/html/rfc6455#section-11.1 , these are the only two portions that apply, and should not be listed in uses_params.

As a workaround, similarly to the previous issue's recommendation, one can use:

>> import urlparse.parse
>> wsschemes = ["ws", "wss"]
>> urlparse.parse.uses_relative.extend(wsschemes)
>> urlparse.parse.uses_netloc.extend(wsschemes)

Attached patch to this effect. Also added tests to cover these schemes, though comments are definitely welcome on all of this!
msg256582 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2015-12-17 06:01
Gergely, the patch looks alright. Please sign the contributor agreement and we'll have it merged.
msg256583 - (view) Author: Gergely Imreh (imrehg) * Date: 2015-12-17 06:13
Just signed it, thanks!
msg256587 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-12-17 07:22
The documentation should probably also be updated. It has a list of supported schemes. I imagine this would have to be taken as a new feature for 3.6+, rather than a bug fix.

An alternative would be to enable joining for arbitrary schemes; see Issue 18828.
msg256628 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2015-12-17 23:35
> The documentation should probably also be updated. It has a list of supported schemes. 

I agree. Gergely, do you want to update your patch?

> I imagine this would have to be taken as a new feature for 3.6+, rather than a bug fix.

Since urlparse can parse wss:// addresses, *but* urljoin can't work with them, I think we can classify this as a bug fix patch.
msg271722 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-07-31 03:10
My view is that because the schemes are not in the documented list of supported schemes, this is a new feature, and any documentation update should include a “versionadded” or similar notice.
msg276126 - (view) Author: Robert Collins (rbcollins) * (Python committer) Date: 2016-09-12 22:15
It is a bugfix, but we should document when it started working using the version added segment IMO. I think addding this to 3.6 now would be ok.
msg276130 - (view) Author: Markus Holtermann (MarkusH) * Date: 2016-09-12 22:25
As discussed with rbcollins during the KiwiPyCon sprints, I'll add patches for 3.5, 3.6 and master with docs mentioning the addition of `ws` and `wss`  as of 3.5.3
msg276142 - (view) Author: Markus Holtermann (MarkusH) * Date: 2016-09-13 00:23
Since the patch applies cleanly to the 3.5, 3.6 and master branch I only attached one updated version of it.
msg276152 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2016-09-13 02:35
I think a Misc/NEWS entry is enough. We didn't use versionadded/versionchanged directives for similar changes in the past (mimetypes.types_map for example) Even if we decide to add an annotation it should be versionchanged, not versionadded (the API is not new.)

The rest of the patch looks good to me.
msg276155 - (view) Author: Markus Holtermann (MarkusH) * Date: 2016-09-13 04:04
Thanks for your input, Berker. Updated as suggested. I still include the versionchanged annotation as I suspect more people to look at the docs than the entire changelog
msg276160 - (view) Author: Robert Collins (rbcollins) * (Python committer) Date: 2016-09-13 04:15
I find details like this extremely useful in the main docs, so please do add there.
msg276163 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-09-13 04:30
IMO if a versionadded/versionchanged notice is relevant, that is a good sign it is a feature rather than bug fix. If the 3.6.1 documentation says “added in 3.5.3”, how do you know if it is in 3.6.0?

I looked at the history of other schemes being added to try and understand if it is a bug fix or not. I guess it is not that clear. In Python 2 the sftp and sips schemes are documented as a new feature in 2.5. Is that why you aren’t proposing any change to 2.7?

imap (Issue 618024), mms (7ec771893410): 2.3 features
rsync (Issue 981299): 2.3 bugfix (3a7e34dc6ae2)
svn: 2.4 bugfix (349a0fd598e0)
sftp (Issue 1407902): 2.5 feature (r42108); 2.4 bugfix reverted (r42123); documented (r42125) with version (r42158)
sips (r43520): 2.5 feature; documented with version
nfs (Issue 4962): 2.7 (r70757), 3.1 (r70760) features; 2.6 bugfix (r81131); undocumented
git (Issue 8657): 2.6, 2.7, 3.1 bugfixes; undocumented
tel (Issue 16713): 2.7, 3.2, 3.3 bugfixes; undocumented
msg276292 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2016-09-13 15:35
I believe our stance on these has changed over time from "feature" to "bugfix", especially when there is a standards document to back it (but sometimes when there is not...we gave up a while ago on the MIME-type standards ever getting finalized and just bow to the reality of something being a defacto standard).

Given that, there should be no versionchanged directive.  I agree that if you want a versionchanged, then it is an enhancement and can only go into 3.7.

I'm a bit conflicted about whether these are bug fixes or features...I used to be on the feature side, but it is really annoying to have these 
"standard" things not work, so I'm more on the bugfix side now.
msg276379 - (view) Author: Markus Holtermann (MarkusH) * Date: 2016-09-14 05:15
Thanks for your input. I remove the versionchanged block.
msg276701 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-09-16 11:45
New changeset a5e8fe666c6b by Berker Peksag in branch '3.5':
Issue #25895: Enable WebSocket URL schemes in urllib.parse.urljoin
https://hg.python.org/cpython/rev/a5e8fe666c6b

New changeset 9bf370a33938 by Berker Peksag in branch '3.6':
Issue #25895: Merge from 3.5
https://hg.python.org/cpython/rev/9bf370a33938

New changeset 2fc89d6eff20 by Berker Peksag in branch 'default':
Issue #25895: Merge from 3.6
https://hg.python.org/cpython/rev/2fc89d6eff20
msg276702 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2016-09-16 11:46
Thank you Gergely and Markus!
History
Date User Action Args
2017-03-31 16:36:12dstufftsetpull_requests: + pull_request883
2016-09-16 11:46:51berker.peksagsetstatus: open -> closed
resolution: fixed
messages: + msg276702

stage: patch review -> resolved
2016-09-16 11:45:08python-devsetnosy: + python-dev
messages: + msg276701
2016-09-14 05:15:21MarkusHsetfiles: + 0001-Enable-WebSocket-URL-schemes-in-urllib.parse.urljoin.patch

messages: + msg276379
2016-09-13 15:35:09r.david.murraysetnosy: + r.david.murray
messages: + msg276292
2016-09-13 04:30:52martin.pantersetmessages: + msg276163
2016-09-13 04:22:49martin.panterlinkissue18828 dependencies
2016-09-13 04:15:12rbcollinssetmessages: + msg276160
2016-09-13 04:04:29MarkusHsetfiles: + 0001-Enable-WebSocket-URL-schemes-in-urllib.parse.urljoin.patch

messages: + msg276155
2016-09-13 02:35:24berker.peksagsetnosy: + berker.peksag

messages: + msg276152
versions: + Python 3.6, Python 3.7, - Python 3.4
2016-09-13 00:25:51MarkusHsetfiles: - 0001-Enable-WebSocket-URL-schemes-in-urllib.parse.urljoin.master.patch
2016-09-13 00:25:48MarkusHsetfiles: - 0001-Enable-WebSocket-URL-schemes-in-urllib.parse.urljoin.3.6.patch
2016-09-13 00:25:16yselivanovsetnosy: - yselivanov
2016-09-13 00:23:53MarkusHsetfiles: - 0001-Enable-WebSocket-URL-schemes-in-urllib.parse.urljoin.3.5.patch
2016-09-13 00:23:32MarkusHsetfiles: + 0001-Enable-WebSocket-URL-schemes-in-urllib.parse.urljoin.patch

messages: + msg276142
2016-09-12 23:04:40MarkusHsetfiles: + 0001-Enable-WebSocket-URL-schemes-in-urllib.parse.urljoin.master.patch
2016-09-12 23:04:21MarkusHsetfiles: + 0001-Enable-WebSocket-URL-schemes-in-urllib.parse.urljoin.3.6.patch
2016-09-12 23:04:05MarkusHsetfiles: + 0001-Enable-WebSocket-URL-schemes-in-urllib.parse.urljoin.3.5.patch
2016-09-12 22:25:16MarkusHsetnosy: + MarkusH
messages: + msg276130
2016-09-12 22:15:21rbcollinssetnosy: + rbcollins
messages: + msg276126
2016-07-31 03:10:19martin.pantersetmessages: + msg271722
components: + Library (Lib)
stage: patch review
2015-12-17 23:35:52yselivanovsetmessages: + msg256628
2015-12-17 07:22:08martin.pantersetnosy: + martin.panter
messages: + msg256587
2015-12-17 06:13:00imrehgsetmessages: + msg256583
2015-12-17 06:01:40yselivanovsetnosy: + yselivanov
messages: + msg256582
2015-12-17 05:58:53imrehgcreate