classification
Title: Add support for RTMP schemes to urlparse
Type: enhancement Stage:
Components: Library (Lib) Versions: Python 3.4
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Jorge.Gomes, eric.smith, gregory.p.smith, martin.panter, orsenthil
Priority: normal Keywords:

Created on 2012-10-04 18:39 by Jorge.Gomes, last changed 2013-11-24 02:51 by martin.panter.

Messages (6)
msg171984 - (view) Author: Jorge Gomes (Jorge.Gomes) Date: 2012-10-04 18:39
Please add support in urlparse for the family of RTMP schemes:
rtmp
rtmpe
rtmps
rtmpt

I believe these schemes should be added to the following module variables:
uses_relative
uses_netloc
uses_params
uses_query
[essentially, the one where rtsp already is]

The RTMP spec is hosted at http://www.adobe.com/devnet/rtmp.html which describes the format as "protocol://servername:port/"
The example provided there is rtmp://localhost:1935/test

An example YouTube RTMP *service* URL looks like:
rtmp://a.rtmp.youtube.com/videolive?ns=yt-live&id=123456&itag=35&signature=blahblahblah

Please let me know if further information is required.

Thanks!

========================================
Footnote:

A full YouTube RTMP stream URL may look like this:

rtmp://a.rtmp.youtube.com/videolive?ns=yt-live&id=123456&itag=35&signature=blahblahblah/yt-live.123456.35

i.e. it is the stream service url suffixed with '/' + the_stream_name. 

When one uses urlparse (extended with the 'rtmp' scheme), the stream name part gets lumped in with the last query value.
I think it's reasonable to expect the user of the urlparse library to strip the stream name off, thus returning just the service URL, which can be parsed normally. However, if urlparse could handle this sort use-case generically, then that would be great.
msg171985 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2012-10-04 18:43
As this is a feature request, it can only be applied to 3.4. I've modified the versions.
msg172448 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2012-10-09 05:30
Personally, I want to do away with all those scheme specific stuff, if we can. I have tried previously, but failed due to some backwards incompatibility. 3.4 gives a good chance/time to make those changes to get rid of those scheme specific stuff (again).

So, instead of adding the rmtp* modules to the various categories, I would like to see if can find a way out.


issue9374 - another related one which.

Also, Jorge Gomes: If you care about 2.7 version only, then the way I have seen this issue being handled in production is you extend the uses_relative list with the protocols that you want to support.
Like

>>> from urlparse import uses_netloc
>>> uses_netloc.extend(['rtmp','rtmpe'])

2.7.x is in bugfix mode and this change may not be considered a bug-fix to find it's place in 2.7.x
msg172449 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2012-10-09 05:33
issue9374 - another related one which should be taken care. Which is simply reverting this: http://hg.python.org/cpython/diff/a0b3cb52816e/Lib/urllib/parse.py and informing in the DOCs that those globals are not available anymore. (But this should also be discussed in python-dev before making the change).
msg204079 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2013-11-23 19:40
i'd like to see a proposed change against the 3.4 standard library for this with tests.
msg204170 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2013-11-24 02:51
Looks like Issue 9374 already covers most of this, with fixes in 2.7, 3.2 and 3.3.

$ python3.3
Python 3.3.2 (default, May 16 2013, 23:40:52) 
[GCC 4.6.3] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from urllib.parse import urlparse
>>> urlparse("protocol://servername:port/")
ParseResult(scheme='protocol', netloc='servername:port', path='/', params='', query='', fragment='')
>>> urlparse("rtmp://a.rtmp.youtube.com/videolive?ns=yt-live&id=123456&itag=35&signature=blahblahblah/yt-live.123456.35")
ParseResult(scheme='rtmp', netloc='a.rtmp.youtube.com', path='/videolive', params='', query='ns=yt-live&id=123456&itag=35&signature=blahblahblah/yt-live.123456.35', fragment='')

Now there are only the three unresolved aspects listed below, as I see it. Personally I think the first, for urljoin(), should be fixed (hopefully in a generic way without whitelists). I mentioned this in Issue 18828. I wonder if last two really matter?

* uses_relative: would allow urljoin() to work. Compare urljoin("rtmp://host/", "path") and urljoin("rtsp://host/", "path").
* uses_netloc: would affect urlunsplit(("rtmp", "", "/path", "", ""))
* uses_params: would affect urlparse("rtmp://host/;a=b")
History
Date User Action Args
2016-09-13 04:22:49martin.panterlinkissue18828 dependencies
2013-11-24 02:51:13martin.pantersetnosy: + martin.panter
messages: + msg204170
2013-11-23 19:40:54gregory.p.smithsetassignee: gregory.p.smith ->
messages: + msg204079
2012-10-09 05:33:08orsenthilsetmessages: + msg172449
2012-10-09 05:30:38orsenthilsetmessages: + msg172448
2012-10-05 21:27:53gregory.p.smithsetassignee: gregory.p.smith

nosy: + gregory.p.smith
2012-10-05 12:48:29pitrousetnosy: + orsenthil
2012-10-04 18:43:06eric.smithsetversions: - Python 2.6, Python 3.1, Python 2.7, Python 3.2, Python 3.3, Python 3.5
nosy: + eric.smith

messages: + msg171985

components: + Library (Lib)
2012-10-04 18:39:21Jorge.Gomescreate