classification
Title: urllib.splitport -- is it official or not?
Type: Stage: resolved
Components: Versions: Python 3.8
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: brett.cannon, cheryl.sabella, jaraco, lukasz.langa, martin.panter, orsenthil, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2016-07-11 18:02 by gvanrossum, last changed 2019-02-03 15:12 by jaraco. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 2205 merged cheryl.sabella, 2017-06-14 23:25
PR 7070 merged cheryl.sabella, 2018-05-23 12:08
Messages (18)
msg270193 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2016-07-11 18:02
I've seen and written some code that uses urllib.splitport() [1], but it's not in the export list, nor in the docs. However I see no easy other way to perform the same function. Should we make it official, or get rid of it? It's used internally in urllib/request.py [2]. There's a test for it in test_urlparse.py [3], but another test [4] also acknowledges that it's "undocumented" (which suggests that the author of that test didn't know what to do with it either).

Same question for the others in that list [4]:
            'splitattr', 'splithost', 'splitnport', 'splitpasswd',
            'splitport', 'splitquery', 'splittag', 'splittype', 'splituser',
            'splitvalue',
            'Quoter', 'ResultBase', 'clear_cache', 'to_bytes', 'unwrap',

References:
[1] https://hg.python.org/cpython/file/tip/Lib/urllib/parse.py#l956
[2] https://hg.python.org/cpython/file/tip/Lib/urllib/request.py#l106
[3] https://hg.python.org/cpython/file/tip/Lib/test/test_urlparse.py#l1015
[4] https://hg.python.org/cpython/file/tip/Lib/test/test_urlparse.py#l946
msg270200 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-07-11 19:08
splitport() doesn't work with IPv6 ("[::1]", see issue18191), nor with authority ("user:password@example.com"). Note that there is a almost duplicate function splitnport(). The existence of two similar functions that behave differently in corner cases looks confusing. And seems splitport() and splitnport() not always used correctly internally (see issue20271).
msg270215 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-07-11 23:33
Previous discussion: Issue 1722, Issue 11009.

In Python 2, most of the split- functions _have_ been in urllib.__all__ since revision 5d68afc5227c (2.1). Also, since revision c3656dca65e7 (Issue 1722, 2.7.4), the RST documentation does mention that at least some of them are deprecated in favour of the “urlparse” module. However there are no index entries, and splitport() is not mentioned by name.

In Python 3, these functions wandered into urllib.parse. There is no RST documentation, and the functions are not in __all__ (which was added for Issue 13287 in 3.3).

I think you can use the documented urllib.parse API instead of splitport(), but it is borderline unwieldy:

>>> netloc = "[::1]:80"
>>> urllib.parse.splitport(netloc)  # [Brackets] kept!
('[::1]', '80')
>>> split = urlsplit("//" + netloc); (split.hostname, split.port)
('::1', 80)
>>> split = SplitResult("", netloc, path="", query="", fragment=""); (split.hostname, split.port)
('::1', 80)

I opened Issue 23416 with a suggestion that would make SplitResult a bit simpler to use here. But maybe it makes the implementation too complicated.

I don’t think the non-split-names (Quoter, etc) are in much doubt. They were never in __all__.
msg270218 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2016-07-12 00:51
Aha. I see you are referring to this note in the 2.7 docs for urllib:

    urllib also exposes certain utility functions like splittype, splithost and
    others parsing URL into various components. But it is recommended to use
    :mod:`urlparse` for parsing URLs rather than using these functions directly.
    Python 3 does not expose these helper functions from :mod:`urllib.parse`
    module.

This is somewhat ironic because those functions still exist in urllib.parse.

I've rewritten my code using your suggestions of using urllib.parse.urlparse().

Shall we just close this issue or is there still an action item? (Maybe actually delete those functions whose deletion has been promised so long ago, or at least rename them to _splitport() etc.?)
msg270253 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2016-07-12 16:03
Probably a rename is good. Question then becomes whether the old names should raise an DeprecationWarning for a release?
msg270536 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2016-07-16 04:05
I think that we use encourage everyone to use the higher level functions like urlparse() or urlsplit() and then get the .port from the named tuple result.

Two things to do.

1. Update that Note the documentation which states a false statement that those helper functions are not exposed in urllib.parse

2. Raise a DeprecationWarning and rename them to _splitport()
msg295962 - (view) Author: Cheryl Sabella (cheryl.sabella) * (Python committer) Date: 2017-06-13 22:03
Would it be OK for me to work on this?
msg295965 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2017-06-13 22:22
Go for it, Cheryl!
msg295966 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2017-06-13 22:26
Skimming the issue I can't even figure out what the task is -- Cheryl, I suppose you have, could you post a brief summary of your plan here?
msg295967 - (view) Author: Cheryl Sabella (cheryl.sabella) * (Python committer) Date: 2017-06-13 22:59
Thank you.

From my understanding, urllib didn't officially supported the split* functions (splittype, splithost, splitport, splinport, splituser, splitpasswd, splitattr, splitquery, splitvalue, splittag) even though they were migrated to urllib.parse.  The 2.7 documentation recommended using urlparse and stated that these split* functions would not be exposed in Python 3, but they are.

The proposal would be as Senthil suggested - to raise a DeprecationWarning if the current names are used and to rename them with a single underscore (to _split*).

However, I did have some questions.  
1. With the DeprecationWarning for the current function names, should the return value be a call to the _split* function or should it call urlparse/urlsplit?
2. Some of the return values from these split* functions can be different than urlsplit, as Martin showed in msg270215.  It seems that the return values should remain the same for now, but would the differences need to be documented? 
3. These functions are used in requests.py.  Should they be changed to use the _split* functions or changed to use urlparse?
msg295983 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2017-06-14 04:48
I don't think it is worth changing the implementations to be in terms of urlsplit or urlparse. This is proposed for splithost in <https://github.com/python/cpython/pull/1849>, but I suspect it would change the behaviour in some corner cases. See Issue 22852 for some deficiencies with urlsplit.

3. Change existing usage to the internal _split functions, unless there is a reason (bug or security problem) to make further changes.
msg296049 - (view) Author: Cheryl Sabella (cheryl.sabella) * (Python committer) Date: 2017-06-14 23:34
Martin, thank you for the information and for pointing out those other related issues.  It makes sense to separate the security or bug issues from this change.
msg315765 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2018-04-25 23:51
New changeset 0250de48199552cdaed5a4fe44b3f9cdb5325363 by Łukasz Langa (Cheryl Sabella) in branch 'master':
bpo-27485: Rename and deprecate undocumented functions in urllib.parse (GH-2205)
https://github.com/python/cpython/commit/0250de48199552cdaed5a4fe44b3f9cdb5325363
msg315769 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2018-04-26 02:10
Thanks! This is now fixed for Python 3.8 \o/
msg317354 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-05-23 04:53
This change made test_urlparse failing when ran with -We. Or just producing a lot of warnings in the default mode.

======================================================================
ERROR: test_splitattr (test.test_urlparse.Utility_Tests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/serhiy/py/cpython-gc/Lib/test/test_urlparse.py", line 1113, in test_splitattr
    self.assertEqual(splitattr('/path;attr1=value1;attr2=value2'),
  File "/home/serhiy/py/cpython-gc/Lib/urllib/parse.py", line 1103, in splitattr
    DeprecationWarning, stacklevel=2)
DeprecationWarning: urllib.parse.splitattr() is deprecated as of 3.8, use urllib.parse.urlparse() instead

======================================================================
ERROR: test_splithost (test.test_urlparse.Utility_Tests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/serhiy/py/cpython-gc/Lib/test/test_urlparse.py", line 1006, in test_splithost
    self.assertEqual(splithost('//www.example.org:80/foo/bar/baz.html'),
  File "/home/serhiy/py/cpython-gc/Lib/urllib/parse.py", line 977, in splithost
    DeprecationWarning, stacklevel=2)
DeprecationWarning: urllib.parse.splithost() is deprecated as of 3.8, use urllib.parse.urlparse() instead

======================================================================
ERROR: test_splitnport (test.test_urlparse.Utility_Tests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/serhiy/py/cpython-gc/Lib/test/test_urlparse.py", line 1077, in test_splitnport
    self.assertEqual(splitnport('parrot:88'), ('parrot', 88))
  File "/home/serhiy/py/cpython-gc/Lib/urllib/parse.py", line 1049, in splitnport
    DeprecationWarning, stacklevel=2)
DeprecationWarning: urllib.parse.splitnport() is deprecated as of 3.8, use urllib.parse.urlparse() instead

======================================================================
ERROR: test_splitpasswd (test.test_urlparse.Utility_Tests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/serhiy/py/cpython-gc/Lib/test/test_urlparse.py", line 1050, in test_splitpasswd
    self.assertEqual(splitpasswd('user:ab'), ('user', 'ab'))
  File "/home/serhiy/py/cpython-gc/Lib/urllib/parse.py", line 1013, in splitpasswd
    DeprecationWarning, stacklevel=2)
DeprecationWarning: urllib.parse.splitpasswd() is deprecated as of 3.8, use urllib.parse.urlparse() instead

======================================================================
ERROR: test_splitport (test.test_urlparse.Utility_Tests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/serhiy/py/cpython-gc/Lib/test/test_urlparse.py", line 1066, in test_splitport
    self.assertEqual(splitport('parrot:88'), ('parrot', '88'))
  File "/home/serhiy/py/cpython-gc/Lib/urllib/parse.py", line 1026, in splitport
    DeprecationWarning, stacklevel=2)
DeprecationWarning: urllib.parse.splitport() is deprecated as of 3.8, use urllib.parse.urlparse() instead

======================================================================
ERROR: test_splitquery (test.test_urlparse.Utility_Tests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/serhiy/py/cpython-gc/Lib/test/test_urlparse.py", line 1091, in test_splitquery
    self.assertEqual(splitquery('http://python.org/fake?foo=bar'),
  File "/home/serhiy/py/cpython-gc/Lib/urllib/parse.py", line 1073, in splitquery
    DeprecationWarning, stacklevel=2)
DeprecationWarning: urllib.parse.splitquery() is deprecated as of 3.8, use urllib.parse.urlparse() instead

======================================================================
ERROR: test_splittag (test.test_urlparse.Utility_Tests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/serhiy/py/cpython-gc/Lib/test/test_urlparse.py", line 1101, in test_splittag
    self.assertEqual(splittag('http://example.com?foo=bar#baz'),
  File "/home/serhiy/py/cpython-gc/Lib/urllib/parse.py", line 1088, in splittag
    DeprecationWarning, stacklevel=2)
DeprecationWarning: urllib.parse.splittag() is deprecated as of 3.8, use urllib.parse.urlparse() instead

======================================================================
ERROR: test_splittype (test.test_urlparse.Utility_Tests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/serhiy/py/cpython-gc/Lib/test/test_urlparse.py", line 998, in test_splittype
    self.assertEqual(splittype('type:opaquestring'), ('type', 'opaquestring'))
  File "/home/serhiy/py/cpython-gc/Lib/urllib/parse.py", line 956, in splittype
    DeprecationWarning, stacklevel=2)
DeprecationWarning: urllib.parse.splittype() is deprecated as of 3.8, use urllib.parse.urlparse() instead

======================================================================
ERROR: test_splituser (test.test_urlparse.Utility_Tests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/serhiy/py/cpython-gc/Lib/test/test_urlparse.py", line 1035, in test_splituser
    self.assertEqual(splituser('User:Pass@www.python.org:080'),
  File "/home/serhiy/py/cpython-gc/Lib/urllib/parse.py", line 1000, in splituser
    DeprecationWarning, stacklevel=2)
DeprecationWarning: urllib.parse.splituser() is deprecated as of 3.8, use urllib.parse.urlparse() instead

======================================================================
ERROR: test_splitvalue (test.test_urlparse.Utility_Tests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/serhiy/py/cpython-gc/Lib/test/test_urlparse.py", line 1124, in test_splitvalue
    self.assertEqual(splitvalue('foo=bar'), ('foo', 'bar'))
  File "/home/serhiy/py/cpython-gc/Lib/urllib/parse.py", line 1117, in splitvalue
    DeprecationWarning, stacklevel=2)
DeprecationWarning: urllib.parse.splitvalue() is deprecated as of 3.8, use urllib.parse.parse_qsl() instead

======================================================================
ERROR: test_to_bytes (test.test_urlparse.Utility_Tests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/serhiy/py/cpython-gc/Lib/test/test_urlparse.py", line 1131, in test_to_bytes
    result = urllib.parse.to_bytes('http://www.python.org')
  File "/home/serhiy/py/cpython-gc/Lib/urllib/parse.py", line 920, in to_bytes
    DeprecationWarning, stacklevel=2)
DeprecationWarning: urllib.parse.to_bytes() is deprecated as of 3.8

======================================================================
ERROR: test_unwrap (test.test_urlparse.Utility_Tests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/serhiy/py/cpython-gc/Lib/test/test_urlparse.py", line 1137, in test_unwrap
    url = urllib.parse.unwrap('<URL:type://host/path>')
  File "/home/serhiy/py/cpython-gc/Lib/urllib/parse.py", line 940, in unwrap
    DeprecationWarning, stacklevel=2)
DeprecationWarning: urllib.parse.unwrap() is deprecated as of 3.8

----------------------------------------------------------------------
msg317391 - (view) Author: Cheryl Sabella (cheryl.sabella) * (Python committer) Date: 2018-05-23 12:11
Serhiy,

Thanks for finding this.  I've submitted a PR to fix the tests.
msg318554 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-06-03 14:31
New changeset 867b825830b9b0baff791c9bcda57bba3809722a by Serhiy Storchaka (Cheryl Sabella) in branch 'master':
bpo-27485: Change urlparse tests to use private methods. (GH-7070)
https://github.com/python/cpython/commit/867b825830b9b0baff791c9bcda57bba3809722a
msg334794 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2019-02-03 15:12
Please refer to issue35891 for a description of an important use-case broken by the planned removal of splituser.
History
Date User Action Args
2019-02-03 15:12:17jaracosetnosy: + jaraco
messages: + msg334794
2018-06-03 14:47:49serhiy.storchakasetstatus: open -> closed
stage: patch review -> resolved
2018-06-03 14:31:34serhiy.storchakasetmessages: + msg318554
2018-05-23 14:34:53gvanrossumsetnosy: - gvanrossum
2018-05-23 12:11:26cheryl.sabellasetmessages: + msg317391
2018-05-23 12:08:06cheryl.sabellasetkeywords: + patch
stage: resolved -> patch review
pull_requests: + pull_request6699
2018-05-23 04:53:17serhiy.storchakasetstatus: closed -> open

messages: + msg317354
2018-04-26 02:10:18lukasz.langasetstatus: open -> closed
versions: + Python 3.8, - Python 2.7, Python 3.5, Python 3.6
messages: + msg315769

resolution: fixed
stage: resolved
2018-04-25 23:51:57lukasz.langasetnosy: + lukasz.langa
messages: + msg315765
2017-06-14 23:34:10cheryl.sabellasetmessages: + msg296049
2017-06-14 23:25:20cheryl.sabellasetpull_requests: + pull_request2249
2017-06-14 04:48:57martin.pantersetmessages: + msg295983
2017-06-13 22:59:11cheryl.sabellasetmessages: + msg295967
2017-06-13 22:26:06gvanrossumsetmessages: + msg295966
2017-06-13 22:22:59brett.cannonsetmessages: + msg295965
2017-06-13 22:03:49cheryl.sabellasetnosy: + cheryl.sabella
messages: + msg295962
2016-07-16 04:05:27orsenthilsetnosy: + orsenthil

messages: + msg270536
versions: - Python 3.2, Python 3.3, Python 3.4
2016-07-12 16:03:48brett.cannonsetnosy: + brett.cannon
messages: + msg270253
2016-07-12 00:51:34gvanrossumsetmessages: + msg270218
2016-07-11 23:33:39martin.pantersetnosy: + martin.panter
messages: + msg270215
2016-07-11 19:08:01serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg270200
2016-07-11 18:02:03gvanrossumcreate