This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: urlparse("host:123", "http") inconsistent between 3.8 and 3.9
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.9, Python 3.8
process
Status: closed Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: arcivanov, martin.panter
Priority: normal Keywords:

Created on 2021-05-02 03:29 by arcivanov, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (3)
msg392654 - (view) Author: Arcadiy Ivanov (arcivanov) Date: 2021-05-02 03:29
$ ~/.pyenv/versions/3.8.6/bin/python3.8
Python 3.8.6 (default, Oct  8 2020, 13:32:06) 
[GCC 10.2.1 20200723 (Red Hat 10.2.1-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from urllib.parse import urlparse
>>> urlparse("host:123", "http")
ParseResult(scheme='http', netloc='', path='host:123', params='', query='', fragment='')
>>> 

$ ~/.pyenv/versions/3.8.9/bin/python3.8
Python 3.8.9 (default, May  1 2021, 23:27:11) 
[GCC 11.1.1 20210428 (Red Hat 11.1.1-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from urllib.parse import urlparse
>>> urlparse("host:123", "http")
ParseResult(scheme='http', netloc='', path='host:123', params='', query='', fragment='')
>>> 

$ ~/.pyenv/versions/3.9.4/bin/python3.9
Python 3.9.4 (default, Apr  8 2021, 17:27:49) 
[GCC 10.2.1 20201125 (Red Hat 10.2.1-9)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from urllib.parse import urlparse
>>> urlparse("host:123", "http")
ParseResult(scheme='host', netloc='', path='123', params='', query='', fragment='')
>>> 


While I'm not sure, it seems to me that 3.9 is wrong here, given that the default scheme is specified as a second parameter to URL parse, i.e. "host:123" should be treated as "http://host:123" as in 3.8.

We also relied on this parser behavior, i.e. for us it's a regression.
msg392657 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2021-05-02 03:49
I suspect this comes from Issue 27657. Consider how similar URLs like tel:123 or javascript:123 should be parsed.
msg392661 - (view) Author: Arcadiy Ivanov (arcivanov) Date: 2021-05-02 04:34
I guess I'll work around this, thanks.
History
Date User Action Args
2022-04-11 14:59:45adminsetgithub: 88173
2021-05-02 04:34:50arcivanovsetstatus: open -> closed

messages: + msg392661
stage: resolved
2021-05-02 03:49:59martin.pantersetnosy: + martin.panter
messages: + msg392657
2021-05-02 03:30:06arcivanovsettype: behavior
2021-05-02 03:29:37arcivanovcreate