Title: urllib.parse.urlparse is not parsing the url properly
Type: Stage:
Components: Versions: Python 3.9
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Gnanesh, neethun
Priority: normal Keywords:

Created on 2021-06-10 11:29 by neethun, last changed 2021-06-10 11:52 by Gnanesh.

Messages (2)
msg395518 - (view) Author: Neethu (neethun) Date: 2021-06-10 11:29
urllib.parse.urlparse is not parsing urls without scheme and with port number properly.

from urllib.parse import urlparse

ParseResult(scheme='', netloc='', path='80', params='', query='', fragment='')

Python version : 3.9.5
msg395522 - (view) Author: Gnanesh (Gnanesh) Date: 2021-06-10 11:52
Hey neethu,

For empty schemes, it should have a prefix of "//" in the URL to parse it correctly.

> urlparse('//')

ParseResult(scheme='', netloc='', path='', params='', query='', fragment='')

Here's a comment from the docs ( 
> Following the syntax specifications in RFC 1808, urlparse recognizes a netloc only if it is properly introduced by ‘//’. Otherwise the input is presumed to be a relative URL and thus to start with a path component.
Date User Action Args
2021-06-10 11:52:24Gnaneshsetnosy: + Gnanesh
messages: + msg395522
2021-06-10 11:29:34neethuncreate