Title: urllib.parse.urlparse is not parsing the url properly
Created on 2021-06-10 11:29 by neethun, last changed 2021-06-10 11:52 by Gnanesh.

msg395518 - (view) Author: Neethu (neethun) Date: 2021-06-10 11:29
urllib.parse.urlparse is not parsing urls without scheme and with port number properly.

from urllib.parse import urlparse

ParseResult(scheme='', netloc='', path='80', params='', query='', fragment='')

Python version : 3.9.5
msg395522 - (view) Author: Gnanesh (Gnanesh) Date: 2021-06-10 11:52
Hey neethu,

For empty schemes, it should have a prefix of "//" in the URL to parse it correctly.

> urlparse('//')

ParseResult(scheme='', netloc='', path='', params='', query='', fragment='')

Here's a comment from the docs ( 
> Following the syntax specifications in RFC 1808, urlparse recognizes a netloc only if it is properly introduced by ‘//’. Otherwise the input is presumed to be a relative URL and thus to start with a path component.
