Title: host and port attributes not documented well in function urllib.parse.urlparse and urlsplit
msg123898 - (view) Author: JamesThomasMoon1979 (JTMoon79) Date: 2010-12-13 20:17
Copy of issue 10696
This issue is exactly the same as issue 10696 except it affects a different function, urllib.parse.urlparse (instead of urllib.parse.urlsplit).

urlparse function from urllib.parse.urlparse does not return the port field.
>>> import urllib
>>> import urllib.parse
>>> urllib.parse.urlparse(r'')
ParseResult(scheme='http', netloc='', path='/blarg', params='', query='a=1&b=2', fragment='')
ParseResult(scheme='http', netloc='', path='/blarg', port='80', params='', query='a=1&b=2', fragment='')

The documentation at shows this as expected.  What is the purpose of a possible port parameter if that port parameter is not set?

According to RFC 1808 the syntatic components are 
However, according to referenced RFC 1738 (referenced by RFC 1808)
the <net_loc> can be further separated to <host> and <port>.

I guess a bigger more general complaint about this is, why not make urlparse more useful by separating <host> and <port>?
I imagine this is a common need of users.  I like standards.  And doing a little extra to work with standards make those standards even more useful.
msg123901 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-12-13 20:30
The repr gives the primary components defined by the URL.  The subfields are provided as attributes of the result.  This is documented in the example at the top of the chapter, but it is not, IMO, well documented in the rest of the chapter.

I'm not sure when this feature was introduced, so I'm leaving 3.1 in the versions for now.
msg123902 - (view) Author: Fred Drake (fdrake) (Python committer) Date: 2010-12-13 20:43
These attributes were added in Python 2.5.

Documentation improvements should be backported to 2.7 and 3.1.
msg123903 - (view) Author: JamesThomasMoon1979 (JTMoon79) Date: 2010-12-13 20:48
Doh!  I feel a bit silly.
I didn't notice 'hostname' and 'port' in 
>>> dir(urllib.parse.urlparse(r''))
[... 'count', 'fragment', 'geturl', 'hostname', 'index'
, 'netloc', 'params', 'password', 'path', 'port', 'query', 'scheme', 'username']

I agree, some clarity in the documentation for these overlapping fields (<net_loc>,<port>,<hostname>) would help.

msg235583 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-02-09 03:27
I don’t understand where the work needs to be done for this one. Even in the 3.1 and 2.7 documentation, the urlparse() and urlsplit() entries both list “port” as one of the returned attributes, and urlparse() has example code for it.
msg270372 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2016-07-14 05:54
I am unsure of the change too. I am willing to close this report as .port attribute is already documented.
