This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: httplib does not check if port is valid (easy to fix?)
Type: Stage:
Components: Library (Lib) Versions:
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: jhylton Nosy List: dealfaro, gvanrossum, jhylton, martinthomas, skip.montanaro
Priority: normal Keywords:

Created on 2000-12-14 04:45 by dealfaro, last changed 2022-04-10 16:03 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
httplib.diff skip.montanaro, 2002-03-09 14:30
Messages (7)
msg2663 - (view) Author: Luca de Alfaro (dealfaro) Date: 2000-12-14 04:45
In httplib.py, line 336, the following code appears: 

    def _set_hostport(self, host, port):
        if port is None:
            i = string.find(host, ':')
            if i >= 0:
                port = int(host[i+1:])
                host = host[:i]
            else:
                port = self.default_port
        self.host = host
        self.port = port

Ths code breaks if the host string ends with ":", so that
int("") is called.  In the old (1.5.2) version of this 
module, the corresponding int () conversion used to be 
enclosed in a try/except pair: 

                try: port = string.atoi(port)
                except string.atoi_error:
                    raise socket.error, "nonnumeric port"

and this fixed the problem.  
Note BTW that now the error reported by int is 
"ValueError: invalid literal for int():"
rather than the above string.atoi_error. 

I found this problem while downloading web pages, 
but unfortunately I cannot pinpoint which page 
caused the problem. 

Luca de Alfaro
msg2664 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2000-12-14 14:37
The only effect is that it raises ValueError instead of socket.error.
Where is this a problem?

(Note that string.atoi_error is an alias for ValueError.)
msg2665 - (view) Author: Luca de Alfaro (dealfaro) Date: 2000-12-18 22:25
There are three (minor?) problems with raising
ValueError. 

1) Compatibility.  I had some code for 1.5.2 that
was trying to load web pages checking for various
errors, and it was expecting this error to cause
a socket error, not a value error. 

2) Accuracy.  ValueError can be caused by
anything.  The 'non-numeric port' error is much 
more informative.  I don't want to catch
ValueError, because it can be caused in too 
many situations.  I also cannot check 
myself that the port is fine, because the 
port and the URL are often given by a  
redirect (errors 301 and 302, if I remember
correctly).  This in fact was the situation 
that caused the problem. 
Hence, my only real solution was to patch my version of httplib. 

3) Style.  I am somewhat new to Python, but I was
under the impression that, stilistically, 
a ValueError was used to convey a situation that
was the fault of the programmer, while other 
more specific errors were used for unexpected 
situations (communication, etc).  Since the 
socket is the result of a URL redirection 
(errors 301 or 302), the programmer is not in 
a position to prevent this error by "better 
checking".  Hence, I would consider a
network-relted exception to be more appropriate 
here. 

But who am I to argue with the creator of Python? 
;-)

Luca
msg2666 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2000-12-18 22:38
Thanks for explaining this more.

I am surprised that a 301 redirect would give an invalid port -- but surely webmasters aren't perfect. :-)

The argument that urllib.Urlopener.open() checks for socket.error but not for other errors is a good one.

However I don't see the httplib.py code raising socket.error elsewhere.  I'll ask Jeremy.  The rest of the module seems to be using a totally different set of exceptions.  On the other hand, it *can* raise socket.error, implicitly (when various socket calls are being made).

msg2667 - (view) Author: MartinThomas (martinthomas) Date: 2001-01-10 21:19
I have been trying to pin down a problem in Redhat's
Update agent which is written in Python (..mostly)
which happens when a proxy is specified. 

In RH7.0, they are still using Python 1.5.2 and the message
'nonnumeric port' is received when a proxy is specified
in the following form:
http://proxy.yourdomain.com:80
but  the following:
proxy.yourdomain.com:80
works..
looking at the code, it seems that it expects that the only
colon would be near the end of the url and makes no
allowance for 'http:' nor 'https:'...

Regards / Martin
msg2668 - (view) Author: Skip Montanaro (skip.montanaro) * (Python triager) Date: 2002-03-09 14:30
Logged In: YES 
user_id=44345

Here's a suggested patch that matches the exception
structure used by httplib.  It adds a subclass of 
HTTPException called InvalidURL and raises that when
int(port) gets incorrect input.
msg2669 - (view) Author: Jeremy Hylton (jhylton) (Python triager) Date: 2002-07-02 20:50
Logged In: YES 
user_id=31392

Skip's change was checked in a while ago.
History
Date User Action Args
2022-04-10 16:03:33adminsetgithub: 33589
2000-12-14 04:45:35dealfarocreate