When reusing an handler, urllib(2)'s digest authentication fails after multiple regative replies
Components: Library (Lib) Versions: Python 3.1, Python 3.2, Python 2.7, Python 2.6
Nosy List: Erick.Jones, Luci.Stanescu, kiilerix, orsenthil, spaceone
Created on 2010-08-27 06:59 by Luci.Stanescu, last changed 2019-03-15 22:16 by BreamoreBoy.

Author: Luci Stanescu (Luci.Stanescu) Date: 2010-08-27 06:59

The HTTPDigestAuthHandler's code looks like this:

    def http_error_401(self, req, fp, code, msg, headers):
        host = urlparse(req.full_url)[1]
        retry = self.http_error_auth_reqed('www-authenticate',
                                           host, req, headers)
        return retry

After successful authentication, the HTTP server might still return an error code, such as 404 or 304. In that case, self.http_error_auth_reqed raises the appropriate HTTPError and self.reset_retry_count is not called. I think that the code should be something along the lines of:

  retry = self.http_error_auth_reqed('www-authenticate', host, req, headers)
except HTTPError, e:
  if e.code != 401:
  return retry

Ways to reproduce the problem: try to access a resource for which an HTTP server requires authentication but for which after successful authentication returns a negative reply. I've attached an example script to demonstrate it (for python 2.X; bug also resent in 3.X, just replace import urllib2 with from urllib import request as urllib2 ;-) ).

The same problem applies to ProxyDigestAuthHandler.
Author: Mark Lawrence (BreamoreBoy) Date: 2014-06-17 18:14
Could we have a response to this please as a way to reproduce the problem is given in the attached patch and a suggested solution is inline.
Author: Erick Jones (Erick.Jones) Date: 2015-02-12 16:42
This ended up biting me also.  I had a list of URLs to fetch with authentication.  One of the URLs was bad (returning 401 even with authentication), and that was causing all of the subsequent URLs to fail as well since the reset count wasn't getting reset.

I also don't like that the retry count is stored in the handler -- that's mutable global state, which wreaks havoc if I use this with Eventlet coroutines for concurrent page fetches.  (If I just add the authentication headers myself, then urllib2 works just fine under Eventlet.)

Couldn't the retry count be stored in the request object itself?

And why do we even need a retry "count"?  If it fails without authentication, then try it with authentication.  If it fails again, just return to the application.  It makes no sense to retry four more times.
