This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author EvensF
Recipients EvensF, docs@python, martin.panter, orsenthil, r.david.murray
Date 2014-09-04.22:25:40
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1409869542.46.0.355938912343.issue21228@psf.upfronthosting.co.za>
In-reply-to
Content
Hi,

From my limited experience reporting documentation issues, I see that it's better to submit a patch than to only report an issue. So, I've tried to look into the source code to figure out what was going on. I have attached a patch that I'm submitting to you for review hoping I doing everything right.

What was reported as ambiguous in this issue is the description of the return value of the function urllib.request.urlopen() for http and https URLs. It was mentionned that it should be an http.client.HTTPResponse object but it implied that something may have been different about this object. 

To understand why I'm may now be able to assert what's being said in that patch, follow me in the source code. It's based on revision c499cc2c4a06. If you don't care about all the walkthrough you can skip to step 9.

   1. We want to describe the return value of the urllib.request.urlopen() for http and https URLs
   2. The urlopen() function is defined in Lib/urllib/request.py starting on line 138. Its return value is the return value function of the opener.open() method (line 153)
      * This opener object is defined in one of these locations:
         * On line 150 as the return value of the module function build_opener() (this return value is cached in the _opener module variable)
         * On line 152 as the value cached in the _opener module variable
         * On line 148 still as the return value of the module function build_opener() in the case if you want to access an HTTPS URL and you provide a cafile, capath or cadefault argument
      * So either way, the opener object come from the build_opener directly or indirectly.
   3. The build_opener() function is defined starting on line 505. Its return value (line 539) is an instance of the OpenerDirector class (line 514). The OpenerDirector class is defined starting on line 363.
      a. Before returning its return value, after some checks (lines 522-530, 535-536), build_opener() calls the OpenerDirector().add_handler() with an instance of some of the classes defined in the default_classes list (line 515-518). What matters to us is the HTTPHandler class and the HTTPSHandler class (line 520).
      b. The OpenerDirector().add_handler() method (line 375) takes the HTTPHandler class (line 1196) and:
         * Insert the HTTPHandler.http_open() method in the list stored as the value of OpenerDirector().handle_open['http'].
         * Insert the HTTPHandler.http_request() method in the list stored as the value of OpenerDirector().process_request['http'].
      c. For HTTPSHandler (line 1203) is the same thing but :
         * HTTPSHandler.https_open() for OpenerDirector().handle_open['https']
         * HTTPSHandler.https_request() for OpenerDirector().process_request['https']
   4. I remind you that we are looking for the return value of the method open() of an instance of the OpenerDirector class (see point number 2). This method is defined starting on line 437.
   5. The OpenDirector.open() method's return value is the response variable (line 463)
   6. This variable is defined on lines 461 and 455.
      a. The loop on lines 458-461 tries to find in his handlers (the OpenerDirector().process_response dictionary) a response processor (a XXX.http_response() method) which isn't defined in HTTPHandler or HTTPSHandler. (a http_response() method is defined in HTTPErrorProcessor [line 564] and in HTTPCookieProcessor [line 1231] but in each of these cases, these classes don't modify the response value)
      b. So response variable's value is the return value of OpenerDirector()._open(req, data) on line 455. 
         * The req argument is a Request instance (line 440) or something that has the same interface, I guess (line 442). The Request class is defined on line 253.
         * The data argument is included in the constructor of the Request instance (line 440 and then on line 262) or added to the object provided (line 444). Afterwards, it won't be used directly (OpenerDirector()._open() receives it as an argument but won't use it in its body)
   7. OpenerDirector()._open() is defined on line 465. It will call OpenerDirector()._call_chain() up to three times depending on whether a result has been found (lines 468-469, 474-475). 
      * OpenerDirector()._call_chain() is defined on line 426. All it does is calling the handlers registered in the dictionnary provided (the chain argument) until one returns something else than None and returns it.
      * In our case (retrieving http and http resources):
         a. The first call (line 466) will return None since HTTPHandler or HTTPSHandler don't have a default_open() method (in fact, no handler defined in this file has a default_open() method)
         b. The second call will work since HTTPHandler.http_open() (line 1198) and HTTPSHandler.https_open() (line 1212) exists. Their return values will be enventually what we are looking for.
   8. HTTPHandler.http_open() and HTTPSHandler.https_open() returns the return value of do_open() method defined (on line 1134) in their mutual superclass AbstractHTTPHandler (line 1086). They will call it with http.client.HTTPConnection and req in the case of HTTPHandler and http.client.HTTPSConnection and req in the case of HTTPSHandler with a few other arguments.
   9. OpenerDirector().do_open() creates a http.client.HTTPSConnection object (line 1144) and calls its request() method (line 1173) and if it works, calls its getreponse() method (line 1178). This return value is the HTTPResponse object we are looking for.
   10. Finally we get our answer: 
      * On line 1186, an url attribute is added to this HTTPResponse object
      * On line 1192, the msg attribute is replaced by the reason attribute

I hope this is what was needed to close this issue. Otherwise, just tell me what is missing.

Oh and there seems that there are be many things that could be refactored. Can I do it and open issues about them ?
History
Date User Action Args
2014-09-04 22:25:42EvensFsetrecipients: + EvensF, orsenthil, r.david.murray, docs@python, martin.panter
2014-09-04 22:25:42EvensFsetmessageid: <1409869542.46.0.355938912343.issue21228@psf.upfronthosting.co.za>
2014-09-04 22:25:42EvensFlinkissue21228 messages
2014-09-04 22:25:40EvensFcreate