Message 268894 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	Rotkraut
Recipients	Rotkraut, demian.brecht, harobed, martin.panter, orsenthil, petri.lehtinen, piotr.dobrogost, pitrou, whitemice
Date	2016-06-20.11:48:32
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1466423313.28.0.768781122323.issue12319@psf.upfronthosting.co.za>
In-reply-to

Content
Martin, thank you for your review! I have to work through the list of comments and get back to you when I'm done. Please note that I didn't got your earlier reviews, so I don't know which comments are new and which are left over from earlier reviews. Of course, I could also have called http.client._get_content_length(). But I believe it would be bad coding style to call an internal helper function from outside of the module. Furthermore, get_content_length() is closely related to HTTPConnection.send(), because it's vital that both methods share the same interpretation of the various body representations. That is why I felt that this should rather be a method of HTTPConnection. But I do not have a strong opinion about this. I would be happy to change it back if you prefer so. Concerning the new encode_chunked=True flag in HTTPConnection.request(), I guess there is no ideal solution. I can see the following alternatives, all of them having advantages and problems: 1. Leave it to http.client. Do not add the Content-Length or Transfer-Encoding header in the Request object in urllib.request. Pro: simple & clean, no extra flag. Con: changes current behavior, possibly breaks existing code. 2. Add Content-Length, but not Transfer-Encoding in urllib.request. Pro: no extra flag. Con: inconsistent, Content-Length and Transfer-Encoding are closely related and serve the same purpose, so it's unlogic to treat them differently in urllib. 3. Care about both headers in urllib.request, adding either Content-Length or Transfer-Encoding. Always do the chunk encoding in http.client, also if the Transfer-Encoding header is already set. Pro: the most consistent solution, no extra flag. Con: changes current behavior, possibly breaks existing code. 4. Care about both headers in urllib.request, add either Content-Length or Transfer-Encoding. Add the extra flag to HTTPConnection.request(). Do the chunk encoding in http.client if either Transfer-Encoding header is not set or if the extra flag is set. Pro: consistent & compatible. Con: extra flag. In this situation, my patch implements option 4, also because I believe, it's most important to keep the high level API clean and consistent. But I would also be happy to switch to option 2. And yes, the encode_chunked flag in HTTPConnection.request() is only relevant if the Transfer-Encoding header is set. If Content-Length is set, chunked encoding is illegal anyway. If neither Content-Length nor Transfer-Encoding is set, http.client must decide on its own to set either the one or the other. If it takes this decision, it must also implement it. In either case, the question whether http.client should do the chunk encoding or not, does not arise. Concerning the variety body types, it is essentially just bytes like, file like, and iterables of bytes like. All three have important use cases, so we need to support them. By text, I guess you mean str? Yes, I agree, disallowing str would make things much cleaner, so I would be happy to drop them. But they are currently explicitly supported in http.client, so I guess, we can't get rid of them.

Martin, thank you for your review! I have to work through the list of comments and get back to you when I'm done. Please note that I didn't got your earlier reviews, so I don't know which comments are new and which are left over from earlier reviews.

Of course, I could also have called http.client._get_content_length(). But I believe it would be bad coding style to call an internal helper function from outside of the module. Furthermore, get_content_length() is closely related to HTTPConnection.send(), because it's vital that both methods share the same interpretation of the various body representations. That is why I felt that this should rather be a method of HTTPConnection. But I do not have a strong opinion about this. I would be happy to change it back if you prefer so.

Concerning the new encode_chunked=True flag in HTTPConnection.request(), I guess there is no ideal solution. I can see the following alternatives, all of them having advantages and problems:

1. Leave it to http.client. Do not add the Content-Length or Transfer-Encoding header in the Request object in urllib.request. Pro: simple & clean, no extra flag. Con: changes current behavior, possibly breaks existing code.

2. Add Content-Length, but not Transfer-Encoding in urllib.request. Pro: no extra flag. Con: inconsistent, Content-Length and Transfer-Encoding are closely related and serve the same purpose, so it's unlogic to treat them differently in urllib.

3. Care about both headers in urllib.request, adding either Content-Length or Transfer-Encoding. Always do the chunk encoding in http.client, also if the Transfer-Encoding header is already set. Pro: the most consistent solution, no extra flag. Con: changes current behavior, possibly breaks existing code.

4. Care about both headers in urllib.request, add either Content-Length or Transfer-Encoding. Add the extra flag to HTTPConnection.request(). Do the chunk encoding in http.client if either Transfer-Encoding header is not set or if the extra flag is set. Pro: consistent & compatible. Con: extra flag.

In this situation, my patch implements option 4, also because I believe, it's most important to keep the high level API clean and consistent. But I would also be happy to switch to option 2.

And yes, the encode_chunked flag in HTTPConnection.request() is only relevant if the Transfer-Encoding header is set. If Content-Length is set, chunked encoding is illegal anyway. If neither Content-Length nor Transfer-Encoding is set, http.client must decide on its own to set either the one or the other. If it takes this decision, it must also implement it. In either case, the question whether http.client should do the chunk encoding or not, does not arise.

Concerning the variety body types, it is essentially just bytes like, file like, and iterables of bytes like. All three have important use cases, so we need to support them. By text, I guess you mean str? Yes, I agree, disallowing str would make things much cleaner, so I would be happy to drop them. But they are currently explicitly supported in http.client, so I guess, we can't get rid of them.

History
Date	User	Action	Args
2016-06-20 11:48:33	Rotkraut	set	recipients: + Rotkraut, orsenthil, pitrou, harobed, petri.lehtinen, martin.panter, piotr.dobrogost, demian.brecht, whitemice
2016-06-20 11:48:33	Rotkraut	set	messageid: <1466423313.28.0.768781122323.issue12319@psf.upfronthosting.co.za>
2016-06-20 11:48:33	Rotkraut	link	issue12319 messages
2016-06-20 11:48:32	Rotkraut	create