This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: http POST request with python 3.3 through web proxy
Type: behavior Stage: resolved
Components: Windows Versions: Python 3.4, Python 3.5
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: AlexMJ, demian.brecht
Priority: normal Keywords: patch

Created on 2014-07-22 20:19 by AlexMJ, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
issue22041.patch demian.brecht, 2014-07-24 16:21 review
issue22041_1.patch demian.brecht, 2014-07-24 23:36 review
Messages (11)
msg223688 - (view) Author: Alejandro MJ (AlexMJ) Date: 2014-07-22 20:19
I'm trying this specific method with python, in order to use a different ip source, to do a POST request: 

import http.client, urllib.parse
data = urllib.parse.urlencode({'QLastname': 'DIAZ HERNANDEZ', 'QFirstname': 'JAIME'})
headers = {"Content-type": "application/x-www-form-urlencoded","Accept": "text/plain"}
conn = http.client.HTTPConnection("www.infobel.com",80, source_address=("16.19.109.51", 0))
conn.request("POST", "/es/spain/people.aspx", data, headers)
response = conn.getresponse()
print(response.status, response.reason)
data = response.read()
conn.close()

It works perfectly when I test it without a proxy, but when I try with proxy connection, I receive this error:

>>> conn.request("POST", "/es/spain/people.aspx", data, headers)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python34\lib\http\client.py", line 1090, in request
    self._send_request(method, url, body, headers)
  File "C:\Python34\lib\http\client.py", line 1128, in _send_request
    self.endheaders(body)
  File "C:\Python34\lib\http\client.py", line 1086, in endheaders
    self._send_output(message_body)
  File "C:\Python34\lib\http\client.py", line 924, in _send_output
    self.send(msg)
  File "C:\Python34\lib\http\client.py", line 859, in send
    self.connect()
  File "C:\Python34\lib\http\client.py", line 836, in connect
    self.timeout, self.source_address)
  File "C:\Python34\lib\socket.py", line 509, in create_connection
    raise err
  File "C:\Python34\lib\socket.py", line 500, in create_connection
    sock.connect(sa)

TimeoutError: [WinError 10060] Se produjo un error durante el intento...

How could I follow proxy configuration in this script? 

This is the code I made to test with proxy (following the documentation of Python):

import http.client, urllib.parse
data = urllib.parse.urlencode({'QLastname': 'DIAZ HERNANDEZ', 'QFirstname': 'JAIME'})
headers={"Content-Type":"application/x-www-form-urlencoded","Accept":"text/plain"}
conn = http.client.HTTPConnection(proxy_url,8080, source_address=(ipAddress, 0))
conn.set_tunnel("www.infobel.com")
conn.request("POST", "/es/spain/people.aspx", data, headers)
response = conn.getresponse()
print(response.status, response.reason)
data = response.read()
conn.close()

I could't make it work in this SO:
SUSE Linux Enterprise Server 11 (x86_64) VERSION = 11 PATCHLEVEL = 2 

The message that proxy give us is this:

2014-07-22 08:31:49 87 16.19.109.51 23.2.2.22 - - - PROXIED "none" -  200 TCP_ACCELERATED CONNECT - tcp www.infobel.com 80 / - - - 23.2.2.22 39 39 - 
2014-07-22 08:31:49 1  16.19.109.51 23.2.2.22 - - dns_unresolved_hostname PROXIED "none" -  404 TCP_ERR_MISS POST - http cachebdvg1.igrupobbva 8080 /es/spain/people.aspx - aspx - 23.2.2.22 815 230 - 

So I tried to prove it in other SO, such as Windows, in a different computer. A curious thing... I've tried this with Python3.4.1 in Windows, and it didn't work. But when I proved with Python3.3.5 it works!! 

Thanks for help.
msg223804 - (view) Author: Demian Brecht (demian.brecht) * (Python triager) Date: 2014-07-24 02:59
Hi Alejandro,

I've spent a little time looking into this. I haven't been able to reproduce what you're seeing on Windows exactly, but I've encountered other issues along the same path using a local squid instance (localhost:4242):


from http.client import OK, HTTPConnection
import unittest

class TestProxy(unittest.TestCase):
    def test_proxy_tunnel_success(self):
        con = HTTPConnection('localhost', 4242)
        con.set_tunnel('www.example.com')
        con.request('GET', 'http://www.example.com')
        resp = con.getresponse()
        self.assertEqual(resp.code, 200)
        data = resp.read()
        con.close()

    def test_proxy_tunnel_failure(self):
        con = HTTPConnection('localhost', 4242)
        con.set_tunnel('www.example.com')
        con.request('GET', '/')
        resp = con.getresponse()
        self.assertEqual(resp.code, 200) # FAILS
        con.close()

if __name__ == '__main__':
    unittest.main()


As you can see with the test above, if I use the full URI, the request succeeds, but the relative path (as in your example) fails. My assumption is that these issues may be related to proxy server implementations, but I'd have to some further investigation before being able to go on more than a hunch (and I don't have time to do that tonight).

As a first step, could you please try using a full URI in your request and see if that produces the desired result?
msg223807 - (view) Author: Demian Brecht (demian.brecht) * (Python triager) Date: 2014-07-24 05:00
Ignore my previous note. Digging into this a little more, I think I've possibly found the underlying issue:

If the port is not specified in set_tunnel (as in your example), the buffer sent over the wire looks like

"send: b'POST [PATH] HTTP/1.1\r\nHost: [HOST]:None\r\nAccept-Encoding: identity\r\nContent-Length: 41\r\nAccept: text/plain\r\nContent-type: application/x-www-form-urlencoded\r\n\r\n[FORM_DATA]'"

Note the "None" as the port. However, if the port is explicitly set, then the resulting buffer looks like:

"send: b'POST [PATH] HTTP/1.1\r\nHost: [HOST]:[PORT]\r\nAccept-Encoding: identity\r\nContent-Length: 41\r\nAccept: text/plain\r\nContent-type: application/x-www-form-urlencoded\r\n\r\n[FORM_DATA]'"


Can you retry your example, but specify the port and let me know if that fixes your problem? Either way, this is a bug that I'll submit a patch (and test) for, but I'd like to know that it solves the issue as written.
msg223854 - (view) Author: Demian Brecht (demian.brecht) * (Python triager) Date: 2014-07-24 16:21
I've attached a patch that solves the issue I encountered. It would be great if you could confirm whether or not it also resolves the issue as reported.
msg223910 - (view) Author: Demian Brecht (demian.brecht) * (Python triager) Date: 2014-07-24 23:36
Attached a new patch with with a simple test.
msg223935 - (view) Author: Alejandro MJ (AlexMJ) Date: 2014-07-25 11:02
Thanks a lot for your help!

I've tested it in Linux, Python version 3.3.5 and the message obtained is this: [404 Not Found]. The script is this one (changing of course the ip_address and the proxy_url values):

import http.client, urllib.parse
data = urllib.parse.urlencode({'nombre': 'HERVAS INFANTE ALBERTO'})
headers = {"Content-type": "application/x-www-form-urlencoded"}
conn = http.client.HTTPConnection(proxy_url,8080, source_address=(ip_address, 0))
conn.set_tunnel("www.telexplorer.es",port=80)  
conn.request("POST", "/?zone=namwp",data,headers)
response = conn.getresponse()
print("Test 1: TE - ", response.status, response.reason)
data = response.read()
conn.close()


How could I test that patch attached? I suppose that I have to install something on Suse? As I could read in some forums, I should launch this sentence:

patch -p1 --dry-run < issue22041_1.patch

Could you please help me with this points? thanks!
msg223938 - (view) Author: Alejandro MJ (AlexMJ) Date: 2014-07-25 11:57
I've wrote these sentences on my SUSE, python is installed on path: /usr/local/pr/python

computer002:/usr/local/pr/python # patch -p1 --dry-run <issue22041_1.patch
can't find file to patch at input line 4
Perhaps you used the wrong -p or --strip option?
The text leading up to this was:
--------------------------
|diff -r 5d70ac83d104 Lib/http/client.py
|--- a/Lib/http/client.py       Thu Jul 24 12:44:07 2014 +0200
|+++ b/Lib/http/client.py       Thu Jul 24 16:34:46 2014 -0700
--------------------------
File to patch: /usr/local/pr/python/lib/python3.3/http/client.py
patching file /usr/local/pr/python/lib/python3.3/http/client.py
Hunk #1 FAILED at 835.
1 out of 1 hunk FAILED -- saving rejects to file /usr/local/pr/python/lib/python3.3/http/client.py.rej
can't find file to patch at input line 17
Perhaps you used the wrong -p or --strip option?
The text leading up to this was:
--------------------------
|diff -r 5d70ac83d104 Lib/test/test_httplib.py
|--- a/Lib/test/test_httplib.py Thu Jul 24 12:44:07 2014 +0200
|+++ b/Lib/test/test_httplib.py Thu Jul 24 16:34:46 2014 -0700
--------------------------
File to patch: /usr/local/pr/python/lib/python3.3/test/test_httplib.py
patching file /usr/local/pr/python/lib/python3.3/test/test_httplib.py
Hunk #1 FAILED at 1235.
1 out of 1 hunk FAILED -- saving rejects to file /usr/local/pr/python/lib/python3.3/test/test_httplib.py.rej
computer002:/usr/local/pr/python #
 
What am I doing wrong?
msg223954 - (view) Author: Demian Brecht (demian.brecht) * (Python triager) Date: 2014-07-25 14:58
Sorry Alejandro, I should have clarified: The attached patch is for dev, so the failure you're seeing when attempting to apply the patch against 3.3 is expected. It effectively does the same thing as explicitly setting the port as you have already attempted.

At this point, I'm relatively confident that the issue is due to the proxy server in use. Using your latest code but a local squid proxy, I'm able to get 200 responses with the latest releases of 3.3 and 3.4 as well as against dev.

Do you absolutely need to tunnel? The most common use case for tunnelling is to use SSL, which doesn't seem to be the case here. Does the following code work for you? It still uses the proxy server, but without CONNECT.

import http.client, urllib.parse
data = urllib.parse.urlencode({'nombre': 'HERVAS INFANTE ALBERTO'})
headers = {"Content-type": "application/x-www-form-urlencoded"}
conn = http.client.HTTPConnection(proxy_url,8080, source_address=(ip_address, 0))
conn.request("POST", "http://www.telexplorer.es/?zone=namwp",data,headers)
response = conn.getresponse()
print("Test 1: TE - ", response.status, response.reason)
data = response.read()
conn.close()


If the above code doesn't fulfill your requirements, do you know the vendor/version of the proxy that you're using?


Note to self: 3.3 doesn't respect _tunnel_port when setting the host header for requests (this was added in 3.4), so CONNECT and subsequent host headers will appear to be correct as long as the ports match up. The problem 3.4+ is that rather than ensuring non-None value in set_tunnel, it's done in _tunnel as a step just before CONNECT. That step is not replicated when the host header is set in putrequest, which leads to the value of "None" being sent for the port in the case where the port is not explicitly set in set_tunnel. To me, it makes the most sense to use _set_hostport in set_tunnel as in the attached patch to ensure any other use of _tunnel_port can be done without special handling of None.
msg223956 - (view) Author: Demian Brecht (demian.brecht) * (Python triager) Date: 2014-07-25 15:07
To add a little more detail, from what I gather, CONNECT support may be unsupported or limited (i.e. only allowing SSL connections) on various proxy servers. If the code snippet in my previous post solves your issue, then I would assume that to be the case with the proxy you're using.
msg224156 - (view) Author: Alejandro MJ (AlexMJ) Date: 2014-07-28 08:11
Thanks a lot for your help, as you suggested the problem was because of the method set_tunnel. I've tested the code that you have posted and now works perfectly.

I'll keep in mind this for future works. We can conclude that it's not really a bug of Python, so that this problem is related to proxy configuration. However, I'll take in count which version of Python to use if I have to do a similar job.

Regards,
Alejandro
msg224179 - (view) Author: Demian Brecht (demian.brecht) * (Python triager) Date: 2014-07-28 19:07
No problem, happy you were able to get things sorted. Feel free to close this issue as I've opened #22095 to address the host port header issue.
History
Date User Action Args
2022-04-11 14:58:06adminsetgithub: 66240
2014-07-28 19:51:31ned.deilysetstatus: open -> closed
stage: resolved
2014-07-28 19:07:14demian.brechtsetmessages: + msg224179
2014-07-28 08:11:55AlexMJsetresolution: not a bug
messages: + msg224156
2014-07-25 15:07:09demian.brechtsetmessages: + msg223956
2014-07-25 14:58:45demian.brechtsetmessages: + msg223954
2014-07-25 11:57:44AlexMJsetmessages: + msg223938
2014-07-25 11:03:00AlexMJsetnosy: + AlexMJ
messages: + msg223935
2014-07-24 23:36:14demian.brechtsetfiles: + issue22041_1.patch

messages: + msg223910
2014-07-24 16:21:00demian.brechtsetfiles: + issue22041.patch
keywords: + patch
messages: + msg223854
2014-07-24 05:00:31demian.brechtsetmessages: + msg223807
2014-07-24 02:59:21demian.brechtsetmessages: + msg223804
2014-07-23 13:52:01demian.brechtsetnosy: + demian.brecht
2014-07-23 12:23:16Alejandro.Mjsetnosy: - AlexMJ
-> (no value)
2014-07-22 20:19:20AlexMJcreate