classification
Title: urllib.request no_proxy check differs from curl
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.6, Python 3.5, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: docs@python Nosy List: Daniel Morrison, martin.panter, python-dev, xiang.zhang
Priority: normal Keywords: patch

Created on 2016-04-26 14:43 by Daniel Morrison, last changed 2016-04-30 12:47 by martin.panter. This issue is now closed.

Files
File name Uploaded Description Edit
issue26864.patch xiang.zhang, 2016-04-27 02:23 review
issue26864_v2.patch xiang.zhang, 2016-04-27 03:02 review
issue26864_v3.patch xiang.zhang, 2016-04-27 06:14 review
issue26864_v4.patch xiang.zhang, 2016-04-27 06:22 review
issue26864_v5.patch xiang.zhang, 2016-04-27 09:09 review
Messages (13)
msg264297 - (view) Author: Daniel Morrison (Daniel Morrison) Date: 2016-04-26 14:43
The no_proxy environment variable works in python as a case
sensitive suffix check.

Curl handles this variable as a case insensitive hostname check.

Case sensitivity appears to be in conflict with the DNS Case
Insensitivity RFC (https://tools.ietf.org/html/rfc4343).

While the suffix check is documented
(https://docs.python.org/3/library/urllib.request.html), this
seems to be problematic and inconsistent with other tools on the
system.

I believe the ideal solution would be to have proxy_bypass be a
method of ProxyHandler so that it can be overridden without
dependence on undocumented methods. This would also allow
for the requested behavior to be added without breaking backwards
compatibility.
msg264311 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-04-26 15:41
No, urllib.request does not handle the no_proxy environment in case-sensitive way. It first checks if any environment_variable.lower() ends with "_proxy", and then checks if there is any environment_variable ends with "_proxy". So it works as a case insensitive check, but the lowercase one will override others, e.g, no_proxy overrides no_PROXY. But the document is misleading, it should be clarified.
msg264313 - (view) Author: Daniel Morrison (Daniel Morrison) Date: 2016-04-26 15:55
I believe there was a misunderstanding.

While the environment variable name is handled in a case
insensitive way, the contents of the environment variable is not.

Please see some examples below:

    >>> os.environ['no_proxy'] = 'example.com'
    >>> urllib.request.proxy_bypass('EXAMPLE.com')
    0
    >>> urllib.request.proxy_bypass('example.com')
    1

Also to clarify the meaning of suffix check:

    >>> os.environ['no_proxy'] = 'example.com'
    >>> urllib.request.proxy_bypass('myexample.com')
    1

My apologies for my lack of clarity.
msg264318 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-04-26 16:59
Ohh, it's my fault to misunderstand. I think your thinking is reasonable. Curl handles the suffix check insensitively.
msg264339 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-04-27 02:23
I write a patch to fix this.

It seems on Windows and MacOS the behaviour is right. On MacOS it leaves the matching to fnmatch. On Windows it uses case insensitive regular matching.
msg264340 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-04-27 03:02
The code has changed recently. I update the patch to reveal the change.
msg264345 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-04-27 03:53
I think this patch looks okay. It fixes the case insensitivity problem, but there is also the string suffix problem (see the myexample.com demo). I think it is reasonable for example.com to match my.example.com, but not myexample.com.
msg264348 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-04-27 04:08
Yes, you are right. I don't notice that. I will try to fix it.
msg264354 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-04-27 06:14
I update the patch to use regular expression to handle both case and suffix. This regular expression has the function of curl's check_noproxy. And I think a separate test case is better here.
msg264356 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-04-27 06:22
Update to improve the test case.
msg264364 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-04-27 09:09
Thanks for your comment martin. Using re.escape is really a good advise.
msg264372 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-04-27 10:53
This version looks okay to me, thanks.
msg264540 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-04-30 03:46
New changeset ca882ee68d46 by Martin Panter in branch '3.5':
Issue #26864: Fix case insensitivity and suffix comparison with no_proxy
https://hg.python.org/cpython/rev/ca882ee68d46

New changeset 1ceb91974dc4 by Martin Panter in branch 'default':
Issue #26864: Merge no_proxy fixes from 3.5
https://hg.python.org/cpython/rev/1ceb91974dc4

New changeset a1aad42f1195 by Martin Panter in branch '2.7':
Issue #26864: Fix case insensitivity and suffix comparison with no_proxy
https://hg.python.org/cpython/rev/a1aad42f1195
History
Date User Action Args
2016-04-30 12:47:56martin.pantersetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2016-04-30 03:46:30python-devsetnosy: + python-dev
messages: + msg264540
2016-04-27 10:53:17martin.pantersetmessages: + msg264372
2016-04-27 09:09:28xiang.zhangsetfiles: + issue26864_v5.patch

messages: + msg264364
2016-04-27 06:22:13xiang.zhangsetfiles: + issue26864_v4.patch

messages: + msg264356
2016-04-27 06:14:28xiang.zhangsetfiles: + issue26864_v3.patch

messages: + msg264354
2016-04-27 04:08:47xiang.zhangsetmessages: + msg264348
2016-04-27 03:53:48martin.pantersetstage: patch review
messages: + msg264345
versions: - Python 3.2, Python 3.3, Python 3.4
2016-04-27 03:02:59xiang.zhangsetfiles: + issue26864_v2.patch

messages: + msg264340
2016-04-27 02:23:03xiang.zhangsetfiles: + issue26864.patch
keywords: + patch
messages: + msg264339
2016-04-26 16:59:06xiang.zhangsetnosy: - docs@python
messages: + msg264318

components: + Library (Lib), - Documentation
type: enhancement -> behavior
2016-04-26 15:55:08Daniel Morrisonsetmessages: + msg264313
2016-04-26 15:41:10xiang.zhangsetnosy: + docs@python, xiang.zhang, martin.panter
messages: + msg264311

assignee: docs@python
components: + Documentation, - Library (Lib)
2016-04-26 14:43:47Daniel Morrisoncreate