classification
Title: test_fileno of test_urllibnet intermittently fails
Type: behavior Stage: resolved
Components: Tests Versions: Python 3.6, Python 3.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Arfrever, berker.peksag, dfarrell07, haypo, martin.panter, ned.deily, orsenthil, python-dev, r.david.murray
Priority: high Keywords: buildbot, patch

Created on 2014-03-26 08:10 by dfarrell07, last changed 2016-04-09 22:27 by martin.panter. This issue is now closed.

Files
File name Uploaded Description Edit
test_failures.txt dfarrell07, 2014-03-26 08:10 Output of `make test`, shows test failures
issue21069.patch dfarrell07, 2014-03-27 00:36 Patch to change example URL, fixes this issue
test_success_after_patch.txt dfarrell07, 2014-03-27 00:50 Successful test run after applying two relevant patches to latest code
rewrite-fileno.patch martin.panter, 2016-04-02 02:00 Test HTTPResponse, use CONNECT review
Messages (24)
msg214882 - (view) Author: Daniel Farrell (dfarrell07) * Date: 2014-03-26 08:10
I'm seeing quite a few unit test failures in urllib ("urlopen error unknown url type: https"). From what I've gathered on IRC, this seems to be because I'm missing the ssl module. I'm seeing these errors in 3.4.0, after a simple `./configure; make; make test`. Test results: http://pastebin.com/5LBAjkVW

I'm not seeing them after `hg clone http://hg.python.org/cpython; ./configure --with-pydebug && make -j2; ./python -m test -j3`, via http://docs.python.org/devguide/#quick-start.

Issue 20939 is the closest thing I've found to this, but it seems to be more narrowly scoped (http -> https change at python.org). Nice people on IRC suggest that this is a different enough problem to warrant a new issue.

I'm on an up-to-date Fedora 20 install.
msg214897 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014-03-26 13:02
What makes you think this is a different issue?  It sounds like a duplicate to me.
msg214898 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014-03-26 13:40
Ah, in case I wasn't clear: what unittest failures that you are seeing are not covered by issue 20939?  In order to diagnose this issue, we'll need to know that :)
msg214900 - (view) Author: Daniel Farrell (dfarrell07) * Date: 2014-03-26 16:08
> Ah, in case I wasn't clear: what unittest failures that you are seeing are not covered by issue 20939?

The unit test failures I'm seeing are different in at least two ways: 1) A larger set of tests failing than mentioned in Issue 20939; 2) Their failure was an assertion FAIL, mine are ERRORS related to https being an unknown URL type.

Here are the unit tests that are failing, extracted from that massive dump of info. 

The following block are all failing with "urllib.error.URLError: <urlopen error unknown url type: https>":

test_license_exists_at_url (test.test_site.ImportSideEffectTests) ... ERROR
<...>
test_issue16464 (test.test_urllib2.MiscTests) ... ERROR
<...>
test_close (test.test_urllib2net.CloseSocketTest) ... ERRO
<...>
test_http_basic (test.test_urllib2net.TimeoutTest) ... ERROR
test_http_default_timeout (test.test_urllib2net.TimeoutTest) ... ERROR
test_http_no_timeout (test.test_urllib2net.TimeoutTest) ... ERROR
test_http_timeout (test.test_urllib2net.TimeoutTest) ... ERROR
<...>
testURLread (test.test_urllibnet.URLTimeoutTest) ... ERROR
test_basic (test.test_urllibnet.urlopenNetworkTests) ... ERROR
test_fileno (test.test_urllibnet.urlopenNetworkTests) ... ERROR
<...>
test_geturl (test.test_urllibnet.urlopenNetworkTests) ... ERROR
test_info (test.test_urllibnet.urlopenNetworkTests) ... ERROR
test_readlines (test.test_urllibnet.urlopenNetworkTests) ... ERROR
test_basic (test.test_urllibnet.urlretrieveNetworkTests) ... ERROR
test_data_header (test.test_urllibnet.urlretrieveNetworkTests) ... ERROR
test_header (test.test_urllibnet.urlretrieveNetworkTests) ... ERROR
test_reporthook (test.test_urllibnet.urlretrieveNetworkTests) ... ERROR
test_specified_path (test.test_urllibnet.urlretrieveNetworkTests) ... ERROR

This one is failing with the very similar "OSError: [Errno url error] unknown url type: 'https":

test_getcode (test.test_urllibnet.urlopenNetworkTests) ... ERROR
msg214901 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014-03-26 16:43
Well, I just looked at one of those tests at random, and it is using an http://www.python.org url, so it would be covered by 20939.  Can you re-run your tests after applying the patches from that issue?
msg214918 - (view) Author: Daniel Farrell (dfarrell07) * Date: 2014-03-26 22:43
Interestingly, using the configure/make/test options given in the devguide doesn't show these failures, but running on the exact same codebase (`hg checkout v3.4.0`) with the configure/make/test options in the README consistently shows them. That seems...not good. What are those flags doing that could suppress errors?

> Can you re-run your tests after applying the patches from that issue?

Initial results are that I'm seeing some failures, but not as many and somewhat different ones. Digging into the details now.
msg214919 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014-03-26 22:48
Network tests are only run if the 'network' resource is enabled.  The devguide does not suggest setting network, but make test does do so.  (I personally always use -uall when running the tests).
msg214924 - (view) Author: Daniel Farrell (dfarrell07) * Date: 2014-03-27 00:36
Running the unit tests against the latest code (`hg clone <url>; ./configure; make; make test`) shows the same errors I described earlier. Patching the code with issue20939_all_3x.patch fixes all related errors except one.

=======================================================================
FAIL: test_fileno (test.test_urllibnet.urlopenNetworkTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/daniel/cpython/Lib/test/test_urllibnet.py", line 107, in test_fileno
    self.assertTrue(f.read(), "reading from file created using fd "
AssertionError: b'' is not true : reading from file created using fd returned by fileno failed

----------------------------------------------------------------------

Something funky is going on with example.com for this test. I imagine one of you awesome people will understand this failure quickly. Until then, using a different non-HTTPS URL seems to work (see attached patch).
msg214925 - (view) Author: Daniel Farrell (dfarrell07) * Date: 2014-03-27 00:50
Note that issue21069.patch depends on issue20939_all_3x.patch (if that wasn't clear).

[~/cpython]$ hg import issue20939_all_3x.patch
applying issue20939_all_3x.patch
patching file Lib/test/test_urllib2.py
Hunk #1 succeeded at 1447 with fuzz 1 (offset 6 lines).
[~/cpython]$ hg import issue21069.patch
applying issue21069.patch
[~/cpython]$ ./configure
<snip>
[~/cpython]$ make
<snip>
[~/cpython]$ make test &> ~/test_success_after_patch.txt
<snip>

See attached output.
msg214942 - (view) Author: Roundup Robot (python-dev) Date: 2014-03-27 08:40
New changeset adf1e04478b4 by Ned Deily in branch '3.4':
Issue #21069: Temporarily use www.google.com while investigating
http://hg.python.org/cpython/rev/adf1e04478b4

New changeset 8d4cace71113 by Ned Deily in branch 'default':
Issue 21069: merge from 3.4
http://hg.python.org/cpython/rev/8d4cace71113
msg214945 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2014-03-27 09:13
After pushing the changes for Issue20939, many of the buildbots started experiencing the test_fileno failure using www.example.com.  The interesting thing is that not all of them do, including my primary development system (OS X 10.9) which is why I didn't see a problem during the initial testing.  On another dev system (Debian Linux on a VM), the test fails intermittently. So, yes, something funky *is* going on.  One obvious difference is that the read from www.python.org returns 44000+ byes while www.example.com returns only 162 bytes.  There could be a race condition with the TCP connection close.  I've temporarily changed the test to use www.google.com pending resolution. Thanks for the report and investigation, Daniel.
msg214947 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2014-03-27 10:48
After looking at why the 2.7 version of the test does not fail, the problem became apparent.  In 2.7, test_errno tests urlopen() of the original deprecated urllib module.  In 3.x, the test was ported over but now uses urlopen() of urllib.request which is based on urllib2() of 2.x.

2.7:
>>> x = urllib.urlopen("http://www.example.com")
[79234 refs]
>>> x
<addinfourl at 3068742324L whose fp = <socket._fileobject object at 0xb6e7eea4>>
[79234 refs]
>>> os.fdopen(x.fileno()).read()
'<!doctype html>\n<html>\n<head>\n    <title>Example Domain</title>\n\n    <meta charset="utf-8" />\n    <meta http-equiv="Content-type" content="text/html; charset=utf-8" />\n    <meta name="viewport" content="width=device-width, initial-scale=1" />\n    <style type="text/css">\n    body {\n        background-color: #f0f0f2;\n        margin: 0;\n        padding: 0;\n        font-family: "Open Sans", "Helvetica Neue", Helvetica, Arial, sans-serif;\n        \n    }\n    div {\n        width: 600px;\n        margin: 5em auto;\n        padding: 50px;\n        background-color: #fff;\n        border-radius: 1em;\n    }\n    a:link, a:visited {\n        color: #38488f;\n        text-decoration: none;\n    }\n    @media (max-width: 700px) {\n        body {\n            background-color: #fff;\n        }\n        div {\n            width: auto;\n            margin: 0 auto;\n            border-radius: 0;\n            padding: 1em;\n        }\n    }\n    </style>    \n</head>\n\n<body>\n<div>\n    <h1>Example Domain</h1>\n    <p>This domain is established to be used for illustrative examples in documents. You may use this\n    domain in examples without prior coordination or asking for permission.</p>\n    <p><a href="http://www.iana.org/domains/example">More information...</a></p>\n</div>\n</body>\n</html>\n'
[79234 refs]

3.4 (when the read doesn't fail):
>>> x = urllib.request.urlopen("http://www.example.com")
>>> x
<http.client.HTTPResponse object at 0xb6bc7114>
>>> os.fdopen(x.fileno()).read()
__main__:1: ResourceWarning: unclosed file <_io.TextIOWrapper name=4 mode='r' encoding='UTF-8'>
' without prior coordination or asking for permission.</p>\n    <p><a href="http://www.iana.org/domains/example">More information...</a></p>\n</div>\n</body>\n</html>\n'

In the 3.x case (and the 2.7 urllib2 case), the read from the file descriptor starts at mid-response or at the end (returning an empty byte string).  In the past, the test passed because of the amount of data returned by the previous test URL.  Now, with the short response from www.example.com, it's clear that the file descriptor read is not returning the whole response.  I don't know whether the file descriptor read is expected to be meaningful for urllib2/urllib.request.

Senthil, what do you think?
msg216400 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2014-04-15 22:10
I am getting to this late.

>> I don't know whether the file descriptor read is expected to be meaningful for urllib2/urllib.request.
>> Senthil, what do you think?

It should be meaningful no matter what the length is. I am looking further into this now.
msg216416 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2014-04-16 01:32
This is the best way I found to reproduce the failure.

I changed the resource to www.example.com and then ran this.

$ ./python.exe -m test -m "*fileno*" -u all -v -F test_urllibnet
msg216429 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2014-04-16 03:53
This is turning out be trickier than I ever thought.

fileno() returning b'' is just random error. The underlying issue is, *directly* reading from the fp of the socket() is returning a incomplete output at all times. The correct way to read the output is by using HTTPResponse read() method only it seems.

Here is a small snippet that will fail for every url.

from urllib.request import urlopen

import os

web = "http://www.example.org"

open_url = urlopen(web)
fd = open_url.fileno()
with os.fdopen(fd, 'rb') as f:
    file_read_len = len(f.read())

req = urlopen(web)
res_len = len(req.read())

assert file_read_len == res_len
msg216430 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2014-04-16 04:00
Ned Deily had done the correct analysis in the msg214947 and has this question - 

> I don't know whether the file descriptor read is expected to be meaningful for urllib2/urllib.request.

I can see that this test case was for the old behavior where we created a file like object to read using addinfourl and fileno /fp was explicitly set.  We have to determine if this expectation that we can access the socket's fp using HTTPResponse object is a right one.
msg216438 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2014-04-16 06:19
Senthil, thanks for looking into this.  Since it is turning out to be more of a urllib design issue, I'm going to deassign myself from it.
msg242576 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2015-05-04 19:58
I tried reproducing this on Windows 8.1 with the code snippet from msg216429 but got

Traceback (most recent call last):
  File "C:\Users\Mark\Documents\MyPython\mytest.py", line 9, in <module>
    with os.fdopen(fd, 'rb') as f:
  File "c:\python34\lib\os.py", line 980, in fdopen
    return io.open(fd, *args, **kwargs)
OSError: [Errno 9] Bad file descriptor

Is there something else that I can try?  Is this still an issue that needs looking into?
msg262733 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-04-01 09:50
Mark: My understanding is on Windows, winsock file descriptors and C library file descriptors are different beasts; see <https://docs.python.org/3/library/socket.html#socket.socket.fileno>. Perhaps the test should call socket functions like socket.recv() on the FD rather than C library functions via os.fdopen().

Victor: The test in this bug has started failing again, very likely due to your revision 7bd4736195ce enabling a timeout on the HTTP request. I guess this causes the socket to be in non-blocking mode, and read() to return None. This is what Issue 10119 tried to fix. Example:

http://buildbot.python.org/all/builders/AMD64%20Debian%20root%203.x/builds/3404/steps/test/logs/stdio
======================================================================
FAIL: test_fileno (test.test_urllibnet.urlopenNetworkTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/test/test_urllibnet.py", line 110, in test_fileno
    self.assertTrue(f.read(), "reading from file created using fd "
AssertionError: None is not true : reading from file created using fd returned by fileno failed

A less serious and long-standing problem with the test is that it attempts to close the socket twice. We are just lucky that socket.close() is called second, which does not raise any errors: <https://hg.python.org/cpython/annotate/8938b3132caa/Modules/socketmodule.c#l259>.

Regarding the purpose and use cases of fileno(), I agree with Senthil that using it to read the HTTP response behind the HTTPResponse object’s back in Python 3 is a bad idea, and I don’t think it is practical to make this work without losing the benefits of buffering. But there are probably other valid use cases such as calling getsockname() on the socket, or sending and receiving non-HTTP data after setting up a CONNECT tunnel.

Proposals:

1. Change the test to do use socket(fileno=...), rather than os.fdopen(...), so that it will be usable on Windows.

2. Ensure that the secondary socket object is not closed; use socket.detach()

3. Rewrite the test to test http.client directly, rather than indirectly through urlopen(). As far as I can see the purpose is only to test HTTPResponse.fileno(), not urlopen().

4. Rewrite the test to test a local server run in a background thread, rather than relying external web sites (currently Google, previously IETF, and Python). This would eliminate the need for setting a timeout.

5. Rewrite to the test for a more realistic use case that does not depend on specific internal HTTPResponse buffering and the HTTP protocol. I suggest mocking a CONNECT request, and uploading some non-HTTP data through the proxy.
msg262785 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-04-02 02:00
Here is a patch implementing my ideas. Let me know what you think of the ideas and/or the patch :)

If people think this change is too much for 3.5, I could try making a more minimal patch that calls something like socket.getpeername() rather than read(). That should also avoid the non-blocking problem.
msg262816 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2016-04-03 07:15
I added a timeout because test_urllibnet fails with a timeout after 15 minutes on the ARM buildbot and other tests of the same file also use a timeout.

I tried but failed to reproduce the timeout.

I ran the test after my change and it passes so I considered that it was ok. I understand your rationale for timeout, non blocking socket and read() returning None.

Can we modify the test to use a selector to wait until the socket is ready and only read available bytes, rather than reading all data?

Using a local server would also avoid blocking too long (take less than 15 min).
msg262822 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-04-03 10:34
Yes using select() would be another way to fix the immediate problem of read() returning None. In the long term I was hoping to use something like my patch to avoid the problems with reading the HTTP response (already discussed in this report) at the same time.
msg263074 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2016-04-09 08:17
I saw test_fileno failure again on the Gentoo buildbot: http://buildbot.python.org/all/builders/x86%20Gentoo%20Installed%20with%20X%203.x/builds/459/steps/test/logs/stdio

rewrite-fileno.patch looks good to me.
msg263099 - (view) Author: Roundup Robot (python-dev) Date: 2016-04-09 14:12
New changeset 00240ddce1d0 by Martin Panter in branch '3.5':
Issue #21069: Move test_fileno() from test_urllibnet and rewrite it
https://hg.python.org/cpython/rev/00240ddce1d0

New changeset 4c19396bd4a0 by Martin Panter in branch 'default':
Issue #21069: Merge test_fileno() from 3.5
https://hg.python.org/cpython/rev/4c19396bd4a0
History
Date User Action Args
2016-04-09 22:27:15martin.pantersetstatus: open -> closed
resolution: fixed
stage: commit review -> resolved
2016-04-09 14:12:54python-devsetmessages: + msg263099
2016-04-09 08:17:23berker.peksagsetnosy: + berker.peksag

messages: + msg263074
stage: patch review -> commit review
2016-04-03 10:34:04martin.pantersetmessages: + msg262822
2016-04-03 07:15:23hayposetmessages: + msg262816
2016-04-02 02:00:48martin.pantersetfiles: + rewrite-fileno.patch
keywords: + patch
messages: + msg262785

stage: needs patch -> patch review
2016-04-01 11:17:49martin.panterlinkissue26685 dependencies
2016-04-01 10:52:23BreamoreBoysetnosy: - BreamoreBoy
2016-04-01 09:50:29martin.pantersettype: behavior
title: test_fileno of test_urllibnet intermittently fails when using www.example.com -> test_fileno of test_urllibnet intermittently fails
components: + Tests

keywords: + buildbot, - patch
nosy: + martin.panter, haypo
versions: + Python 3.6, - Python 3.4
messages: + msg262733
stage: test needed -> needs patch
2015-05-04 19:58:58BreamoreBoysetnosy: + BreamoreBoy
messages: + msg242576
2014-04-16 06:19:19ned.deilysetassignee: ned.deily ->
messages: + msg216438
2014-04-16 04:00:14orsenthilsetmessages: + msg216430
2014-04-16 03:53:02orsenthilsetmessages: + msg216429
2014-04-16 01:32:41orsenthilsetmessages: + msg216416
2014-04-15 22:10:42orsenthilsetmessages: + msg216400
2014-03-27 12:07:28Arfreversetnosy: + Arfrever
2014-03-27 10:48:40ned.deilysetnosy: + orsenthil
messages: + msg214947
2014-03-27 09:13:32ned.deilysetpriority: normal -> high

title: urllib unit tests failing without ssl module -> test_fileno of test_urllibnet intermittently fails when using www.example.com
messages: + msg214945
stage: needs patch -> test needed
2014-03-27 08:40:39python-devsetnosy: + python-dev
messages: + msg214942
2014-03-27 07:16:32ned.deilysetassignee: ned.deily
stage: needs patch

nosy: + ned.deily
versions: + Python 3.5
2014-03-27 00:50:35dfarrell07setfiles: + test_success_after_patch.txt

messages: + msg214925
2014-03-27 00:36:00dfarrell07setfiles: + issue21069.patch
keywords: + patch
messages: + msg214924
2014-03-26 22:48:32r.david.murraysetmessages: + msg214919
2014-03-26 22:43:04dfarrell07setmessages: + msg214918
2014-03-26 16:43:52r.david.murraysetmessages: + msg214901
2014-03-26 16:08:32dfarrell07setmessages: + msg214900
2014-03-26 13:40:33r.david.murraysetmessages: + msg214898
2014-03-26 13:02:05r.david.murraysetnosy: + r.david.murray
messages: + msg214897
2014-03-26 08:10:37dfarrell07create