This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Unsupported provider

classification
Title: urllib.request.urlopen does not return an iterable object
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.1, Python 3.2
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: Arfrever, ajaksu2, bbrazil, facundobatista, flox, jhylton, jwilk, marduk, orsenthil, pl, python-dev, rhettinger, santoso.wijaya, sschwarzer, zanella
Priority: high Keywords: easy, patch

Created on 2008-12-09 12:13 by jwilk, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
issue4608_py31.diff orsenthil, 2008-12-21 05:47
issue4608_py31-v2.diff orsenthil, 2009-01-02 21:30
tests_issue4608_py31.diff ajaksu2, 2009-02-08 23:38 Test addinfourl iterability
tests-iter-urllib-py3k.patch bbrazil, 2010-08-08 13:23 Fix short write and add tests.
issue4608.diff sschwarzer, 2011-06-25 16:37 Fix allowing iteration over urlopen results for "ftp://" and "file://" URLs. review
Messages (19)
msg77405 - (view) Author: Jakub Wilk (jwilk) Date: 2008-12-09 12:13
$ cat urltest2.5
#!/usr/bin/python2.5
from urllib2 import urlopen
for line in urlopen('http://python.org/'):
        print line
        break

$ ./urltest2.5
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">


$ cat urltest3.0
#!/usr/bin/python3.0
from urllib.request import urlopen
for line in urlopen('http://python.org/'):
        print(line)
        break

$ ./urltest3.0
Traceback (most recent call last):
  File "./urltest3.0", line 3, in <module>
    for line in urlopen('http://python.org/'):
TypeError: 'addinfourl' object is not iterable
msg77409 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2008-12-09 12:50
I verified this bug in the Py3.0 and Py3.1. Shall come out with a patch
for it.
msg77415 - (view) Author: Jeremy Hylton (jhylton) (Python triager) Date: 2008-12-09 13:48
Oops.  I didn't think it translate the code in addinfobase to the new
style of iterators.

Jeremy

On Tue, Dec 9, 2008 at 7:50 AM, Senthil <report@bugs.python.org> wrote:
>
> Senthil <orsenthil@gmail.com> added the comment:
>
> I verified this bug in the Py3.0 and Py3.1. Shall come out with a patch
> for it.
>
> ----------
> nosy: +orsenthil
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue4608>
> _______________________________________
> _______________________________________________
> Python-bugs-list mailing list
> Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/jeremy%40alum.mit.edu
>
>
msg78139 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2008-12-21 05:47
Here is a patch to fix the issue. 
Jeremy, is it approach okay? Or do you have any other suggestion?
msg78416 - (view) Author: Jakub Wilk (jwilk) Date: 2008-12-28 16:03
Regarding Senthil's patch:
__next__() method seems superfluous to me (and the implementation is buggy).
msg78880 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2009-01-02 21:30
Jakub,

I have attached a revision to the patch.
You are right, when __iter__ returns self.fp (as in previous patch), the
__next__ is superflous. 
But, I was thinking of __iter__ returning an instance of addbase,
instead of self.fp and in that case __next__ was required. But I see
that i had not changed self.fp to self. 

This is implemented in the similar lines of  IOBase class, io.py
w.r.t to your other comment, why do you think __next__ implementation is
incorrect?

Thanks,
Senthil
msg78944 - (view) Author: Jakub Wilk (jwilk) Date: 2009-01-03 10:52
Oops, __next__ is OK. Sorry for the confusion.
msg79575 - (view) Author: Facundo Batista (facundobatista) * (Python committer) Date: 2009-01-10 21:00
Senthil, do you think you could provide a test case for this?

Thank you!
msg81426 - (view) Author: Daniel Diniz (ajaksu2) * (Python triager) Date: 2009-02-08 23:38
Test cases attached.

The second one highlights a bug in the current patch, as it fails to
return a line longer than 65475 chars. This behavior doesn't match trunk's.
msg86365 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2009-04-23 12:32
This issue is already fixed by jeremy at Revision 70815, wherein "The
response from an HTTP request is now an HTTPResponse instance instead of
an addinfourl() wrapper instance."

So the issue won't be present in the py3k code ( confirmed).

However, the test added by Daniel,which tests for urlopen() for a
request which is b"verylong" * 8192 still fails. 

It is not just with iteration; but test_200 will fail too if the request
is a large chunk. This is only in py3k branch, test will pass in the
trunk code. I am investigating further.
msg113245 - (view) Author: Brian Brazil (bbrazil) * Date: 2010-08-08 09:43
This looks as though its a short write:

[pid 28343] recvfrom(5, "GET / HTTP/1.1\r\nAccept-Encoding:"..., 8192, 0, NULL, NULL) = 118
[pid 28343] poll([{fd=5, events=POLLOUT, revents=POLLOUT}], 1, 10000) = 1
[pid 28343] sendto(5, "HTTP/1.0 200 OK\r\n", 17, 0, NULL, 0) = 17
[pid 28343] poll([{fd=5, events=POLLOUT, revents=POLLOUT}], 1, 10000) = 1
[pid 28343] sendto(5, "Server: TestHTTP/ Python/3.2a1+\r"..., 33, 0, NULL, 0) = 33
[pid 28343] poll([{fd=5, events=POLLOUT, revents=POLLOUT}], 1, 10000) = 1
[pid 28343] sendto(5, "Date: Sun, 08 Aug 2010 09:41:08 "..., 37, 0, NULL, 0) = 37
[pid 28343] poll([{fd=5, events=POLLOUT, revents=POLLOUT}], 1, 10000) = 1
[pid 28343] sendto(5, "Content-type: text/plain\r\n", 26, 0, NULL, 0) = 26
[pid 28343] poll([{fd=5, events=POLLOUT, revents=POLLOUT}], 1, 10000) = 1
[pid 28343] sendto(5, "\r\n", 2, 0, NULL, 0) = 2
[pid 28343] poll([{fd=5, events=POLLOUT, revents=POLLOUT}], 1, 10000) = 1
[pid 28343] sendto(5, "verylongverylongverylongverylong"..., 56001, 0, NULL, 0) = 49054
[pid 28343] shutdown(5, 1 /* send */)   = 0
msg113260 - (view) Author: Brian Brazil (bbrazil) * Date: 2010-08-08 13:23
The attached patch handles short writes, and adds ajaksu2's tests.
msg113280 - (view) Author: Florent Xicluna (flox) * (Python committer) Date: 2010-08-08 16:27
Thanks, Brian.
Pushed with revision 83833.
msg134129 - (view) Author: Albert Hopkins (marduk) Date: 2011-04-20 07:54
This issue appears to persist when the protocol used is FTP:


root@tp-db $ cat test.py
from urllib.request import urlopen
for line in urlopen('ftp://gentoo.osuosl.org/pub/gentoo/releases/'):
        print(line)
        break

root@tp-db $ python3.2 test.py
Traceback (most recent call last):
  File "test.py", line 2, in <module>
    for line in urlopen('ftp://gentoo.osuosl.org/pub/gentoo/releases/'):
TypeError: 'addinfourl' object is not iterable
msg134130 - (view) Author: Albert Hopkins (marduk) Date: 2011-04-20 07:56
Oops, previous example was a directory, but it's the same if the url points to a ftp file.
msg134319 - (view) Author: Rafael Zanella (zanella) Date: 2011-04-24 00:30
The patch that makes addinfourl() iterable was not commited due to the change to HTTP request see: msg86365 (http://bugs.python.org/issue4608#msg86365).

Since urllib is protocol agnostic it should behave the same with FTP, right?

So, where to fix? Change the addinfourl() to become itrable or change the FTPHandler return?
msg139100 - (view) Author: Stefan Schwarzer (sschwarzer) Date: 2011-06-25 16:37
It turned out that although the addinfourl instance had the `__iter__` attribute in `addbase.__init__` correctly assigned, `__iter__` wasn't found by the `iter` builtin. It seems that `iter` always tries to use the `__iter__` method of the _class_ and doesn't look at the instance.

Riccardo Attilio Galli and I made the attached patch. The patch also fixes a corresponding `TypeError` for "file://" URLs, not just "ftp://" URLs.
msg139170 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2011-06-26 12:30
New changeset c0a68b948f5d by Raymond Hettinger in branch '3.2':
Issue #4608: urllib.request.urlopen does not return an iterable object
http://hg.python.org/cpython/rev/c0a68b948f5d

New changeset d4aeeddf72e3 by Raymond Hettinger in branch 'default':
Issue #4608: urllib.request.urlopen does not return an iterable object
http://hg.python.org/cpython/rev/d4aeeddf72e3
msg139171 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2011-06-26 12:31
Thanks for the patch.
History
Date User Action Args
2022-04-11 14:56:42adminsetgithub: 48858
2011-06-26 12:31:11rhettingersetstatus: open -> closed

messages: + msg139171
2011-06-26 12:30:41python-devsetnosy: + python-dev
messages: + msg139170
2011-06-25 16:44:24rhettingersetassignee: flox -> rhettinger

nosy: + rhettinger
2011-06-25 16:37:52sschwarzersetfiles: + issue4608.diff
nosy: + sschwarzer
messages: + msg139100

2011-04-24 00:30:56zanellasetnosy: + zanella
messages: + msg134319
2011-04-20 16:03:49santoso.wijayasetnosy: + santoso.wijaya
2011-04-20 15:24:45Arfreversetnosy: + Arfrever
2011-04-20 08:29:53orsenthilsetstatus: closed -> open
2011-04-20 07:56:49marduksetmessages: + msg134130
2011-04-20 07:54:24marduksetnosy: + marduk
messages: + msg134129
2010-08-08 16:27:42floxsetstatus: open -> closed

assignee: facundobatista -> flox

nosy: + flox
messages: + msg113280
resolution: fixed
stage: patch review -> resolved
2010-08-08 13:23:07bbrazilsetfiles: + tests-iter-urllib-py3k.patch

messages: + msg113260
2010-08-08 09:43:22bbrazilsetnosy: + bbrazil
messages: + msg113245
2010-05-11 20:52:40terry.reedysetversions: + Python 3.2, - Python 3.0
2009-04-23 12:32:00orsenthilsetmessages: + msg86365
2009-04-22 18:50:23ajaksu2setpriority: high
keywords: + easy
stage: patch review
2009-02-08 23:38:48ajaksu2setfiles: + tests_issue4608_py31.diff
nosy: + ajaksu2
messages: + msg81426
2009-01-10 21:00:05facundobatistasetassignee: facundobatista
messages: + msg79575
nosy: + facundobatista
2009-01-03 10:52:57jwilksetmessages: + msg78944
2009-01-02 21:30:42orsenthilsetfiles: + issue4608_py31-v2.diff
messages: + msg78880
2008-12-28 16:03:18jwilksetmessages: + msg78416
2008-12-21 05:47:52orsenthilsetversions: + Python 3.1
2008-12-21 05:47:19orsenthilsetfiles: + issue4608_py31.diff
keywords: + patch
messages: + msg78139
2008-12-09 13:48:16jhyltonsetnosy: + jhylton
messages: + msg77415
2008-12-09 12:50:59orsenthilsetnosy: + orsenthil
messages: + msg77409
2008-12-09 12:13:42jwilkcreate