Message 261287 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	maubp
Recipients	maubp
Date	2016-03-07.10:10:52
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1457345453.59.0.0327827539319.issue26499@psf.upfronthosting.co.za>
In-reply-to

Content
This is a regression in Python 3.5 tested under Linux and Mac OS X, spotted from a failing test in Biopython https://github.com/biopython/biopython/issues/773 where we would parse a file from the internet. The trigger is partially reading the network handle line by line (e.g. until an end record marker is found), and then calling handle.read() to fetch any remaining data. Self contained examples below. Note that partially reading a file like this still works: $ python3.5 Python 3.5.0 (default, Sep 14 2015, 12:13:24) [GCC 4.2.1 Compatible Apple LLVM 6.1.0 (clang-602.0.53)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> >>> from urllib.request import urlopen >>> handle = urlopen("http://www.python.org") >>> chunk = handle.read(50) >>> rest = handle.read() >>> handle.close() However, the following variants read a few lines and then attempt to call handle.read() and fail. The URL is not important (as long as it has over four lines in these examples). Using readline, >>> from urllib.request import urlopen >>> handle = urlopen("http://www.python.org") >>> for i in range(4): ... line = handle.readline() ... >>> rest = handle.read() Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/xxx/lib/python3.5/http/client.py", line 446, in read s = self._safe_read(self.length) File "/Users/xxx/lib/python3.5/http/client.py", line 594, in _safe_read raise IncompleteRead(b''.join(s), amt) http.client.IncompleteRead: IncompleteRead(46698 bytes read, 259 more expected) Using line iteration via next, >>> from urllib.request import urlopen >>> handle = urlopen("http://www.python.org") >>> for i in range(4): ... line = next(handle) ... >>> rest = handle.read() Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/xxx/lib/python3.5/http/client.py", line 446, in read s = self._safe_read(self.length) File "/Users/xxx/lib/python3.5/http/client.py", line 594, in _safe_read raise IncompleteRead(b''.join(s), amt) http.client.IncompleteRead: IncompleteRead(46698 bytes read, 259 more expected) Using line iteration directly, >>> from urllib.request import urlopen >>> count = 0 >>> handle = urlopen("http://www.python.org") >>> for line in handle: ... count += 1 ... if count == 4: ... break ... >>> rest = handle.read() Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/xxx/lib/python3.5/http/client.py", line 446, in read s = self._safe_read(self.length) File "/Users/xxx/lib/python3.5/http/client.py", line 594, in _safe_read raise IncompleteRead(b''.join(s), amt) http.client.IncompleteRead: IncompleteRead(46698 bytes read, 259 more expected) These examples all worked on Python 3.3 and 3.4 so this is a regression.

This is a regression in Python 3.5 tested under Linux and Mac OS X, spotted from a failing test in Biopython https://github.com/biopython/biopython/issues/773 where we would parse a file from the internet. The trigger is partially reading the network handle line by line (e.g. until an end record marker is found), and then calling handle.read() to fetch any remaining data. Self contained examples below.

Note that partially reading a file like this still works:


$ python3.5
Python 3.5.0 (default, Sep 14 2015, 12:13:24) 
[GCC 4.2.1 Compatible Apple LLVM 6.1.0 (clang-602.0.53)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> 
>>> from urllib.request import urlopen
>>> handle = urlopen("http://www.python.org")
>>> chunk = handle.read(50)
>>> rest = handle.read()
>>> handle.close()


However, the following variants read a few lines and then attempt to call handle.read() and fail. The URL is not important (as long as it has over four lines in these examples).

Using readline,


>>> from urllib.request import urlopen
>>> handle = urlopen("http://www.python.org")
>>> for i in range(4):
...     line = handle.readline()
... 
>>> rest = handle.read()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/xxx/lib/python3.5/http/client.py", line 446, in read
    s = self._safe_read(self.length)
  File "/Users/xxx/lib/python3.5/http/client.py", line 594, in _safe_read
    raise IncompleteRead(b''.join(s), amt)
http.client.IncompleteRead: IncompleteRead(46698 bytes read, 259 more expected)


Using line iteration via next,


>>> from urllib.request import urlopen
>>> handle = urlopen("http://www.python.org")
>>> for i in range(4):
...      line = next(handle)
... 
>>> rest = handle.read()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/xxx/lib/python3.5/http/client.py", line 446, in read
    s = self._safe_read(self.length)
  File "/Users/xxx/lib/python3.5/http/client.py", line 594, in _safe_read
    raise IncompleteRead(b''.join(s), amt)
http.client.IncompleteRead: IncompleteRead(46698 bytes read, 259 more expected)


Using line iteration directly,


>>> from urllib.request import urlopen
>>> count = 0
>>> handle = urlopen("http://www.python.org")
>>> for line in handle:
...     count += 1
...     if count == 4:
...         break
... 
>>> rest = handle.read()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/xxx/lib/python3.5/http/client.py", line 446, in read
    s = self._safe_read(self.length)
  File "/Users/xxx/lib/python3.5/http/client.py", line 594, in _safe_read
    raise IncompleteRead(b''.join(s), amt)
http.client.IncompleteRead: IncompleteRead(46698 bytes read, 259 more expected)



These examples all worked on Python 3.3 and 3.4 so this is a regression.

History
Date	User	Action	Args
2016-03-07 10:10:53	maubp	set	recipients: + maubp
2016-03-07 10:10:53	maubp	set	messageid: <1457345453.59.0.0327827539319.issue26499@psf.upfronthosting.co.za>
2016-03-07 10:10:53	maubp	link	issue26499 messages
2016-03-07 10:10:52	maubp	create