Title: Python3 regression for urllib(2).urlopen(...).fp for chunked http responses
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.8, Python 3.7, Python 3.6, Python 3.4, Python 3.5
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: tkruse, xtreak
Priority: normal Keywords:

Created on 2018-09-11 16:36 by tkruse, last changed 2018-09-11 17:52 by xtreak.

File name Uploaded Description Edit tkruse, 2018-09-11 16:36
Messages (1)
msg325025 - (view) Author: Thibault Kruse (tkruse) Date: 2018-09-11 16:36
We had a problem running code that downloads files from github when porting from python2.7 to python3.[3-7]. Not sure if a bug or not.

With the given code, in python3 a file downloaded in chunks will contain the size of chunks when using the undocumented fp from urlopen(...).fp. In python2, only the chunk payload would make it into the file.

We assume that we can just use the urlopen response directly as a fix (without '.fp'), but though it might still be nice to report the difference.

Short code:
resp = urlopen('http://someurl')
fhand = os.fdopen(fdesc, "wb")
shutil.copyfileobj(resp.fp, fhand)   # using .fp here is the dodgy part

The attached script demonstrates the difference:

$ python --version
Python 2.7.15rc1
$ python - - [12/Sep/2018 01:27:28] "GET /downloads/1.0.tar.gz HTTP/1.1" 200 -

$ python3 --version
Python 3.6.5
$ python3 - - [12/Sep/2018 01:27:37] "GET /downloads/1.0.tar.gz HTTP/1.1" 200 -
Traceback (most recent call last):
  File "", line 87, in <module>
    assert data == FILE_CONTENT, '%s, %s'%(len(FILE_CONTENT), len(data))
AssertionError: 100000, 100493
!!! BASH reports ERROR: shell returned 1
Date User Action Args
2018-09-11 17:52:22xtreaksetnosy: + xtreak
2018-09-11 16:36:09tkrusecreate