This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: urllib2 request does not update content length after new add_data
Type: behavior Stage:
Components: Library (Lib) Versions: Python 2.6, Python 2.5
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: orsenthil Nosy List: barry, fdrake, nirai, orsenthil, pablomouzo, r.david.murray, till
Priority: normal Keywords: patch

Created on 2009-12-18 15:24 by till, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
bad_fix.diff pablomouzo, 2009-12-27 17:18 This fix breaks the tests
Messages (16)
msg96567 - (view) Author: Till Maas (till) Date: 2009-12-18 15:23
When I try to reuse a urllib2.Request object with different post data,
the data itself is updated, but the content length is not. Here is a
simple script to reproduce it on Python 2.5 on Fedora 10 and 2.6 on Arch
Linux:

#!/usr/bin/python
# vim: fileencoding=utf8 
# test with:  echo | socat - tcp4-listen:1111,fork
# Demonstrates bad content length of second request, which should be 2
import urllib2
req = urllib2.Request('http://localhost:1111')
req.add_data("1")
urllib2.urlopen(req)
req.add_data("10")
urllib2.urlopen(req)
msg96915 - (view) Author: Pablo Mouzo (pablomouzo) Date: 2009-12-27 17:18
The problem here is that the headers are not updated if they already 
exists. The solution is quite simple but breaks the tests because it 
"clobbers the existing headers".

You can do this:
...
req.add_data(some_data)
req.add_unredirected_header('Content-Length', len(some_data))
urllib2.urlopen(req)
...

But is risky because all the other headers are still outdated.

Is there any reason why you need to reuse the request object?
msg96920 - (view) Author: Till Maas (till) Date: 2009-12-27 20:27
I do not need to reuse a request object, but I did in a script when only
the data was different for each request. If this is not meant to be
done, then any not meant to be done modification should somehow create
an error, when it is done, instead of silently performing broken HTTP
requests.
msg100067 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2010-02-24 20:56
Reusing a request object and doing add_data is a bad idea. 
Made changes to raise TypeError when you try to that. Fix committed in revision 78431.
msg100068 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2010-02-24 21:05
merged into other branches r78432, r78433, r78434.
msg100353 - (view) Author: Fred Drake (fdrake) (Python committer) Date: 2010-03-03 22:03
This change breaks existing uses of Python 2.6.4.

The mechanize library frequently re-initializes the data in the request
without re-using the request.  Applications (including tests) that use
mechanize now break with this TypeError.

The proper response to this issue for Python 2.6 is to make no code
changes (though a documentation enhancement may be in order).

This change should be reverted from all branches.

Python 2.6.5 should be blocked until this is done.
msg100354 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2010-03-03 22:08
Since this is a regression from 2.6.4, it's a release blocker for 2.6.5 and probably will require an rc2.
msg100356 - (view) Author: Fred Drake (fdrake) (Python committer) Date: 2010-03-03 22:10
To clarify: Multiple calls to add_data on a urllib2 request, when the request isn't being reused, are in no way invalidated by the problem initially reported.
msg100374 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2010-03-04 04:03
fdrake and barry:
Shall I revert the changes made in the release26-maint branches then? I documentation fix in those branches may be sufficient with warning in the documentation.
msg100375 - (view) Author: Fred Drake (fdrake) (Python committer) Date: 2010-03-04 06:18
I'd like to hear at least one other say-so (which can be yours if you
agree this is the right thing), so there's more recognized consensus
on the matter.  We also need an explicit go-ahead from Barry as the
release manager.

At this point, the community sees only my assertion that this is a
regression from 2.6.4.

I'd be fine with a documentation improvement going in as well (as a
separate commit), but that's not a requirement to deal with the
release block.  Before the release, it would require a separate
go-ahead from the release manager.
msg100385 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-03-04 12:30
Fred, I think your report of observed breakage is sufficient to warrant rolling back the change, especially since this isn't an actual bug fix.  That is, no code that was failing to work before would be enabled to work by this fix.  (Rather, it aims to prevent broken code from being written...which it could be argued is inappropriate for a backport anyway, though you could argue it either way.)
msg100387 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2010-03-04 14:14
Please roll this back for 2.6.5rc2.  Thanks.
msg100389 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2010-03-04 15:20
Reverted the changes in the revision 78651
No longer a release blocker for rc2.
msg100408 - (view) Author: Fred Drake (fdrake) (Python committer) Date: 2010-03-04 19:14
Reopening; per discussion on IRC, the change needs to be reverted on the other three branches to which it was applied.

If code changes are needed to make unsupported usage fail early, they need to be considered carefully and only applied as a new feature.
msg100561 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2010-03-07 04:39
Change is reverted from other branches too. Discussed in IRC, that changes made to an existing API like add_data,may break some existing applications, even if its done to prevent unsupported usage. 

Specifically, in mechanize's case, it was not reusing req object, but using add_data method more than once before req was getting submitted. This is under scenarios wherein code as in zope.testbrowser) is centered on the browser controls; any twiddle to a control could update the req object before its submitted. This could be changed in mechanize, but in general agreed to fact that changes to existing API should not result in breaking of apps.

We can address the non-reusablity of request by adding a boolean that signifies if a request has been used and throw an exception if one is used a second time ( benji's suggestion). And of course clarify the supported and intended scenarios via documentation.
msg100785 - (view) Author: Fred Drake (fdrake) (Python committer) Date: 2010-03-10 13:56
The reverts in SVN look good to me; re-closing this issue as no further action is required.

If there's a proposal for specific changes to urllib2 to improve diagnostics in unsupported cases, that should be detailed in a separate issue.

Marking this issue "invalid" since the original reported symptom was determined to be caused by an unsupported use (re-use of the request).
History
Date User Action Args
2022-04-11 14:56:55adminsetgithub: 51789
2010-04-27 20:33:34loewissetpriority: normal
2010-03-10 13:56:24fdrakesetstatus: open -> closed
resolution: accepted -> not a bug
messages: + msg100785
2010-03-07 04:39:12orsenthilsetresolution: fixed -> accepted
messages: + msg100561
2010-03-04 19:14:45fdrakesetstatus: closed -> open

messages: + msg100408
2010-03-04 15:20:48orsenthilsetstatus: open -> closed
priority: release blocker -> (no value)
resolution: accepted -> fixed
messages: + msg100389
2010-03-04 14:14:55barrysetresolution: fixed -> accepted
messages: + msg100387
2010-03-04 12:30:41r.david.murraysetnosy: + r.david.murray
messages: + msg100385
2010-03-04 06:18:07fdrakesetmessages: + msg100375
2010-03-04 04:03:29orsenthilsetmessages: + msg100374
2010-03-03 22:10:56fdrakesetmessages: + msg100356
2010-03-03 22:08:50barrysetpriority: release blocker
nosy: + barry
messages: + msg100354

2010-03-03 22:03:48fdrakesetstatus: closed -> open
nosy: + fdrake
messages: + msg100353

2010-02-24 21:05:45orsenthilsetstatus: open -> closed

messages: + msg100068
2010-02-24 20:56:54orsenthilsetresolution: fixed
messages: + msg100067
2009-12-27 20:27:30tillsetmessages: + msg96920
2009-12-27 17:18:18pablomouzosetfiles: + bad_fix.diff

nosy: + pablomouzo
messages: + msg96915

keywords: + patch
2009-12-25 07:22:44niraisetnosy: + nirai
2009-12-21 03:07:15orsenthilsetassignee: orsenthil

nosy: + orsenthil
2009-12-18 15:24:00tillcreate