Issue1625
Created on 2007-12-14 09:20 by therve, last changed 2009-10-26 18:53 by pitrou.
| File name |
Uploaded |
Description |
Edit |
Remove |
|
bz2_patch.tar.bz2
|
dbonner,
2009-09-29 20:07
|
issue1625 - bz2 multiple stream patch v1 |
|
|
|
py3k_bz2.patch
|
dbonner,
2009-10-01 14:21
|
issue1625 - bz2 multiple stream patch v2 |
|
|
|
py3k_bz2.patch
|
dbonner,
2009-10-07 20:36
|
issue1625 - bz2 multiple stream patch v3 |
|
|
|
py3k_bz2.patch
|
dbonner,
2009-10-21 20:09
|
issue1625 - bz2 multiple stream patch v4 |
|
|
|
msg58619 - (view) |
Author: Thomas Herve (therve) |
Date: 2007-12-14 09:20 |
|
The BZ2File class only supports one stream per file. It possible to have
multiple streams concatenated in one file, it the resulting data should
be the concatenation of all the streams. It's what the bunzip2 program
produces, for example. It's also supported by the gzip module.
Once this done, this would add the ability to open a file for appending,
by adding another stream to the file.
I'll probably try to do this, but the fact it's done in C (unlike gzip)
makes it harder, so if someone beats me to it, etc.
|
|
msg59897 - (view) |
Author: Thomas Lee (thomas.lee) |
Date: 2008-01-14 13:31 |
|
If you're referring to an 'append' mode for bz2file objects, it may be a
limitation of the underlying library: my version of bzlib.h only
provides BZ2_bzWriteOpen and BZ2_bzReadOpen - it's not immediately clear
how you would open a BZ2File in append mode looking at this API.
It may be possible to implement r/w/a using the lower-level
bzCompress/bzDecompress functions, but I doubt that's going to happen
unless somebody (such as yourself? :)) cares deeply about this.
|
|
msg60236 - (view) |
Author: A.M. Kuchling (akuchling) |
Date: 2008-01-19 22:00 |
|
Like gzip, you can concatenate two bzip2 files:
bzip2 -c /etc/passwd >/tmp/pass.bz2
bzip2 -c /etc/passwd >>/tmp/pass.bz2
bunzip2 will output both parts, generating two copies of the file.
So nothing needs to be done on compression, but uncompression needs to
look for another chunk of compressed data after finishing one chunk.
|
|
msg60268 - (view) |
Author: Thomas Herve (therve) |
Date: 2008-01-20 09:12 |
|
The gzip module supports reopening an existing file to add another
stream. I think the bz2 module should not the same.
|
|
msg93323 - (view) |
Author: David Bonner (dbonner) |
Date: 2009-09-29 19:36 |
|
I've got a patch that fixes this. It allows BZ2File to read
multi-stream files as generated by pbzip2, allows BZ2File to open files
in append mode, and also updates bz2.decompress to allow it to handle
multi-stream chunks of data.
We originally wrote it against 2.5, but I've updated the patch to py3k
trunk, and attached it here. If there's interest in a patch against 2.7
trunk, please let me know.
|
|
msg93326 - (view) |
Author: David Bonner (dbonner) |
Date: 2009-09-29 20:07 |
|
sorry, the previous patch was from an old version. attaching the
correct version now. apologies for the noise.
|
|
msg93405 - (view) |
Author: Antoine Pitrou (pitrou) |
Date: 2009-10-01 13:42 |
|
Some notes about posting patches:
- you should post the patch alone, not in an archive
- generally you should post patches against the 2.7 trunk, we take care
of merging them to py3k ourselves (but in this case the difference
should be minimal anyway)
- I'm not sure it's ok to add legal boilerplate at the top of files, we
never do that usually (and if everyone did it would become unreadable).
Does your company require you to do so?
I'll look at the patch itself another day, I don't have the time right
now. But thanks for posting it!
|
|
msg93407 - (view) |
Author: David Bonner (dbonner) |
Date: 2009-10-01 14:21 |
|
Thanks for the reply.
My company's legal dept. told me that we needed to put the boilerplate
into the files as part of releasing it under the apache license. I used
a tarball because they also recommended including a full copy of the
license with the patch.
I'm reattaching just the patch to the bug now. I'll check with legal
and see if they'd have a problem with removing the boilerplate.
|
|
msg93408 - (view) |
Author: R. David Murray (r.david.murray) |
Date: 2009-10-01 14:29 |
|
If the patch is substantial enough that legal boilerplate is even an
issue, then I'm pretty sure a contributor agreement will be required for
patch acceptance, at which point I think the boilerplate won't be
needed. The Apache license is certainly acceptable. I'm obviously not
the authority on this, though. That would be van Lindburg.
|
|
msg93721 - (view) |
Author: David Bonner (dbonner) |
Date: 2009-10-07 20:36 |
|
I can remove the boilerplate from the code as long as I add the
following to the submittal:
VMware, Inc. is providing this bz2 module patch to you under the terms
of the Apache License 2.0 with the understanding that you plan to
re-license this under the terms and conditions of the Python License.
This patch is provided as is, with no warranties or support. VMware
disclaims all liability in connection with the use/inability to use this
patch. Any use of the attached is considered acceptance of the above.
|
|
msg93841 - (view) |
Author: Antoine Pitrou (pitrou) |
Date: 2009-10-10 20:11 |
|
As far as I can tell, the patch looks mostly good.
I just wonder, in Util_HandleBZStreamEnd(), why you don't set self->mode
to MODE_CLOSED if BZ2_bzReadOpen() fails.
As a sidenote, the bz2 module implementation seems to have changed quite
a bit between trunk and py3k, so if you want it to be backported to
trunk (2.7), you'll have to provide a separate patch.
|
|
msg94316 - (view) |
Author: David Bonner (dbonner) |
Date: 2009-10-21 18:02 |
|
Hrm...yeah, I should probably be setting it to closed as soon as
BZ2_bzReadClose() returns, and then back to open once BZ2_bzReadOpen
succeeds. Wasn't intentional...thanks for the catch. You guys need a
new patch with that change in it?
I'll try and get a 2.7 patch done and uploaded in a day or two.
|
|
msg94318 - (view) |
Author: R. David Murray (r.david.murray) |
Date: 2009-10-21 18:28 |
|
A new patch will make it more likely that it will actually get applied :)
Thanks for your work on this.
|
|
msg94321 - (view) |
Author: David Bonner (dbonner) |
Date: 2009-10-21 20:09 |
|
Understandable. New patch attached.
|
|
msg94434 - (view) |
Author: Antoine Pitrou (pitrou) |
Date: 2009-10-24 18:44 |
|
I'm not comfortable with the following change (which appears twice in
the patch):
- BZ2_bzReadClose(&bzerror, self->fp);
+ if (self->fp)
+ BZ2_bzReadClose(&bzerror, self->fp);
break;
case MODE_WRITE:
- BZ2_bzWriteClose(&bzerror, self->fp,
- 0, NULL, NULL);
+ if (self->fp)
+ BZ2_bzWriteClose(&bzerror, self->fp,
+ 0, NULL, NULL);
If you need to test for the file pointer, perhaps there's a logic flaw
in your patch. Also, it might be dangerous in write mode: could it occur
that the file isn't closed but the problem isn't reported?
|
|
msg94446 - (view) |
Author: David Bonner (dbonner) |
Date: 2009-10-25 03:37 |
|
That was mostly just out of paranoia, since the comments mentioned
multiple calls to close being legal. Looking at it again, that particular
case isn't an issue, since we don't hit that call when the mode is
MODE_CLOSED. The testsuite runs happily with those changes reverted.
Should I upload a new patch?
|
|
msg94499 - (view) |
Author: Antoine Pitrou (pitrou) |
Date: 2009-10-26 18:53 |
|
> That was mostly just out of paranoia, since the comments mentioned
> multiple calls to close being legal. Looking at it again, that particular
> case isn't an issue, since we don't hit that call when the mode is
> MODE_CLOSED. The testsuite runs happily with those changes reverted.
> Should I upload a new patch?
You don't need to, but on the other hand I forgot to ask you to update
the documentation :-) (see Doc/library/bz2.rst)
|
|
| Date |
User |
Action |
Args |
| 2009-10-26 18:53:02 | pitrou | set | messages:
+ msg94499 |
| 2009-10-25 03:37:07 | dbonner | set | messages:
+ msg94446 |
| 2009-10-24 18:44:19 | pitrou | set | messages:
+ msg94434 |
| 2009-10-21 20:09:10 | dbonner | set | files:
+ py3k_bz2.patch
messages:
+ msg94321 |
| 2009-10-21 18:28:25 | r.david.murray | set | messages:
+ msg94318 |
| 2009-10-21 18:02:40 | dbonner | set | messages:
+ msg94316 |
| 2009-10-10 20:11:34 | pitrou | set | messages:
+ msg93841 |
| 2009-10-07 20:36:51 | dbonner | set | files:
+ py3k_bz2.patch
messages:
+ msg93721 |
| 2009-10-01 14:29:38 | r.david.murray | set | nosy:
+ r.david.murray messages:
+ msg93408
|
| 2009-10-01 14:21:18 | dbonner | set | files:
+ py3k_bz2.patch keywords:
+ patch messages:
+ msg93407
|
| 2009-10-01 13:42:41 | pitrou | set | nosy:
+ pitrou
messages:
+ msg93405 versions:
+ Python 2.7, - Python 2.6, Python 2.5 |
| 2009-09-29 20:07:06 | dbonner | set | files:
+ bz2_patch.tar.bz2
messages:
+ msg93326 |
| 2009-09-29 20:06:34 | dbonner | set | files:
- bz2_patch.tar.bz2 |
| 2009-09-29 19:36:03 | dbonner | set | files:
+ bz2_patch.tar.bz2 versions:
+ Python 2.5, Python 3.2 nosy:
+ dbonner
messages:
+ msg93323
|
| 2008-03-18 16:55:02 | jafo | set | priority: normal assignee: niemeyer nosy:
+ niemeyer |
| 2008-01-20 09:12:38 | therve | set | messages:
+ msg60268 |
| 2008-01-19 22:00:08 | akuchling | set | nosy:
+ akuchling messages:
+ msg60236 |
| 2008-01-14 13:31:59 | thomas.lee | set | nosy:
+ thomas.lee messages:
+ msg59897 |
| 2007-12-14 09:20:30 | therve | create | |
|