This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: bz2/lzma: Compressor/decompressor crash if __init__ is not called
Type: crash Stage: resolved
Components: Extension Modules Versions: Python 3.8, Python 3.7, Python 3.6, Python 2.7
process
Status: closed Resolution: duplicate
Dependencies: Superseder: bz2/lzma: Compressor/Decompressor objects are only initialized in __init__
View: 23224
Assigned To: Nosy List: berker.peksag, izbyshev, serhiy.storchaka
Priority: normal Keywords:

Created on 2018-09-18 22:49 by izbyshev, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (5)
msg325691 - (view) Author: Alexey Izbyshev (izbyshev) * (Python triager) Date: 2018-09-18 22:49
The compressor/decompressor classes from bz2 and lzma modules rely on __init__() for initialization, but it is not guaranteed to be called. Method calls on an uninitialized object crash:

>>> from bz2 import BZ2Compressor as C
>>> c = C.__new__(C)
>>> c.compress(b'')
Segmentation fault (core dumped)

I see two ways to fix this:

1) Move some initialization (notably, for "lock" field) to __new__() and add initialization checks to other methods. This should be backwards-compatible.

2) Move all initialization to __new__(). Since compressor/decompressor classes are not subclassable, it'll break only code than repeatedly calls __init__() on the same object. The simplicity of the fix might outweigh the necessity to support such code.
(However, in 2.7, classes in bz2 *are* subclassable; lzma is not present in 2.7).

Which way is more preferable?
msg325693 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2018-09-18 23:09
I think we usually went with option 1 when we fixed similar issues in the past.

See also issue 23224 for the same problem in *Decompressor classes of lzma and bz2 modules. It looks like the attached PR to that issue went with option 2: PR 7822.

Perhaps we can combine this and issue 23224.
msg325694 - (view) Author: Alexey Izbyshev (izbyshev) * (Python triager) Date: 2018-09-18 23:29
I somehow failed to notice #23224 when I searched the tracker. You're right, it's the same, and, moreover, PR 7822 fixes problem with both compressors and decompressors (though it includes tests only for the latter for some reason).

I think that this report should be closed as duplicate, but should we also change the title of #23224 to be more general?
msg325695 - (view) Author: Alexey Izbyshev (izbyshev) * (Python triager) Date: 2018-09-18 23:30
Reclosing (browser cache problem).
msg325696 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2018-09-18 23:35
We can definitely make the title of that issue more descriptive. Feel free to change it -- IIRC, you don't need additional permissions to change the title of an issue. Thanks!
History
Date User Action Args
2022-04-11 14:59:06adminsetgithub: 78910
2018-09-18 23:35:38berker.peksagsetmessages: + msg325696
2018-09-18 23:30:54izbyshevsetstatus: open -> closed
resolution: duplicate
messages: + msg325695

stage: needs patch -> resolved
2018-09-18 23:29:21izbyshevsetresolution: duplicate -> (no value)
messages: + msg325694
2018-09-18 23:09:12berker.peksagsetsuperseder: bz2/lzma: Compressor/Decompressor objects are only initialized in __init__
resolution: duplicate
messages: + msg325693
stage: needs patch
2018-09-18 22:49:05izbyshevcreate