Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bz2/lzma: Compressor/Decompressor objects are only initialized in __init__ #67413

Closed
vadmium opened this issue Jan 12, 2015 · 9 comments
Closed
Labels
3.9 only security fixes 3.10 only security fixes 3.11 only security fixes extension-modules C modules in the Modules dir type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@vadmium
Copy link
Member

vadmium commented Jan 12, 2015

BPO 23224
Nosy @vstinner, @vadmium, @serhiy-storchaka, @animalize, @orenmn, @izbyshev, @corona10, @ZackerySpytz
PRs
  • gh-67413: Fix segfaults and multiple leaks in the lzma and bz2 modules #7822
  • Files
  • fix-lzma-bz2-segfaults.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2015-01-12.00:58:28.305>
    labels = ['extension-modules', '3.10', '3.9', 'type-crash', '3.11']
    title = 'bz2/lzma: Compressor/Decompressor objects are only initialized in __init__'
    updated_at = <Date 2021-12-20.03:23:36.888>
    user = 'https://github.com/vadmium'

    bugs.python.org fields:

    activity = <Date 2021-12-20.03:23:36.888>
    actor = 'malin'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Extension Modules']
    creation = <Date 2015-01-12.00:58:28.305>
    creator = 'martin.panter'
    dependencies = []
    files = ['45730']
    hgrepos = []
    issue_num = 23224
    keywords = ['patch']
    message_count = 9.0
    messages = ['233866', '233900', '236355', '282216', '282254', '320146', '325698', '365920', '408942']
    nosy_count = 10.0
    nosy_names = ['vstinner', 'nadeem.vawda', 'martin.panter', 'serhiy.storchaka', 'malin', 'Aaron1011', 'Oren Milman', 'izbyshev', 'corona10', 'ZackerySpytz']
    pr_nums = ['7822']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'crash'
    url = 'https://bugs.python.org/issue23224'
    versions = ['Python 3.9', 'Python 3.10', 'Python 3.11']

    @vadmium
    Copy link
    Member Author

    vadmium commented Jan 12, 2015

    Noticed in a patch review around LZMModules/lzmamodule.c:1055 that the C-level LZMADecompressor object is being initialized in an __init_() method. It crashes if you create the object with __new__() and never call __init__():

    >>> from lzma import LZMADecompressor
    >>> LZMADecompressor.__new__(LZMADecompressor).decompress(bytes())
    Segmentation fault (core dumped)
    [Exit 139]

    @vadmium vadmium added extension-modules C modules in the Modules dir type-crash A hard crash of the interpreter, possibly with a core dump labels Jan 12, 2015
    @vadmium
    Copy link
    Member Author

    vadmium commented Jan 13, 2015

    A patch for this might conflict with the LZMA patch for bpo-15955, so it would be simplest to wait for that issue to be resolved first

    @vadmium
    Copy link
    Member Author

    vadmium commented Feb 21, 2015

    Similar situation in the bzip module:

    >>> BZ2Decompressor.__new__(BZ2Decompressor).decompress(bytes())
    Segmentation fault (core dumped)

    @Aaron1011
    Copy link
    Mannequin

    Aaron1011 mannequin commented Dec 2, 2016

    I've upload a patch which should address the issue in both the lzma and bz2 modules.

    @serhiy-storchaka
    Copy link
    Member

    This is not the only issue. Calling __init__ multiple times causes leaking locks, bz2/lzma internal buffers, etc.

    It looks to me that the most straightforward way is using __new__ instead of __init__.

    @serhiy-storchaka serhiy-storchaka added the 3.7 (EOL) end of life label Dec 2, 2016
    @ZackerySpytz ZackerySpytz mannequin added the 3.8 only security fixes label Jun 20, 2018
    @vstinner
    Copy link
    Member

    Calling directly __new__() is very unusual, so I'm not surprised that you found bugs. Calling __init__() twice is also an unusal practice. I concur with Serhiy: implementing new instead of init should prevent such bug. I recall that some objects decided instead to raise an exception if they are not fully initialized (if init has not been called), but using new instead of init seems to be safer approach if we can implement it.

    In the meanwhile, I blocked by the memory leak (handle leak) in bz2 and lzma: bpo-33916. I proposed a very simple fix: PR 7843. I propose to apply this one since it's easy to backport it to all branches, whereas replacing init with new would be more difficult and risky to backport: I would suggest to only make this change in the master.

    What do you think?

    @izbyshev
    Copy link
    Mannequin

    izbyshev mannequin commented Sep 18, 2018

    bz2 in 2.7 is also affected.

    Victor, do we want to fix the crash at all in stable branches? If yes, IMHO taking the slight risk of __init__ -> __new__ change is preferable to taking the trouble to implement the alternative backwards-compatible fix (init checks in all methods) for stable branches.

    @izbyshev izbyshev mannequin changed the title LZMADecompressor object is only initialized in __init__ bz2/lzma: Compressor/Decompressor objects are only initialized in __init__ Sep 18, 2018
    @corona10 corona10 added the 3.9 only security fixes label Apr 7, 2020
    @corona10
    Copy link
    Member

    corona10 commented Apr 7, 2020

    The issue is not solved yet.
    @ZackerySpytz would you like to finalize this issue?

    @devdanzin devdanzin mannequin added 3.10 only security fixes 3.11 only security fixes and removed 3.7 (EOL) end of life 3.8 only security fixes labels Dec 19, 2021
    @animalize
    Copy link
    Mannequin

    animalize mannequin commented Dec 20, 2021

    These can be done in .__new__() method:

    • create thread lock
    • create (de)?compression context
    • initialize (de)?compressor states

    In .__init__() method, only set (de)?compression parameters. And prevent .__init__() method from being called multiple times.

    This mode works fine in my pyzstd module (A Python bindings to zstd library).
    But I think very few people will encounter this problem, we can leave it.

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes 3.10 only security fixes 3.11 only security fixes extension-modules C modules in the Modules dir type-crash A hard crash of the interpreter, possibly with a core dump
    Projects
    Status: Done
    Development

    No branches or pull requests

    5 participants