Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release GIL periodically in _pickle module #78309

Open
MartinBammer mannequin opened this issue Jul 16, 2018 · 12 comments
Open

Release GIL periodically in _pickle module #78309

MartinBammer mannequin opened this issue Jul 16, 2018 · 12 comments
Labels
3.8 only security fixes extension-modules C modules in the Modules dir stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@MartinBammer
Copy link
Mannequin

MartinBammer mannequin commented Jul 16, 2018

BPO 34128
Nosy @pitrou, @serhiy-storchaka, @pierreglaser
Files
  • pickle_gil.patch
  • pickle_gil.py
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2018-07-16.18:55:37.836>
    labels = ['interpreter-core', '3.8', 'type-feature', 'library']
    title = 'Release GIL periodically in _pickle module'
    updated_at = <Date 2019-05-10.17:46:26.367>
    user = 'https://bugs.python.org/MartinBammer'

    bugs.python.org fields:

    activity = <Date 2019-05-10.17:46:26.367>
    actor = 'pitrou'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Interpreter Core', 'Library (Lib)']
    creation = <Date 2018-07-16.18:55:37.836>
    creator = 'Martin Bammer'
    dependencies = []
    files = ['47700', '47701']
    hgrepos = []
    issue_num = 34128
    keywords = ['patch']
    message_count = 12.0
    messages = ['321755', '321764', '321805', '321806', '321821', '321823', '321826', '321827', '321830', '321835', '321846', '321847']
    nosy_count = 4.0
    nosy_names = ['pitrou', 'serhiy.storchaka', 'Martin Bammer', 'pierreglaser']
    pr_nums = []
    priority = 'normal'
    resolution = None
    stage = None
    status = 'open'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue34128'
    versions = ['Python 3.8']

    @MartinBammer
    Copy link
    Mannequin Author

    MartinBammer mannequin commented Jul 16, 2018

    Hi,

    the old and slow python implementation of pickle didn't block background
    thread.
    But the newer C-implementation blocks other threads while dump/load is
    running.
    Wouldn't it be possible to allow other threads during this time?
    Especially could load/loads release the GIL, because Python objects are not available to the Python code until these functions have finished?

    Regards,
    Martin

    @MartinBammer MartinBammer mannequin added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement labels Jul 16, 2018
    @pppery
    Copy link
    Mannequin

    pppery mannequin commented Jul 16, 2018

    um, something doesn't make sense about this. the python implementation of pickle never released the GIL (it can't, by definition -- it's written in python). The C implementation releasing the GIL wouldn't make sense, as the pickle api involves calls into python everywhere (for example, __reduce__)

    @pitrou
    Copy link
    Member

    pitrou commented Jul 17, 2018

    This is about releasing the GIL periodically to allow other threads to run, as Python already does in its main interpreter loop.

    @pitrou pitrou added the 3.8 only security fixes label Jul 17, 2018
    @serhiy-storchaka
    Copy link
    Member

    A workaround is writing Python wrappers for IO:

    def Writer:
    def __init__(self, file):
    self.file = file
    def write(self, data):
    return self.file.write(data)

    def Reader:
    def __init__(self, file):
    self.file = file
    def read(self, size=-1):
    return self.file.read(size)
    def readline(self, size=-1):
    return self.file.readline(size)
    def peek(self, size=-1):
    return self.file.peek(size)

    def mydump(obj, file, *args, **kwargs):
        return pickle.dump(obj, Writer(file), *args, **kwargs)
    
    def myload(file, *args, **kwargs):
        return pickle.load(Reader(file), *args, **kwargs)

    @pppery pppery mannequin added the stdlib Python modules in the Lib dir label Jul 17, 2018
    @pppery pppery mannequin changed the title Do not block threads when pickle/unpickle Release GIL periodically in _pickle module Jul 17, 2018
    @MartinBammer
    Copy link
    Mannequin Author

    MartinBammer mannequin commented Jul 17, 2018

    Maybe an optional parameter with the desired interval would be good idea. So that the coder can decide if he wants/needs that feature and which interval he needs for his application.
    Otherwise it is hard to define a specific interval which fits for everyone.

    @pitrou
    Copy link
    Member

    pitrou commented Jul 17, 2018

    The right way to do this is not to pass a timeout parameter but to check for GIL interrupts as done in the main bytecode evaluation loop.

    @pitrou
    Copy link
    Member

    pitrou commented Jul 17, 2018

    Attaching proof-of-concept patch.

    @pitrou
    Copy link
    Member

    pitrou commented Jul 17, 2018

    Attaching demonstration script.

    @pitrou
    Copy link
    Member

    pitrou commented Jul 17, 2018

    (as the demo script shows, there is no detectable slowdown)

    @serhiy-storchaka
    Copy link
    Member

    The demo script shows around 8% slowdown to me for

        data = list(map(float, range(N)))

    @pitrou
    Copy link
    Member

    pitrou commented Jul 17, 2018

    Interesting, which kind of computer / system / compiler are you on?

    @serhiy-storchaka
    Copy link
    Member

    CPU = Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz
    Ubuntu 18.04
    Linux 4.15.0 x86_64
    gcc 7.3.0

    Performing the check in save() can have not insignificant overhead (especially after implementing the bpo-34141 optimization). It can be reduced if perform it when flush a frame (in protocol 4) or buffer to the file, or after writing significant amount of bytes into buffer.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @iritkatriel iritkatriel added extension-modules C modules in the Modules dir and removed interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Oct 2, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 only security fixes extension-modules C modules in the Modules dir stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    Status: No status
    Development

    No branches or pull requests

    3 participants