Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PATCH]Add FastDbfilenameShelf: shelf nerver sync cache even when writeback=True #49733

Closed
zhigang mannequin opened this issue Mar 13, 2009 · 6 comments
Closed

[PATCH]Add FastDbfilenameShelf: shelf nerver sync cache even when writeback=True #49733

zhigang mannequin opened this issue Mar 13, 2009 · 6 comments
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@zhigang
Copy link
Mannequin

zhigang mannequin commented Mar 13, 2009

BPO 5483
Nosy @loewis, @aleaxit, @bitdancer
Files
  • fast_shelf.patch: Patch to add FastDbfilenameShelf
  • fast_shelf-v2.patch: Errated patch to also add the new class to all
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2010-02-11.03:32:38.841>
    created_at = <Date 2009-03-13.13:12:03.868>
    labels = ['type-feature', 'library']
    title = '[PATCH]Add FastDbfilenameShelf: shelf nerver sync cache even when writeback=True'
    updated_at = <Date 2010-02-11.03:32:38.840>
    user = 'https://bugs.python.org/zhigang'

    bugs.python.org fields:

    activity = <Date 2010-02-11.03:32:38.840>
    actor = 'r.david.murray'
    assignee = 'none'
    closed = True
    closed_date = <Date 2010-02-11.03:32:38.841>
    closer = 'r.david.murray'
    components = ['Library (Lib)']
    creation = <Date 2009-03-13.13:12:03.868>
    creator = 'zhigang'
    dependencies = []
    files = ['13317', '13318']
    hgrepos = []
    issue_num = 5483
    keywords = ['patch']
    message_count = 6.0
    messages = ['83516', '83517', '92703', '92740', '92774', '99195']
    nosy_count = 5.0
    nosy_names = ['loewis', 'aleax', 'lehmannro', 'r.david.murray', 'zhigang']
    pr_nums = []
    priority = 'normal'
    resolution = 'rejected'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue5483'
    versions = ['Python 3.1']

    @zhigang
    Copy link
    Mannequin Author

    zhigang mannequin commented Mar 13, 2009

    shelf without writeback=True is too slow, while shelves with
    writeback=True takes too much time to close. And even worse, these codes
    can not run:

    $ cat test_shelve.py
    #!/usr/bin/env python
    import shelve
    
    store = shelve.open("/tmp/shelve.db", writeback=True)
    
    class Test(object):
        pass
    
    def main():
        store["a"] = Test()
    
    if __name__ == '__main__':
        main()
    $ python test_shelve.py 
    Exception cPickle.PicklingError: "Can't pickle <class '__main__.Test'>:
    it's not the same object as __main__.Test" in  ignored

    With this module, we can make it to run.

    I think it's worth add this function to shelve. We can achieve great
    improvement with some avoidable limitations.

    The other approach to add this function is to add a extra option the
    shelve.open(). We can discuss for which is better.

    @zhigang zhigang mannequin added stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels Mar 13, 2009
    @zhigang
    Copy link
    Mannequin Author

    zhigang mannequin commented Mar 13, 2009

    Add some errata of the patch: add the new class to __all__.

    @lehmannro
    Copy link
    Mannequin

    lehmannro mannequin commented Sep 16, 2009

    If I understand you correctly, your proposal is the following: use
    Shelf.cache to cache *all* objects instead of only keeping live
    references. Your patch retains the cache forever instead of purging it
    on sync. (All these changes only apply with writeback=True, which you
    enabled by default; nothing changes with writeback=False.)
    This speeds up *repeated* reads/writes as they are kept in-memory
    instead of querying the-- probably slow --database every time.

    I do not think this is a feasible solution for two reasons:
    (1) writeback was never intended to do such thing. It was introduced as
    a solution to "make shelve less surprising." If you remove its
    sync-on-close characteristics, shelve is as surprising as before. See
    bpo-553171.
    (2) If you intend to implement caching on *any* database I'd suggest not
    using shelve for that matter. Using it for serialization is all okay but
    I'd rather add an additional layer of indirection to implement caching
    strategies if you want to do that The Right Way. (Shelf.cache is really
    only a map of objects that were touched during runtime.)

    I'm -1 on this patch but generally +1 on a generic caching wrapper.

    The error you describe later on appears because Python is already
    tearing down when gc'ing the Shelf (calling __del__ -> close -> sync).
    With writeback=True this currently tries to pickle the cache again which
    emits the error you observed. This should be handled in a separate issue.

    @zhigang
    Copy link
    Mannequin Author

    zhigang mannequin commented Sep 17, 2009

    Thanks Robert for pointing out bpo-553171.

    After read that issue, I still think we paid too much to make shelf less
    surprising.

    We should at lease provide a option for the *smart* programmers to get
    better speed and less exceptions.

    The write-back-all-cache-on-close feature is easy to document, but hard
    to accept when you have cached too much data.

    CCing Alex and Martin for comments.

    @lehmannro
    Copy link
    Mannequin

    lehmannro mannequin commented Sep 17, 2009

    I addressed the other bug you were experiencing in bpo-6932.

    @bitdancer
    Copy link
    Member

    I agree that caching for speed is something that should be implemented in another layer. It certainly is orthogonal to the writeback issue. The best caching strategy is going to depend on the application, so I don't think caching for speed belongs in Shelve itself. Closing as rejected.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant