Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve dbm modules #53732

Open
ysjray mannequin opened this issue Aug 5, 2010 · 24 comments
Open

Improve dbm modules #53732

ysjray mannequin opened this issue Aug 5, 2010 · 24 comments
Labels
extension-modules C modules in the Modules dir type-feature A feature request or enhancement

Comments

@ysjray
Copy link
Mannequin

ysjray mannequin commented Aug 5, 2010

BPO 9523
Nosy @birkenfeld, @rhettinger, @terryjreedy, @jcea, @ncoghlan, @merwok, @bitdancer
Files
  • issue9523.diff: patch again py3k
  • issue_9523_3.diff
  • issue_9523_3.2_doc_patch.diff: patch against 3.2
  • issue_9523_4.diff
  • issue_9523_5.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2010-08-05.15:23:53.923>
    labels = ['extension-modules', 'type-feature']
    title = 'Improve dbm modules'
    updated_at = <Date 2011-11-09.22:35:50.913>
    user = 'https://bugs.python.org/ysjray'

    bugs.python.org fields:

    activity = <Date 2011-11-09.22:35:50.913>
    actor = 'jcea'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Extension Modules']
    creation = <Date 2010-08-05.15:23:53.923>
    creator = 'ysj.ray'
    dependencies = []
    files = ['20832', '21355', '21369', '21738', '21769']
    hgrepos = []
    issue_num = 9523
    keywords = ['patch']
    message_count = 24.0
    messages = ['112989', '114026', '123355', '123550', '123558', '123681', '127844', '127846', '128277', '128345', '128447', '128464', '128466', '128673', '128679', '128780', '129037', '131569', '131651', '131697', '131865', '131969', '134127', '134371']
    nosy_count = 10.0
    nosy_names = ['georg.brandl', 'rhettinger', 'terry.reedy', 'jcea', 'ncoghlan', 'stutzbach', 'eric.araujo', 'r.david.murray', 'Kain94', 'ysj.ray']
    pr_nums = []
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue9523'
    versions = ['Python 3.3']

    @ysjray
    Copy link
    Mannequin Author

    ysjray mannequin commented Aug 5, 2010

    During the patching work of bpo-8634, I found several problems about the module dbm.gnu and dbm.ndbm, I feel it's better to address them on a separate issue.

    1. As bpo-8634 said, the dbm.gnu, dbm.ndbm and dbm.dumb should have the similar interface, since they are intended to use the general dbm module as a common interface. If some methods only appears in one or two of them, like the "get()" method, it will be not convenient for users. Making all the three ones follow the collections.MutableMapping interface could be a better choice. Since the dbm.dumb is already collections.MutableMapping, I implemented missing methods of collections.MutableMapping in dbm.gnu and dbm.ndbm, and register the dbm object type in dbm.gnu and dbm.ndbm to the ABC. The missing methods mainly include get(), values(), items(), pop(), popitem(), clear()

    2. I fix the dbm_contains() function which implement the "in" operator to accept all buffer object, just like the dbm_subscript() fuction which implment the "[]" slice operator. I feel it's wearied that if "dbm['a']" yields the expected result but at the same time "'a' in dbm" raises TypeError.

    3. The type of dbm object in dbm.gnu is not iterable, and there is no way to iterate on it sufficiently because the only way we can iterate over it is to get all the keys first using keys() and then iter on the keys list. So I implemented a iterator type for dbm.gnu. Besides the dbm object in dbm.ndbm is also not iterable, and I implemented its tp_iter by get an iterator from its keys list instead of implementing a "real" iterator since the ndbm c api for get all the keys one by one is thread specific and I could not found a way to implement a real iterator to avoid the problem which occurred in the case we get tow iterators from one db object in the same thread and iterate over both at the same time.

    The patch contains also unittest and doc.

    @ysjray ysjray mannequin added extension-modules C modules in the Modules dir type-feature A feature request or enhancement labels Aug 6, 2010
    @terryjreedy
    Copy link
    Member

    Upgrading to match the MutableMapping interface seems reasonable.

    @merwok merwok changed the title Improve dbm module Improve dbm modules Nov 12, 2010
    @merwok
    Copy link
    Member

    merwok commented Dec 4, 2010

    In 3.2, objects return by dbm.dumb.open implement MutableMapping with incorrect semantics: keys return a list, iterkeys exist, etc.

    @ysjray
    Copy link
    Mannequin Author

    ysjray mannequin commented Dec 7, 2010

    Oh, yes. I noticed that the PEP-3119 defines return value of method MutableMapping.keys() as Set, as well as method items(). So the implementation of dumb.keys() and dump.items() are not correct since they all return lists while the class inherits MutableMapping.

    The implementations in my patch should also be corrected since I made the same mistake. Besides, since bpo-6045 has already added get(), I need to update my patch. I will do it later.

    And who can tell the specification of MutableMapping.update()? The PEP-3119 lacks of it. Should I follow the implementation in the ABC class MutableMapping?

    @bitdancer
    Copy link
    Member

    I believe that in the absence of other documentation the ABC is considered authoritative.

    @ysjray
    Copy link
    Mannequin Author

    ysjray mannequin commented Dec 9, 2010

    Here is the updated patch, which fixed:

    1. remove get() method of gdbm since bpo-6045 has already add it.

    2. method keys() and items() of dbm object return set instead of list. Since PEP-3119 said keys() and items() should return collections.Set and set is a collections.Set.

    3. add update() method to dbm object, which follows implementation in collections.MutableMapping.update().

    @merwok
    Copy link
    Member

    merwok commented Feb 4, 2011

    Thank you for working on a patch, especially with such comprehensive tests.

    The object returned by :func:`.open` supports the same basic functionality as
    -dictionaries
    +:mod:`collection`.MutableMapping

    The previous text was okay, I wouldn’t have changed it.

    def items(self):
    
    •    return set([(key, self[key]) for key in self.\_index.keys()])
      

    I don’t know if you should use a plain set or a collections.ItemsView here. In dict objects, KeysView and ValuesView are set-like objects with added behavior, for example they yield their elements in the same order. Raymond, can you comment?

    Style remarks: you can iter without calling _index.keys(); you can avoid the intermediary list (IOW, remove enclosing [ and ]).

    In the tests, you can use specialized methods like assertIn and assertIsNone, they remove some boilerplate and can give better error output.

    I can’t review the C code. :)

    @merwok
    Copy link
    Member

    merwok commented Feb 4, 2011

    See bpo-5736 for a patch adding iteration support. If the patch attached to his report supersedes the other one, I’ll close the other bug as duplicate.

    @ysjray
    Copy link
    Mannequin Author

    ysjray mannequin commented Feb 10, 2011

    Thanks eric for reviewing my patch! And thanks for you suggestions. I'm following them.

    I don’t know if you should use a plain set or a collections.ItemsView here. In dict objects, KeysView and ValuesView are set-like objects with added behavior, for example they yield their elements in the same order.

    Yes you are right. I think returning a view object is better than returning a set.

    Here is the updated patch. It updates:

    1. Make keys(), values(), items() methods return view object for ndbm, gdbm and dumb objects. I following the codes in dictobject.c. The keysview object support len(), "in" operator, and iteratable, while valuesview and itemsview object only support len() and iteratable.
    2. Removing doc changes:
      The object returned by :func:`.open` supports the same basic functionality as
      -dictionaries
      +:mod:`collection`.MutableMapping
      which is mentioned in eric's comment.
    3. Remove dumb's keys() method which calls self._index.keys() since it is unnecessary.
    4. Using more specialized assertXxx methods in test cases.
    5. Remove "the values() and items() method are not supported" in Doc/library/dbm.rst.

    See bpo-5736 for a patch adding iteration support. If the patch attached to his report supersedes the other one, I’ll close the other bug as duplicate.

    bpo-5736 's patch for adding iteration to ndbm and gdbm modules simple calling PyObject_GetIter(dbm_keys(dbm, NULL)) for both gdbm and ndbm, but I feel it's better to create a seperate iterator object for gdbm objects.

    @merwok
    Copy link
    Member

    merwok commented Feb 10, 2011

    1. Make keys(), values(), items() methods return view object for
      ndbm, gdbm and dumb objects. I following the codes in dictobject.c.
      Did you have to copy the code? Isn’t it possible to somehow reuse it?

    The keysview object support len(), "in" operator, and iteratable,
    while valuesview and itemsview object only support len() and iteratable.
    That does not seem to comply with the definition of dict views. Do the views yield elements in the same order? (In a dict, iteration order is undefined but consistent between the various views, IIUC.)

    1. Remove dumb's keys() method which calls self._index.keys() since it is unnecessary.
      Does dumb have no keys method then?
    1. Using more specialized assertXxx methods in test cases.
      See my comments on http://codereview.appspot.com/4185044/

    bpo-5736 's patch for adding iteration to ndbm and gdbm modules simple
    calling PyObject_GetIter(dbm_keys(dbm, NULL)) for both gdbm and ndbm,
    but I feel it's better to create a seperate iterator object for gdbm objects.
    Okay, so I shall close the other bug report, indicating it’s superseded by your patch.

    I can’t judge the C code; maybe Raymond or Daniel will. They’re also experts about collections and ABCs, so they’ll be able to confirm or infirm what I’ve said about dict views and the registration stuff (on codereview).

    @ysjray
    Copy link
    Mannequin Author

    ysjray mannequin commented Feb 12, 2011

    > 1. Make keys(), values(), items() methods return view object for ndbm, gdbm and dumb objects. I following the codes in dictobject.c.
    Did you have to copy the code? Isn’t it possible to somehow reuse it?

    I feel not so easy to reuse the code, there could be several differences in the c code. Resuing the code may make the code more complecated. But if someone could find a better way to reuse the code, it would be nice.

    > The keysview object support len(), "in" operator, and iteratable, while valuesview and itemsview object only support len() and iteratable.
    That does not seem to comply with the definition of dict views.

    Oh, yes, I missed the rich compare functions and isdisjoint() method of view objects.

    Do the views yield elements in the same order? (In a dict, iteration order is undefined but consistent between the various views, IIUC.)
    gdbm and dumb views yield elements in the same order, ndbm views doesn't. I missed it.

    > 3. Remove dumb's keys() method which calls self._index.keys() since it is unnecessary.
    Does dumb have no keys method then?
    Yes, it does. Its keys() method is provided by Mapping abc already.

    Here is the updated patch:

    1. Add rich compare functions and disjoint() method to dbm view objects to make them as MappingView objects, and add abc registration for them.
    2. Make ndbm view objects yield elements in the same order.
    3. Other changes during to the codeview: http://codereview.appspot.com/4185044/

    @merwok
    Copy link
    Member

    merwok commented Feb 12, 2011

    Add rich compare functions and disjoint() method to dbm view objects
    to make them as MappingView objects, and add abc registration for them.
    I’d prefer you not to register them, but test isinstance(keys(), KeysView), so that we’re sure no method is missing. (Does not validate behavior, but it’s a starting point.)

    Other comments on Rietveld (the code review site).

    BTW, if you’re posting updated patches on Rietveld you can remove patches from this bug report, so that people don’t review old patches. It would also be simpler to keep all discussion there :)

    @merwok
    Copy link
    Member

    merwok commented Feb 12, 2011

    Closed bpo-5736 as superseded. Please make sure the comments there about the peculiar API of gdbm don’t come up or are addressed in your patch.

    @ysjray
    Copy link
    Mannequin Author

    ysjray mannequin commented Feb 16, 2011

    Thanks!

    Here is my updated patch:
    1, Now the dbm view objects are the same as dict view objects, which are in conformity with collections.KeysView, ValuesView and ItemsView.
    2, I register all these abcs explicitly because these abcs have not __subclasshook__() method so they can't check api conformance(at lease exist) through isinstance(). I could not make sure api conformance except testing each method I find in abc explicitly. And my test_abc() is just to test the registering.
    3, Other updates which are from comments I wrote newly on codereview: http://codereview.appspot.com/4185044/

    @merwok
    Copy link
    Member

    merwok commented Feb 16, 2011

    Here is my updated patch:
    You don’t have to attach a file here, just update the codereview page instead. Or maybe you can’t because I created the page?

    1, Now the dbm view objects are the same as dict view objects, which
    are in conformity with collections.KeysView, ValuesView and ItemsView.
    Great! Just to make sure: How do you know that the view objects are compliant? Do you test all the methods documented for the ABCs?

    2, I register all these abcs explicitly because these abcs have not
    __subclasshook__() method so they can't check api conformance(at
    lease exist) through isinstance(). I could not make sure api
    conformance except testing each method I find in abc explicitly. And
    my test_abc() is just to test the registering.

    Thank you for repeating that many times and politely: I was indeed wrong. I went back to PEP-3119 to read again about __instancecheck__ and __subclasscheck__, then experimented in a shell and was surprised. I have been misunderstanding one thing: issubclass(cls, abc) does not return true automatically if cls provides the methods, it’s entirely up to the ABC to check methods or do something else in its __subclasscheck__ or __instancecheck__ methods. (I should have known better, I was in a discussion about adding that very feature on python-ideas and bpo-9731!) After another bit of experimentation with dict views and collections ABCs, I finally understand that you have to register your view classes. Thanks for letting me correct my misunderstanding.

    @ysjray
    Copy link
    Mannequin Author

    ysjray mannequin commented Feb 18, 2011

    An updated patch, based on latest several reviews on http://codereview.appspot.com/4185044/

    update:
    1, Refactoring the common tests between test_dbm_gnu and test_dbm_ndbm.
    2, Move the abc registering in Lib/dbm/ndbm.py and Lib/dbm/gnu.py.
    3, Other changes pointed out in review comments.

    @ysjray
    Copy link
    Mannequin Author

    ysjray mannequin commented Feb 22, 2011

    Sine r88451 removed unittest's assertSameElements() method, I need to updated my patch to fit it. So here it is.

    @merwok
    Copy link
    Member

    merwok commented Mar 20, 2011

    I think the patch will not be suitable for 3.1 and 3.2, so there should be a doc patch to mention the limitations of the dbm API (keys() returning a list and all that).

    @ysjray
    Copy link
    Mannequin Author

    ysjray mannequin commented Mar 21, 2011

    I think the patch will not be suitable for 3.1 and 3.2

    Yes, it changes some api(e.g keys()), which may introduces compatibility issues.

    so there should be a doc patch to mention the limitations of the dbm API (keys() > returning a list and all that).

    Do you mean a doc patch for 3.2 which mentions the missing or imperfect methods of collections.MutableSequence()? e.g keys() not returns a KeysView but a list instead, update() is missing

    @merwok
    Copy link
    Member

    merwok commented Mar 21, 2011

    Yes, I mean exactly that.

    @ysjray
    Copy link
    Mannequin Author

    ysjray mannequin commented Mar 23, 2011

    Updated patch:

    1, Changes follows review comments: http://codereview.appspot.com/4185044/. Thanks eric!

    2, Make Objects/dictobject.c:all_contained_in() a common useful limited api Object/abstract.c:_PyObject_AllContainedIn() for the purpose of re-usage in Modules/_gdbmmodule.c and Modules/_dbmmodule.c. Not sure if this is proper. I will ask somebody with C knowledge to do a review on the C code.

    @ysjray
    Copy link
    Mannequin Author

    ysjray mannequin commented Mar 24, 2011

    I tried to work out a doc patch for 3.2 to mention the limitation api: the missing methods compared with dict and the imperfect methods(keys(), items()) of collections.MutableMapping. Here is it.

    @ysjray
    Copy link
    Mannequin Author

    ysjray mannequin commented Apr 20, 2011

    Updated patch following C code review comments from http://bugs.python.org/review/9523/show

    Two big changes:

    1. Add check for each time assign a Py_ssize_t variable to datum.dsize(int), if value not fit, raise a ValueError(following PEP-353)

    2. Simplify dbm.update(), behave more like dict.update().

    @ysjray
    Copy link
    Mannequin Author

    ysjray mannequin commented Apr 25, 2011

    Sorry, previous patch(issue_9523_4.diff) missed a file(Lib/test/dbm_tests.py)
    Here is an intact one.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    extension-modules C modules in the Modules dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants