classification
Title: Improve dbm modules
Type: enhancement Stage: patch review
Components: Extension Modules Versions: Python 3.3
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Kain94, eric.araujo, georg.brandl, jcea, ncoghlan, r.david.murray, rhettinger, stutzbach, terry.reedy, ysj.ray
Priority: normal Keywords: patch

Created on 2010-08-05 15:23 by ysj.ray, last changed 2011-11-09 22:35 by jcea.

Files
File name Uploaded Description Edit
issue9523.diff ysj.ray, 2011-02-22 06:29 patch again py3k
issue_9523_3.diff ysj.ray, 2011-03-23 13:36 review
issue_9523_3.2_doc_patch.diff ysj.ray, 2011-03-24 11:46 patch against 3.2 review
issue_9523_4.diff ysj.ray, 2011-04-20 07:42 review
issue_9523_5.diff ysj.ray, 2011-04-25 06:34 review
Messages (24)
msg112989 - (view) Author: ysj.ray (ysj.ray) Date: 2010-08-05 15:23
During the patching work of issue8634, I found several problems about the module dbm.gnu and dbm.ndbm, I feel it's better to address them on a separate issue.

1. As issue8634 said, the dbm.gnu, dbm.ndbm and dbm.dumb should have the similar interface, since they are intended to use the general dbm module as a common interface. If some methods only appears in one or two of them, like the "get()" method, it will be not convenient for users. Making all the three ones follow the collections.MutableMapping interface could be a better choice. Since the dbm.dumb is already collections.MutableMapping, I implemented missing methods of collections.MutableMapping in dbm.gnu and dbm.ndbm, and register the dbm object type in dbm.gnu and dbm.ndbm to the ABC. The missing methods mainly include get(), values(), items(), pop(), popitem(), clear()

2. I fix the dbm_contains() function which implement the "in" operator to accept all buffer object, just like the dbm_subscript() fuction which implment the "[]" slice operator. I feel it's wearied that if "dbm['a']" yields the expected result but at the same time "'a' in dbm" raises TypeError. 

3. The type of dbm object in dbm.gnu is not iterable, and there is no way to iterate on it sufficiently because the only way we can iterate over it is to get all the keys first using keys() and then iter on the keys list. So I implemented a iterator type for dbm.gnu. Besides the dbm object in dbm.ndbm is also not iterable, and I implemented its tp_iter by get an iterator from its keys list instead of implementing a "real" iterator since the ndbm c api for get all the keys one by one is thread specific and I could not found a way to implement a real iterator to avoid the problem which occurred in the case we get tow iterators from one db object in the same thread and iterate over both at the same time.

The patch contains also unittest and doc.
msg114026 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2010-08-15 23:29
Upgrading to match the MutableMapping interface seems reasonable.
msg123355 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2010-12-04 15:08
In 3.2, objects return by dbm.dumb.open implement MutableMapping with incorrect semantics: keys return a list, iterkeys exist, etc.
msg123550 - (view) Author: ysj.ray (ysj.ray) Date: 2010-12-07 14:02
Oh, yes. I noticed that the pep3119 defines return value of method MutableMapping.keys() as Set, as well as method items(). So the implementation of dumb.keys() and dump.items() are not correct since they all return lists while the class inherits MutableMapping. 

The implementations in my patch should also be corrected since I made the same mistake. Besides, since issue6045 has already added get(), I need to update my patch. I will do it later.

And who can tell the specification of MutableMapping.update()? The pep3119 lacks of it. Should I follow the implementation in the ABC class MutableMapping?
msg123558 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-12-07 15:56
I believe that in the absence of other documentation the ABC is considered authoritative.
msg123681 - (view) Author: ysj.ray (ysj.ray) Date: 2010-12-09 13:56
Here is the updated patch, which fixed:

1. remove get() method of gdbm since issue6045 has already add it.

2. method keys() and items() of dbm object return set instead of list. Since pep3119 said keys() and items() should return collections.Set and set is a collections.Set.

3. add update() method to dbm object, which follows implementation in collections.MutableMapping.update().
msg127844 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-02-04 00:52
Thank you for working on a patch, especially with such comprehensive tests.

> The object returned by :func:`.open` supports the same basic functionality as
> -dictionaries
> +:mod:`collection`.MutableMapping

The previous text was okay, I wouldn’t have changed it.

>     def items(self):
>+        return set([(key, self[key]) for key in self._index.keys()])
I don’t know if you should use a plain set or a collections.ItemsView here.  In dict objects, KeysView and ValuesView are set-like objects with added behavior, for example they yield their elements in the same order.  Raymond, can you comment?

Style remarks: you can iter without calling _index.keys(); you can avoid the intermediary list (IOW, remove enclosing [ and ]).

In the tests, you can use specialized methods like assertIn and assertIsNone, they remove some boilerplate and can give better error output.

I can’t review the C code. :)
msg127846 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-02-04 00:58
See #5736 for a patch adding iteration support.  If the patch attached to his report supersedes the other one, I’ll close the other bug as duplicate.
msg128277 - (view) Author: ysj.ray (ysj.ray) Date: 2011-02-10 10:27
Thanks eric for reviewing my patch! And thanks for you suggestions. I'm  following them.

> I don’t know if you should use a plain set or a collections.ItemsView here.  In dict objects, KeysView and ValuesView are set-like objects with added behavior, for example they yield their elements in the same order.

Yes you are right. I think returning a view object is better than returning a set.

Here is the updated patch. It updates:
1. Make keys(), values(), items() methods return view object for ndbm, gdbm and dumb objects. I following the codes in dictobject.c. The keysview object support len(), "in" operator, and iteratable, while valuesview and itemsview object only support len() and iteratable.
2. Removing doc changes: 
  The object returned by :func:`.open` supports the same basic functionality as
  -dictionaries
  +:mod:`collection`.MutableMapping
    which is mentioned in eric's comment.
3. Remove dumb's keys() method which calls self._index.keys() since it is unnecessary.
4. Using more specialized assertXxx methods in test cases.
5. Remove "the values() and items() method are not supported" in Doc/library/dbm.rst.


> See #5736 for a patch adding iteration support.  If the patch attached to his report supersedes the other one, I’ll close the other bug as duplicate.

#5736 's patch for adding iteration to ndbm and gdbm modules simple calling PyObject_GetIter(dbm_keys(dbm, NULL)) for both gdbm and ndbm, but I feel it's better to create a seperate iterator object for gdbm objects.
msg128345 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-02-10 22:09
> 1. Make keys(), values(), items() methods return view object for
> ndbm, gdbm and dumb objects. I following the codes in dictobject.c.
Did you have to copy the code?  Isn’t it possible to somehow reuse it?

> The keysview object support len(), "in" operator, and iteratable,
> while valuesview and itemsview object only support len() and iteratable.
That does not seem to comply with the definition of dict views.  Do the views yield elements in the same order?  (In a dict, iteration order is undefined but consistent between the various views, IIUC.)

> 3. Remove dumb's keys() method which calls self._index.keys() since it is unnecessary.
Does dumb have no keys method then?

> 4. Using more specialized assertXxx methods in test cases.
See my comments on http://codereview.appspot.com/4185044/

> #5736 's patch for adding iteration to ndbm and gdbm modules simple
> calling PyObject_GetIter(dbm_keys(dbm, NULL)) for both gdbm and ndbm,
> but I feel it's better to create a seperate iterator object for gdbm objects.
Okay, so I shall close the other bug report, indicating it’s superseded by your patch.

I can’t judge the C code; maybe Raymond or Daniel will.  They’re also experts about collections and ABCs, so they’ll be able to confirm or infirm what I’ve said about dict views and the registration stuff (on codereview).
msg128447 - (view) Author: ysj.ray (ysj.ray) Date: 2011-02-12 16:49
> > 1. Make keys(), values(), items() methods return view object for ndbm, gdbm and dumb objects. I following the codes in dictobject.c.
> Did you have to copy the code?  Isn’t it possible to somehow reuse it?

I feel not so easy to reuse the code, there could be several differences in the c code. Resuing the code may make the code more complecated. But if someone could find a better way to reuse the code, it would be nice.


> > The keysview object support len(), "in" operator, and iteratable, while valuesview and itemsview object only support len() and iteratable.
> That does not seem to comply with the definition of dict views. 

Oh, yes, I missed the rich compare functions and isdisjoint() method of view objects.


> Do the views yield elements in the same order?  (In a dict, iteration order is undefined but consistent between the various views, IIUC.)
gdbm and dumb views yield elements in the same order, ndbm views doesn't. I missed it.


> > 3. Remove dumb's keys() method which calls self._index.keys() since it is unnecessary.
> Does dumb have no keys method then?
Yes, it does. Its keys() method is provided by Mapping abc already.


Here is the updated patch:
1. Add rich compare functions and disjoint() method to dbm view objects to make them as MappingView objects, and add abc registration for them.
2. Make ndbm view objects yield elements in the same order.
3. Other changes during to the codeview: http://codereview.appspot.com/4185044/
msg128464 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-02-12 20:15
> Add rich compare functions and disjoint() method to dbm view objects
> to make them as MappingView objects, and add abc registration for them.
I’d prefer you not to register them, but test isinstance(keys(), KeysView), so that we’re sure no method is missing.  (Does not validate behavior, but it’s a starting point.)

Other comments on Rietveld (the code review site).

BTW, if you’re posting updated patches on Rietveld you can remove patches from this bug report, so that people don’t review old patches.  It would also be simpler to keep all discussion there :)
msg128466 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-02-12 20:18
Closed #5736 as superseded.  Please make sure the comments there about the peculiar API of gdbm don’t come up or are addressed in your patch.
msg128673 - (view) Author: ysj.ray (ysj.ray) Date: 2011-02-16 15:21
Thanks!

Here is my updated patch:
1, Now the dbm view objects are the same as dict view objects, which are in conformity with collections.KeysView, ValuesView and ItemsView.
2, I register all these abcs explicitly because these abcs have not __subclasshook__() method so they can't check api conformance(at lease exist) through isinstance(). I could not make sure api conformance except testing each method I find in abc explicitly. And my test_abc() is just to test the registering.
3, Other updates which are from comments I wrote newly on codereview: http://codereview.appspot.com/4185044/
msg128679 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-02-16 16:35
> Here is my updated patch:
You don’t have to attach a file here, just update the codereview page instead.  Or maybe you can’t because I created the page?

> 1, Now the dbm view objects are the same as dict view objects, which
> are in conformity with collections.KeysView, ValuesView and ItemsView.
Great!  Just to make sure: How do you know that the view objects are compliant?  Do you test all the methods documented for the ABCs?

> 2, I register all these abcs explicitly because these abcs have not
> __subclasshook__() method so they can't check api conformance(at
> lease exist) through isinstance(). I could not make sure api
> conformance except testing each method I find in abc explicitly. And
> my test_abc() is just to test the registering.

Thank you for repeating that many times and politely: I was indeed wrong.  I went back to  PEP 3119 to read again about __instancecheck__ and __subclasscheck__, then experimented in a shell and was surprised.  I have been misunderstanding one thing: issubclass(cls, abc) does not return true automatically if cls provides the methods, it’s entirely up to the ABC to check methods or do something else in its __subclasscheck__ or __instancecheck__ methods.  (I should have known better, I was in a discussion about adding that very feature on python-ideas and #9731!)  After another bit of experimentation with dict views and collections ABCs, I finally understand that you have to register your view classes.  Thanks for letting me correct my misunderstanding.
msg128780 - (view) Author: ysj.ray (ysj.ray) Date: 2011-02-18 13:13
An updated patch, based on latest several reviews on http://codereview.appspot.com/4185044/

update:
1, Refactoring the common tests between test_dbm_gnu and test_dbm_ndbm.
2, Move the abc registering in Lib/dbm/ndbm.py and Lib/dbm/gnu.py.
3, Other changes pointed out in review comments.
msg129037 - (view) Author: ysj.ray (ysj.ray) Date: 2011-02-22 06:29
Sine r88451 removed unittest's assertSameElements() method, I need to updated my patch to fit it. So here it is.
msg131569 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-03-20 23:29
I think the patch will not be suitable for 3.1 and 3.2, so there should be a doc patch to mention the limitations of the dbm API (keys() returning a list and all that).
msg131651 - (view) Author: ysj.ray (ysj.ray) Date: 2011-03-21 13:44
> I think the patch will not be suitable for 3.1 and 3.2

Yes, it changes some api(e.g keys()), which may introduces compatibility issues.

> so there should be a doc patch to mention the limitations of the dbm API (keys() > returning a list and all that).

Do you mean a doc patch for 3.2 which mentions the missing or imperfect methods of collections.MutableSequence()? e.g keys() not returns a KeysView but a list instead, update() is missing
msg131697 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-03-21 22:52
Yes, I mean exactly that.
msg131865 - (view) Author: ysj.ray (ysj.ray) Date: 2011-03-23 13:36
Updated patch:

1, Changes follows review comments: http://codereview.appspot.com/4185044/. Thanks eric!

2, Make Objects/dictobject.c:all_contained_in() a common useful limited api Object/abstract.c:_PyObject_AllContainedIn() for the purpose of re-usage in Modules/_gdbmmodule.c and Modules/_dbmmodule.c. Not sure if this is proper. I will ask somebody with C knowledge to do a review on the C code.
msg131969 - (view) Author: ysj.ray (ysj.ray) Date: 2011-03-24 11:46
I tried to work out a doc patch for 3.2 to mention the limitation api: the missing methods compared with dict and the imperfect methods(keys(), items()) of collections.MutableMapping. Here is it.
msg134127 - (view) Author: ysj.ray (ysj.ray) Date: 2011-04-20 07:42
Updated patch following C code review comments from http://bugs.python.org/review/9523/show

Two big changes:

1. Add check for each time assign a Py_ssize_t variable to datum.dsize(int), if value not fit, raise a ValueError(following PEP 353)

2. Simplify dbm.update(), behave more like dict.update().
msg134371 - (view) Author: ysj.ray (ysj.ray) Date: 2011-04-25 06:34
Sorry, previous patch(issue_9523_4.diff) missed a file(Lib/test/dbm_tests.py)
Here is an intact one.
History
Date User Action Args
2011-11-09 22:35:50jceasetnosy: + jcea
2011-04-25 06:34:38ysj.raysetfiles: + issue_9523_5.diff

messages: + msg134371
2011-04-20 07:42:44ysj.raysetfiles: + issue_9523_4.diff

messages: + msg134127
2011-03-24 11:46:04ysj.raysetfiles: + issue_9523_3.2_doc_patch.diff

messages: + msg131969
2011-03-23 14:52:11ncoghlansetnosy: + ncoghlan
2011-03-23 13:36:58ysj.raysetfiles: + issue_9523_3.diff
nosy: georg.brandl, rhettinger, terry.reedy, stutzbach, eric.araujo, r.david.murray, Kain94, ysj.ray
messages: + msg131865
2011-03-21 22:52:15eric.araujosetnosy: georg.brandl, rhettinger, terry.reedy, stutzbach, eric.araujo, r.david.murray, Kain94, ysj.ray
messages: + msg131697
2011-03-21 13:44:46ysj.raysetnosy: georg.brandl, rhettinger, terry.reedy, stutzbach, eric.araujo, r.david.murray, Kain94, ysj.ray
messages: + msg131651
2011-03-20 23:29:24eric.araujosetnosy: georg.brandl, rhettinger, terry.reedy, stutzbach, eric.araujo, r.david.murray, Kain94, ysj.ray
messages: + msg131569
2011-02-22 06:30:42ysj.raysetfiles: - issue_9523.diff
nosy: georg.brandl, rhettinger, terry.reedy, stutzbach, eric.araujo, r.david.murray, Kain94, ysj.ray
2011-02-22 06:30:01ysj.raysetfiles: + issue9523.diff
nosy: georg.brandl, rhettinger, terry.reedy, stutzbach, eric.araujo, r.david.murray, Kain94, ysj.ray
messages: + msg129037
2011-02-18 13:13:39ysj.raysetfiles: - issue_9523.diff
nosy: georg.brandl, rhettinger, terry.reedy, stutzbach, eric.araujo, r.david.murray, Kain94, ysj.ray
2011-02-18 13:13:27ysj.raysetfiles: + issue_9523.diff
nosy: georg.brandl, rhettinger, terry.reedy, stutzbach, eric.araujo, r.david.murray, Kain94, ysj.ray
messages: + msg128780
2011-02-16 16:35:01eric.araujosetnosy: georg.brandl, rhettinger, terry.reedy, stutzbach, eric.araujo, r.david.murray, Kain94, ysj.ray
messages: + msg128679
2011-02-16 15:21:32ysj.raysetfiles: - issue_9523.diff
nosy: georg.brandl, rhettinger, terry.reedy, stutzbach, eric.araujo, r.david.murray, Kain94, ysj.ray
2011-02-16 15:21:15ysj.raysetfiles: + issue_9523.diff
nosy: georg.brandl, rhettinger, terry.reedy, stutzbach, eric.araujo, r.david.murray, Kain94, ysj.ray
messages: + msg128673
2011-02-14 02:22:50ysj.raysetfiles: - issue_9523.diff
nosy: georg.brandl, rhettinger, terry.reedy, stutzbach, eric.araujo, r.david.murray, Kain94, ysj.ray
2011-02-14 02:22:45ysj.raysetfiles: - issue_9523.diff
nosy: georg.brandl, rhettinger, terry.reedy, stutzbach, eric.araujo, r.david.murray, Kain94, ysj.ray
2011-02-14 02:22:41ysj.raysetfiles: - issue8634.diff
nosy: georg.brandl, rhettinger, terry.reedy, stutzbach, eric.araujo, r.david.murray, Kain94, ysj.ray
2011-02-12 20:18:33eric.araujosetnosy: georg.brandl, rhettinger, terry.reedy, stutzbach, eric.araujo, r.david.murray, Kain94, ysj.ray
messages: + msg128466
2011-02-12 20:17:43eric.araujolinkissue5736 superseder
2011-02-12 20:15:38eric.araujosetnosy: georg.brandl, rhettinger, terry.reedy, stutzbach, eric.araujo, r.david.murray, Kain94, ysj.ray
messages: + msg128464
2011-02-12 16:49:50ysj.raysetfiles: + issue_9523.diff
nosy: georg.brandl, rhettinger, terry.reedy, stutzbach, eric.araujo, r.david.murray, Kain94, ysj.ray
2011-02-12 16:49:28ysj.raysetnosy: georg.brandl, rhettinger, terry.reedy, stutzbach, eric.araujo, r.david.murray, Kain94, ysj.ray
messages: + msg128447
2011-02-10 22:09:48eric.araujosetnosy: + stutzbach
messages: + msg128345
2011-02-10 10:27:05ysj.raysetfiles: + issue_9523.diff
nosy: georg.brandl, rhettinger, terry.reedy, eric.araujo, r.david.murray, Kain94, ysj.ray
messages: + msg128277
2011-02-04 00:58:07eric.araujosetnosy: georg.brandl, rhettinger, terry.reedy, eric.araujo, r.david.murray, Kain94, ysj.ray
messages: + msg127846
2011-02-04 00:56:40eric.araujolinkissue6045 superseder
2011-02-04 00:52:19eric.araujosetnosy: + rhettinger

messages: + msg127844
versions: + Python 3.3, - Python 3.2
2010-12-09 13:56:44ysj.raysetfiles: + issue_9523.diff

messages: + msg123681
2010-12-07 15:56:10r.david.murraysetnosy: + r.david.murray
messages: + msg123558
2010-12-07 14:02:02ysj.raysetmessages: + msg123550
2010-12-04 15:08:44eric.araujosetmessages: + msg123355
2010-11-12 01:15:08eric.araujosetnosy: + georg.brandl, Kain94
2010-11-12 01:14:51eric.araujolinkissue8634 superseder
2010-11-12 01:13:22eric.araujosetnosy: + eric.araujo

title: Improve dbm module -> Improve dbm modules
2010-08-15 23:29:02terry.reedysetnosy: + terry.reedy

messages: + msg114026
stage: patch review
2010-08-06 01:59:46ysj.raysettype: enhancement
components: + Extension Modules
versions: + Python 3.2
2010-08-05 15:23:53ysj.raycreate