msg85856 - (view) |
Author: Akira Kitada (akitada) * |
Date: 2009-04-11 13:47 |
In Python 2.6, dbm modules othar than bsddb don't support the iterator
protocol.
>>> import dbm
>>> d = dbm.open('spam.dbm', 'c')
>>> for k in range(5): d["key%d" % k] = "value%d" % k
...
>>> for k in d: print k, d[k]
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'dbm.dbm' object is not iterable
Adding iterator support would make dbm modules more convenient and
easier to use.
|
msg85859 - (view) |
Author: Akira Kitada (akitada) * |
Date: 2009-04-11 14:11 |
Attached is a patch that adds the iterator protocol.
Now it can be interated through like:
>>> for k in d: print k, d[k]
...
key1 vale1
key3 vale3
key0 vale0
key2 vale2
key4 vale4
The problem is there is no way to get the internal pointer back to the
start. So Once it reached to the end, you are done.
>>> for k in d: print k, d[k]
...
The solution to this would be:
- Add a method to get the pointer back to the start
(with {first,next}key API)
- Add a method that returns a generator
|
msg85867 - (view) |
Author: Akira Kitada (akitada) * |
Date: 2009-04-11 16:14 |
Revised patch adds firstkey and nextkey to dbm.
Now the internal pointer can be reset with firstkey.
|
msg85878 - (view) |
Author: Martin v. Löwis (loewis) * |
Date: 2009-04-11 22:03 |
Would you like to fix gdbm as well?
|
msg85888 - (view) |
Author: Akira Kitada (akitada) * |
Date: 2009-04-12 09:47 |
Here's another patch which addsd iter to dbm and gdbm.
Note that dbm and gdbm C API is a little different.
gdbm_nextkey requires key for its argument, dbm_nextkey don't.
So I had to use for gdbm an static variable that points to the current
position.
Now iterator in gdbm and dbm works differently.
>>> import dbm
>>> d = dbm.open('foo', 'n')
>>> d['k1'] = 'v1';d['k2'] = 'v2';
>>> for i in d: print i; break
...
k1
>>> for i in d: print i
...
k2
>>> for i in d: print i
...
>>> import gdbm
>>> gd = gdbm.open('foo.gdbm', 'n')
>>> gd['k1'] = 'v1';gd['k2'] = 'v2';
>>> for i in gd: print i; break
...
k2
>>> for i in gd: print i
for i in gd: print i
...
k1
>>> for i in gd: print i
...
k2
k1
|
msg85889 - (view) |
Author: Akira Kitada (akitada) * |
Date: 2009-04-12 10:11 |
Of course iter should work in the same way in all dbm modules.
iter in dbm/gdbm should work like dumbdbm's iter.
>>> dumb = dumbdbm.open('foo', 'n')
>>> dumb['k1'] = 'v1';dumb['k2'] = 'v2';
>>> for i in dumb: print i; break
...
k2
>>> for i in dumb: print i
for i in dumb: print i
...
k2
k1
>>> for i in dumb: print i
for i in dumb: print i
...
k2
k1
|
msg85928 - (view) |
Author: Skip Montanaro (skip.montanaro) * |
Date: 2009-04-12 23:45 |
Akira> Note that dbm and gdbm C API is a little different. gdbm_nextkey
Akira> requires key for its argument, dbm_nextkey don't. So I had to
Akira> use for gdbm an static variable that points to the current
Akira> position.
I don't think this is going to fly. A static variable is not thread-safe.
What's worse, even in a non-threaded environment you might want to iterate
over the gdbm file simultaneously from two different places.
|
msg85931 - (view) |
Author: Skip Montanaro (skip.montanaro) * |
Date: 2009-04-13 00:19 |
skip> What's worse, even in a non-threaded environment you might want to
skip> iterate over the gdbm file simultaneously from two different
skip> places.
Or iterate over two different gdbm files simultaneously.
|
msg85944 - (view) |
Author: Martin v. Löwis (loewis) * |
Date: 2009-04-13 12:56 |
I agree with Skip that using a static variable is not appropriate. The
proper solution probably would be to define a separate gdbm_iter object
which always preserves the last key returned.
|
msg85946 - (view) |
Author: Akira Kitada (akitada) * |
Date: 2009-04-13 13:43 |
Yes, using a static variable there is wrong and
actually I'm now working on "dbm_iterobject" just as Martin suggested.
dbm iterator should behave just like one in dict.
I think I can use Objects/dictobject.c as a good example for this.
Attached is minimal tests for dbm iterator.
|
msg91339 - (view) |
Author: Christopher Lee (foobaron) |
Date: 2009-08-06 00:32 |
Another reason this issue is really important, is that the lack of a
consistent iter() interface for dbm.* makes shelve iteration not
scalable; i.e. trying to iterate on a Shelf will run self.dict.keys() to
load the entire index into memory. This seems contrary to a primary
purpose of shelve, namely to store the index on-disk so as to avoid
having to keep the whole index in memory.
I suspect that for most users, shelve is the main way they will access
the dbm.* interfaces. Therefore, fixing the dbm.* interfaces so that
shelve is scalable seems like an important need.
Once dbm and gdbm support the iterator protocol, it will be trivial to
add an __iter__() method to shelve.Shelf, that simply returns
iter(self.dict).
|
msg118874 - (view) |
Author: Akira Kitada (akitada) * |
Date: 2010-10-16 15:46 |
This patch just uses PyObject_GetIter to get an iter.
(I just copied the idea from issue9523)
|
msg123358 - (view) |
Author: Éric Araujo (eric.araujo) * |
Date: 2010-12-04 15:20 |
This may be superseded by #9523. There are comments and patches in both issues, so I’m not closing either as duplicate of the other.
|
msg128465 - (view) |
Author: Éric Araujo (eric.araujo) * |
Date: 2011-02-12 20:17 |
#9523 has a more comprehensive patch in progress, adding __iter__ and other mapping methods, so I’m closing this one.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:56:47 | admin | set | github: 49986 |
2011-02-12 20:17:43 | eric.araujo | set | status: open -> closed
superseder: Improve dbm modules versions:
- Python 3.2 nosy:
loewis, rhettinger, eric.araujo, akitada, foobaron, ysj.ray messages:
+ msg128465 resolution: duplicate stage: resolved |
2010-12-04 15:20:59 | eric.araujo | set | nosy:
+ eric.araujo messages:
+ msg123358
|
2010-10-18 11:41:18 | pitrou | set | nosy:
+ rhettinger
|
2010-10-16 15:47:01 | akitada | set | files:
+ issue5736.diff versions:
+ Python 3.2, - Python 2.7 nosy:
+ ysj.ray
messages:
+ msg118874
|
2010-10-16 15:32:10 | akitada | set | files:
- issue5736.diff |
2010-10-16 15:32:06 | akitada | set | files:
- test_issue5736.diff |
2010-10-16 15:32:02 | akitada | set | files:
- issue5736.diff |
2010-10-16 15:31:52 | akitada | set | files:
- issue5736.diff |
2010-05-20 20:31:03 | skip.montanaro | set | nosy:
- skip.montanaro
|
2009-08-06 00:32:36 | foobaron | set | nosy:
+ foobaron messages:
+ msg91339
|
2009-04-13 13:43:40 | akitada | set | files:
+ test_issue5736.diff
messages:
+ msg85946 |
2009-04-13 12:56:08 | loewis | set | messages:
+ msg85944 |
2009-04-13 00:19:22 | skip.montanaro | set | messages:
+ msg85931 |
2009-04-12 23:45:07 | skip.montanaro | set | nosy:
+ skip.montanaro messages:
+ msg85928
|
2009-04-12 10:11:41 | akitada | set | messages:
+ msg85889 |
2009-04-12 09:47:38 | akitada | set | files:
+ issue5736.diff
messages:
+ msg85888 |
2009-04-11 22:03:38 | loewis | set | nosy:
+ loewis messages:
+ msg85878
|
2009-04-11 16:14:07 | akitada | set | files:
+ issue5736.diff
messages:
+ msg85867 |
2009-04-11 14:11:27 | akitada | set | files:
+ issue5736.diff keywords:
+ patch messages:
+ msg85859
|
2009-04-11 13:47:52 | akitada | create | |