Issue553108
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2002-05-07 03:46 by gtk, last changed 2022-04-10 16:05 by admin. This issue is now closed.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
deprecate_bsddb.diff | gtk, 2002-05-07 03:46 | patch to deprecate bsddb and remove from anydbm's list of candidates | ||
bsddb.diff | skip.montanaro, 2002-06-14 03:33 |
Messages (13) | |||
---|---|---|---|
msg39908 - (view) | Author: Garth T Kidd (gtk) | Date: 2002-05-07 03:46 | |
Large numbers of inserts break bsddb, as first discovered in Python 1.5 (bug 408271). According to Barry Warsaw, "trying to get the bsddb module that comes with Python to work is a hopeless cause." If it's broken, let's discourage people from using it. In particular, let's ensure that people importing shelve or anydbm don't end up using it by default. The submitted patch adds a DeprecationWarning to the bsddb module and removes bsddb from the list of db module candidates in anydbm. |
|||
msg39909 - (view) | Author: Martin v. Löwis (loewis) * | Date: 2002-05-08 09:01 | |
Logged In: YES user_id=21627 I'm in favour of this change, but I'd like simultaneously incorporate bsddb3. |
|||
msg39910 - (view) | Author: Garth T Kidd (gtk) | Date: 2002-05-09 03:12 | |
Logged In: YES user_id=59803 Let's not turn a simple patch into something requiring a PEP, compulsory thrashing on comp.lang.python, SleepyCat being willing to change their distribution model, lawyers (to make sure the licences are compatible), and so on. I'd hate it if other people spent the kind of time I did trying to get shelve to work only to find that a known- broken bsddb was causing all the problems, and that a patch was there to gently guide them to gdbm, but it got jammed because of scope-creep. Let's get this one, very simple and necessary (bsddb IS broken) change out of the way, and THEN start negotiating, thrashing, and integrating. :) I firmly believe bsddb3 should be one of the included batteries. Let's do it, but let's guide people away from broken code first. |
|||
msg39911 - (view) | Author: Martin D Katz, Ph.D. (drbits) | Date: 2002-05-16 23:10 | |
Logged In: YES user_id=276840 I am not sure there is a reason to deprecate bsddb. The btopen format appears to be stable enough for normal work. Maybe 2.3 should change dbhash to use btopen? |
|||
msg39912 - (view) | Author: Martin D Katz, Ph.D. (drbits) | Date: 2002-05-20 18:14 | |
Logged In: YES user_id=276840 #!/bin/python # Test for Python bug report 553108 # This program shows that bsddb seems to work reliably with # the btopen database format. # This is based on the test program # in the discussion of bug report 445862 # This has been enhanced to perform read, modify, # write operations in random order. # This is only one of several tests I performed. # This included 4,000,000 read, modify, write operations to 90,909 records # (an average of 44,000 writes for each record). # Note: This program took approximately 50 hours to run # on my 930MHz Pentium 3 under Windows 2000 with # ActiveState Python version 2.1.1 build 212 import unittest, sys, os, math, time LIMIT=4000000 DISPLAY_AT_END=1 USE_RANDOM=100 # If set, number of keys is approximately LIMIT/USE_RANDOM AUTO_RANDOM=1 if USE_RANDOM and AUTO_RANDOM: USE_RANDOM=int(math.sqrt(math.sqrt(LIMIT))) if USE_RANDOM < 2: USE_RANDOM = 2 ## The format of the value string is ## count|hash|hash...|b ## Where ## count is an 8 byte hexadecimal count of the number of times ## this record has been written. ## hash is the md5 hash of the random value that created this record. ## It is the key for this record. It is appended once for each ## time the record is written (that is, it occurs count times). ## b is 129 '!' ## if USE_RANDOM is set, its value should be >= 2 class BreakDB(unittest.TestCase): def runTest(self): import md5, bsddb, os if USE_RANDOM: import random random.seed() max_key=int(LIMIT / USE_RANDOM) m = md5.new() b = "!" * 129 # small string to write db = bsddb.btopen(self.dbname, 'c') try: self.db = db for count in xrange(1, LIMIT+1): if count % 100==0: print >> sys.stderr, " %10d\r" % (count), if USE_RANDOM: r = random.randrange(0, max_key) m = md5.new(str(r)) key = m.hexdigest() if db.has_key(key): rec = db[key] old_count = int(rec[0:8], 16) should_be = '%08X|%s%s'% (old_count, ((key+'|') *old_count), b) if rec != should_be: self.fail("Mismatched data: db ["+repr(key)+"]="+ repr(db[key])+". Should be "+repr(should_be)) return 1 else: # New record rec = '00000000|'+b old_count = 0 new_count = old_count+1 new_rec = '%08X|%s%s'% (new_count, key, rec[8:], ) db[key] = new_rec else: m.update(str(count)) db[m.digest()] = b try: db.sync() except: pass if DISPLAY_AT_END: rec = db.first() count = 0 while 1: print >> sys.stderr, " count = %6i db[% s]=%s" % ( count, rec[0], rec[1], ) count += 1 try: rec = db.next() except KeyError: break finally: db.close() def unlinkDB(self): import os if os.path.exists(self.dbname): os.unlink(self.dbname) def setUp(self): self.dbname = 'test.db' self.unlinkDB() def tearDown(self): self.db.close() self.unlinkDB() if __name__ == '__main__': runner = unittest.TextTestRunner() runner.run(unittest.TestSuite([BreakDB()])) |
|||
msg39913 - (view) | Author: Skip Montanaro (skip.montanaro) * | Date: 2002-06-11 16:09 | |
Logged In: YES user_id=44345 I think deprecating bsddb is too drastic. In the first place, the problems you refer to are in the underlying Berkeley DB library, not in the bsddb code itself. In the second place, later versions of the library fix the problem. The attached patch attempts to modify setup.py and configure.in to solve the problem. It does a couple things differently than the current CVS version: 1. It only searches for versions 2 and 3 of the Berkeley DB library by default. People who know what they are doing can uncomment the information relevant to version 1. 2. It moves all the checking code into setup.py. The header file checks in configure.in were deleted. 3. The ndbm lookalike stuff for the dbm module is done differently. This has not really been tested yet. I anticipate further changes will be necessary with this code. I'm sure it's not perfect. Please give it a try and let me know how it works for you. All that said, I think a better migration path is to replace the current module with the bsddb3/pybsddb stuff. I think that would effectively restrict you to versions 3 or 4 of the underlying Berkeley DB library, so it probably couldn't be done with impunity. Skip |
|||
msg39914 - (view) | Author: Skip Montanaro (skip.montanaro) * | Date: 2002-06-13 07:35 | |
Logged In: YES user_id=44345 Here's an updated patch. It's different in a couple ways: * support for Berkeley DB 4.x was added. You will need to configure iBerkdb with the 1.85 compatibility stuff. * I cleaned up the dbm build code a bit. * I added a diff for the configure file for people who don't have autoconf handy. Skip |
|||
msg39915 - (view) | Author: Skip Montanaro (skip.montanaro) * | Date: 2002-06-14 03:33 | |
Logged In: YES user_id=44345 a couple more tweaks... I forgot to include dbmmodule.c in previous patches. This version of the patch also includes a modified README file that adds a section about building the bsddb and dbm modules. |
|||
msg39916 - (view) | Author: Martin v. Löwis (loewis) * | Date: 2002-06-14 07:16 | |
Logged In: YES user_id=21627 The patch looks good, please apply it. |
|||
msg39917 - (view) | Author: Skip Montanaro (skip.montanaro) * | Date: 2002-06-14 20:32 | |
Logged In: YES user_id=44345 Implemented in setup.py 1.93 README 1.147 configure 1.315 configure.in 1.325 pyconfig.h.in 1.42 Modules/dbmmodule 2.30 |
|||
msg39918 - (view) | Author: Jack Jansen (jackjansen) * | Date: 2002-07-02 21:52 | |
Logged In: YES user_id=45365 Skip, I'm reopening this bug report: the fix breaks builds on Mac OS X, and I haven't a clue as to how to fix this so I hope you can help. MacOSX has /usr/include/ndbm.h (implemented with Berkeley DB, I think) but it doesn't have any of the libraries (I assume everything needed is in libc). Everything worked fine until last week, when configure still took care of defining HAVE_NDBM_H. |
|||
msg39919 - (view) | Author: Skip Montanaro (skip.montanaro) * | Date: 2002-07-02 22:17 | |
Logged In: YES user_id=44345 Jack, Sorry to here you're having trouble. Alas, my MacOS X system is with my wife at the moment, so I can't dig into the problem much. Can you provide me with some background info? If you can send me your copy of ndbm.h (I doubt it's using Berkeley DB) and figure out which library dbm_open resides in, that would be great. Also, can you provide me with the output of the build process so I can see just what errors are being generated? Skip |
|||
msg39920 - (view) | Author: Skip Montanaro (skip.montanaro) * | Date: 2002-08-06 17:43 | |
Logged In: YES user_id=44345 Closing this again. I think Jack's running okay on MacOSX once again. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-10 16:05:18 | admin | set | github: 36567 |
2002-05-07 03:46:04 | gtk | create |