Issue3885
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2008-09-16 21:08 by vstinner, last changed 2022-04-11 14:56 by admin. This issue is now closed.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
_bsddb.patch | vstinner, 2008-09-16 21:08 | Patch fixing described bugs | ||
_bsddb_test.patch | vstinner, 2008-09-17 00:20 | Test to reproduce the crash | ||
bsddb.patch | jcea, 2008-09-18 15:20 | Patch for python 2.6. |
Messages (19) | |||
---|---|---|---|
msg73312 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2008-09-16 21:08 | |
I found two differents bugs using Fusil the fuzzer. (1) On db_env_create() error in newDBEnvObject(), self->db_env is not set and so use of this pointer may crashs. (2) DBEnv_dealloc() may raise an exception (DBEnv_close_internal() calls makeDBError()) but the garbage collector dislike exceptions! Example of (2): ---- $ python >>> import gc, _bsddb; env=_bsddb.DBEnv(3); del env >>> gc.collect(); gc.collect() Exception bsddb.db.DBNoServerError: (-30992, 'DB_NOSERVER: Fatal error, no RPC server -- No Berkeley DB RPC server environment') in 'garbage collection' ignored Fatal Python error: unexpected exception during garbage collection ---- |
|||
msg73313 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2008-09-16 21:14 | |
About the bug (1): it also occurs in DBEnv_dealloc() but DBEnv_dealloc() is directly called from newDBEnvObject() with Py_DECREF(self);. The two bugs can be reproduces with dummy DBenv() arguments, eg. "DBEnv(92)". Backtrace using gdb: ----- $ gdb ./python ... >>> import _bsddb; _bsddb.DBenv(92) ... Program received signal SIGSEGV, Segmentation fault. 0xb7cc8f5e in DBEnv_close_internal (self=0x83385f0, flags=0) at ../Modules/_bsddb.c:3989 3989 err = self->db_env->close(self->db_env, flags); (gdb) print self->db_env $1 = (DB_ENV *) 0xfffffffe ----- There db_env value is not set. |
|||
msg73319 - (view) | Author: Jesús Cea Avión (jcea) * ![]() |
Date: 2008-09-16 22:48 | |
Good catch. I've patched my private branch. Following 2.6 patches, if anybody can provide a review, since we are in RC status. I was wondering if we must protect other objects from raising exceptions in GC. For example, DB objects. What do you think?. (I have broken my left wrist ten days ago. If you can write a testcase showing the issue and a patch for it, would be very nice). |
|||
msg73320 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2008-09-17 00:20 | |
Here is a test to reproduce the crash. I don't know db_env_create() function, so I don't know what are "invalid arguments". So I used ~DB_RPCCLIENT. "gc.collect()" is to detect the bug (2). About other objects dealloc method... DBEnv_dealloc() `-> DBEnv_close_internal(): RETURN_IF_ERR() DB_dealloc() `-> DB_close_internal(): RETURN_IF_ERR() DBC_dealloc() `-> DBC_close_internal(): RETURN_IF_ERR() DBSequence_dealloc() `-> DBSequence_close_internal(): RETURN_IF_ERR() DBTxn_dealloc() `-> DBTxn_abort_discard_internal(): RETURN_IF_ERR() DBLock_dealloc() looks ok. |
|||
msg73327 - (view) | Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * ![]() |
Date: 2008-09-17 08:53 | |
I thought first that the problem was during the execution of gc.collect(), but its not: when configured --with-pydebug, the exception is printed before: >>> import gc, _bsddb; env=_bsddb.DBEnv(3); del env XXX undetected error Traceback (most recent call last): File "<stdin>", line 1, in <module> bsddb.db.DBNoServerError: (-30992, 'DB_NOSERVER: Fatal error, no RPC server -- No Berkeley DB RPC server environment') gc.collect() is just a rude way to display this "XXX undetected error". (Victor: does Fusil check for this? gc.collect() will not fail if there is another exception in-between, or in debug mode) Now, to the _bsddb module: in general, the following pattern is wrong: dummy = someFunction(); Py_XDECREF(dummy); because it does nothing about an eventual exception set. If the exception can be discarded, PyErr_Clear() must be called. I think there is an invariant to keep in each function: return NULL if and only if an exception is set. Many places in _bsddb do not respect this. |
|||
msg73328 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2008-09-17 09:38 | |
> gc.collect() is just a rude way to display this "XXX undetected error". > (Victor: does Fusil check for this? gc.collect() will not fail if there > is another exception in-between, or in debug mode) I stopped to fuzz Python using --pydebug because a critical design error in CPython reference counting: http://bugs.python.org/issue3299 (some modules use PyObject_DEL() instead of Py_DECREF() whereas PyObject_DEL() creates inconsistent objects in the double linked object list). Since nobody cares about this issue, I don't use pydebug anymore. > Now, to the _bsddb module: in general, the following pattern is wrong: > dummy = someFunction(); > Py_XDECREF(dummy); > because it does nothing about an eventual exception set. If the > exception can be discarded, PyErr_Clear() must be called. Well, we have to choices: don't raise an error, or clear the exception if an exception is raised. > I think there is an invariant to keep in each function: return NULL if > and only if an exception is set. Many places in _bsddb do not respect this. Where? dealloc() callback prototype is "void tp_dealloc(...)": no result! It's not possible to tell Python that an error occured. |
|||
msg73329 - (view) | Author: Jesús Cea Avión (jcea) * ![]() |
Date: 2008-09-17 10:22 | |
So, the right thing to do seems to drop the "check_error" flag and just do a "PyErr_Clear()" if necessary in the dealloc code. This must be done in all dealloc code: DBEnv, DB, DBSequence and DBCursor. Do you agree?. |
|||
msg73330 - (view) | Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * ![]() |
Date: 2008-09-17 11:25 | |
Not only in these functions, but anywhere "dummy=" is followed by a Py_XDECREF(dummy); And code like this must also be changed (in DB_open): if (makeDBError(err)) { PyObject *dummy; dummy=DB_close_internal(self,0); Py_XDECREF(dummy); return NULL; } if DB_close_internal() raises an exception it will replace the original error set by makeDBError(). I'm not sure this is desirable. |
|||
msg73334 - (view) | Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * ![]() |
Date: 2008-09-17 13:17 | |
> I stopped to fuzz Python using --pydebug because a critical design > error in CPython reference counting I won't comment on this, but to get the extra checks you could arrange that ceval.c is compiled with CHECKEXC defined. "./configure BASECFLAGS=-DCHECKEXC" did the trick for me. > Where? dealloc() callback prototype is "void tp_dealloc(...)": > no result! It's not possible to tell Python that an error occured. Exactly. You cannot return NULL (or -1 for other kind of functions), so no exception may be raised; PyErr_Clear() calls are necessary. |
|||
msg73387 - (view) | Author: Jesús Cea Avión (jcea) * ![]() |
Date: 2008-09-18 15:20 | |
> Not only in these functions, but anywhere "dummy=" is followed by a > Py_XDECREF(dummy); Please, clarify this. I don't see your point. > And code like this must also be changed (in DB_open): > if (makeDBError(err)) { > PyObject *dummy; > > dummy=DB_close_internal(self,0); > Py_XDECREF(dummy); > return NULL; > } > if DB_close_internal() raises an exception it will replace the > original error set by makeDBError(). I'm not sure this is desirable. I agree this could be an issue that need to be studied. Please, post a new bug report for that (to be addressed in 2.6.1, I think, since this issue doesn't crash the interpreter). I post a patch for the bugs originally indicated in this bug report, an another one (db.verify() regression). *Please Review*. I will update python 2.6 svn when the review is done. Thanks. |
|||
msg73388 - (view) | Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * ![]() |
Date: 2008-09-18 16:10 | |
> > Not only in these functions, but anywhere "dummy=" is followed by a > > Py_XDECREF(dummy); > > Please, clarify this. I don't see your point. If dummy is NULL, a pending exception has been set (with PyErr_SetString or another similar function). If you simply ignore this, your function will return a valid object, but PyErr_Occurred() returns True. This is not correct and will fail in subtle ways (like with gc.collect()); in debug mode, a message "XXX undetected error" is printed. You must either clear the exception, or return NULL from your function. |
|||
msg73395 - (view) | Author: Jesús Cea Avión (jcea) * ![]() |
Date: 2008-09-18 18:29 | |
Very good point, Amaury. I would open a new bug to track this (probably for 2.6.1 time). Can you?. In the meanwhile, I need review for the inmediate patch, since the crash is 100% reproductible. |
|||
msg73594 - (view) | Author: Alyssa Coghlan (ncoghlan) * ![]() |
Date: 2008-09-22 21:02 | |
Patch comments: - the test suite section of the diff appears to have a number of changes that are unrelated to this issue - the purpose of the new do_not_close flag (i.e. avoiding the crash) could use a comment at the point where it is referenced in the code - aren't all those dummy=DB_close_internal(*) + Py_XDECREF calls also in need of the PyErr_Clear() treatment? - globally clearing *any* error unconditionally is typically a bad idea. If there are certain 'expected' errors (such as the No Server error Victor was triggering or other BSDDB errors) then those can be cleared silently, but for other errors something should at least be written to stderr to indicate that an error was ignored because it couldn't be propagated back to the eval loop (and possibly even for the BSDDB errors). |
|||
msg73602 - (view) | Author: Jesús Cea Avión (jcea) * ![]() |
Date: 2008-09-22 22:20 | |
Nick: 1. Yes, the code actually patches an unrelated regression too (DB.verify() crashes). I added the testcase, since the testsuite didn't exercise "DB.verify()" and so the bug was lurking there for months. It is solved now, and the testcase is there to protect us in the future. I can split the patch, if demanded, nevertheless. 2. About commenting "do_not_close" use, I agree. Done. Now the code says: """ /* ** "do_not_close" is used to dispose all related objects in the ** tree, without actually releasing the "root" object. ** This is done, for example, because function calls like ** "DB.verify()" implicitly close the underlying handle. So ** the handle doesn't need to be closed, but related objects ** must be cleaned up. */ """ Similarly in "DBSequence.remove()". I'm actually thinking about splitting "*_close_internal()" functions in "release related objects"+"object close". I don't like the "do_not_close" flag. But that would be a post-2.6.0 change that I'm still considering. 3. About "dummy=DB_close_internal(*) + Py_XDECREF", the crashes are related to raising exceptions *while* deallocating objects. These are solved with this patch. Other "dummy=DB_close_internal(*) + Py_XDECREF" uses need be improved, but they do not crash the library, and the changes are a bit extensive for a post RC code. They are in my TO DO list, but I think they can be left out of Python 2.6.0. See my previous post. 4. I agree with you. The problem is related to cleaning a partially/incorrectly initialized object. Ideally the error should be raised in the object constructor. Nevertheless in this case the RPC initialization involves several functions that must be called in sequence: object created as "RPC enabled"+specifying the details of remote RPC server. If the handle is disposed before the second step, it can be safely deallocated without showing any spurious error. In this concrete example, I don't think the user needs to know about the "phantom" error. If you don't agree, please tell to me. Would be a few "fprintf()" enough?. Thanks for your time, Nick. |
|||
msg73628 - (view) | Author: Alyssa Coghlan (ncoghlan) * ![]() |
Date: 2008-09-23 12:06 | |
All of those explanations sound fair to me - with those questions answered, the patch looks good to me. |
|||
msg73657 - (view) | Author: Jesús Cea Avión (jcea) * ![]() |
Date: 2008-09-23 19:27 | |
Thanks. Committed as r66568. |
|||
msg73670 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2008-09-23 21:49 | |
Ok, thanks. My initial bug is closed and my fuzzer is unable to find new ones ;-) Great job. |
|||
msg73672 - (view) | Author: Jesús Cea Avión (jcea) * ![]() |
Date: 2008-09-23 22:24 | |
Would be nice to use that fuzzer myself. Details, please :). |
|||
msg73676 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2008-09-23 22:47 | |
> Would be nice to use that fuzzer myself. Details, please :). It's not the best place to explain it, so I will try explain shortly: * install fusil 1.0: use mandriva/debian packages, or use sources * (create fusil user group) * run "sudo fusil-python --module=_bsddb" The best way is to use the trunk version (without installation)! ---- cd <your fusil home> svn co http://python-ptrace.hachoir.org/svn/trunk python-ptrace svn co http://fusil.hachoir.org/svn/trunk fusil export PYTHONPATH=$PWD/python-ptrace/ptrace:$PWD/fusil:$PYTHONPATH sudo <your python interpreter> ./fusil/fuzzer/fusil-python --module=_bsddb ---- (refer to python-ptrace/INSTALL and fusil/INSTALL) You have to run the fuzzer as root to be able to run child processes as the fusil user. Never run fusil as root or it will kill you(r computer)! See also http://fusil.hachoir.org/trac/wiki/Contact |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:56:39 | admin | set | github: 48135 |
2008-09-23 22:47:23 | vstinner | set | messages: + msg73676 |
2008-09-23 22:24:13 | jcea | set | messages: + msg73672 |
2008-09-23 21:49:05 | vstinner | set | messages: + msg73670 |
2008-09-23 21:29:51 | jcea | set | status: open -> closed type: crash resolution: fixed |
2008-09-23 19:27:51 | jcea | set | messages: + msg73657 |
2008-09-23 12:06:02 | ncoghlan | set | messages: + msg73628 |
2008-09-22 22:20:57 | jcea | set | messages: + msg73602 |
2008-09-22 21:02:04 | ncoghlan | set | nosy:
+ ncoghlan messages: + msg73594 |
2008-09-18 18:29:09 | jcea | set | messages: + msg73395 |
2008-09-18 16:10:21 | amaury.forgeotdarc | set | messages: + msg73388 |
2008-09-18 15:20:13 | jcea | set | files:
+ bsddb.patch messages: + msg73387 |
2008-09-17 13:17:23 | amaury.forgeotdarc | set | messages: + msg73334 |
2008-09-17 11:25:05 | amaury.forgeotdarc | set | messages: + msg73330 |
2008-09-17 10:22:08 | jcea | set | messages: + msg73329 |
2008-09-17 09:38:04 | vstinner | set | messages: + msg73328 |
2008-09-17 08:53:29 | amaury.forgeotdarc | set | nosy:
+ amaury.forgeotdarc messages: + msg73327 |
2008-09-17 00:20:58 | vstinner | set | files:
+ _bsddb_test.patch messages: + msg73320 |
2008-09-16 22:48:22 | jcea | set | priority: high keywords: + needs review messages: + msg73319 |
2008-09-16 21:14:21 | vstinner | set | messages: + msg73313 |
2008-09-16 21:13:00 | benjamin.peterson | set | assignee: jcea nosy: + jcea |
2008-09-16 21:08:59 | vstinner | create |