msg73348 - (view) |
Author: Benjamin Peterson (benjamin.peterson) *  |
Date: 2008-09-18 01:18 |
This happens on the Windows buildbot everything once and a while [1]:
======================================================================
FAIL: test01_basic_replication
(bsddb.test.test_replication.DBReplicationManager)
----------------------------------------------------------------------
Traceback (most recent call last):
File
"S:\buildbots\python\trunk.nelson-windows\build\lib\bsddb\test\test_replication.py",
line 122, in test01_basic_replication
self.assertTrue(time.time()<timeout)
AssertionError
----------------------------------------------------------------------
http://python.org/dev/buildbot/stable/x86%20W2k8%20trunk/builds/159/step-test/0
|
msg73351 - (view) |
Author: Barry A. Warsaw (barry) *  |
Date: 2008-09-18 01:44 |
We're going to disable this test for 2.6rc2. Please jcea and
gregory.p.smith, take a look at it for 2.6 final. I've made it a
deferred blocker for that release.
|
msg73352 - (view) |
Author: Mark Hammond (mhammond) *  |
Date: 2008-09-18 01:51 |
I instrumented the code a little. The error is happening because
self.client_startupdone never gets set to True. This is supposed to be
set in the client_startupdone() method. It expects an event type of
db.DB_EVENT_REP_STARTUPDONE, but we see exactly one call to this
function with a value of 2, which is apparently DB_EVENT_REP_CLIENT.
After this call no further calls are made to the method and after 10
seconds the test gives up.
|
msg73357 - (view) |
Author: Mark Hammond (mhammond) *  |
Date: 2008-09-18 02:50 |
As discussed with Barry on #python-dev, I committed r66498 which skips
the failing assertion on Windows and replaces it with some noise to
stderr. Note that only that one assertion fails - the rest of the test
passes on Windows.
Also, Brett Cannon on #python-dev experimented and could see the
callback function being called a number of times - with 2 first as
Windows does, but then a number of other times and once with the
expected value. I'm afraid I've no insight into why these aren't
delivered on Windows.
|
msg73370 - (view) |
Author: Jesús Cea Avión (jcea) *  |
Date: 2008-09-18 11:22 |
Adding Nelson to the nosy list, since I'm unable to debug any Windows
issue by myself.
I'm very interested in the report saying that MS Windows Berkeley DB
correctly sends the db.DB_EVENT_REP_STARTUPDONE event in his setup. That
could indicate some issue in the buildbot.
Barry, I don't consider this a "release blocker" for 2.6.0. This is a
very advanced feature (database replication). This must be solved, but
2.6 doesn't need to be delayed for it.
|
msg73371 - (view) |
Author: Jesús Cea Avión (jcea) *  |
Date: 2008-09-18 11:25 |
¿Somebody can try this issue in a Windows box?. It could be a Berkeley
DB bug in that platform, but I would like to verify this before
informing Oracle.
Thanks.
|
msg73386 - (view) |
Author: Hirokazu Yamamoto (ocean-city) *  |
Date: 2008-09-18 14:43 |
I've tried this issue on VC6 + db4.7.25.0 + win2k, and I could reproduce
error.
I added following patch for investigation.
Index: Lib/bsddb/test/test_replication.py
===================================================================
--- Lib/bsddb/test/test_replication.py (revision 66483)
+++ Lib/bsddb/test/test_replication.py (working copy)
@@ -40,6 +40,7 @@
self.confirmed_master=True
def client_startupdone(a,b,c) :
+ print "DBReplicationManager:", (a, b, c)
if b==db.DB_EVENT_REP_STARTUPDONE :
self.client_startupdone=True
@@ -201,6 +202,7 @@
self.confirmed_master = True
def client_startupdone(a,b,c) :
+ print "DBBaseReplication:", (a, b, c)
if b == db.DB_EVENT_REP_STARTUPDONE :
self.client_startupdone = True
And this is result.
DBReplicationManager: (<DBEnv object at 0x00A80148>, 2, None)
DBReplicationManager: (<DBEnv object at 0x00A80148>, 5, 0)
DBBaseReplication: (<DBEnv object at 0x00B35CD0>, 2, None)
DBBaseReplication: (<DBEnv object at 0x00B35CD0>, 5, 13)
DBBaseReplication: (<DBEnv object at 0x00B35CD0>, 7, None)
DBBaseReplication: (<DBEnv object at 0x00B49658>, 2, None)
DBBaseReplication: (<DBEnv object at 0x00B49658>, 5, 13)
DBBaseReplication: (<DBEnv object at 0x00B49730>, 2, None)
DBBaseReplication: (<DBEnv object at 0x00B49730>, 5, 13)
|
msg73415 - (view) |
Author: Mark Hammond (mhammond) *  |
Date: 2008-09-19 02:15 |
We are seeing one more error almost identical to the one I fixed (even
the method name is the same), but its at line 315 - this is the last
error in the Windows buildbot! Please let me know if you would like me
to make a similar "fix" to this line, or if you think you will be able
to back out my hack or skip the test some other way.
|
msg73421 - (view) |
Author: Gregory P. Smith (gregory.p.smith) *  |
Date: 2008-09-19 05:56 |
just make a similar "fix" for now.
|
msg73452 - (view) |
Author: Mark Hammond (mhammond) *  |
Date: 2008-09-20 01:18 |
Actually, I've decided to leave it alone. The buildbots most recent
failure was:
test test_bsddb3 failed -- Traceback (most recent call last):
File
"S:\buildbots\python\trunk.nelson-windows\build\lib\bsddb\test\test_replication.py",
line 315, in test01_basic_replication
self.assertTrue(time.time()<timeout)
AssertionError
But I can't repro that - instead, I now see:
Traceback (most recent call last):
File "o:\src\python-svn\lib\bsddb\test\test_lock.py", line 127, in
test03_lock_timeout
self.assertTrue((end_time-start_time) >= 0.1)
AssertionError
So I go back to the windows buildbots and they are all green!! So any
remaining issues appear truly transient and don't seem to be restricted
to a single place. I can't think of an obvious improvement to make here.
|
msg73785 - (view) |
Author: Jesús Cea Avión (jcea) *  |
Date: 2008-09-25 12:53 |
Oracle guys are studying this issue. I will keep you informed.
This issue is not a release blocker, in any case. See rational in msg73370.
|
msg73912 - (view) |
Author: Jesús Cea Avión (jcea) *  |
Date: 2008-09-27 01:37 |
I need some MS Windows user able to replicate this issue locally (not in
the buildbot). Oracle people need to do some test and I would like to
avoid to mess with builtbots.
Please, help!.
|
msg73920 - (view) |
Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) *  |
Date: 2008-09-27 08:53 |
I reproduce the test failure very consistently, on both win2k and winXP,
and may be of some help to test stuff.
|
msg80652 - (view) |
Author: Hirokazu Yamamoto (ocean-city) *  |
Date: 2009-01-27 14:29 |
I've investigated more. I think this is timeout problem. According to
following site,
http://www.oracle.com/technology/documentation/berkeley-
db/db/api_c/rep_timeout.html
timeout should be specified in microseconds. This means
self.dbenvMaster.rep_set_timeout(db.DB_REP_CONNECTION_RETRY,100123)
only waits 10msec. This is too small comparing to default timeout.
Probably 10sec is appropriate here.
After I applied an attached patch "fix_timeout.py", I ran test several
times and found no error.
# I changed time.time()+10 to time.time()+30 because my machine is too
slow, so 10sec was not enough. :-[
|
msg94932 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2009-11-05 14:49 |
I changed the timeout to +30, and the bsddb3 test has been passing on
the windows buildbot. That buildbot is a virtualbox instance running on
a P4, so it makes sense that it just took a long time for the (second)
test to pass. While testing I noticed that on my gentoo buildbot the
ReplicationManager test would time out but the base test would pass...so
I wonder if the test that is currently being skipped would pass if you
gave that windows buildbot even longer to think about it. But since
bsddb3 isn't in py3k, I'm not inclined to take the time to find out.
|
msg95247 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2009-11-14 18:22 |
Failures still occur occasionally even with the timeout set to 60. So
I've turned the check that is skipped on Windows from an assertion into
a warning only on all other platforms, since bsddb support isn't
actively maintained and is gone in py3k. Fix applied to trunk in r76265
and 2.6 in r76267.
|
msg97735 - (view) |
Author: Florent Xicluna (flox) *  |
Date: 2010-01-13 19:45 |
Still occurs, on 2010-01-13.
http://www.python.org/dev/buildbot/all/builders/x86%20Ubuntu%20trunk/builds/264
|
msg98521 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2010-01-29 18:35 |
Hmm. That's not actually the same timeout. What that means is that if the startup timeout triggers, then a subsequent test (one that is looking for the results of a client transaction) fails, which contradicts the earlier observation that even if the startup timeout fails then everything else still passes.
|
msg98537 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2010-01-29 20:46 |
Since this isn't likely to get fixed and it is causing buildbot instability, I propose the attached patch to just skip the remainder of the test method if this second timeout happens.
|
msg100234 - (view) |
Author: Gregory P. Smith (gregory.p.smith) *  |
Date: 2010-03-01 01:55 |
r.david.murray - sounds like a good idea.
|
msg100266 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2010-03-01 20:09 |
Applied skip patch to trunk in r78558, and 2.6 in 78560.
|
msg214525 - (view) |
Author: Mark Lawrence (BreamoreBoy) * |
Date: 2014-03-23 00:10 |
Is any follow up needed to this or can it be closed as bsddb has been removed from Python 3?
|
msg214529 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2014-03-23 00:52 |
Yeah, let's just close it. Nobody is likely to care enough to try to fix it properly.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:56:39 | admin | set | github: 48142 |
2014-03-23 00:52:52 | r.david.murray | set | status: languishing -> closed resolution: out of date messages:
+ msg214529
stage: needs patch -> resolved |
2014-03-23 00:10:57 | BreamoreBoy | set | nosy:
+ BreamoreBoy
messages:
+ msg214525 versions:
- Python 2.6 |
2010-03-01 20:09:33 | r.david.murray | set | status: open -> languishing
messages:
+ msg100266 |
2010-03-01 01:55:27 | gregory.p.smith | set | messages:
+ msg100234 |
2010-01-29 20:47:00 | r.david.murray | set | files:
+ bsddb3-skip-random-failure.patch keywords:
+ patch messages:
+ msg98537
|
2010-01-29 18:38:33 | r.david.murray | set | keywords:
+ buildbot, - patch |
2010-01-29 18:35:40 | r.david.murray | set | messages:
+ msg98521 stage: patch review -> needs patch |
2010-01-13 19:45:53 | flox | set | nosy:
+ flox messages:
+ msg97735
|
2009-11-14 18:22:42 | r.david.murray | set | title: bsddb: test01_basic_replication fails on Windows sometimes -> bsddb: test01_basic_replication fails sometimes nosy:
mhammond, barry, gregory.p.smith, jcea, amaury.forgeotdarc, ocean-city, benjamin.peterson, trent, r.david.murray messages:
+ msg95247
components:
+ Tests type: behavior stage: patch review |
2009-11-06 02:47:35 | r.david.murray | set | keywords:
- buildbot |
2009-11-05 14:49:12 | r.david.murray | set | nosy:
+ r.david.murray
messages:
+ msg94932 versions:
+ Python 2.7 |
2009-10-30 19:47:16 | r.david.murray | set | keywords:
+ buildbot |
2009-01-27 14:30:01 | ocean-city | set | files:
+ fix_timeout.patch keywords:
+ patch messages:
+ msg80652 |
2008-09-27 08:53:20 | amaury.forgeotdarc | set | nosy:
+ amaury.forgeotdarc messages:
+ msg73920 |
2008-09-27 01:37:59 | jcea | set | messages:
+ msg73912 |
2008-09-26 23:37:33 | benjamin.peterson | set | priority: release blocker -> normal |
2008-09-25 12:53:10 | jcea | set | messages:
+ msg73785 |
2008-09-20 01:18:06 | mhammond | set | messages:
+ msg73452 |
2008-09-19 05:56:44 | gregory.p.smith | set | messages:
+ msg73421 |
2008-09-19 02:15:25 | mhammond | set | messages:
+ msg73415 |
2008-09-18 14:43:06 | ocean-city | set | nosy:
+ ocean-city messages:
+ msg73386 |
2008-09-18 11:25:50 | jcea | set | messages:
+ msg73371 |
2008-09-18 11:22:42 | jcea | set | nosy:
+ trent messages:
+ msg73370 |
2008-09-18 05:44:17 | barry | set | priority: deferred blocker -> release blocker |
2008-09-18 02:50:17 | mhammond | set | messages:
+ msg73357 |
2008-09-18 01:51:41 | mhammond | set | nosy:
+ mhammond messages:
+ msg73352 |
2008-09-18 01:44:45 | barry | set | priority: deferred blocker nosy:
+ barry messages:
+ msg73351 |
2008-09-18 01:43:36 | barry | set | nosy:
+ gregory.p.smith |
2008-09-18 01:18:07 | benjamin.peterson | create | |