dtoa.c: oversize b in quorem, and a menagerie of other bugs #51881

skrah · 2010-01-04T12:50:04Z

BPO	7632
Nosy	@tim-one, @mdickinson, @ericvsmith, @skrah
Files	issue7632.patch issue7632_v2.patch test_dtoa.py: Random tests for string -> float conversion issue7632_bug8.patch: Patch for the release blocker dtoa_detect_leaks.patch: Detect leaks from dtoa and strtod. memory_debugger.diff

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = 'https://github.com/mdickinson'
closed_at = <Date 2010-01-17.21:09:44.862>
created_at = <Date 2010-01-04.12:50:03.934>
labels = ['interpreter-core', 'type-crash', 'release-blocker']
title = 'dtoa.c: oversize b in quorem, and a menagerie of other bugs'
updated_at = <Date 2010-01-17.21:09:44.860>
user = 'https://github.com/skrah'

bugs.python.org fields:

activity = <Date 2010-01-17.21:09:44.860>
actor = 'mark.dickinson'
assignee = 'mark.dickinson'
closed = True
closed_date = <Date 2010-01-17.21:09:44.862>
closer = 'mark.dickinson'
components = ['Interpreter Core']
creation = <Date 2010-01-04.12:50:03.934>
creator = 'skrah'
dependencies = []
files = ['15792', '15796', '15797', '15899', '15926', '15930']
hgrepos = []
issue_num = 7632
keywords = ['patch']
message_count = 41.0
messages = ['97205', '97206', '97209', '97210', '97226', '97417', '97439', '97443', '97458', '97459', '97524', '97544', '97552', '97649', '97667', '97670', '97672', '97741', '97763', '97767', '97770', '97771', '97814', '97815', '97816', '97850', '97851', '97852', '97857', '97874', '97888', '97889', '97907', '97914', '97915', '97919', '97920', '97945', '97946', '97952', '97973']
nosy_count = 4.0
nosy_names = ['tim.peters', 'mark.dickinson', 'eric.smith', 'skrah']
pr_nums = []
priority = 'release blocker'
resolution = 'fixed'
stage = 'needs patch'
status = 'closed'
superseder = None
type = 'crash'
url = 'https://bugs.python.org/issue7632'
versions = ['Python 3.1', 'Python 2.7', 'Python 3.2']

skrah · 2010-01-04T12:50:04Z

In a debug build:

Python 3.2a0 (py3k:76671M, Dec 22 2009, 19:41:08) 
[GCC 4.1.3 20080623 (prerelease) (Ubuntu 4.1.2-23ubuntu3)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> s = "2183167012312112312312.23538020374420446192e-370"
[30473 refs]
>>> f = float(s)
oversize b in quorem

mdickinson · 2010-01-04T12:57:15Z

Nice catch! I'll take a look. We should find out whether this is something that happens with Gay's original code, or whether it was introduced in the process of adapting that code for Python.

mdickinson · 2010-01-04T13:23:17Z

I can reproduce this on OS X 10.6 (64-bit), both in py3k and trunk debug builds. In non-debug builds it appears to return the correct result (0.0), so the oversize b appears to have no ill effects. So this may just be an overeager assert; it may be a symptom of a deeper problem, though.

ericvsmith · 2010-01-04T13:37:58Z

I'm testing on a Fedora Core 6 i386 box and an Intel Mac 32-bit 10.5 box. I only see this on debug builds. I've tested trunk, py3k, release31-maint, and release26-maint (just for giggles).

The error shows up in debug builds of trunk, py3k, and release31-maint on both machines, and does not show up in non-debug builds.

mdickinson · 2010-01-04T22:14:49Z

The bug is present in the current version of dtoa.c from http://www.netlib.org/fp, so I'll report it upstream. As far as I can tell, though, it's benign, in the sense that if the check is disabled then nothing bad happens, and the correct result is eventually returned (albeit after some unnecessary computation).

I suspect that the problem is in the if block around lines 1531--1543 of Python/dtoa.c: a subnormal rv isn't being handled correctly here---it should end up being set to 0.0, but is instead set to 2**-968.

mdickinson · 2010-01-08T17:10:26Z

Here's a patch that seems to fix the problem; I'll wait a while to see if I get a response from David Gay before applying this.

Also, if we've got to the stage of modifying the algorithmic part of the original dtoa.c, we should really make sure that we've got our own set of comprehensive tests.

mdickinson · 2010-01-08T20:44:11Z

Randomised testing quickly turned up another troublesome string for str -> float conversion:

s = "94393431193180696942841837085033647913224148539854e-358"

This one's actually giving incorrectly rounded results (the horror!) in a non-debug build of trunk, and giving the same 'oversize b in quorem' in a debug build. With the patch, it doesn't give the 'oversize b' error, but does still give incorrect results.

Python 2.7a1+ (trunk:77375, Jan  8 2010, 20:33:59) 
[GCC 4.2.1 (Apple Inc. build 5646) (dot 1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> s = "94393431193180696942841837085033647913224148539854e-358"
>>> float(s)   # result of dtoa.c
9.439343119318067e-309
>>> from __future__ import division
>>> int(s[:-5])/10**358  # result via (correctly rounded) division
9.43934311931807e-309

I also double checked this value using a simple pure Python implementation of strtod, and using MPFR (via the Python bigfloat module), with the same result:

>>> from test_dtoa import strtod
>>> strtod(s)  # result via a simple pure Python implementation of strtod
9.43934311931807e-309
>>> from bigfloat import *
>>> with double_precision: x = float(BigFloat(s))
>>> x   # result from MPFR, via the bigfloat module
9.43934311931807e-309

mdickinson · 2010-01-08T22:18:31Z

Okay, I think I've found the cause of the second rounding bug above: at the end of the bigcomp function there's a correction block that looks like

    ...
    else if (dd < 0) {
        if (!dsign)     /* does not happen for round-near */
          retlow1:
            dval(rv) -= ulp(rv);
    }
    else if (dd > 0) {
        if (dsign) {
          rethi1:
            dval(rv) += ulp(rv);
        }
    }
    else ...

The problem is that the += and -= corrections don't take into account the possibility that bc->scale is nonzero, and for the case where the scaled rv is subnormal, they'll typically have no effect.

I'll work on a fix... tomorrow.

mdickinson · 2010-01-09T16:47:34Z

Second patch, adding a fix for the rounding bug to the first patch.

mdickinson · 2010-01-09T16:54:16Z

Here's the (rather crude) testing program that turned up these errors.

mdickinson · 2010-01-10T17:34:56Z

One more incorrectly rounded result, this time for a normal number:

AssertionError: Incorrectly rounded str->float conversion for 99999999999999994487665465554760717039532578546e-47: expected 0x1.0000000000000p+0, got 0x1.fffffffffffffp-1

tim-one · 2010-01-10T19:43:58Z

Showing once again that a proof of FP code correctness is about as compelling as a proof of God's ontological status ;-)

Still, have to express surprised admiration for 99999999999999994487665465554760717039532578546e-47! That one's not even close to being a "hard" case.

mdickinson · 2010-01-10T21:53:42Z

Showing once again that a proof of FP code correctness is about as
compelling as a proof of God's ontological status ;-)

Clearly we need a 1000-page Isabelle/HOL-style machine-checked formal proof, rather than a ten-page TeX proof. Any takers?

All of the above bugs seem to have been introduced with the new 'bigcomp' code that arrived on March 16, 2009, just a couple of weeks before I downloaded the version that got adapted for Python; in retrospect, I probably should have used the NO_STRTOD_BIGCOMP #define to bypass the new code.

mdickinson · 2010-01-12T18:39:08Z

Progress report: I've had a response, and fix, from David Gay for the first 2 bugs (Stefan's original bug and the incorrect subnormal result); I'm still arguing with him about a 3rd one (not reported here; there's some possibly incorrect code in bigcomp that probably never actually gets called). I reported the 4th bug (the incorrect rounding for values near 1) to him today. In the mean time, here's bug number 5, found by eyeballing the bigcomp code until it surrendered. :-)

>>> 1000000000000000000000000000000000000000e-16
1e+23
>>> 10000000000000000000000000000000000000000e-17
1.0000000000000001e+23

mdickinson · 2010-01-12T22:25:14Z

Fixed the crash that Stefan originally reported in r77450. That revision also removes the 'possibly incorrect code in bigcomp that probably never actually gets called'.

mdickinson · 2010-01-12T22:56:52Z

Second bug fixed in r77451 (trunk), using a fix from David Gay, modified slightly for correctness.

mdickinson · 2010-01-12T23:09:41Z

Merged fixes so far, and a couple of other cleanups, to py3k in r77452, and release31-maint in r77453.

mdickinson · 2010-01-13T22:37:33Z

Just so I don't forget, there are a couple more places in the dtoa.c that look suspicious and need to be checked; I haven't tried to generate failures for them yet. Since we're up to bug 5, I'll number these 6 and 7:

(6) at the end of bigcomp, when applying the round-to-even rule for halfway cases, the lsb of rv is checked. This looks wrong if bc->scale is nonzero.

(7) In the main strtod correction loop, after computing delta and i, there's a block:

                bc.nd = nd;
                i = -1; /* Discarded digits make delta smaller. */

This logic seems invalid if all the discarded digits are zero. (This is the same logic error as is causing bug5: the bigcomp comparison code also assumes incorrectly that digit nd-1 of s0 is nonzero.)

mdickinson · 2010-01-14T13:17:45Z

Bug 6 is indeed a bug: an example incorrectly-rounded string is:
'104308485241983990666713401708072175773165034278685682646111762292409330928739751702404658197872319129036519947435319418387839758990478549477777586673075945844895981012024387992135617064532141489278815239849108105951619997829153633535314849999674266169258928940692239684771590065027025835804863585454872499320500023126142553932654370362024104462255244034053203998964360882487378334860197725139151265590832887433736189468858614521708567646743455601905935595381852723723645799866672558576993978025033590728687206296379801363024094048327273913079612469982585674824156000783167963081616214710691759864332339239688734656548790656486646106983450809073750535624894296242072010195710276073042036425579852459556183541199012652571123898996574563824424330960027873516082763671875e-1075'

It's fixed in r77491. I'll add tests once the remaining (known) dtoa.c bugs are fixed.

mdickinson · 2010-01-14T14:44:51Z

Bug 4 fixed in r77492. This just leaves bugs 5 and 7; I have a fix for these in the works.

mdickinson · 2010-01-14T15:23:14Z

Tests committed in r77493.

mdickinson · 2010-01-14T15:44:21Z

Fixes and tests so far merged to py3k in r77494, release31-maint in r77496.

mdickinson · 2010-01-15T15:20:39Z

I was considering downgrading this to 'normal'. Then I found Bug 8, and it's a biggie:

>>> 10.900000000000000012345678912345678912345
10.0

Now I'm thinking it should be upgraded to release blocker instead.

The cause is in the _Py_strtod block that starts: 'if (nd > STRTOD_DIGLIM) {'... It truncates the input to 18 digits, and then deletes trailing zeros. But the code that deletes the zeros is buggy, and passes over the digit '9' just before the point.

tim-one · 2010-01-15T15:28:44Z

Mark, I agree that last one should be a release blocker -- it's truly dreadful.

BTW, did you guess in advance just how many bugs there could be in this kind of code? I did ;-)

mdickinson · 2010-01-15T15:32:15Z

Upgrading to release blocker. It'll almost certainly be fixed before the weekend is out. (And I will, of course, report it upstream.)

mdickinson · 2010-01-15T21:21:37Z

Here's a patch for the release blocker.

Eric, would you be interested in double checking the logic for this patch?

Tim: No, I have to admit I didn't forsee quite this number of bugs. :)

mdickinson · 2010-01-15T21:26:41Z

issue7632_bug8.patch uploaded to Rietveld:

http://codereview.appspot.com/186168

ericvsmith · 2010-01-15T22:09:34Z

It looks correct to me, assuming this comment is correct:

 /* scan back until we hit a nonzero digit.  significant digit 'i'
    is s0[i] if i < nd0, s0[i+1] if i >= nd0. */

I didn't verify the comment itself.

ericvsmith · 2010-01-15T22:42:11Z

I have a few minor comments posted on Rietveld, but nothing that would keep you from checking this in.

mdickinson · 2010-01-16T12:14:13Z

Applied the bug 8 patch in r77519 (thanks Eric for reviewing!). For safety, I'll leave this as a release blocker until fixes have been merged to py3k and release31-maint.

I've uploaded a fix for bugs 5 and 7 to Rietveld:

http://codereview.appspot.com/186182

I still don't like the parsing code much: I'm tempted to pull out the calculation of y and z and do it after the parsing is complete. It's probably marginally less efficient that way, but it would help make the code clearer.

mdickinson · 2010-01-16T18:00:25Z

I've applied a minimal fix for bugs 5 and 7 in r77530 (trunk). (I wasn't able to produce any strings that trigger bug 7, so it may not technically be a bug.)

I'm continuing to review, comment, and clean up the remainder of the _Py_dg_strtod.

mdickinson · 2010-01-16T18:15:42Z

Fixes merged to py3k and release31-maint in r77535 and r77537.

mdickinson · 2010-01-16T21:31:16Z

One of the buildbots just produced a MemoryError from test_strtod:

http://www.python.org/dev/buildbot/all/builders/i386%20Ubuntu%203.x/builds/411

It looks as though there's a memory leak somewhere in dtoa.c. It's a bit difficult to tell, though, since the memory allocation functions in that file deliberately hold on to small pieces of memory.

mdickinson · 2010-01-16T22:38:20Z

Okay, so there's a memory leak for overflowing values: if an overflow is detected in the main correction loop of _Py_dg_strtod, then 'references' to bd0, bd, bb, bs and delta aren't released.

There may be other leaks; I'm trying to come up with a good way to detect them reliably.

skrah · 2010-01-16T22:43:55Z

This is what Valgrind complains about:

==4750== 3,456 (1,440 direct, 2,016 indirect) bytes in 30 blocks are definitely lost in loss record 3,302 of 3,430
==4750== at 0x4C2412C: malloc (vg_replace_malloc.c:195)
==4750== by 0x41B7B5: PyMem_Malloc (object.c:1740)
==4750== by 0x4C03CF: Balloc (dtoa.c:352)
==4750== by 0x4C286E: _Py_dg_strtod (dtoa.c:1675)
==4750== by 0x4BEDF2: _PyOS_ascii_strtod (pystrtod.c:103)
==4750== by 0x4BEF61: PyOS_string_to_double (pystrtod.c:345)
==4750== by 0x543968: PyFloat_FromString (floatobject.c:192)
==4750== by 0x546E74: float_new (floatobject.c:1569)
==4750== by 0x42B5C9: type_call (typeobject.c:664)
==4750== by 0x516442: PyObject_Call (abstract.c:2160)
==4750== by 0x47FDAE: do_call (ceval.c:4088)
==4750== by 0x47F1CF: call_function (ceval.c:3891)

==4750== 9,680 bytes in 242 blocks are still reachable in loss record 3,369 of 3,430
==4750== at 0x4C2412C: malloc (vg_replace_malloc.c:195)
==4750== by 0x41B7B5: PyMem_Malloc (object.c:1740)
==4750== by 0x4C03CF: Balloc (dtoa.c:352)
==4750== by 0x4C0875: i2b (dtoa.c:556)
==4750== by 0x4C2906: _Py_dg_strtod (dtoa.c:1687)
==4750== by 0x4BEDF2: _PyOS_ascii_strtod (pystrtod.c:103)
==4750== by 0x4BEF61: PyOS_string_to_double (pystrtod.c:345)
==4750== by 0x543968: PyFloat_FromString (floatobject.c:192)
==4750== by 0x546E74: float_new (floatobject.c:1569)
==4750== by 0x42B5C9: type_call (typeobject.c:664)
==4750== by 0x516442: PyObject_Call (abstract.c:2160)
==4750== by 0x47FDAE: do_call (ceval.c:4088)

==4750== 270,720 bytes in 1,692 blocks are indirectly lost in loss record 3,423 of 3,430
==4750== at 0x4C2412C: malloc (vg_replace_malloc.c:195)
==4750== by 0x41B7B5: PyMem_Malloc (object.c:1740)
==4750== by 0x4C03CF: Balloc (dtoa.c:352)
==4750== by 0x4C0F97: diff (dtoa.c:825)
==4750== by 0x4C2BED: _Py_dg_strtod (dtoa.c:1779)
==4750== by 0x4BEDF2: _PyOS_ascii_strtod (pystrtod.c:103)
==4750== by 0x4BEF61: PyOS_string_to_double (pystrtod.c:345)
==4750== by 0x543968: PyFloat_FromString (floatobject.c:192)
==4750== by 0x546E74: float_new (floatobject.c:1569)
==4750== by 0x42B5C9: type_call (typeobject.c:664)
==4750== by 0x516442: PyObject_Call (abstract.c:2160)
==4750== by 0x47FDAE: do_call (ceval.c:4088)

==4750== 382,080 bytes in 2,388 blocks are indirectly lost in loss record 3,424 of 3,430
==4750== at 0x4C2412C: malloc (vg_replace_malloc.c:195)
==4750== by 0x41B7B5: PyMem_Malloc (object.c:1740)
==4750== by 0x4C03CF: Balloc (dtoa.c:352)
==4750== by 0x4C0C82: lshift (dtoa.c:730)
==4750== by 0x4C2BA9: _Py_dg_strtod (dtoa.c:1771)
==4750== by 0x4BEDF2: _PyOS_ascii_strtod (pystrtod.c:103)
==4750== by 0x4BEF61: PyOS_string_to_double (pystrtod.c:345)
==4750== by 0x543968: PyFloat_FromString (floatobject.c:192)
==4750== by 0x546E74: float_new (floatobject.c:1569)
==4750== by 0x42B5C9: type_call (typeobject.c:664)
==4750== by 0x516442: PyObject_Call (abstract.c:2160)
==4750== by 0x47FDAE: do_call (ceval.c:4088)

==4750== 414,560 bytes in 2,591 blocks are indirectly lost in loss record 3,425 of 3,430
==4750== at 0x4C2412C: malloc (vg_replace_malloc.c:195)
==4750== by 0x41B7B5: PyMem_Malloc (object.c:1740)
==4750== by 0x4C03CF: Balloc (dtoa.c:352)
==4750== by 0x4C0C82: lshift (dtoa.c:730)
==4750== by 0x4C2AD1: _Py_dg_strtod (dtoa.c:1744)
==4750== by 0x4BEDF2: _PyOS_ascii_strtod (pystrtod.c:103)
==4750== by 0x4BEF61: PyOS_string_to_double (pystrtod.c:345)
==4750== by 0x543968: PyFloat_FromString (floatobject.c:192)
==4750== by 0x546E74: float_new (floatobject.c:1569)
==4750== by 0x42B5C9: type_call (typeobject.c:664)
==4750== by 0x516442: PyObject_Call (abstract.c:2160)
==4750== by 0x47FDAE: do_call (ceval.c:4088)

==4750== 414,960 (414,768 direct, 192 indirect) bytes in 2,604 blocks are definitely lost in loss record 3,426 of 3,430
==4750== at 0x4C2412C: malloc (vg_replace_malloc.c:195)
==4750== by 0x41B7B5: PyMem_Malloc (object.c:1740)
==4750== by 0x4C03CF: Balloc (dtoa.c:352)
==4750== by 0x4C0929: mult (dtoa.c:592)
==4750== by 0x4C0B90: pow5mult (dtoa.c:691)
==4750== by 0x4C2B1A: _Py_dg_strtod (dtoa.c:1753)
==4750== by 0x4BEDF2: _PyOS_ascii_strtod (pystrtod.c:103)
==4750== by 0x4BEF61: PyOS_string_to_double (pystrtod.c:345)
==4750== by 0x543968: PyFloat_FromString (floatobject.c:192)
==4750== by 0x546E74: float_new (floatobject.c:1569)
==4750== by 0x42B5C9: type_call (typeobject.c:664)
==4750== by 0x516442: PyObject_Call (abstract.c:2160)

==4750== 890,720 (532,960 direct, 357,760 indirect) bytes in 3,331 blocks are definitely lost in loss record 3,428 of 3,430
==4750== at 0x4C2412C: malloc (vg_replace_malloc.c:195)
==4750== by 0x41B7B5: PyMem_Malloc (object.c:1740)
==4750== by 0x4C03CF: Balloc (dtoa.c:352)
==4750== by 0x4C0C82: lshift (dtoa.c:730)
==4750== by 0x4C2AD1: _Py_dg_strtod (dtoa.c:1744)
==4750== by 0x4BEDF2: _PyOS_ascii_strtod (pystrtod.c:103)
==4750== by 0x4BEF61: PyOS_string_to_double (pystrtod.c:345)
==4750== by 0x543968: PyFloat_FromString (floatobject.c:192)
==4750== by 0x546E74: float_new (floatobject.c:1569)
==4750== by 0x42B5C9: type_call (typeobject.c:664)
==4750== by 0x516442: PyObject_Call (abstract.c:2160)
==4750== by 0x47FDAE: do_call (ceval.c:4088)

==4750== 1,021,280 (566,080 direct, 455,200 indirect) bytes in 3,538 blocks are definitely lost in loss record 3,429 of 3,430
==4750== at 0x4C2412C: malloc (vg_replace_malloc.c:195)
==4750== by 0x41B7B5: PyMem_Malloc (object.c:1740)
==4750== by 0x4C03CF: Balloc (dtoa.c:352)
==4750== by 0x4C0C82: lshift (dtoa.c:730)
==4750== by 0x4C2BA9: _Py_dg_strtod (dtoa.c:1771)
==4750== by 0x4BEDF2: _PyOS_ascii_strtod (pystrtod.c:103)
==4750== by 0x4BEF61: PyOS_string_to_double (pystrtod.c:345)
==4750== by 0x543968: PyFloat_FromString (floatobject.c:192)
==4750== by 0x546E74: float_new (floatobject.c:1569)
==4750== by 0x42B5C9: type_call (typeobject.c:664)
==4750== by 0x516442: PyObject_Call (abstract.c:2160)
==4750== by 0x47FDAE: do_call (ceval.c:4088)

==4750== 1,465,280 (676,640 direct, 788,640 indirect) bytes in 4,229 blocks are definitely lost in loss record 3,430 of 3,430
==4750== at 0x4C2412C: malloc (vg_replace_malloc.c:195)
==4750== by 0x41B7B5: PyMem_Malloc (object.c:1740)
==4750== by 0x4C03CF: Balloc (dtoa.c:352)
==4750== by 0x4C0F97: diff (dtoa.c:825)
==4750== by 0x4C2BED: _Py_dg_strtod (dtoa.c:1779)
==4750== by 0x4BEDF2: _PyOS_ascii_strtod (pystrtod.c:103)
==4750== by 0x4BEF61: PyOS_string_to_double (pystrtod.c:345)
==4750== by 0x543968: PyFloat_FromString (floatobject.c:192)
==4750== by 0x546E74: float_new (floatobject.c:1569)
==4750== by 0x42B5C9: type_call (typeobject.c:664)
==4750== by 0x516442: PyObject_Call (abstract.c:2160)
==4750== by 0x47FDAE: do_call (ceval.c:4088)

mdickinson · 2010-01-16T23:28:15Z

Stefan, thanks for that! I'm not entirely sure how to make use of it, though. Is there a way to tell Valgrind that some leaks are expected?

The main problem with leak detection is that dtoa.c deliberately keeps hold of any malloc'ed chunks less than a certain size (which I think is something like 2048 bytes, but I'm not sure). These chunks are never freed in normal use; instead, they're added to a bunch of free lists for the next time that strtod or dtoa is called. The logic isn't too complicated: it's in the functions Balloc and Bfree in dtoa.c.

So the right thing to do is just to check that for each call to strtod, the total number of calls to Balloc matches the total number of calls to Bfree with non-NULL argument. And similarly for dtoa, except that in that case one of the Balloc'd blocks gets returned to the caller (it's the caller's responsibility to call free_dtoa to free it when it's no longer needed), so there should be a difference of 1.

And there's one further wrinkle: dtoa.c maintains a list of powers of 5 of the form 5**2**k, and this list is automatically extended with newly allocated Bigints when necessary: those Bigints are never freed either, so calls to Balloc from that source should be ignored. Another way round this is just to ignore any leak from the first call to strtod, and then do a repeat call with the same parameters; the second call will already have all the powers of 5 it needs.

mdickinson · 2010-01-16T23:29:02Z

Upgrading to release blocker again: the memory leak should be fixed for 2.7 (and more immediately, for 3.1.2).

skrah · 2010-01-17T13:53:29Z

Mark, thanks for the explanation! - You can generate suppressions for the Misc/valgrind-python.supp file, but you have to know exactly which errors can be ignored.

Going through the Valgrind output again, it looks like most of it is about what you already mentioned (bd0, bd, bb, bs and delta not being released).

Would it be much work to provide Valgrind-friendly versions of Balloc, Bfree and pow5mult? Balloc and Bfree are already mentioned in an XXX
comment, pow5mult should be a slow version that doesn't cache anything. Perhaps these could be ifdef'd with Py_USING_MEMORY_DEBUGGER.

mdickinson · 2010-01-17T13:58:22Z

Stefan, I'm not particularly familiar with Valgrind: can you tell me what would need to be done? Is a non-caching version of pow5mult all that's required?

Here's the patch that I'm using to detect leaks at the moment. (It includes a slow pow5mult version.)

skrah · 2010-01-17T15:41:59Z

With the latest dtoa.c, your non-caching pow5mult and a quick hack for Balloc and Bfree I get zero (dtoa.c related) Valgrind errors.

So the attached memory_debugger.diff is pretty much all what's needed for Valgrind.

mdickinson · 2010-01-17T21:09:44Z

Thanks, Stefan. Applied in r77589 (trunk), r77590 (py3k), r77591 (release31-maint) with one small change: I moved the freelist and p5s declarations inside the #ifndef Py_USING_MEMORY_DEBUGGER conditionals.

The leak itself was fixed in revisions r77578 through r77580; from Stefan's Valgrind report, and my own refcount testing, it looks as though that was the only leak point.

I haven't finished reviewing/testing the _Py_dg_strtod code yet, but I'm going to close this issue; if anything new turns up I'll open another one.

mdickinson self-assigned this Jan 4, 2010

mdickinson added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump labels Jan 4, 2010

mdickinson added the release-blocker label Jan 15, 2010

mdickinson changed the title ~~dtoa.c: oversize b in quorem~~ dtoa.c: oversize b in quorem, and a menagerie of other bugs Jan 15, 2010

mdickinson removed the release-blocker label Jan 16, 2010

mdickinson added the release-blocker label Jan 16, 2010

mdickinson closed this as completed Jan 17, 2010

ezio-melotti transferred this issue from another repository Apr 10, 2022

dtoa.c: oversize b in quorem, and a menagerie of other bugs #51881

dtoa.c: oversize b in quorem, and a menagerie of other bugs #51881

Comments

skrah mannequin commented Jan 4, 2010

skrah mannequin commented Jan 4, 2010

mdickinson commented Jan 4, 2010

mdickinson commented Jan 4, 2010

ericvsmith commented Jan 4, 2010

mdickinson commented Jan 4, 2010

mdickinson commented Jan 8, 2010

mdickinson commented Jan 8, 2010

mdickinson commented Jan 8, 2010

mdickinson commented Jan 9, 2010

mdickinson commented Jan 9, 2010

mdickinson commented Jan 10, 2010

tim-one commented Jan 10, 2010

mdickinson commented Jan 10, 2010

mdickinson commented Jan 12, 2010

mdickinson commented Jan 12, 2010

mdickinson commented Jan 12, 2010

mdickinson commented Jan 12, 2010

mdickinson commented Jan 13, 2010

mdickinson commented Jan 14, 2010

mdickinson commented Jan 14, 2010

mdickinson commented Jan 14, 2010

mdickinson commented Jan 14, 2010

mdickinson commented Jan 15, 2010

tim-one commented Jan 15, 2010

mdickinson commented Jan 15, 2010

mdickinson commented Jan 15, 2010

mdickinson commented Jan 15, 2010

ericvsmith commented Jan 15, 2010

ericvsmith commented Jan 15, 2010

mdickinson commented Jan 16, 2010

mdickinson commented Jan 16, 2010

mdickinson commented Jan 16, 2010

mdickinson commented Jan 16, 2010

mdickinson commented Jan 16, 2010

skrah mannequin commented Jan 16, 2010

mdickinson commented Jan 16, 2010

mdickinson commented Jan 16, 2010

skrah mannequin commented Jan 17, 2010

mdickinson commented Jan 17, 2010

skrah mannequin commented Jan 17, 2010

mdickinson commented Jan 17, 2010