Author mark.dickinson
Recipients collinwinter, eric.smith, gawain, gregory.p.smith, mark.dickinson, vstinner
Date 2009-09-16.19:41:13
SpamBayes Score 1.03418e-10
Marked as misclassified No
Message-id <1253130076.26.0.42390916583.issue6713@psf.upfronthosting.co.za>
In-reply-to
Content
Updated patch, with minor changes:
  - remove an incorrect Py_DECREF(str)
  - rename _PyLong_ToDecimal;  no need for the _Py prefix, since this
    function isn't shared across files
  - absorb special case for 0 into the rest of the code
  - whitespace and indentation fixes

Not that it matters much, but it's curious that on my machine (gcc-4.2, OS 
X 10.6.1, x64-64) it's significantly faster (~6% increase in str() speed 
for large integers) to use the line:

    pout[j] = z - (twodigits)hi * _PyLong_DECIMAL_BASE;

in the middle of the inner loop, rather than the line:

    pout[j] = z - hi * _PyLong_DECIMAL_BASE;

I'm wondering whether this is just a quirk of my OS/compiler combination, 
or whether there's a good reason for this difference.  The lines are 
functionally equivalent, since the result is reduced modulo 2**32 either 
way, but the first line involves a 32x32->64 multiplication and a 64-bit 
subtraction, where the second involves a 32x32->32 multiplication and a 
32-bit subtraction;  the generated assembly code for the second line is 
also one instruction shorter (there's a move opcode saved somewhere).
History
Date User Action Args
2009-09-16 19:41:16mark.dickinsonsetrecipients: + mark.dickinson, collinwinter, gregory.p.smith, vstinner, eric.smith, gawain
2009-09-16 19:41:16mark.dickinsonsetmessageid: <1253130076.26.0.42390916583.issue6713@psf.upfronthosting.co.za>
2009-09-16 19:41:14mark.dickinsonlinkissue6713 messages
2009-09-16 19:41:14mark.dickinsoncreate