Message 92711 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	mark.dickinson
Recipients	collinwinter, eric.smith, gawain, gregory.p.smith, mark.dickinson, vstinner
Date	2009-09-16.19:41:13
SpamBayes Score	1.0341755e-10
Marked as misclassified	No
Message-id	<1253130076.26.0.42390916583.issue6713@psf.upfronthosting.co.za>
In-reply-to

Content
Updated patch, with minor changes: - remove an incorrect Py_DECREF(str) - rename _PyLong_ToDecimal; no need for the _Py prefix, since this function isn't shared across files - absorb special case for 0 into the rest of the code - whitespace and indentation fixes Not that it matters much, but it's curious that on my machine (gcc-4.2, OS X 10.6.1, x64-64) it's significantly faster (~6% increase in str() speed for large integers) to use the line: pout[j] = z - (twodigits)hi * _PyLong_DECIMAL_BASE; in the middle of the inner loop, rather than the line: pout[j] = z - hi * _PyLong_DECIMAL_BASE; I'm wondering whether this is just a quirk of my OS/compiler combination, or whether there's a good reason for this difference. The lines are functionally equivalent, since the result is reduced modulo 2**32 either way, but the first line involves a 32x32->64 multiplication and a 64-bit subtraction, where the second involves a 32x32->32 multiplication and a 32-bit subtraction; the generated assembly code for the second line is also one instruction shorter (there's a move opcode saved somewhere).

Updated patch, with minor changes:
  - remove an incorrect Py_DECREF(str)
  - rename _PyLong_ToDecimal;  no need for the _Py prefix, since this
    function isn't shared across files
  - absorb special case for 0 into the rest of the code
  - whitespace and indentation fixes

Not that it matters much, but it's curious that on my machine (gcc-4.2, OS 
X 10.6.1, x64-64) it's significantly faster (~6% increase in str() speed 
for large integers) to use the line:

    pout[j] = z - (twodigits)hi * _PyLong_DECIMAL_BASE;

in the middle of the inner loop, rather than the line:

    pout[j] = z - hi * _PyLong_DECIMAL_BASE;

I'm wondering whether this is just a quirk of my OS/compiler combination, 
or whether there's a good reason for this difference.  The lines are 
functionally equivalent, since the result is reduced modulo 2**32 either 
way, but the first line involves a 32x32->64 multiplication and a 64-bit 
subtraction, where the second involves a 32x32->32 multiplication and a 
32-bit subtraction;  the generated assembly code for the second line is 
also one instruction shorter (there's a move opcode saved somewhere).

History
Date	User	Action	Args
2009-09-16 19:41:16	mark.dickinson	set	recipients: + mark.dickinson, collinwinter, gregory.p.smith, vstinner, eric.smith, gawain
2009-09-16 19:41:16	mark.dickinson	set	messageid: <1253130076.26.0.42390916583.issue6713@psf.upfronthosting.co.za>
2009-09-16 19:41:14	mark.dickinson	link	issue6713 messages
2009-09-16 19:41:14	mark.dickinson	create