float(0.0) singleton #48274

ldeller · 2008-10-03T03:25:06Z

BPO	4024
Nosy	@tim-one, @birkenfeld, @rhettinger, @terryjreedy, @tiran
Superseder	bpo-14381: Intern certain integral floats for memory savings and performance
Files	python_zero_float.patch: patch for svn trunk / py2.6 / py2.5

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = 'https://github.com/tim-one'
closed_at = <Date 2009-03-20.01:07:50.816>
created_at = <Date 2008-10-03.03:25:06.248>
labels = ['interpreter-core', 'performance']
title = 'float(0.0) singleton'
updated_at = <Date 2012-03-22.11:55:09.738>
user = 'https://bugs.python.org/ldeller'

bugs.python.org fields:

activity = <Date 2012-03-22.11:55:09.738>
actor = 'kristjan.jonsson'
assignee = 'tim.peters'
closed = True
closed_date = <Date 2009-03-20.01:07:50.816>
closer = 'rhettinger'
components = ['Interpreter Core']
creation = <Date 2008-10-03.03:25:06.248>
creator = 'ldeller'
dependencies = []
files = ['11686']
hgrepos = []
issue_num = 4024
keywords = ['patch']
message_count = 7.0
messages = ['74224', '74228', '74243', '74244', '74245', '74261', '83884']
nosy_count = 6.0
nosy_names = ['tim.peters', 'georg.brandl', 'rhettinger', 'terry.reedy', 'ldeller', 'christian.heimes']
pr_nums = []
priority = 'normal'
resolution = 'rejected'
stage = None
status = 'closed'
superseder = '14381'
type = 'resource usage'
url = 'https://bugs.python.org/issue4024'
versions = ['Python 2.6', 'Python 3.0']

ldeller · 2008-10-03T03:25:04Z

Here is a patch to make PyFloat_FromDouble(0.0) always return the same
float instance. This is similar to the existing optimization in
PyInt_FromLong(x) for small x.

My own motivation is that the patch reduces memory by several megabytes
for a particular in-house data processing script, but I think that it
should be generally useful assuming that zero is a very common float
value, and at worst almost neutral when this assumption is wrong. The
minimal performance impact of the test for zero should be easily
recovered by reduced memory allocation calls. I am happy to look into
benchmarking if you require empirical performance data.

birkenfeld · 2008-10-03T07:46:27Z

Will it correctly distinguish between +0.0 and -0.0?

ldeller · 2008-10-03T12:00:00Z

No it won't distinguish between +0.0 and -0.0 in its present form,
because these two have the same value according to the C equality
operator.  This should be easy to adjust, eg we could exclude -0.0 by
changing the comparison
    if (fval == 0.0)
into 
    static double positive_zero = 0.0;
    ...
    if (!memcmp(&fval, &positive_zero, sizeof(double)))

vstinner · 2008-10-03T12:16:18Z

We need maybe more hardcoded floats. I mean a "cache" of current
float. Example of pseudocode:

def cache_float(value):
   return abs(value) in (0.0, 1.0, 2.0)

def create_float(value):
   try:
      return cache[value]
   except KeyError:
      obj = float(value)
      if cache_value(value):
         cache[value] = obj
      return obj

Since some (most?) programs don't use float, the cache is created on
demand and not at startup.

Since the goal is speed, only a benchmark can answer to my question
(is Python faster using such cache) ;-) Instead of cache_float(), an
RCU cache might me used.

tiran · 2008-10-03T12:19:45Z

Please use copysign(1.0, fval) == 1.0 instead of your memcpy trick. It's
the cannonical way to check for negative zero. copysign() is always
available because we have our own implementation if the platform doesn't
provide one. We might also want to special case 1.0 and -1.0.

I've to check with Guido and Barry if we can get the optimization into
2.6.1 and 3.0.1. It may have to wait until 2.7 and 3.0.

rhettinger · 2008-10-03T17:02:19Z

I question whether this should be done at all. Making the creation of a
float even slightly slower is bad. This is on the critical path for all
floating point intensive computations. If someone really cares about
the memory savings, it is not hard take a single in instance of float
and use it everywhere: ZERO=0.0; arr=[ZERO if x == 0.0 else x for x in
arr]. That technique also works for 1.0 and -1.0 and pi and other
values that may commonly occur in a particular app. Also, the technique
is portable to implementations other than CPython. I don't mind this
sort of optimization for immutable containers but feel that floats are
too granular. Special cases aren't special enough to break the rules.
If the OP is insistent, then at least this should be discussed with the
numeric community who will have a better insight into whether the
speed/space trade-off makes sense in other applications beyond the OP's
original case.

Tim, any insights?

terryjreedy · 2009-03-20T23:15:00Z

I have 3 comments for future readers who might want to reopen.

This would have little effect on calculation with numpy.
According to sys.getrefcount, when '>>>' appears, 3.0.1 has 1200
duplicate references to 0 and 1 alone, and about 2000 to all of them.
So so small int caching really needs to be done by the interpreter. Are
there *any* duplicate internal references to 0.0 that would help justify
this proposal?
It is? (certainly was) standard in certain Fortran circles to NAME
constants as Raymond suggested. One reason given was to ease conversion
between single and double precision. In Python, named constants in
functions would ease conversion between, for instance, float and decimal.

ldeller mannequin added interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage labels Oct 3, 2008

tiran self-assigned this Oct 3, 2008

rhettinger assigned tim-one and unassigned tiran Oct 3, 2008

rhettinger closed this as completed Mar 20, 2009

ezio-melotti transferred this issue from another repository Apr 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

float(0.0) singleton #48274

float(0.0) singleton #48274

ldeller mannequin commented Oct 3, 2008

ldeller mannequin commented Oct 3, 2008

birkenfeld commented Oct 3, 2008

ldeller mannequin commented Oct 3, 2008

vstinner commented Oct 3, 2008

tiran commented Oct 3, 2008

rhettinger commented Oct 3, 2008

terryjreedy commented Mar 20, 2009

float(0.0) singleton #48274

float(0.0) singleton #48274

Comments

ldeller mannequin commented Oct 3, 2008

ldeller mannequin commented Oct 3, 2008

birkenfeld commented Oct 3, 2008

ldeller mannequin commented Oct 3, 2008

vstinner commented Oct 3, 2008

tiran commented Oct 3, 2008

rhettinger commented Oct 3, 2008

terryjreedy commented Mar 20, 2009