classification
Title: Peephole binops folding can lead to memory and bytecache ballooning
Type: resource usage Stage: resolved
Components: Versions: Python 3.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: mjpieters
Priority: low Keywords:

Created on 2017-05-06 19:04 by mjpieters, last changed 2017-05-23 01:52 by rhettinger. This issue is now closed.

Messages (4)
msg293167 - (view) Author: Martijn Pieters (mjpieters) * Date: 2017-05-06 19:04
The following expression produces 127MB in constants in `co_consts` due to two 63.5MB integer objects produced when folding:

    ((200*200 - 2) & ((1 << 500000000) - 1)) + ((200*200 - 2) >> 500000000)

The optimizer already does not store optimized *sequences* of more than 20 elements to avoid making bytecode files too large:

   If the new constant is a sequence, only folds when the size
   is below a threshold value.  That keeps pyc files from
   becoming large in the presence of code like:  (None,)*1000.

Perhaps the same should be extended to number objects?

Context: https://stackoverflow.com/questions/43823807/why-does-using-arguments-make-this-function-so-much-slower
msg293172 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-05-06 20:12
A few thoughts:

* This is very unlikely to come up in real code. Accordingly, I've marked this a low priority -- the peepholer is over a decade old and seems to work-out fine in actual practice.

* The user explicitly asked for 1 << 500000000 to be computed.  At some point, either during compilation or during execution, a large object is going to be produced.

* The binops folding in peephole.c was originally much simpler and did not recursively combine results.  The patch that added the recursive application of constant folding added a lot of complexity, but it demonstrated very little real world benefit, and it opened the door to examples like this where the computer is being explicitly told to do a lot of work.  Perhaps constant folding should be reverted to its simpler state.

* We could do a size check after each folding step and abandon folding when the size hit some limit.  I'm not sure that is worth it though.  Someone can just add more zeros and then complain that the compilation step took a lot of time and memory without producing any benefit.

* We could try to detect an expensive computation before it is run, but in general that is hard to do right and it adds yet more complexity to something that started-out simple.

* One of the design objectives for the peephole transformations was to keep it fast and slim rather than trying to be all things to all people.  Burdening the optimizer with insanity checks just slows down the compilation of normal, sane code.

* With sum(), Guido and Alex took the position that users should be blocked from using it with list-of-strings, but we didn't do block other unfavorable possibilities like a list-of-lists.  The rationale was that the list-of-strings was too tempting and likely but that other cases weren't common or easy to detect.  Likewise for Tim and I, the early decision for the peephole optimized (the decision being second guessed by this issue) was that large strings were easy to detect and likely enough to warrant extra code but that numeric folding issues were much less likely and didn't warrant extra code. 

* Constant folding was introduced before the AST branch.  Once that branch landed, it has been a long standing objective to move constant folding out of the peepholer and into the AST stage.  Accordingly, we've tried to refrain from further build-outs of constant folding.  It is a pit of snakes.  If someone really cares about this, they should work on an AST transformation.

* Programming languages are different from applications.  Users are allowed to explicitly create code that tells the computer to do something resource intensive and there are a myriad of ways to do so. In general, it will be difficult to prevent the execution of such code without unduly burdening execution of normal code.
msg293292 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-05-09 09:01
Looking back at the OP's timings in the referenced SO question, I would expect that if someone "fixed" this issue, it wouldn't be long before someone else filed a performance regression bug claiming a 63,000x slowdown in exactly the same code.

I'm marking this as closed because if this ever did arise in real code, it is unclear whether the desirable behavior is to eat memory but run fast, or to save memory upfront but run dog slow and eat memory later when the function is called.  Either way, the situation is likely to be very rare.

Your guess is as good as mine regarding which behavior would be more desirable to the user.  Presumably, if they direct the computer to build a large object, they won't be surprised if a large object is created.
msg293295 - (view) Author: Martijn Pieters (mjpieters) * Date: 2017-05-09 09:13
Thanks Raymond, for the response. I agree, we can't prevent all possible misuse, and avoiding the memory issue would require overly costly checks as to what is being multiplied or added.
History
Date User Action Args
2017-05-23 01:52:46rhettingersetnosy: - rhettinger
2017-05-09 09:13:18mjpieterssetmessages: + msg293295
2017-05-09 09:01:32rhettingersetstatus: open -> closed
resolution: not a bug
messages: + msg293292

stage: resolved
2017-05-06 20:12:37rhettingersetpriority: normal -> low
versions: + Python 3.7
nosy: + rhettinger

messages: + msg293172

type: resource usage
2017-05-06 19:04:17mjpieterscreate