Message 63693 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	nnorwitz
Recipients	alecthomas, aronacher, nnorwitz
Date	2008-03-17.17:48:46
SpamBayes Score	0.2281227
Marked as misclassified	No
Message-id	<ee2a432c0803171048w487a3afeo23a04919d5695a34@mail.gmail.com>
In-reply-to	<1205772952.0.0.890459758617.issue2321@psf.upfronthosting.co.za>

Content
On Mon, Mar 17, 2008 at 11:55 AM, Alec Thomas <report@bugs.python.org> wrote: > > Alec Thomas <alec@swapoff.org> added the comment: > > Hi Neal, > > This seems to be a more general problem than just unicode. Kinda, sorta. The general issue is the pattern of memory allocation/deallocation. In the case of >>> x = [(1, 2, 3, 4, i) for i in xrange(800000)] The memory that is not returned is in the integer free list. If this code is changed to: >>> for x in ((1, 2, 3, 4, i) for i in xrange(800000)): pass That doesn't hold on to any extra memory. The problem is that holes are left in memory and a lot of it can't always be returned to the O/S. It can still be re-used by python. > Both exhibit the same behaviour. Naively to me it seems like using the > custom allocator uniformly would fix this problem. In general, we should find places (like unicode) that use PyMem_* (ie, system malloc) and replace them with PyObject_* (pymalloc). That should improve the behaviour, but there will always be some allocation patterns that will be suboptimal. Note that using pymalloc will only help for objects < 256 bytes. Larger objects are delegated to the system malloc and will still exhibit some of the bad problems. Alec, can you find places that are using the PyMem_* interface and create a patch to fix them. That would be great!

On Mon, Mar 17, 2008 at 11:55 AM, Alec Thomas <report@bugs.python.org> wrote:
>
>  Alec Thomas <alec@swapoff.org> added the comment:
>
>  Hi Neal,
>
>  This seems to be a more general problem than just unicode.

Kinda, sorta. The general issue is the pattern of memory
allocation/deallocation.  In the case of

>>> x = [(1, 2, 3, 4, i) for i in xrange(800000)]

The memory that is not returned is in the integer free list.  If this
code is changed to:

>>> for x in ((1, 2, 3, 4, i) for i in xrange(800000)): pass

That doesn't hold on to any extra memory.  The problem is that holes
are left in memory and a lot of it can't always be returned to the
O/S.  It can still be re-used by python.

>  Both exhibit the same behaviour. Naively to me it seems like using the
>  custom allocator uniformly would fix this problem.

In general, we should find places (like unicode) that use PyMem_* (ie,
system malloc) and replace them with PyObject_* (pymalloc).  That
should improve the behaviour, but there will always be some allocation
patterns that will be suboptimal.  Note that using pymalloc will only
help for objects < 256 bytes.  Larger objects are delegated to the
system malloc and will still exhibit some of the bad problems.

Alec, can you find places that are using the PyMem_* interface and
create a patch to fix them.  That would be great!

History
Date	User	Action	Args
2008-03-17 17:48:48	nnorwitz	set	spambayes_score: 0.228123 -> 0.2281227 recipients: + nnorwitz, aronacher, alecthomas
2008-03-17 17:48:47	nnorwitz	link	issue2321 messages
2008-03-17 17:48:46	nnorwitz	create