classification
Title: fcntl module doesn't support F_NOCACHE (OS X specific) results in high 'inactive' memory performance issues
Type: enhancement Stage: resolved
Components: Extension Modules Versions: Python 3.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Alex.Stewart, jcea, ned.deily, neologix, python-dev, ronaldoussoren
Priority: normal Keywords: patch

Created on 2011-11-02 16:51 by Alex.Stewart, last changed 2011-11-02 18:45 by jcea. This issue is now closed.

Files
File name Uploaded Description Edit
fcntlmodule.patch Alex.Stewart, 2011-11-02 16:51 fcntlmodule.c (v2.6.7) patch adding support for F_NOCACHE
Messages (3)
msg146846 - (view) Author: Alex Stewart (Alex.Stewart) Date: 2011-11-02 16:51
----------------------------
ISSUE DESCRIPTION: 
----------------------------
In 2.6.7, 2.7.2 and also HEAD (r 73301) the fcntl module does not support the F_NOCACHE OSX specific mode.  

The affect of this is that the default behaviour (data caching enabled) is used when parsing files etc.  When data caching is enabled, OSX caches the data parsed from a file by python, keeping it available as 'inactive memory' even when it has been freed by python.  In *theory*, this should be fine as OSX *should* recycle the inactive memory as it is required.  

Unfortunately, at least under OSX 10.6.8 (and it seems 10.7) the system will do almost anything to avoid recycling inactive memory, including swallowing up all 'pure' free memory and then paging manically to the disk.  The net affect of this is significantly degraded system performance.

For most users, operating with relatively small files and a large quantity of RAM this issue is probably not that obvious.  However, for various reasons I'm working with files of 5-125+GB and thus it rapidly becomes a major issue.

----------------------------
FIX
----------------------------
Very simply, all the attached patch does it add support for F_NOCACHE to fcntl just like F_FULLFSYNC (another fcntl OSX specific flag) - it's a trivial change that allows you to control whether OSX uses data caching on a file handle, e.g, the following turns OFF data caching for the specified file:

fcntl.fcntl(theFile.fileno(), fcntl.F_NOCACHE, 1)

With this patch in place the inactive memory on my system stays (low) and flat, without it it rises until it maxes out all available memory and then starts paging.
msg146861 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2011-11-02 17:59
New changeset cee6fdd6436d by Charles-Fran├žois Natali in branch 'default':
Issue #13324: fcntlmodule: Add the F_NOCACHE flag. Patch by Alex Stewart.
http://hg.python.org/cpython/rev/cee6fdd6436d
msg146864 - (view) Author: Charles-Fran├žois Natali (neologix) * (Python committer) Date: 2011-11-02 18:05
> With this patch in place the inactive memory on my system stays (low)
> and flat, without it it rises until it maxes out all available memory
> and then starts paging.

It's often a desired behaviour: paging out unused memory makes room for a bigger page cache, which yields better performance. Also, it's a simple heuristic to know which pages are actually in use.
Linux has a systcl that can be used to tune the swap tendency (vm.swappiness).
Anyway, since it seems to improve performance under some workloads and it's just a constant, I've applied your patch.
Thanks!
History
Date User Action Args
2011-11-02 18:45:28jceasetnosy: + jcea
2011-11-02 18:30:54neologixsettype: behavior -> enhancement
2011-11-02 18:05:13neologixsetstatus: open -> closed

type: resource usage -> behavior
versions: + Python 3.3, - Python 2.6, Python 2.7
nosy: + neologix

messages: + msg146864
resolution: fixed
stage: resolved
2011-11-02 17:59:26python-devsetnosy: + python-dev
messages: + msg146861
2011-11-02 16:51:14Alex.Stewartcreate