Author schlamar
Recipients schlamar
Date 2012-12-21.14:43:30
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1356101011.63.0.640174538358.issue16743@psf.upfronthosting.co.za>
In-reply-to
Content
Platform: Windows 7 64 bit
Interpreter: Python 2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit Intel)] on win32

Here are the steps to reproduce:

1. Create a big file (5 GB):

with open('big', 'wb') as fobj:
    for _ in xrange(1024 * 1024 * 5):
        fobj.write('1' + '0' * 1023)

2. Open and process it with `mmap`:

import mmap
import re
import sys

with open('big', 'rb') as fobj:
    data = mmap.mmap(fobj.fileno(), 0, access=mmap.ACCESS_READ)
    print data.size()
    try:
        counter = 0
        for match in re.finditer('1' + '0' * 1023, data):
            counter += 1
        print len(data[1073740800:1073741824]) # (1 GB - 1024, 1 GB)
        print len(data[1073741824:1073742848]) # (1 GB, 1 GB + 1024)
    finally:
        data.close()

    print counter

This returns the following lines:

    5368709120
    1024
    0
    1048576

So this is a behavioral issue. `mmap` accepts a file which cannot fit in the interpreter memory but fits in the system memory. On processing the data, it only reads data until the maximum interpreter memory is reached (1 GB).
History
Date User Action Args
2012-12-21 14:43:31schlamarsetrecipients: + schlamar
2012-12-21 14:43:31schlamarsetmessageid: <1356101011.63.0.640174538358.issue16743@psf.upfronthosting.co.za>
2012-12-21 14:43:31schlamarlinkissue16743 messages
2012-12-21 14:43:30schlamarcreate