Title: mmap enhancement - resize with sequence notation
Type: enhancement Stage: test needed
Components: Library (Lib) Versions: Python 3.5, Python 2.7
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: bmearns, josh.r, neologix
Priority: normal Keywords:

Created on 2009-04-30 18:34 by bmearns, last changed 2019-03-15 23:39 by BreamoreBoy.

Messages (3)
msg86849 - (view) Author: Brian Mearns (bmearns) Date: 2009-04-30 18:34
I thought it would be nice if mmaps could generally look a little more
like sequences. Specifically, being able to resize+write using
square-bracket notation as with lists:

>>> x = [1,2,3,4,5]
>>> x
[1, 2, 3, 4, 5]
>>> x[2:2] = [6,7,8,9]
>>> x
[1, 2, 6, 7, 8, 9, 3, 4, 5]

If that could be done when x is an mmap.mmap, it'd be great.
alternatively, if mmap had an insert or an extend method that work like
with lists, the same behavior could be achieved without relying on mmap
specific method-names.
msg220767 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2014-06-16 21:14
@Brian this will go nowhere without a patch covering code, tests and documentation, are you interested in providing one?
msg220771 - (view) Author: Josh Rosenberg (josh.r) * (Python triager) Date: 2014-06-16 21:40
I see a few issues with this:

1. Changing the default behavior is a compatibility issue. I've written code that depends on exceptions being raised if slice assignment sizes don't match.
2. The performance cost is high; changing from rewriting in place to shrinking or expanding slice assignment requires (in different orders for shrink/expand) truncating the file to the correct length, memcpy-ing data proportionate to the data after the end of the slice (not proportionate to the slice size) and probably remapping the file (which causes problems if someone has a buffer attached to the existing mapping). At least with non-file backed sequences, when we do work like this it's all in memory and typically smallish; with a file, most of it has to be read from and written to disk, and I'd assume the data being worked with is "largish" (if it's reliably small, the advantages of mmap-ing are small).
3. Behavior in cases where the whole file isn't mapped is hard to intuit or define reasonably. If I map the first 1024 bytes of a 2 GB file, and I add 20 bytes in the middle of the block, what happens? Does data from the unmapped portions get moved? Overwritten? What about removing 20 bytes from the middle of the block? Do we write 0s, or copy down the data that appears after? And remember, for all but the "shrink and write 0s" option, we're moving or modifying data the user explicitly didn't mmap.
Date User Action Args
2019-03-15 23:39:40BreamoreBoysetnosy: - BreamoreBoy
2014-06-24 01:55:42josh.rsettitle: mmap ehancement - resize with sequence notation -> mmap enhancement - resize with sequence notation
2014-06-23 01:38:17ned.deilysetnosy: + neologix
2014-06-16 21:40:09josh.rsetnosy: + josh.r
messages: + msg220771
2014-06-16 21:14:17BreamoreBoysetnosy: + BreamoreBoy

messages: + msg220767
versions: + Python 2.7, Python 3.5, - Python 3.4
2012-11-09 13:27:23ezio.melottisetversions: + Python 3.4, - Python 3.2
2010-07-10 06:36:32terry.reedysetstage: test needed
components: + Library (Lib)
versions: + Python 3.2, - Python 2.6
2009-04-30 18:34:09bmearnscreate