This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: mmap constructor resets the file pointer on Windows
Type: behavior Stage:
Components: IO, Library (Lib), Windows Versions: Python 3.11, Python 3.10, Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: benrg, eryksun, paul.moore, steve.dower, tim.golden, zach.ware
Priority: normal Keywords:

Created on 2022-02-25 20:47 by benrg, last changed 2022-04-11 14:59 by admin.

Messages (2)
msg414039 - (view) Author: (benrg) Date: 2022-02-25 20:47
On Windows, `mmap.mmap(f.fileno(), ...)` has the undocumented side effect of setting f's file pointer to 0.

The responsible code in mmapmodule is this:

    /* Win9x appears to need us seeked to zero */
    lseek(fileno, 0, SEEK_SET);

Win9x is no longer supported, and I'm quite sure that NT doesn't have whatever problem they were trying to fix. I think this code should be deleted, and a regression test added to verify that mmap leaves the file pointer alone on all platforms.

(mmap also maintains its own file pointer, the `pos` field of `mmap_object`, which is initially set to zero. This issue is about the kernel file pointer, not mmap's pointer.)
msg414071 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2022-02-26 05:08
The resize() method also modifies the file pointer. Instead of fixing that oversight, I think it should directly set the file's FileEndOfFileInfo and FileAllocationInfo. For example:

            // resize the file
            if (!SetFileInformationByHandle(
                    self->file_handle, FileEndOfFileInfo,
                    &max_size, sizeof(max_size)) ||
                  !SetFileInformationByHandle(
                        self->file_handle, FileAllocationInfo,
                        &max_size, sizeof(max_size)))
            {
                // resizing failed. try to remap the file
                file_resize_error = GetLastError();
                max_size.QuadPart = self->size;
                new_size = self->size;
            }

This is cheaper in terms of system calls. The existing implementation makes four system calls: one to set the file pointer in SetFilePointerEx() and three in SetEndOfFile(), which queries the file pointer, sets the end-of-file info, and sets the allocation info. 

Note that this approach doesn't modify the file pointer in any case. This may be surprising if the file size shrinks to less than the existing file pointer. But os.ftruncate() behaves the same way, as does the resize() method in Linux.
History
Date User Action Args
2022-04-11 14:59:56adminsetgithub: 91014
2022-02-26 05:08:48eryksunsetnosy: + eryksun

messages: + msg414071
versions: - Python 3.7, Python 3.8
2022-02-25 20:47:49benrgcreate