This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: os.ftruncate on Windows should be sparse
Type: Stage:
Components: Library (Lib) Versions: Python 3.9, Python 3.8
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Artoria2e5, steve.dower
Priority: normal Keywords:

Created on 2020-03-09 11:19 by Artoria2e5, last changed 2022-04-11 14:59 by admin.

Messages (1)
msg363717 - (view) Author: Mingye Wang (Artoria2e5) * Date: 2020-03-09 11:19
Consider this interaction:

cmd> echo > 1.txt
cmd> python -c "__import__('os').truncate('1.txt', 1024 ** 3)"
cmd> fsutil sparse queryFlag 1.txt

Not only takes a long time as is typical for a zero-write, but also reports non-sparse as an actual write would suggest. This is because internally, _chsize_s and friends enlarges files using a loop.[1]
  [1]: https://github.com/leelwh/clib/blob/master/c/chsize.c

On Unix systems, ftruncate for enlarging is described as "... as if the extra space is zero-filled", but this is not to be taken literally. In practice, sparse files are used whenever available (GNU dd expects that) and people do expect the operation to be very fast without a lot of real writes. A FreeBSD bug exists around how ftruncate is too slow on UFS.

The aria2 downloader gives a good example of how to truncate into a sparse file on Windows.[2] First a FSCTL_SET_SPARSE control is issued, and then a seek + SetEndOfFile would finish the job. Of course, a lseek to the end would be required to first determine the size of the file, so we know whether we are enlarging (sparse) or shrinking (don't sparse).
  [2]: https://github.com/aria2/aria2/blob/master/src/AbstractDiskWriter.cc#L507
History
Date User Action Args
2022-04-11 14:59:27adminsetgithub: 84091
2020-03-09 11:19:49Artoria2e5create