classification
Title: fstat64 required on Windows
Type: crash Stage: resolved
Components: IO Versions: Python 3.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: steve.dower Nosy List: benjamin.peterson, hynek, josh.r, pitrou, python-dev, steve.dower, stutzbach, tim.golden, vstinner, zach.ware
Priority: critical Keywords: patch

Created on 2015-01-02 20:23 by steve.dower, last changed 2015-02-22 19:35 by python-dev. This issue is now closed.

Files
File name Uploaded Description Edit
py_fstat.patch vstinner, 2015-01-08 10:30 review
py_fstat_2.patch steve.dower, 2015-01-25 01:59 review
py_fstat_3.patch steve.dower, 2015-01-28 03:38 review
Messages (26)
msg233327 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2015-01-02 20:23
Currently test_largefile is failing on the Windows buildbots because an fstat() call in Modules/_io/fileio.c is failing. fstat() returns a 32-bit size, but the file being opened is larger than 2GB.

This appears to be a change in the CRT where it would previously succeed with an incorrect value but now returns an error. (While we're here, there is no exception set in this case, so you get "SystemError: NULL result without error in PyObject_Call", but that's not the core issue.)

I can fix this instance, but I suspect we may need to fix this in multiple places. fstat64 (or _fstat64) seems to exist on multiple platforms, and the docs I found (e.g. http://linux.die.net/man/2/fstat64) suggest that EOVERFLOW is always going to occur in this case, so is there any reason not to switch to fstat64 everywhere? Maybe enable it through pyport.h/pyconfig.h and have a #define that is set based on availability?
msg233333 - (view) Author: Josh Rosenberg (josh.r) * (Python triager) Date: 2015-01-03 00:02
I believe explicitly calling the 64 bit version of a function is usually frowned upon. At least on *NIX systems, the standard solution is to define -D_FILE_OFFSET_BITS=64 during the build process, so off_t seamlessly becomes a 64 bit value, and the 64 bit version of appropriate functions is called and linked (fstat64, mmap64, etc.).
msg233335 - (view) Author: Josh Rosenberg (josh.r) * (Python triager) Date: 2015-01-03 00:15
Ugh. Looking into it further, POSIX systems tend to support _FILE_OFFSET_BITS=64, but Windows isn't POSIX-y enough. So Windows might need to be special cased regardless. Blech.
msg233336 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2015-01-03 00:27
Josh is right, we would only need to use fstat64() under Windows here. fstat() under POSIX defines st_size as a off_t, which should usually be large enough even on modern 32-bit systems.
msg233338 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2015-01-03 01:00
Okay, I'll try and find a way to redefine it only under Windows. Unfortunately, the CRT defines fstat() as a function, which makes it hard to redefine without eagerly including sys/stat.h.
msg233341 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2015-01-03 01:51
Looks like the easiest fix here is to remove the HAVE_SYS_STAT_H definition and replace it with the include directly:

/* Define to 1 if you have the <sys/stat.h> header file.  */
/* #define HAVE_SYS_STAT_H 1 */
#ifndef MS_WINCE
/* Rather than define HAVE_SYS_STAT_H, we include it now and
   rename two of the functions. The rename must be after the
   header is included. */
#include <sys/stat.h>
#define fstat _fstati64
#define stat _stati64
#endif


Does anyone know whether this sort of thing will cause problems with the build? It seems fine to me, but someone with more experience may know better.
msg233638 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-01-08 10:30
Here is a patch adding a new _Py_fstat() function which uses signed 64-bit integer to store the file size and so is not limited to 2 GB files. I just moved the code from posixmodule.c to fileutils.c.

The patch replaces calls to fstat() with _Py_stat() (and also change the stat structure to _Py_stat_struct).

I didn't modified _Py_stat() which calls _wstat() on Windows and so also have this issue (limited to 2 GB files on Windows). _Py_stat() is called in zipimport.c, so zipped packages are limited to 2 GB on Windows. I don't know if it's a severe issue or not. os.stat() implementation is more complex than os.fstat() on Windows.

I'm unable to test my patch on Windows because Visual Studio 2010 is unable to open the new PCbuild/pcbuild.sln: see my comment on the issue #22919.
msg233639 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-01-08 10:34
> #define fstat _fstati64

This change (alone) is not safe because _fstati64() doesn't use "struct stat" but "struct __stat64".

I don't like such global define, I may have unexpected side effect. I prefer to modify directly calls to fstat() in .c files (like the change implemented in my patch).
msg233653 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2015-01-08 13:22
I prefer your patch too. (I've posted on the other thread about the build problems, and I'll test this when I get a chance.)
msg234092 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2015-01-15 17:51
Victor, I've been testing your patch and it's mostly good (a few obscure errors you'd never find without a compiler), but I think we also need to update the win32_xstat functions in posixmodule.c, since they all try and use the same struct.

I don't know how familiar you are with the posixmodule functions, so I can study up on them if needed. It's probably due for some simplifications anyway, since we can now assume that Vista's APIs are always available. I think a lot of the functionality there is now in fileutils.c too, so we don't need so much duplicated code.
msg234512 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-01-22 21:42
> Currently test_largefile is failing on the Windows buildbots

Oh yes, I just noticed the following bug on "AMD64 Windows8 3.x":
http://buildbot.python.org/all/builders/AMD64%20Windows8%203.x/builds/307/steps/test/logs/stdio

[352/391/1] test_largefile
Assertion failed: (result != NULL && !PyErr_Occurred()) || (result == NULL && PyErr_Occurred()), file ..\Objects\abstract.c, line 2095
R6010
- abort() has been called
Fatal Python error: Aborted

Current thread 0x00000ec8 (most recent call first):
  File "D:\buildarea\3.x.bolen-windows8\build\lib\test\test_largefile.py", line 83 in test_lseek
  File "D:\buildarea\3.x.bolen-windows8\build\lib\unittest\case.py", line 577 in run
  File "D:\buildarea\3.x.bolen-windows8\build\lib\unittest\case.py", line 625 in __call__
  File "D:\buildarea\3.x.bolen-windows8\build\lib\unittest\suite.py", line 125 in run
  File "D:\buildarea\3.x.bolen-windows8\build\lib\unittest\suite.py", line 87 in __call__
  File "D:\buildarea\3.x.bolen-windows8\build\lib\unittest\suite.py", line 125 in run
  File "D:\buildarea\3.x.bolen-windows8\build\lib\unittest\suite.py", line 87 in __call__
  File "D:\buildarea\3.x.bolen-windows8\build\lib\unittest\suite.py", line 125 in run
  File "D:\buildarea\3.x.bolen-windows8\build\lib\unittest\suite.py", line 87 in __call__
  File "D:\buildarea\3.x.bolen-windows8\build\lib\unittest\runner.py", line 168 in run
  File "D:\buildarea\3.x.bolen-windows8\build\lib\test\support\__init__.py", line 1770 in _run_suite
  File "D:\buildarea\3.x.bolen-windows8\build\lib\test\support\__init__.py", line 1804 in run_unittest
  File "D:\buildarea\3.x.bolen-windows8\build\PCbuild\..\lib\test\regrtest.py", line 1283 in test_runner
  File "D:\buildarea\3.x.bolen-windows8\build\PCbuild\..\lib\test\regrtest.py", line 1284 in runtest_inner
  File "D:\buildarea\3.x.bolen-windows8\build\PCbuild\..\lib\test\regrtest.py", line 967 in runtest
  File "D:\buildarea\3.x.bolen-windows8\build\PCbuild\..\lib\test\regrtest.py", line 763 in main
  File "D:\buildarea\3.x.bolen-windows8\build\PCbuild\..\lib\test\regrtest.py", line 1568 in main_in_temp_cwd
  File "D:\buildarea\3.x.bolen-windows8\build\PCbuild\..\lib\test\regrtest.py", line 1593 in <module>
msg234567 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2015-01-23 17:52
Yeah, that's the sole buildbot currently running VS 2015. I'm expecting to have more after VS 2015 RC is released, since that will be "basically finished". Until then, I'm also regularly building with the latest internal versions and tracking issues, but nothing seems to have been introduced since Preview, so that buildbot is a very good guide.
msg234647 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2015-01-25 01:58
I updated and tested your patch - basically making posixmodule.c use the new _Py_stat_struct throughout instead of the Win32 definition. The tests run fine on my VC14 machine.

Should _Py_fstat be conditioned on Py_LIMITED_API in some way, since it's not available pre-3.5? I've never figured out exactly how that should work.
msg234654 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2015-01-25 05:35
Huh, okay, looks like the patch still isn't sufficient (I forgot to put -uall when testing it...) - there are calls in fileio.c that need changing too.

I'll try and do a thorough sweep of calls to fstat and post another patch in the next day or so.
msg234870 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2015-01-28 03:38
Looks like it was only the _io module needing more updates. New patch attached.
msg235790 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2015-02-12 02:21
Anyone interested in reviewing this patch?
msg236208 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2015-02-19 02:13
Last call before I let the buildbots be the reviewers :)
msg236374 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015-02-21 16:50
New changeset 4f6f4aa0d80f by Steve Dower in branch 'default':
Issue #23152: Implement _Py_fstat() to support files larger than 2 GB on Windows.
https://hg.python.org/cpython/rev/4f6f4aa0d80f
msg236376 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-02-21 17:45
The name of attribute_data_to_stat() and other shared functions must be
prefixed by "_Py".

+/* Return size of file in bytes; < 0 if unknown or INT_MAX if too big */
static off_t getfilesize(FILE *fp)

Hum since we have a type able yo store the file size and the function is
private, you should use the type able to store the size. Maybe off_t on
POSIX and something else on Windows. Is Py_off_t type 64 bits?

I'm the sure that the caller works if getfilesize() returns a truncated
size.

@@ -420,9 +420,11 @@ fileio_init(PyObject *oself, PyObject *a }
self->blksize = DEFAULT_BUFFER_SIZE; -#ifdef HAVE_FSTAT - if
(fstat(self->fd, &fdfstat) < 0) +#if defined(HAVE_FSTAT) ||
defined(MS_WINDOWS) + if (_Py_fstat(self->fd, &fdfstat) < 0) { +
PyErr_SetFromErrno(PyExc_OSError); goto error; + }

Why do you raise an exception here? Is it a bug fix? (I cannot read the
code at error label right now.)
msg236377 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2015-02-21 17:53
The caller to getfilesize is only using it to check whether it's small enough to load the file into memory all at once, so "too big" is an okay response (that function is in marshal.c and not used anywhere else).

The error label just returns back to the interpreter loop, previously without an exception set, so we got a SystemError. I mentioned this in my first post.

Happy to rename the now shared functions - I just took those straight from your patch :)
msg236380 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015-02-21 18:04
New changeset 307713759a62 by Steve Dower in branch 'default':
Issue #23152: Renames attribute_data_to_stat to _Py_attribute_data_to_stat
https://hg.python.org/cpython/rev/307713759a62
msg236393 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015-02-21 23:30
New changeset a824c40e8fc0 by Steve Dower in branch 'default':
Issue #23152: Renames time_t_to_FILE_TIME to _Py_time_t_to_FILE_TIME, removes unused struct win32_stat and return value
https://hg.python.org/cpython/rev/a824c40e8fc0
msg236401 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-02-22 07:37
You didn't reply to my question on getfilesize().
msg236409 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2015-02-22 14:11
Two posts ago: "The caller to getfilesize is only using it to check whether it's small enough to load the file into memory all at once, so "too big" is an okay response (that function is in marshal.c and not used anywhere else)."
msg236413 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015-02-22 17:40
New changeset b4f9e787b857 by Serhiy Storchaka in branch 'default':
Issue #23152: Move declaration into a header and exclude from stable API.
https://hg.python.org/cpython/rev/b4f9e787b857
msg236415 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015-02-22 19:35
New changeset 72cf174cc0eb by Serhiy Storchaka in branch 'default':
Issue #23152: Move declarations back to posixmodule.c.
https://hg.python.org/cpython/rev/72cf174cc0eb
History
Date User Action Args
2015-02-22 19:35:42python-devsetmessages: + msg236415
2015-02-22 17:40:23python-devsetmessages: + msg236413
2015-02-22 14:11:30steve.dowersetmessages: + msg236409
2015-02-22 07:37:14vstinnersetmessages: + msg236401
2015-02-22 06:56:51steve.dowersetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2015-02-21 23:30:40python-devsetmessages: + msg236393
2015-02-21 18:04:26python-devsetmessages: + msg236380
2015-02-21 17:53:43steve.dowersetmessages: + msg236377
2015-02-21 17:45:53vstinnersetmessages: + msg236376
2015-02-21 16:50:45python-devsetnosy: + python-dev
messages: + msg236374
2015-02-19 02:13:48steve.dowersetpriority: normal -> critical

messages: + msg236208
2015-02-12 02:21:51steve.dowersetassignee: steve.dower
type: crash
messages: + msg235790
stage: patch review
2015-01-28 03:38:49steve.dowersetfiles: + py_fstat_3.patch

messages: + msg234870
2015-01-25 05:35:26steve.dowersetmessages: + msg234654
2015-01-25 01:59:17steve.dowersetfiles: + py_fstat_2.patch
2015-01-25 01:58:56steve.dowersetmessages: + msg234647
2015-01-23 17:52:21steve.dowersetmessages: + msg234567
2015-01-22 21:42:08vstinnersetmessages: + msg234512
2015-01-15 17:51:26steve.dowersetmessages: + msg234092
2015-01-08 13:22:26steve.dowersetmessages: + msg233653
2015-01-08 10:34:16vstinnersetmessages: + msg233639
2015-01-08 10:31:02vstinnersetfiles: + py_fstat.patch

nosy: + vstinner
messages: + msg233638

keywords: + patch
2015-01-03 01:51:29steve.dowersetmessages: + msg233341
2015-01-03 01:00:53steve.dowersetmessages: + msg233338
2015-01-03 00:27:44pitrousetmessages: + msg233336
2015-01-03 00:15:33josh.rsetmessages: + msg233335
2015-01-03 00:02:32josh.rsetnosy: + josh.r
messages: + msg233333
2015-01-02 20:23:52steve.dowercreate