This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: test_zipfile: test_add_file_after_2107() sometimes fails on Fedora Rawhide 3.x: Linux VFS/XFS bug
Type: behavior Stage: resolved
Components: Tests Versions: Python 3.9, Python 3.8
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: David.Edelsohn, Dormouse759, fweimer, hroncok, miss-islington, vstinner
Priority: normal Keywords: patch

Created on 2020-01-27 09:24 by vstinner, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
utime_stat_localtime.py vstinner, 2020-01-29 00:21
utime_stat_localtime2.py vstinner, 2020-01-29 01:38
mtime.c vstinner, 2020-01-29 09:13
Pull Requests
URL Status Linked Edit
PR 18247 merged vstinner, 2020-01-29 11:36
PR 18253 merged miss-islington, 2020-01-29 14:25
Messages (38)
msg360747 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-27 09:24
s390x Fedora Rawhide 3.x:
https://buildbot.python.org/all/#/builders/323/builds/6

======================================================================
FAIL: test_add_file_after_2107 (test.test_zipfile.StoredTestsWithSourceFile)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/dje/cpython-buildarea/3.x.edelsohn-fedora-rawhide-z/build/Lib/test/test_zipfile.py", line 620, in test_add_file_after_2107
    self.assertRaises(struct.error, zipfp.write, TESTFN)
AssertionError: error not raised by write

Test added in bpo-34097 by:

commit a2fe1e52eb94c41d9ebce1ab284180d7b1faa2a4
Author: Marcel Plch <gmarcel.plch@gmail.com>
Date:   Thu Aug 2 15:04:52 2018 +0200

    bpo-34097: Add support for zipping files older than 1980-01-01 (GH-8270)
    
    ZipFile can zip files older than 1980-01-01 and newer than 2107-12-31 using
    a new strict_timestamps parameter at the cost of setting the timestamp
    to the limit.

see also the following fix:

commit 7b41dbad78c6b03ca2f98800a92a1977d3946643
Author: Marcel Plch <gmarcel.plch@gmail.com>
Date:   Fri Aug 3 17:59:19 2018 +0200

    bpo-34325: Skip zipfile test for large timestamps when filesystem don't support them. (GH-8656)
    
    When the filesystem doesn't support files with large timestamps,
    skip testing that such files can be zipped.
msg360774 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-27 15:59
The failure has been seen on other architectures on Fedora Rawhide when building the Python 3.8.1 package:
https://koji.fedoraproject.org/koji/taskinfo?taskID=40870434
msg360845 - (view) Author: Miro Hrončok (hroncok) * Date: 2020-01-28 09:14
In Fedora, this has started happening after the update of the following packages:

https://koschei.fedoraproject.org/build/7758715

 gcc
10.0.1-0.3.fc32 (in fact, there was gcc 9 in there, the report is maybe broken)
10.0.1-0.4.fc32

 libuuid
2.35-0.5.fc32
2.35-1.fc32

 glibc
2.30.9000-30.fc32
2.30.9000-31.fc32

 libselinux
3.0-1.fc32
3.0-2.fc32

 libblkid
2.35-0.5.fc32
2.35-1.fc32

 libgomp
10.0.1-0.3.fc32
10.0.1-0.4.fc32

 util-linux
2.35-0.5.fc32
2.35-1.fc32

 libmount
2.35-0.5.fc32
2.35-1.fc32

 libsmartcols
2.35-0.5.fc32
2.35-1.fc32

 libfdisk
2.35-0.5.fc32
2.35-1.fc32
msg360866 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-28 12:26
I tried but failed to reproduce the issue on Fedora Rawhide x86-64. I built Python from source and run Python from its source tree:

./python -m test -v test_zipfile

Maybe the issue only occurs if Python is installed? Or it depends on the filesystem where the ZIP file is created?

On my Fedora 31, /tmp is mounted as a tmpfs filesystem. Using it explicitly doesn't make the test to fail:

./python -m test -v test_zipfile -m test_add_file_after_2107 --tempdir /tmp
msg360881 - (view) Author: Marcel Plch (Dormouse759) * Date: 2020-01-28 15:38
Is currently anybody actively working on this? Please, report what you have found out, if so.
I'd like to start digging into this tomorrow and possibly avoid any duplicit work.
msg360892 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-28 18:38
> Is currently anybody actively working on this? Please, report what you have found out, if so. I'd like to start digging into this tomorrow and possibly avoid any duplicit work.

I tried but failed to reproduce the bug on Fedora 31 and Fedora Rawhide.

I tested 3.8 and master development branches on Fedora 31 x86-64 laptop: "./configure --with-pydebug CFLAGS=-O0 && make && ./python -m test -v test_zipfile" pass.

I tested Python 3.9.0a3 on my up-to-date Fedora Rawhide x86-64 VM: "./configure --with-pydebug CFLAGS=-O0 && make && ./python -m test -v test_zipfile" pass. I also tried to install Python in /opt/py39 and run tests using /opt/py39/bin/python3.9: the test still pass.

I also tested Python 3.9.0a3 on an up-to-date Fedora Rawhide ppc64 mock container: Python configured as the Fedora package does (LTO, PGO, CFLAGS, LDFLAGS, etc.). test_zipfile pass as well.

Since I cannot reproduce the issue, nor access a machine which the bug occurs, I don't know how to debug this issue.

Can I get access to "s390x Fedora Rawhide 3.x" buildbot worker?

I didn't try to rebuild Python using the Fedora specfile. I only built Python manually.
msg360896 - (view) Author: David Edelsohn (David.Edelsohn) * Date: 2020-01-28 18:55
$ ./python -m test tet_zipfile
0:00:00 load avg: 0.03 Run tests sequentially
0:00:00 load avg: 0.03 [1/1] tet_zipfile
test tet_zipfile crashed -- Traceback (most recent call last):
  File "/home/dje/src/cpython/Lib/test/libregrtest/runtest.py", line 270, in _runtest_inner
    refleak = _runtest_inner2(ns, test_name)
  File "/home/dje/src/cpython/Lib/test/libregrtest/runtest.py", line 221, in _runtest_inner2
    the_module = importlib.import_module(abstest)
  File "/home/dje/src/cpython/Lib/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1021, in _gcd_import
  File "<frozen importlib._bootstrap>", line 998, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'test.tet_zipfile'

tet_zipfile failed

== Tests result: FAILURE ==

1 test failed:
    tet_zipfile

Total duration: 76 ms
Tests result: FAILURE
msg360897 - (view) Author: David Edelsohn (David.Edelsohn) * Date: 2020-01-28 18:57
Sorry, posted the wrong output above.

$ ./python -m test test_zipfile
0:00:00 load avg: 0.01 Run tests sequentially
0:00:00 load avg: 0.01 [1/1] test_zipfile
test test_zipfile failed -- Traceback (most recent call last):
  File "/home/dje/src/cpython/Lib/test/test_zipfile.py", line 620, in test_add_file_after_2107
    self.assertRaises(struct.error, zipfp.write, TESTFN)
AssertionError: error not raised by write

test_zipfile failed in 55.6 sec

== Tests result: FAILURE ==

1 test failed:
    test_zipfile

Total duration: 55.6 sec
Tests result: FAILURE
msg360912 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-29 00:09
Fedora downstream issue: https://bugzilla.redhat.com/show_bug.cgi?id=1795576
msg360913 - (view) Author: David Edelsohn (David.Edelsohn) * Date: 2020-01-29 00:15
Do you believe that a single GCC 10 issue is affecting PPC64LE, ARM, and s390x, but expressed in different manners on the different architectures OR is the PPC64LE issue separate and architecture-depdendent?
msg360914 - (view) Author: Miro Hrončok (hroncok) * Date: 2020-01-29 00:17
I think this happens "randomly" regardless of the architecture. I've seen it on x86_64 as well.
msg360915 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-29 00:21
I suspect that the timestamp set by os.utime() is not the one seen by os.stat(). Or maybe time.localtime() behavior changed.

Try attached utime_stat_localtime.py script.

Output on Fedora 31 in my home directory which uses btrfs:

$ python3.7 utime_stat_localtime.py 
os.utime (sec): 4386268800
os.stat (sec): 4386268800
os.stat (ns): 4386268800000000000
stat==utime? True
localtime: (2108, 12, 30, 1, 0, 0)
msg360916 - (view) Author: David Edelsohn (David.Edelsohn) * Date: 2020-01-29 00:26
Output on s390x Fedora Rawhide:
$ ./python utime_stat_localtime.py 
os.utime (sec): 4386268800
os.stat (sec): 4386268800
os.stat (ns): 2147483647000000000
stat==utime? False
localtime: (2038, 1, 18, 22, 14, 7)
msg360918 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-29 00:46
> os.stat (sec): 4386268800
> ...
> localtime: (2038, 1, 18, 22, 14, 7)

Aha, localtime() behavior changed for timestamp larger than 2**31. What is the version of the glibc package?

> s390x Fedora Rawhide 3.x:
> https://buildbot.python.org/all/#/builders/323/builds/6

pythoninfo of this build says:

platform.libc_ver: glibc 2.30.9000

(Same in the latest build, build 25.)

--

On AMD64 Fedora Rawhide 3.x with glibc-2.30.9000-31.fc32.x86_64, I get years after 2038:

vstinner@python-builder-rawhide$ ./python
Python 3.9.0a3+ (heads/master:0cd5bff6b7, Jan 28 2020, 19:39:07) 
[GCC 10.0.1 20200123 (Red Hat 10.0.1-0.5)] on linux
Type "help", "copyright", "credits" or "license" for more information.

>>> import time
>>> time.localtime(2**31)
time.struct_time(tm_year=2038, tm_mon=1, tm_mday=18, tm_hour=22, tm_min=14, tm_sec=8, tm_wday=0, tm_yday=18, tm_isdst=0)

>>> time.localtime(2**31-3600*24)
time.struct_time(tm_year=2038, tm_mon=1, tm_mday=17, tm_hour=22, tm_min=14, tm_sec=8, tm_wday=6, tm_yday=17, tm_isdst=0)

>>> time.localtime(2**32)
time.struct_time(tm_year=2106, tm_mon=2, tm_mday=7, tm_hour=1, tm_min=28, tm_sec=16, tm_wday=6, tm_yday=38, tm_isdst=0)

>>> time.localtime(2**33)
time.struct_time(tm_year=2242, tm_mon=3, tm_mday=16, tm_hour=8, tm_min=56, tm_sec=32, tm_wday=2, tm_yday=75, tm_isdst=1)
msg360921 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-29 00:51
I don't see anything about localtime() or the year 2038 bug in glibc 2.30 release notes:
https://lwn.net/Articles/795127/

I'm aware of work in the Linux kernel and glibc for the year 2038 bug, but for 32-bit systems:
https://lwn.net/Articles/776435/
msg360922 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-29 00:54
> s390x Fedora Rawhide 3.x:
> https://buildbot.python.org/all/#/builders/323/builds/6

I also checked the configure: time_t size is 64-bit on this system, so it should not be impacted on the glibc/Linux kernel work on the year 2038 bug.
msg360923 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-29 01:04
Notes on the glibc:

* localtime() calls __tz_convert()
* localtime() and __tz_convert() have been modified recently to support 64-bit timestamp
* __tz_convert() is implemented in time/tzset.c
* localtime() is implemented in time/localtime.c
msg360924 - (view) Author: David Edelsohn (David.Edelsohn) * Date: 2020-01-29 01:21
$ ./python
Python 3.9.0a3+ (heads/master:aabdeb766b, Jan 28 2020, 13:50:48) 
[GCC 10.0.1 20200121 (Red Hat 10.0.1-0.4)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import time
>>> time.localtime(2**31)
time.struct_time(tm_year=2038, tm_mon=1, tm_mday=18, tm_hour=22, tm_min=14, tm_sec=8, tm_wday=0, tm_yday=18, tm_isdst=0)
>>> time.localtime(2**31-3600*24)
time.struct_time(tm_year=2038, tm_mon=1, tm_mday=17, tm_hour=22, tm_min=14, tm_sec=8, tm_wday=6, tm_yday=17, tm_isdst=0)
>>> time.localtime(2**32)
time.struct_time(tm_year=2106, tm_mon=2, tm_mday=7, tm_hour=1, tm_min=28, tm_sec=16, tm_wday=6, tm_yday=38, tm_isdst=0)
>>> time.localtime(2**33)
time.struct_time(tm_year=2242, tm_mon=3, tm_mday=16, tm_hour=8, tm_min=56, tm_sec=32, tm_wday=2, tm_yday=75, tm_isdst=1)
msg360925 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-29 01:38
David Edelsohn:

> Output on s390x Fedora Rawhide:
> (...)
> os.stat (sec): 4386268800
> os.stat (ns): 2147483647000000000

Oh wait, ns != sec * 10**9 here. "2147483647" is 2**31-1 (INT_MAX).

It looks like a bug in fill_time() of Modules/posixmodule.c. I suspect a bug in _PyLong_FromTime_t().

On this buildbot worker, we have:

    checking size of long long... 8
    checking size of time_t... 8

So _PyLong_FromTime_t() should be implemented as:

    return PyLong_FromLongLong((long long)t);

_PyLong_FromTime_t() raw value is exported in os.stat() as os.stat(filename)[8].

Attached utime_stat_localtime2.py exposes it:

vstinner@apu$ ./python utime_stat_localtime2.py 
os.utime (sec): 4386268800
os.stat (sec int): 4386268800
os.stat (sec float): 4386268800.0
os.stat (ns): 4386268800000000000

On Fedora 31 x86-64 with GCC 9.2.1, gcc -O3, _PyLong_FromTime_t() really just calls PyLong_FromLongLong():

(gdb) disassemble _PyLong_FromTime_t
Dump of assembler code for function _PyLong_FromTime_t:
   0x0000000000521f30 <+0>:	jmp    0x457800 <PyLong_FromLongLong>
msg360926 - (view) Author: David Edelsohn (David.Edelsohn) * Date: 2020-01-29 01:52
$ ./python utime_stat_localtime2.py 
os.utime (sec): 4386268800
os.stat (sec int): 2147483647
os.stat (sec float): 2147483647.0
os.stat (ns): 2147483647000000000
msg360928 - (view) Author: David Edelsohn (David.Edelsohn) * Date: 2020-01-29 01:56
Not -O3, but it's calling PyLong_FromLongLong on s390x as well

   0x00000000011ca524 <+28>:	brasl	%r14,0x10649b0 <PyLong_FromLongLong>
msg360930 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-29 02:03
"""
$ ./python utime_stat_localtime2.py 
os.utime (sec): 4386268800
os.stat (sec int): 2147483647
os.stat (sec float): 2147483647.0
os.stat (ns): 2147483647000000000
"""

It doesn't make sense !? In msg360916, you had:

os.stat (sec): 4386268800 <= os.stat().st_mtime
os.stat (ns): 2147483647000000000 <= os.stat().st_mtime._ns

How is it possible that sometimes st_mtime is right, and sometimes st_mtime is so plain wrong?

--

Test setting mtime in Python, reading mtime using /usr/bin/stat, then reading file stat in Python:

$ touch testfn

$ python3 -c 'import os; os.utime("testfn", (4386268800, 4386268800))'

$ stat testfn 
(...)
Modify: 2108-12-30 01:00:00.000000000 +0100
(...)

$ python3 -c 'import os; st=os.stat("testfn"); print(st); print(tuple(st)); print(st.st_mtime_ns)'

os.stat_result(st_mode=33204, st_ino=24648296, st_dev=40, st_nlink=1, st_uid=1000, st_gid=1000, st_size=0, st_atime=4386268800, st_mtime=4386268800, st_ctime=1580263179)
(33204, 24648296, 40, 1, 1000, 1000, 0, 4386268800, 4386268800, 1580263179)
4386268800000000000
msg360931 - (view) Author: David Edelsohn (David.Edelsohn) * Date: 2020-01-29 02:22
[dje@rawhide ~]$ touch testfn
[dje@rawhide ~]$ python3 -c 'import os; os.utime("testfn", (4386268800, 4386268800))'
[dje@rawhide ~]$ stat testfn
  File: testfn
  Size: 0         	Blocks: 0          IO Block: 4096   regular empty file
Device: fd00h/64768d	Inode: 17887487    Links: 1
Access: (0664/-rw-rw-r--)  Uid: ( 1001/     dje)   Gid: ( 1001/     dje)
Context: unconfined_u:object_r:user_home_t:s0
Access: 2038-01-18 22:14:07.000000000 -0500
Modify: 2038-01-18 22:14:07.000000000 -0500
Change: 2020-01-28 21:19:14.707112199 -0500
 Birth: 2020-01-28 21:19:01.627112199 -0500
[dje@rawhide ~]$ python3 -c 'import os; st=os.stat("testfn"); print(st); print(tuple(st)); print(st.st_mtime_ns)'
os.stat_result(st_mode=33204, st_ino=17887487, st_dev=64768, st_nlink=1, st_uid=1001, st_gid=1001, st_size=0, st_atime=2147483647, st_mtime=2147483647, st_ctime=1580264354)
(33204, 17887487, 64768, 1, 1001, 1001, 0, 2147483647, 2147483647, 1580264354)
2147483647000000000
msg360932 - (view) Author: David Edelsohn (David.Edelsohn) * Date: 2020-01-29 02:27
In utime_stat_localtime.py, "os.stat (sec)" is the result of os.utime.
In utime_stat_localtime2.py "os.stat (sec int)" is the result of os.stat.
msg360942 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-29 09:13
> [dje@rawhide ~]$ python3 -c 'import os; os.utime("testfn", (4386268800, 4386268800))'
> [dje@rawhide ~]$ stat testfn
> (...)
> Modify: 2038-01-18 22:14:07.000000000 -0500

Oh. It means that os.utime() replaces mtime=4386268800 with a value close to INT_MAX (2**31-1) to workaround the year 2038 bug.

But the buildbot worker runs Fedora Rawhide and uses the architecture s390x with is a 64-bit system: sizeof(time_t)=8. The glibc is not supposed to change mtime on such platform.

I wrote attached mtime.c which is supposed to reproduce the issue.

Example on my Fedora Rawhide x86-64 VM (glibc-2.30.9000-31.fc32.x86_64):
---
$ gcc mtime.c -o mtime && ./mtime
uname -a:
Linux rawhide 5.5.0-0.rc6.git3.1.fc32.x86_64 #1 SMP Fri Jan 17 18:29:51 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
sizeof(time_t) = 8 bytes
sizeof(void*) = 8 bytes

open(testfn, O_WRONLY | O_CREAT)
utimensat(AT_FDCWD, testfn, {atime=mtime=4386268800.0}, 0)
stat(testfn)
st.st_mtime = 4386268800
st.st_mtim.tv_nsec = 0
(ignored: st.st_mtime.tv_sec = 0)
unlink(testfn)
---

I get "st.st_mtime = 4386268800" as expected.
msg360943 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-29 09:23
I'm now curious which syscalls are used by the glibc to implement utimensat() and stat().

On a ppc64le Fedora Rawhide running Linux kernel 5.4.8-200.fc31.ppc64le with glibc 2.30.9000, I get:

$ sudo dnf install -y strace # if neeeded
$ strace -o trace ./mtime
(...)

$ grep -E '^(utime|stat)' trace 
utimensat(AT_FDCWD, "testfn", [{tv_sec=4386268800, tv_nsec=0} /* 2108-12-30T00:00:00+0000 */, {tv_sec=4386268800, tv_nsec=0} /* 2108-12-30T00:00:00+0000 */], 0) = 0
stat("testfn", {st_mode=S_IFREG|000, st_size=0, ...}) = 0

=> the glibc uses utimensat() and stat() syscalls.

I get the same syscalls on a x64-64 Fedora Rawhide (Linux kernel 5.5.0-0.rc6.git3.1.fc32.x86_64, libc 2.30.9000):

$ grep -E '^(utime|stat)' trace 
utimensat(AT_FDCWD, "testfn", [{tv_sec=4386268800, tv_nsec=0} /* 2108-12-30T01:00:00+0100 */, {tv_sec=4386268800, tv_nsec=0} /* 2108-12-30T01:00:00+0100 */], 0) = 0
stat("testfn", {st_mode=S_IFREG|000, st_size=0, ...}) = 0

Note: I use "uname -r" to get the Linux kernel version and I run directly "/lib64/libc.so.6" to get the glibc version.
msg360944 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-29 09:36
glibc commit adding support for utimensat_time64() syscall:
https://sourceware.org/git/?p=glibc.git;a=commit;h=f5b6fd258b6dd520403a20024e58cb491aca4cbd

See also: https://sourceware.org/glibc/wiki/Y2038ProofnessDesign
msg360947 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-29 10:25
I failed to reproduce the issue on s390x Fedora Rawhide, Linux kernel 5.5.0-0.rc4.git2.1.fc32.s390x, glibc-2.30.9000-31.fc32.s390x.

I tested filesystem nfs and tmpfs.

test_add_file_after_2107() pass.

[vstinner@devel10 ~]$ uname -r
5.5.0-0.rc4.git2.1.fc32.s390x

[vstinner@devel10 ~]$ rpm -q glibc
glibc-2.30.9000-31.fc32.s390x

[vstinner@devel10 ~]$ strace -o trace ./mtime
uname -a:
Linux devel10.s390.bos.redhat.com 5.5.0-0.rc4.git2.1.fc32.s390x #1 SMP Fri Jan 3 20:19:26 UTC 2020 s390x s390x s390x GNU/Linux
sizeof(time_t) = 8 bytes
sizeof(void*) = 8 bytes

open(testfn, O_WRONLY | O_CREAT)
utimensat(AT_FDCWD, testfn, {atime=mtime=4386268800.0}, 0)
stat(testfn)
st.st_mtime = 4386268800
st.st_mtim.tv_nsec = 0
(ignored: st.st_mtime.tv_sec = 0)
unlink(testfn)

[vstinner@devel10 ~]$ grep -E '^(utime|stat)' trace 
utimensat(AT_FDCWD, "testfn", [{tv_sec=4386268800, tv_nsec=0} /* 2108-12-29T19:00:00-0500 */, {tv_sec=4386268800, tv_nsec=0} /* 2108-12-29T19:00:00-0500 */], 0) = 0
stat("testfn", {st_mode=S_IFREG|041, st_size=0, ...}) = 0
msg360948 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-29 10:41
> I failed to reproduce the issue on s390x Fedora Rawhide, Linux kernel 5.5.0-0.rc4.git2.1.fc32.s390x, glibc-2.30.9000-31.fc32.s390x.

I still cannot reproduce after a "dnf upgrade -y":

* Linux 5.5.0-0.rc6.git3.1.fc32.s390x
* glibc-2.30.9000-31.fc32.s390x

I tested attached mtime.c and Python test_add_file_after_2107() on nfs and tmpfs filesystems.

The s390x Fedora Rawhide 3.x buildbot worker where the bug occurs uses the same Linux kernel (5.5.0-0.rc6.git3.1.fc32.s390x) and it looks like the same glibc version (glibc 2.30.9000). Checking "rpm -q glibc" would say if it's exactly the same Fedora package.

David: what is the GCC version? (rpm -q gcc)
msg360949 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-29 11:00
Summary:

* test_zipfile.test_add_file_after_2107() fails on Fedora Rawhide 3.x
* Bug seen on Python 3.8 and Python 3.9
* Bug seen on s390x and x86-64 architectures
* It seems like the bug occurs in os.utime() with atime=mtime=4386268800 => stat returns mtime=2147483647
* It may be a recent change in the glibc to workaround year 2038 bug

Failure on x86-64 Fedora Rawhide, on Fedora Koji build server:

* https://koji.fedoraproject.org/koji/taskinfo?taskID=40870460
* Linux 5.4.8-200.fc31.x86_64
* glibc 2.30.9000
* gcc (GCC) 10.0.1 20200121 (Red Hat 10.0.1-0.4)
* utimensat() available

Failure on "s390x Fedora Rawhide 3.x" buildbot worker:

* https://buildbot.python.org/all/#/builders/323/builds/27
* Linux 5.5.0-0.rc6.git3.1.fc32.s390x
* glibc 2.30.9000
* gcc (GCC) 10.0.1 20200121 (Red Hat 10.0.1-0.4)
* utimensat() available
msg360950 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-29 11:00
Next question: does the issue come from the glibc or the filesystem? Attached mtime.c should ease debug. If you reproduce the issue, please test different filesystem and report which filesystems you tested.

To detect the filesystem used by a directory, I'm using df to find the mount point (ex: "/home") and then mount to get the filesystem:

$ df .
Filesystem      Size  Used Avail Use% Mounted on
/dev/nvme0n1p3  461G  368G   93G  80% /home

$ mount|grep /home
/dev/nvme0n1p3 on /home type btrfs (rw,relatime,seclabel,ssd,space_cache,subvolid=258,subvol=/home)

=> btrfs here
msg360951 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-29 11:08
I failed to reproduce the bug on 3 machines.

My x86-64 Fedora VM:

* Linux 5.5.0-0.rc6.git3.1.fc32.x86_64
* glibc-2.30.9000-31.fc32.x86_64
* gcc-10.0.1-0.5.fc32.x86_64
* utimensat() available
* I tested ext4 (/home) and tmpfs (/tmp)

ppc64le mock:

* Linux 5.4.8-200.fc31.ppc64le (kernel of the Fedora 31 host)
* glibc-2.30.9000-31.fc32.ppc64le
* gcc-10.0.1-0.5.fc32.ppc64le
* I tested ext4 (/home) and tmpfs (/tmp)

s390x devel machine:

* Linux 5.5.0-0.rc6.git3.1.fc32.s390x
* glibc-2.30.9000-31.fc32.s390x
* gcc-10.0.1-0.5.fc32.s390x
* I tested nfs4 (/home) and tmpfs (/tmp)
msg360952 - (view) Author: Florian Weimer (fweimer) Date: 2020-01-29 11:19
I believe you might be observing an XFS limitation in combination with a Linux VFS bug.

On disk, XFS only supports 32-bit timestamps:

typedef struct xfs_timestamp {
	__be32		t_sec;		/* timestamp seconds */
	__be32		t_nsec;		/* timestamp nanoseconds */
} xfs_timestamp_t;

This is on the roadmap being fixed.

However, the Linux VFS code does not appear to know about this. It caches the full 64-bit value. You only see the truncated value if it is read back from disk:

# touch -t 222201020304 /tmp/t
# ls -l /tmp/t
-rw-r--r--. 1 root root 0 Jan  2  2222 /tmp/t
# echo 3 > /proc/sys/vm/drop_caches 
# ls -l /tmp/t
-rw-r--r--. 1 root root 0 Oct 19  1949 /tmp/t

This is a bug in the Linux VFS layer.
msg360954 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-29 11:29
I managed to reproduce the issue on Fedora 31 on a XFS filesystem:

$ touch testfn
$ strace -o trace python3 -c 'import os; os.utime("testfn", (4386268800, 4386268800))'
$ grep ^utimensat trace
utimensat(AT_FDCWD, "testfn", [{tv_sec=4386268800, tv_nsec=0} /* 2108-12-30T01:00:00+0100 */, {tv_sec=4386268800, tv_nsec=0} /* 2108-12-30T01:00:00+0100 */], 0) = 0
$ stat testfn
(...)
Modify: 2038-01-19 04:14:07.000000000 +0100
(...)

It looks like a Linux kernel issue in the XFS filesystem: I don't think that mtime should be *silently* truncated to 2147483647.
msg360956 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-29 11:51
The kernel bug is now tracked at https://bugzilla.redhat.com/show_bug.cgi?id=1795576

I wrote PR 18247 to skip the test.

I suggest to leave this issue is open until the kernel is fixed: until the test is no longer skipped.
msg360963 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-29 14:23
New changeset 3cb49b62e61208efcefbc04414e769fc173f205d by Victor Stinner in branch 'master':
bpo-39460: Fix test_zipfile.test_add_file_after_2107() (GH-18247)
https://github.com/python/cpython/commit/3cb49b62e61208efcefbc04414e769fc173f205d
msg360964 - (view) Author: miss-islington (miss-islington) Date: 2020-01-29 14:41
New changeset 2b675f0c8fd96f61977f6dc636f44fbd5587b6b3 by Miss Islington (bot) in branch '3.8':
bpo-39460: Fix test_zipfile.test_add_file_after_2107() (GH-18247)
https://github.com/python/cpython/commit/2b675f0c8fd96f61977f6dc636f44fbd5587b6b3
msg361561 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-02-07 10:29
It seems like AIX also has a kernel issue with timestamp after year 2038. The year 2107 is stored as year 1972.

test_add_file_after_2107 (test.test_zipfile.StoredTestsWithSourceFile) ... skipped 'Linux VFS/XFS kernel bug detected: mtime_ns=91301504000000000'

https://bugs.python.org/issue39502#msg361116

>>> print(datetime.datetime.fromtimestamp(91301504))
1972-11-22 18:31:44

--

The test is now skipped if the kernel or filesystem stores incorrectly the timestamp 4386268800 (2108-12-30). I don't think that we can do much more.

On the Linux kernel side, we reported the issue to:
https://bugzilla.redhat.com/show_bug.cgi?id=1795576

I close this issue.
History
Date User Action Args
2022-04-11 14:59:25adminsetgithub: 83641
2020-02-07 10:29:48vstinnersetstatus: open -> closed
versions: + Python 3.8
messages: + msg361561

resolution: fixed
stage: patch review -> resolved
2020-01-29 14:41:54miss-islingtonsetnosy: + miss-islington
messages: + msg360964
2020-01-29 14:25:52miss-islingtonsetpull_requests: + pull_request17630
2020-01-29 14:23:36vstinnersetmessages: + msg360963
2020-01-29 11:51:38vstinnersetmessages: + msg360956
title: test_zipfile: test_add_file_after_2107() sometimes fails on Fedora Rawhide 3.x -> test_zipfile: test_add_file_after_2107() sometimes fails on Fedora Rawhide 3.x: Linux VFS/XFS bug
2020-01-29 11:36:58vstinnersetkeywords: + patch
stage: patch review
pull_requests: + pull_request17626
2020-01-29 11:29:41vstinnersetmessages: + msg360954
2020-01-29 11:19:15fweimersetnosy: + fweimer
messages: + msg360952
2020-01-29 11:08:21vstinnersetmessages: + msg360951
2020-01-29 11:00:54vstinnersetmessages: + msg360950
2020-01-29 11:00:50vstinnersetmessages: + msg360949
2020-01-29 10:41:16vstinnersetmessages: + msg360948
2020-01-29 10:25:59vstinnersetmessages: + msg360947
2020-01-29 09:36:41vstinnersetmessages: + msg360944
2020-01-29 09:23:47vstinnersetmessages: + msg360943
2020-01-29 09:13:54vstinnersetfiles: + mtime.c

messages: + msg360942
2020-01-29 02:27:38David.Edelsohnsetmessages: + msg360932
2020-01-29 02:22:04David.Edelsohnsetmessages: + msg360931
2020-01-29 02:03:03vstinnersetmessages: + msg360930
2020-01-29 01:56:01David.Edelsohnsetmessages: + msg360928
2020-01-29 01:52:12David.Edelsohnsetmessages: + msg360926
2020-01-29 01:39:15vstinnersettitle: test_zipfile: test_add_file_after_2107() sometimes fails on Fedora Rawhide 3.x: time.localtime() limited to year 2038 -> test_zipfile: test_add_file_after_2107() sometimes fails on Fedora Rawhide 3.x
2020-01-29 01:38:37vstinnersetfiles: + utime_stat_localtime2.py

messages: + msg360925
2020-01-29 01:21:05David.Edelsohnsetmessages: + msg360924
2020-01-29 01:04:57vstinnersetmessages: + msg360923
title: test_zipfile: test_add_file_after_2107() sometimes fails on Fedora Rawhide 3.x -> test_zipfile: test_add_file_after_2107() sometimes fails on Fedora Rawhide 3.x: time.localtime() limited to year 2038
2020-01-29 00:54:19vstinnersetmessages: + msg360922
2020-01-29 00:51:09vstinnersetmessages: + msg360921
2020-01-29 00:46:41vstinnersetmessages: + msg360918
2020-01-29 00:26:42David.Edelsohnsetmessages: + msg360916
2020-01-29 00:21:33vstinnersetfiles: + utime_stat_localtime.py

messages: + msg360915
2020-01-29 00:17:35hroncoksetmessages: + msg360914
title: test_zipfile: test_add_file_after_2107() fails on s390x Fedora Rawhide 3.x -> test_zipfile: test_add_file_after_2107() sometimes fails on Fedora Rawhide 3.x
2020-01-29 00:15:01David.Edelsohnsetmessages: + msg360913
2020-01-29 00:09:15vstinnersetmessages: + msg360912
2020-01-28 18:57:39David.Edelsohnsetmessages: + msg360897
2020-01-28 18:55:41David.Edelsohnsetmessages: + msg360896
2020-01-28 18:38:43vstinnersetmessages: + msg360892
2020-01-28 15:38:05Dormouse759setmessages: + msg360881
2020-01-28 12:26:16vstinnersetmessages: + msg360866
2020-01-28 10:34:03xtreaksettitle: README.md -> test_zipfile: test_add_file_after_2107() fails on s390x Fedora Rawhide 3.x
nosy: + David.Edelsohn, hroncok, Dormouse759, - paul.moore, tim.golden, ezio.melotti, docs@python, zach.ware, steve.dower, Alex.Willmer, Hinsonlcrystal

assignee: docs@python ->
components: - Build, Documentation, Installation, Interpreter Core, Library (Lib), Unicode, Windows, Cross-Build, SSL
type: security -> behavior
2020-01-28 09:20:10Hinsonlcrystalsettitle: test_zipfile: test_add_file_after_2107() fails on s390x Fedora Rawhide 3.x -> README.md
nosy: + Hinsonlcrystal, paul.moore, tim.golden, ezio.melotti, docs@python, Alex.Willmer, zach.ware, steve.dower, - David.Edelsohn, hroncok, Dormouse759

assignee: docs@python
components: + Build, Documentation, Installation, Interpreter Core, Library (Lib), Unicode, Windows, Cross-Build, SSL
type: security
2020-01-28 09:14:51hroncoksetnosy: + hroncok
messages: + msg360845
2020-01-27 15:59:06vstinnersetmessages: + msg360774
2020-01-27 15:35:35vstinnersetnosy: + David.Edelsohn
2020-01-27 09:25:05vstinnersetnosy: + Dormouse759
2020-01-27 09:24:50vstinnercreate