classification
Title: year 2038 problem in compileall.py
Type: compile error Stage: resolved
Components: Build Versions: Python 3.11, Python 3.10, Python 3.9
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: ammar2, bmwiedemann, matrixise, miss-islington, serhiy.storchaka, vstinner, xtreak
Priority: normal Keywords: patch

Created on 2018-10-15 11:22 by bmwiedemann, last changed 2021-08-24 15:13 by ammar2. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 9892 closed matrixise, 2018-10-15 18:40
PR 19651 closed ammar2, 2020-04-22 10:07
PR 19708 merged ammar2, 2020-04-25 03:53
PR 27928 merged miss-islington, 2021-08-24 09:14
PR 27929 merged miss-islington, 2021-08-24 09:14
Messages (13)
msg327743 - (view) Author: Bernhard M. Wiedemann (bmwiedemann) * Date: 2018-10-15 11:22
To reproduce:
touch -d 2038-01-20 /usr/lib/python3.6/site-packages/six.py
python3 /usr/lib64/python3.6/compileall.py


  File "/usr/lib64/python3.6/compileall.py", line 198, in compile_path
    legacy=legacy, optimize=optimize)
  File "/usr/lib64/python3.6/compileall.py", line 90, in compile_dir
    legacy, optimize):
  File "/usr/lib64/python3.6/compileall.py", line 138, in compile_file
    mtime)
struct.error: 'l' format requires -2147483648 <= number <= 2147483647

It could use either 
64 bit int (requires new .pyc format with different magic number) or
unsigned 32 bit int (gives us only another 68 years)
msg327747 - (view) Author: Stéphane Wirtel (matrixise) * (Python committer) Date: 2018-10-15 13:08
With 3.8a

Traceback (most recent call last):
  File "/home/stephane/src/github.com/python/cpython/Lib/runpy.py", line 192, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/stephane/src/github.com/python/cpython/Lib/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/stephane/src/github.com/python/cpython/Lib/compileall.py", line 326, in <module>
    exit_status = int(not main())
  File "/home/stephane/src/github.com/python/cpython/Lib/compileall.py", line 303, in main
    if not compile_file(dest, args.ddir, args.force, args.rx,
  File "/home/stephane/src/github.com/python/cpython/Lib/compileall.py", line 142, in compile_file
    expect = struct.pack('<4sll', importlib.util.MAGIC_NUMBER,
struct.error: 'l' format requires -2147483648 <= number <= 2147483647
msg327748 - (view) Author: Stéphane Wirtel (matrixise) * (Python committer) Date: 2018-10-15 13:09
But until 2038, maybe there will be a new format for the .pyc file.

We should keep this issue and try to fix it for 3.8 or 3.9?
msg327749 - (view) Author: Bernhard M. Wiedemann (bmwiedemann) * Date: 2018-10-15 13:15
It does not need to be fixed tomorrow, but 2037 is too late, because by then there will be a lot of legacy systems around.
(Un)fortunately many systems live 10+ years now
msg327750 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-10-15 13:16
Timestamp with year >= 2038 are accepted: importlib._bootstrap_external._code_to_timestamp_pyc() uses (int(x) & 0xFFFFFFFF). It's not a bug, but by design. compileall should just do the same. Sorry, I don't know if it's specified somewhere, but I know that it's done on purpose.
msg327751 - (view) Author: Stéphane Wirtel (matrixise) * (Python committer) Date: 2018-10-15 13:18
So we need to fix compileall.py.

maybe we could add the label 'easy' to this issue.
msg327753 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2018-10-15 13:29
A reproducer in Python that can be added to test_compileall if needed : 

def test_compile_all_2038(self):
    with open(self.source_path, 'r') as f:
        os.utime(f.name, (2147558400, 2147558400)) # Jan 20, 2038 as touch
    self.assertTrue(compileall.compile_file(pathlib.Path(self.source_path)))


./python.exe -m unittest -v test.test_compileall.CompileallTestsWithSourceEpoch.test_compile_all_2038
test_compile_all_2038 (test.test_compileall.CompileallTestsWithSourceEpoch) ... ERROR

======================================================================
ERROR: test_compile_all_2038 (test.test_compileall.CompileallTestsWithSourceEpoch)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/test/test_py_compile.py", line 30, in wrapper
    return fxn(*args, **kwargs)
  File "/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/test/test_compileall.py", line 114, in test_compile_all_2038
    self.assertTrue(compileall.compile_file(pathlib.Path(self.source_path)))
  File "/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/compileall.py", line 142, in compile_file
    expect = struct.pack('<4sll', importlib.util.MAGIC_NUMBER,
struct.error: 'l' format requires -2147483648 <= number <= 2147483647

----------------------------------------------------------------------


Thanks
msg327774 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2018-10-15 19:16
Victor seems there was some discussion about 2038 problem in the original PR but I don't know if it's related to this. Reference : https://github.com/python/cpython/pull/4575#discussion_r153376173

Thanks
msg360270 - (view) Author: Bernhard M. Wiedemann (bmwiedemann) * Date: 2020-01-19 20:27
ping.
Another 19th of January passed.

I'd still like to see progress on this, because this hinders my other y2038 bug discovery work.
msg367679 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-04-29 17:13
I would prefer to mimick importlib._bootstrap_external which uses:

def _pack_uint32(x):
    """Convert a 32-bit integer to little-endian."""
    return (int(x) & 0xFFFFFFFF).to_bytes(4, 'little')

Using 64-bit timestamp (PR 19651), treat timestamp as unsigned (PR 9892 and PR 19708) have drawback:

* 64-bit timestamp make .pyc files larger
* unsigned timestamp no longer support timestamp before 1969 which can cause practical issues

"& 0xFFFFFFFF" looks dead simple, uses a fixed size of 4 bytes and doesn't have any limitation of year 2038.

The timestamp doesn't have to be exact. In practice, it sounds very unlikely that two timestamps are equal when compared using (ts1 & 0xFFFFFFFF) == (ts2 & 0xFFFFFFFF). I expect file modification times to be close by a few days, not separated by 2**32 seconds (136 years).

Use hash based .pyc to avoid any issuse with file modification time: it should make Python more deterministic (more "reproducible").
https://docs.python.org/dev/reference/import.html#pyc-invalidation
msg400200 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2021-08-24 09:13
New changeset bb21e28fd08f894ceff2405544a2f257d42b1354 by Ammar Askar in branch 'main':
bpo-34990: Treat the pyc header's mtime in compileall as an unsigned int (GH-19708)
https://github.com/python/cpython/commit/bb21e28fd08f894ceff2405544a2f257d42b1354
msg400214 - (view) Author: Ammar Askar (ammar2) * (Python committer) Date: 2021-08-24 15:07
New changeset 9d3b6b2472f7c7ef841e652825de652bc8af85d7 by Miss Islington (bot) in branch '3.9':
[3.9] bpo-34990: Treat the pyc header's mtime in compileall as an unsigned int (GH-19708)
https://github.com/python/cpython/commit/9d3b6b2472f7c7ef841e652825de652bc8af85d7
msg400215 - (view) Author: Ammar Askar (ammar2) * (Python committer) Date: 2021-08-24 15:09
New changeset 0af681b652c43f0ba90988400ecc1e7934fbfc5d by Miss Islington (bot) in branch '3.10':
[3.10] bpo-34990: Treat the pyc header's mtime in compileall as an unsigned int (GH-19708)
https://github.com/python/cpython/commit/0af681b652c43f0ba90988400ecc1e7934fbfc5d
History
Date User Action Args
2021-08-24 15:13:34ammar2setstatus: open -> closed
stage: patch review -> resolved
resolution: fixed
versions: + Python 3.10, Python 3.11, - Python 3.5, Python 3.6, Python 3.7, Python 3.8
2021-08-24 15:09:19ammar2setmessages: + msg400215
2021-08-24 15:07:35ammar2setmessages: + msg400214
2021-08-24 09:14:26miss-islingtonsetpull_requests: + pull_request26379
2021-08-24 09:14:21miss-islingtonsetnosy: + miss-islington
pull_requests: + pull_request26378
2021-08-24 09:13:39serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg400200
2020-04-29 17:13:16vstinnersetmessages: + msg367679
2020-04-25 03:53:51ammar2setpull_requests: + pull_request19029
2020-04-22 10:07:55ammar2setnosy: + ammar2
pull_requests: + pull_request18977
2020-01-19 20:27:16bmwiedemannsetmessages: + msg360270
versions: + Python 3.5, Python 3.8, Python 3.9
2018-10-15 19:16:38xtreaksetmessages: + msg327774
2018-10-15 18:40:08matrixisesetkeywords: + patch
stage: patch review
pull_requests: + pull_request9256
2018-10-15 13:29:37xtreaksetmessages: + msg327753
2018-10-15 13:18:02matrixisesetmessages: + msg327751
2018-10-15 13:16:45vstinnersetnosy: + vstinner
messages: + msg327750
2018-10-15 13:15:46bmwiedemannsetmessages: + msg327749
2018-10-15 13:09:21matrixisesetmessages: + msg327748
2018-10-15 13:08:30matrixisesetnosy: + matrixise
messages: + msg327747
2018-10-15 12:50:29xtreaksetnosy: + xtreak
2018-10-15 11:22:16bmwiedemanncreate