classification
Title: test_zoneinfo fails when lzma module is unavailable
Type: crash Stage: patch review
Components: Tests Versions: Python 3.10, Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: doodspav, nmaynes, p-ganssle, xtreak
Priority: normal Keywords: easy, newcomer friendly, patch

Created on 2020-07-22 20:17 by doodspav, last changed 2020-08-04 19:58 by nmaynes.

Files
File name Uploaded Description Edit
test_output.txt doodspav, 2020-07-22 20:17 Output from running `make test`
Pull Requests
URL Status Linked Edit
PR 21730 closed nmaynes, 2020-08-04 11:08
PR 21734 open nmaynes, 2020-08-04 17:55
Messages (6)
msg374106 - (view) Author: (doodspav) * Date: 2020-07-22 20:17
Issue:
======
`_lzma` is not built because the required libraries are not available on my machine. `test_zoneinfo` assumes it is always available, leading it to crash on my machine.

How I build and ran the tests:
==============================
git clone https://github.com/python/cpython.git  (bpo-41364)
cd cpython
mkdir build && cd build
../configure
make -j8
make test > test_output.txt

Test traceback:
===============
  File "/home/doodspav/Jetbrains/CLionProjects/cpython/Lib/test/libregrtest/runtest.py", line 272, in _runtest_inner
    refleak = _runtest_inner2(ns, test_name)
  File "/home/doodspav/Jetbrains/CLionProjects/cpython/Lib/test/libregrtest/runtest.py", line 223, in _runtest_inner2
    the_module = importlib.import_module(abstest)
  File "/home/doodspav/Jetbrains/CLionProjects/cpython/Lib/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
  File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 790, in exec_module
  File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
  File "/home/doodspav/Jetbrains/CLionProjects/cpython/Lib/test/test_zoneinfo/__init__.py", line 1, in <module>
    from .test_zoneinfo import *
  File "/home/doodspav/Jetbrains/CLionProjects/cpython/Lib/test/test_zoneinfo/test_zoneinfo.py", line 9, in <module>
    import lzma
  File "/home/doodspav/Jetbrains/CLionProjects/cpython/Lib/lzma.py", line 27, in <module>
    from _lzma import *
ModuleNotFoundError: No module named '_lzma'
msg374113 - (view) Author: (doodspav) * Date: 2020-07-22 21:50
Note:
=====
I marked the type as `crash` because during the test run, the first attempt at `test_zoneinfo` failed with a segfault
msg374186 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2020-07-24 14:38
I got this error as well. Since lzma is needed to decode the test data the ImportError can be captured to skip the test in setUpModule [0] like other test module setup with similar approach for required cases. I am adding easy tag. Feel free to retriage this if the test data need to be encoded in a different format for the test to support platforms that don't have lzma.

try:
   import lzma
except ImportError:
   raise unittest.skipTest("lzma is needed")


[0] https://github.com/python/cpython/blob/0dd98c2d00a75efbec19c2ed942923981bc06683/Lib/test/test_zoneinfo/test_zoneinfo.py#L43
msg374801 - (view) Author: Nathan Maynes (nmaynes) * Date: 2020-08-04 11:01
I'm creating a pull request that implements the suggestion by xtreak.
msg374822 - (view) Author: Paul Ganssle (p-ganssle) * (Python committer) Date: 2020-08-04 15:07
I think for now skipping the tests when lzma is missing is the easiest thing, though another option would be to drop the compression on the input test data so that the tests don't depend on lzma.

Taking a look at the data files, it looks like we get around 50% compression using either lzma or gzip, but the uncompressed file is only 32k to start with:

    $ du -b tests/data/*
    31054   tests/data/zoneinfo_data.json
    15127   tests/data/zoneinfo_data.json.gz
    12895   tests/data/zoneinfo_data.json.lz

We're also currently using the "fat" binaries that `zic` produces (which includes hard-coded transitions all the way until 2038). The new default for `zic` is to produce "slim" binaries, and the script to update test data does nothing to explicitly request fat binaries. If we were to switch over to "slim" binaries, the result would be more like this:

    $ du -b tests/data/*
    8297    tests/data/zoneinfo_data_slim.json.gz
    7750    tests/data/zoneinfo_data_slim.json.lz
    15551   tests/data/zoneinfo_data_unc_slim.json

So we're still looking at ~2:1 compression for both gzip and lzma, but the overall file size is 50% of what it was to start with. The biggest downside to this is that the way the "slim" binaries work is that once a rule repeats indefinitely, `zic` stops producing explicit transitions for it, and falls back to a simple repeating rule, meaning that the current set of tests would take a different code path.

I think we can go with the following course of action (3 or 4 different PRs):

1. Start by skipping the tests when `lzma` is missing.
2. Update the test suite so that it is testing more or less the same thing when the binaries are compiled with `-b slim`.
3. Change `Lib/test/test_zoneinfo/data/update_test_data.py` so that it pulls the raw data from the `tzdata` module on PyPI (which is compiled with `-b slim`) instead of the user's machine.
4. Change `update_test_data.py` to stop using `lzma` and change the tests so that they are able to process the new format of the JSON files.

If we ever decide that we really want the compression again, I assume that `gzip` is found more commonly than `lzma` among systems that don't build the whole standard library, so it might be mildly preferable to switch to `gzip`.
msg374842 - (view) Author: Nathan Maynes (nmaynes) * Date: 2020-08-04 19:58
Im still trying to get the hang of the PR workflow so my apologies in advance.

I closed the first PR by accident. I made the mistake of including a commit for another issue as well as the commit for this issue. When trying to clean up, I reverted back too far and Github closed the PR. I have submitted another PR that imports the lzma library as follows:

from test.support.import_helper import import_module

lzma = import_module('lzma')

Let me know if something still does not look right. I'll have some time this evening to work it out.
History
Date User Action Args
2020-08-04 19:58:40nmaynessetmessages: + msg374842
2020-08-04 17:55:49nmaynessetpull_requests: + pull_request20879
2020-08-04 15:07:30p-gansslesetmessages: + msg374822
2020-08-04 11:08:50nmaynessetkeywords: + patch
stage: patch review
pull_requests: + pull_request20874
2020-08-04 11:01:52nmaynessetnosy: + nmaynes
messages: + msg374801
2020-07-24 14:39:01xtreaksettitle: test_zoneinfo fails -> test_zoneinfo fails when lzma module is unavailable
2020-07-24 14:38:38xtreaksetversions: + Python 3.9
nosy: + xtreak

messages: + msg374186

keywords: + easy, newcomer friendly
2020-07-23 03:05:37xtreaksetnosy: + p-ganssle
2020-07-22 21:50:02doodspavsetmessages: + msg374113
2020-07-22 20:17:30doodspavcreate