Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PEP 615: Add zoneinfo module #84683

Closed
pganssle opened this issue May 4, 2020 · 19 comments
Closed

PEP 615: Add zoneinfo module #84683

pganssle opened this issue May 4, 2020 · 19 comments
Assignees
Labels
3.9 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@pganssle
Copy link
Member

pganssle commented May 4, 2020

BPO 40503
Nosy @malemburg, @Yhg1s, @abalkin, @serhiy-storchaka, @zooba, @pganssle, @isuruf, @FFY00
PRs
  • bpo-40503: Add tests and implementation for ZoneInfo #19909
  • bpo-40503: Add documentation and what's new entry for zoneinfo #20006
  • bpo-40503: Add compile-time configuration to PEP 615 #20034
  • bpo-40503: Check in <prefix>/share/zoneinfo for zoneinfo files on windows #28495
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/pganssle'
    closed_at = None
    created_at = <Date 2020-05-04.18:51:21.137>
    labels = ['type-feature', 'library', '3.9']
    title = 'PEP 615: Add zoneinfo module'
    updated_at = <Date 2021-09-21.20:27:37.648>
    user = 'https://github.com/pganssle'

    bugs.python.org fields:

    activity = <Date 2021-09-21.20:27:37.648>
    actor = 'steve.dower'
    assignee = 'p-ganssle'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2020-05-04.18:51:21.137>
    creator = 'p-ganssle'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 40503
    keywords = ['patch']
    message_count = 17.0
    messages = ['368076', '368461', '368619', '368634', '368649', '368656', '368658', '368800', '369021', '369059', '369206', '401992', '402247', '402330', '402350', '402352', '402357']
    nosy_count = 8.0
    nosy_names = ['lemburg', 'twouters', 'belopolsky', 'serhiy.storchaka', 'steve.dower', 'p-ganssle', 'isuruf', 'FFY00']
    pr_nums = ['19909', '20006', '20034', '28495']
    priority = 'high'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue40503'
    versions = ['Python 3.9']

    @pganssle
    Copy link
    Member Author

    pganssle commented May 4, 2020

    This is an issue to track the implementation of PEP-615: https://www.python.org/dev/peps/pep-0615/

    It should mostly involve migrating from the reference implementation: https://github.com/pganssle/zoneinfo/

    @pganssle pganssle added the 3.9 only security fixes label May 4, 2020
    @pganssle pganssle self-assigned this May 4, 2020
    @pganssle pganssle added stdlib Python modules in the Lib dir type-feature A feature request or enhancement 3.9 only security fixes labels May 4, 2020
    @pganssle pganssle self-assigned this May 4, 2020
    @pganssle pganssle added stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels May 4, 2020
    @pganssle
    Copy link
    Member Author

    pganssle commented May 8, 2020

    I've separated this into two separate PRs, one for docs and one for tests/implementation.

    I have not yet implemented the logic for the ability to configure the TZPATH at compile time because I'm not quite sure how to start on that. How are other compile-time options implemented? Do I need to do something special to handle both Windows and POSIX systems? Is there already a configuration file somewhere that I should use for this? Adding Thomas to the nosy list because he's the only one listed for "autoconf/makefiles" – don't know if that extends to the Windows build as well.

    @Yhg1s
    Copy link
    Member

    Yhg1s commented May 11, 2020

    The normal way to do this (for make/autoconf) is to add a --with-tzpath argument to configure.ac, and a make variable to pass it to the compilation of anything that needs it. You can then access it from Python code with sysconfig.get_config_var().

    In configure.ac, AC_SUBST(TZPATH) makes configure replace @TZPATH@ in the Makefile with the value you set to $TZPATH in configure.ac. You then either add that to the global PY_CFLAGS_NODIST, or modify the build rule for the module that needs it to pass it along. (See for example how GITTAG/GITVERSION/GITBRANCH are passed to Modules/getbuildinfo.o.)

    AC_ARG_WITH() is how you add a new --with-* argument to configure. The usual way people do this is by copying one of the other AC_ARG_WITH blocks and modifying it to suit their needs. It's a mixture of m4 and shell that can be a bit annoying to get right, but it's pretty flexible. Run autoreconf to regenerate configure. You can manually check that the shell in configure makes sense.

    Something will have to be done on the Windows side as well, but I'm not sure what. Adding Steve Dower for that.

    @pganssle
    Copy link
    Member Author

    Thanks Thomas, that was super helpful. I've created #64233 to add in the compile-time arguments on POSIX systems at least, do you mind having a look?

    For the moment I have made it non-configurable on Windows, but I think the right thing to do is to add an argument to PCbuild\build.bat. Hopefully Steve can point me in the right direction for mapping that argument (or something else) to a new config variable in sysconfig.

    @serhiy-storchaka
    Copy link
    Member

    Do we need the C implementation if there is the Python implementation?

    @pganssle
    Copy link
    Member Author

    I mean, theoretically we don't "need" it, but it's much, much faster, and without it nearly every operation that needs time zone offsets will be slower than pytz (which has a mechanism for caching).

    Also, I've already written it, so I see no reason why not use it.

    @pganssle
    Copy link
    Member Author

    Here are some benchmarks run using the latest implementation. The pure Python code is pretty optimized, but the C code is still ~4-5x faster.

    Running from_utc in zone Europe/Paris
    c_zoneinfo: mean: 494.82 ns ± 3.80 ns; min: 489.23 ns (k=5, N=500000)
    py_zoneinfo: mean: 2.48 µs ± 79.17 ns; min: 2.42 µs (k=5, N=100000)
    dateutil: mean: 10.41 µs ± 209.97 ns; min: 10.17 µs (k=5, N=50000)
    pytz: mean: 4.69 µs ± 252.70 ns; min: 4.39 µs (k=5, N=50000)

    Running to_utc in zone Europe/Paris
    c_zoneinfo: mean: 539.61 ns ± 25.68 ns; min: 514.39 ns (k=5, N=500000)
    py_zoneinfo: mean: 2.01 µs ± 61.69 ns; min: 1.94 µs (k=5, N=100000)
    dateutil: mean: 7.88 µs ± 506.89 ns; min: 7.25 µs (k=5, N=50000)
    pytz: mean: 773.02 ns ± 14.11 ns; min: 759.56 ns (k=5, N=500000)

    Running utcoffset in zone Europe/Paris
    c_zoneinfo: mean: 329.34 ns ± 36.31 ns; min: 302.88 ns (k=5, N=1000000)
    py_zoneinfo: mean: 1.57 µs ± 9.58 ns; min: 1.55 µs (k=5, N=200000)
    dateutil: mean: 6.28 µs ± 86.61 ns; min: 6.16 µs (k=5, N=50000)
    pytz: mean: 461.47 ns ± 2.07 ns; min: 458.91 ns (k=5, N=500000)

    utcoffset() is very likely to be called possibly many times in certain hot loops (including implicitly as it's part of hash and equality checks).

    @pganssle
    Copy link
    Member Author

    Talked to Steve Dower in a sidebar about the issue with compile-time configuration, he is convinced that compile-time configuration is not something that would be useful or worth doing on Windows. I am indifferent on the matter, so I am fine with calling the compile-time configuration part of this done at this point. If anyone building Python for Windows shows up needing support for this, we can re-visit the issue — I don't believe it's technically infeasible, just that the usage patterns of Python on Windows mean that it's not worth doing.

    So, once we've got a critical mass of reviews done for zoneinfo, I think this feature is done. 🎉

    @vstinner
    Copy link
    Member

    New changeset 62972d9 by Paul Ganssle in branch 'master':
    bpo-40503: PEP-615: Tests and implementation for zoneinfo (GH-19909)
    62972d9

    @pganssle
    Copy link
    Member Author

    New changeset b17e49e by Paul Ganssle in branch 'master':
    bpo-40503: Add documentation and what's new entry for zoneinfo (GH-20006)
    b17e49e

    @vstinner
    Copy link
    Member

    I suggest to open a separated issue to discuss how tzdata can be installed on Travis CI, Azure Pipelines, Buildbots, etc. when running tests.

    @isuruf
    Copy link
    Mannequin

    isuruf mannequin commented Sep 16, 2021

    If anyone building Python for Windows shows up needing support for this, we can re-visit the issue — I don't believe it's technically infeasible, just that the usage patterns of Python on Windows mean that it's not worth doing.

    At conda-forge, we need this and our current solution is
    https://github.com/conda-forge/python-feedstock/blob/8195ba1178041b7461238e8c5680eee62f5ea9d0/recipe/patches/0032-Fix-TZPATH-on-windows.patch#L19

    I can change to check if that directory exists. What do you think?

    @zooba
    Copy link
    Member

    zooba commented Sep 20, 2021

    That looks like a relocatable path anyway, which means you wouldn't want to compile it into the binary. Your patch to select a default value at runtime is probably better.

    The compile-time option is meant for a system directory.

    @isuruf
    Copy link
    Mannequin

    isuruf mannequin commented Sep 21, 2021

    Thanks @steve.dower for the info. I've created #28495. Let me know if it needs to have a separate bpo issue.

    @zooba
    Copy link
    Member

    zooba commented Sep 21, 2021

    I meant I don't think we want that path upstream, and you should keep it as your own patch.

    We don't have a "share" directory as part of CPython, but this would codify it. That would then conflict with your use of it in conda-forge.

    It also makes sysconfig non-deterministic, which is going to be problematic against some other efforts around being able to treat it as static data to make native module compilation easier, as well as more generally something that often leads to security vulnerabilities and is best avoided.

    So on balance, I think we will reject that patch.

    @isuruf
    Copy link
    Mannequin

    isuruf mannequin commented Sep 21, 2021

    Do you have a suggestion for how to make it configurable at compile time?
    In POSIX platforms, we can set --with-tzpath to make it configurable at compile time.

    @zooba
    Copy link
    Member

    zooba commented Sep 21, 2021

    We'd need to implement some form of data store for values on Windows, which currently are not compiled in anywhere, and it would need a substitution syntax to be updated at runtime.

    We're talking a big feature now, and you'd end up having just as much code on your side to make it work.

    The only justification that'd make sense on our side is if we were going to bundle the tzdata file with the Windows installer, which doesn't affect you at all, apart from us then having an existing path listed that you'd want to override still. And even then, we'd probably preinstall the tzdata package which already overrides the path, and so there'd be no changes necessary to sysconfig and so we wouldn't touch it anyway.

    @serhiy-storchaka
    Copy link
    Member

    Should not this issue be closed?

    @pganssle
    Copy link
    Member Author

    Yeah I think so.

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    Archived in project
    Development

    No branches or pull requests

    5 participants