Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Addition of a "list of available time zones" function to zoneinfo #84716

Closed
pganssle opened this issue May 6, 2020 · 12 comments
Closed

Addition of a "list of available time zones" function to zoneinfo #84716

pganssle opened this issue May 6, 2020 · 12 comments
Assignees
Labels
3.10 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@pganssle
Copy link
Member

pganssle commented May 6, 2020

BPO 40536
Nosy @malemburg, @abalkin, @vstinner, @ambv, @pganssle, @gaborbernat, @FFY00
PRs
  • bpo-40536: Add zoneinfo.available_timezones #20158
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/pganssle'
    closed_at = <Date 2020-09-14.18:08:27.166>
    created_at = <Date 2020-05-06.17:19:51.795>
    labels = ['type-feature', 'library', '3.10']
    title = 'Addition of a "list of available time zones" function to zoneinfo'
    updated_at = <Date 2020-09-14.18:08:27.165>
    user = 'https://github.com/pganssle'

    bugs.python.org fields:

    activity = <Date 2020-09-14.18:08:27.165>
    actor = 'p-ganssle'
    assignee = 'p-ganssle'
    closed = True
    closed_date = <Date 2020-09-14.18:08:27.166>
    closer = 'p-ganssle'
    components = ['Library (Lib)']
    creation = <Date 2020-05-06.17:19:51.795>
    creator = 'p-ganssle'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 40536
    keywords = ['patch']
    message_count = 12.0
    messages = ['368285', '368530', '368799', '369143', '369148', '369162', '369163', '369209', '369243', '373595', '376889', '376891']
    nosy_count = 7.0
    nosy_names = ['lemburg', 'belopolsky', 'vstinner', 'lukasz.langa', 'p-ganssle', 'gaborjbernat', 'FFY00']
    pr_nums = ['20158']
    priority = 'critical'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue40536'
    versions = ['Python 3.10']

    @pganssle
    Copy link
    Member Author

    pganssle commented May 6, 2020

    One thing that I sort of overlooked in PEP-615 that I think will be a common feature request for zoneinfo is the ability to get a list of time zones available on the current TZPATH.

    This is more complicated to implement than it sounds like, but luckily I already did it in order to implement the property tests for the zoneinfo module:

    https://github.com/pganssle/zoneinfo/blob/ffd21a6d065e04725e04b37bb430c2559fefd5fa/tests/test_zoneinfo_property.py#L23-L70

    The biggest complication is that there are files on TZPATH that are not necessarily time zones, and when I looked I couldn't easily find a list of all available time zones, so the only way to tell which ones are time zones and which ones aren't is to open each one and read the first 4 bytes. However, tzdata does ship with a list of available time zones, so the way I've got the function working right now is:

    1. If tzdata is installed – despite the fact that it's technically the end of the search path, convert the list of available zones that ships with tzdata to your starting set (since you know these are all valid zones).
    2. Walk the tzpath, and every time you find something not in the set of valid keys, open it and read the first 4 bytes, then add that to the set.

    The common cases will be that tzdata is not available (in which case no harm done), or tzdata has the same set of keys as the TZPATH, in which case you never have to open any of the TZif files. The fact that the search order is inverted from what it normally is doesn't matter because the output is the union of all the sets anyway.

    I don't know that the PEP needs to be amended – if this were a feature request for Python 3.10 I don't think it would need a PEP to go through, but I don't mind amending the PEP to document it.

    Design decisions (with my initial answers, loosely held):

    1. Should this be a classmethod or a free-standing function?

    I'm inclined towards free-standing function because zoneinfo.TZPATH is at the module level and not the class level.

    1. What should the return type be?

    I think frozenset() or set(); a sorted list is also a reasonable option, but for people who just want to use "in" or show some subset of available zones, that would be wasteful.

    We should go with frozenset() if we want there to be some possibility of caching the result in the future.

    1. Should we try to cache the result?

    I would say no, at least for now. It would not be super easy to get the cache invalidation right in the general case, and not many people are likely to be sensitive to the performance of this operation – the people who are can easily create their own cache.

    1. What should it be called?

    Naming things is hard. Options:

    • zoneinfo.timezones()
    • zoneinfo.all_timezones()
    • zoneinfo.timezone_{list,set}()
    • zoneinfo.valid_timezones()
    • zoneinfo.valid_keys()
    • zoneinfo.available_timezones()

    pytz has a similar thing and calls it all_timezones. It also has something called common_timezones, but that is a bit harder to pull off and I'm not confident that we'll get something satisfactory before the feature freeze.

    @pganssle pganssle added 3.9 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels May 6, 2020
    @pganssle
    Copy link
    Member Author

    pganssle commented May 9, 2020

    I have an initial implementation against the reference implementation here: pganssle/zoneinfo#60

    Once #64108 is merged, I will turn that into a PR against CPython.

    For the first pass I went with:

    1. free-standing function
    2. returning set()
    3. no cache
    4. available_timezones()

    @pganssle
    Copy link
    Member Author

    From some discussion on the reference implementation PR, it seems that this may be a more complicated feature than I had bargained for: pganssle/zoneinfo#60

    The issue is that the current implementation picks up the posix/ and right/ directories as well as the posixrules file, none of which is wrong — they are available time zones, after all — but they're not really canonical zones. In tzdata we have a list of zones sourced from tzdata's make zonenames, which does not include the posix/ and right/ directories.

    There is zone1970.tab and zone.tab, but that has too strict of a subset — it only includes canonical zones associated with a specific country, as far as I can tell, and not things like UTC. It also includes, for example, America/Nuuk but not America/Godthab (which was the canonical name up until 2020a).

    I'm considering postponing this feature to Python 3.10 so that we can have more time to figure out a decent API for this.

    Łukasz: Question for you as release manager — would you prefer that we put this in for 3.9 with the understanding that before the final release we may end up needing to remove it as a nuisance or change what it returns (and possibly flags that would be passed to it), or would that be too disruptive in the beta period? I'm confident that we can make a final decision before October, just not confident that we can make a final decision before Monday.

    @pganssle pganssle self-assigned this May 13, 2020
    @pganssle
    Copy link
    Member Author

    I've opened a PR adding this feature and tagged this as release blocker, since I'd like to make sure this is in the beta if Łukasz does not object.

    The concerns about the stability of the function I expressed earlier have not changed much, though we did get some feedback from the tz database list that at least make me think that the new approach (excluding posix/, right/ and posixrules) is *probably* the right way to go.

    @gaborbernat
    Copy link
    Mannequin

    gaborbernat mannequin commented May 17, 2020

    I think semantically the correct expression would be available timezones, free function, set and no cache. Document the operation is expensive and users should cache if they want repeated access or provide an available timezones cached function 👍 my 2c

    @pganssle
    Copy link
    Member Author

    New changeset e527ec8 by Paul Ganssle in branch 'master':
    bpo-40536: Add zoneinfo.available_timezones (GH-20158)
    e527ec8

    @pganssle
    Copy link
    Member Author

    I've merged the existing implementation, but I'm leaving this staged as "release blocker" so that Łukasz can have final say on whether this goes into 3.9.

    Thanks for the help everyone!

    @vstinner
    Copy link
    Member

    def valid_key(fpath):
            try:
                with open(fpath, "rb") as f:
                    return f.read(4) == b"TZif"
            except Exception:  # pragma: nocover
                return False

    Why not only catching "except OSError:" rather than any exception?

    Or even "except (FileNotFoundError, PermissionError):" if you want to be pedantic :-p

    @ambv
    Copy link
    Contributor

    ambv commented May 18, 2020

    This goes into 3.9, but please settle on a final design long before October. Scratch that, please do it ideally before beta 2.

    I'm downgrading from release blocker, exception handling can be discussed and changed after beta 1 as well.

    @ambv
    Copy link
    Contributor

    ambv commented Jul 13, 2020

    Note: next week is 3.9.0b5, the last beta before 3.9.0. Please decide what to do with the rest of the issue before then.

    @ambv
    Copy link
    Contributor

    ambv commented Sep 14, 2020

    Note: the release of the last RC is imminent. Whatever API changes were planned, they have to wait for 3.10.

    @ambv ambv added 3.10 only security fixes and removed 3.9 only security fixes deferred-blocker labels Sep 14, 2020
    @pganssle
    Copy link
    Member Author

    Thanks Łukasz. Sorry I forgot to respond last time. We've had no feedback on this whatsoever, and I think we've probably made the right choice, so we can go ahead and call this resolved.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.10 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants