Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

datetime.strptime: Support for parsing offsets with a colon #75981

Closed
mariocj89 mannequin opened this issue Oct 16, 2017 · 9 comments
Closed

datetime.strptime: Support for parsing offsets with a colon #75981

mariocj89 mannequin opened this issue Oct 16, 2017 · 9 comments
Labels
3.7 (EOL) end of life stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@mariocj89
Copy link
Mannequin

mariocj89 mannequin commented Oct 16, 2017

BPO 31800
Nosy @abalkin, @vadmium, @pganssle, @mariocj89, @pablogsal
PRs
  • bpo-31800: Support for colon when parsing time offsets #4015
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2017-10-26.00:35:43.737>
    created_at = <Date 2017-10-16.22:51:49.234>
    labels = ['3.7', 'type-feature', 'library']
    title = 'datetime.strptime: Support for parsing offsets with a colon'
    updated_at = <Date 2017-10-26.00:35:43.731>
    user = 'https://github.com/mariocj89'

    bugs.python.org fields:

    activity = <Date 2017-10-26.00:35:43.731>
    actor = 'belopolsky'
    assignee = 'none'
    closed = True
    closed_date = <Date 2017-10-26.00:35:43.737>
    closer = 'belopolsky'
    components = ['Library (Lib)']
    creation = <Date 2017-10-16.22:51:49.234>
    creator = 'mariocj89'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 31800
    keywords = ['patch']
    message_count = 9.0
    messages = ['304486', '304489', '304490', '304510', '304620', '304644', '304645', '304836', '305017']
    nosy_count = 5.0
    nosy_names = ['belopolsky', 'martin.panter', 'p-ganssle', 'mariocj89', 'pablogsal']
    pr_nums = ['4015']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue31800'
    versions = ['Python 3.7']

    @mariocj89
    Copy link
    Mannequin Author

    mariocj89 mannequin commented Oct 16, 2017

    Currently, datetime.strptime does not support parsing utc offsets that include a colon. "+0000" is parsed without issues whilst it fails with "+00:00".

    "+NN:NN" is not only ISO8601 valid but also the way the offset is presented to the user when using .isoformat on a datetime with a timezone/offset.

    This lead to the users needing to go to external libraries like dateutil or iso8601 just to be able to parse the datetime encoded in strings that "datetime" produces.

    Even if a long-term goal would be to provide a way to parse any isoformatted string this issue just aims to address the problem that the %z parsing presents. This already unblocks users from parsing datetime object serialized with isoformat.

    With this change, the following will just work:

    >> import datetime as dt
    >> iso_fmt = '%Y-%m-%dT%H:%M:%S%z'
    >> d = dt.datetime.strptime('2004-01-01T10:10:10+05:00', iso_fmt)

    *'2004-01-01T10:10:10+05:00' is a sample string generated via datetime.isoformat()

    Other options like having a new %:z was proposed but having just %z seems much simpler for the user.

    Note: There has been already conversations about adding support on datetime to parse any ISO-formatted string. This is a more simplistic approach. We might be able to get to that situation after this patch, but this aims just to unblock us.

    Related:
    http://www.loc.gov/standards/datetime/iso-tc154-wg5_n0039_iso_wd_8601-2_2016-02-16.pdf
    https://mail.python.org/pipermail/python-ideas/2014-March/027018.html
    https://bugs.python.org/issue15873

    @mariocj89 mariocj89 mannequin added 3.7 (EOL) end of life stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels Oct 16, 2017
    @vadmium
    Copy link
    Member

    vadmium commented Oct 17, 2017

    FWIW it looks like “strptime” in glibc, and Open and Free BSD support parsing this and even more formats (RFC 822 and RFC 3339; includes “Z”, U.S. time zones, ±HH). Also, there is bpo-24954 for adding “%:z” like Gnu “date”.

    @vadmium
    Copy link
    Member

    vadmium commented Oct 17, 2017

    Sorry, I meant Net BSD not Free BSD

    @mariocj89
    Copy link
    Mannequin Author

    mariocj89 mannequin commented Oct 17, 2017

    Yep, http://man7.org/linux/man-pages/man3/strptime.3.html does support it even if it might look asymetrical.

    Example:

               struct tm tm;
               char buf[255];
           memset(&tm, 0, sizeof(struct tm));
           strptime("+00:00", "%z", &tm);
           strftime(buf, sizeof(buf), "%z", &tm);
           puts(buf); // Will print +0000
           exit(EXIT_SUCCESS);
    

    Martin do you want me to "cleanup" the PR, add docs, news entry, etc?

    @pganssle
    Copy link
    Member

    This seems very useful to me. I very frequently advise people *against* using dateutil.parser (despite my conflict of interest as maintainer of dateutil) for well-known formats, but the problem frequently comes up of, "what should I do when I have date created by isoformat()?", to which there's no clean satisfying answer other than, "use dateutil.parser even though you know the format."

    I think the strptime page that Mario linked to is evidence that the %z directive is *intended* to match against -HH:MM, and so that might be the most "standard" solution.

    That said, I somewhat prefer the granularity of the GNU date extensions %z, %:z and %::z, since this allows downstream users to be stricter about what they are willing to accept. I think either approach is defensible, but that *something* should be done soon, preferably for the 3.7 release.

    @mariocj89
    Copy link
    Mannequin Author

    mariocj89 mannequin commented Oct 19, 2017

    As a note

    Seems support for the ":" was added in 2015 for glibc:
    http://code.metager.de/source/xref/gnu/glibc/time/strptime_l.c#765

    Commit e952e1df

    Before that, it basically just ignores the minutes.

    @mariocj89
    Copy link
    Mannequin Author

    mariocj89 mannequin commented Oct 19, 2017

    I have a patch to add 'Z' support as well if we are interested in making it the same as it glibc does. (as it supports it as well)

    @abalkin
    Copy link
    Member

    abalkin commented Oct 23, 2017

    Note that bpo-5288 relaxed the whole number of minutes restriction on UTC offsets. Since the goal is to be able to parse the output of .isoformat(), I think %z should accept sub-minute offsets.

    @abalkin
    Copy link
    Member

    abalkin commented Oct 26, 2017

    New changeset 3231893 by Alexander Belopolsky (Mario Corchero) in branch 'master':
    Closes bpo-31800: Support for colon when parsing time offsets (bpo-4015)
    3231893

    @abalkin abalkin closed this as completed Oct 26, 2017
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants