Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

datetime-RFC2822 roundtripping #37748

Closed
doerwalter opened this issue Jan 9, 2003 · 36 comments
Closed

datetime-RFC2822 roundtripping #37748

doerwalter opened this issue Jan 9, 2003 · 36 comments
Assignees
Labels
easy stdlib Python modules in the Lib dir topic-email type-feature A feature request or enhancement

Comments

@doerwalter
Copy link
Contributor

BPO 665194
Nosy @tim-one, @warsaw, @akuchling, @doerwalter, @abalkin, @devdanzin, @merwok, @bitdancer
Dependencies
  • bpo-5094: datetime lacks concrete tzinfo implementation for UTC
  • Files
  • datetime-mimeformat.diff
  • formatdate_datetime_support.patch
  • util_datetime.patch
  • issue665194.diff
  • localtime.patch
  • localtime.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/abalkin'
    closed_at = <Date 2012-08-23.04:10:12.456>
    created_at = <Date 2003-01-09.18:24:53.000>
    labels = ['easy', 'type-feature', 'library', 'expert-email']
    title = 'datetime-RFC2822 roundtripping'
    updated_at = <Date 2012-08-23.04:10:12.455>
    user = 'https://github.com/doerwalter'

    bugs.python.org fields:

    activity = <Date 2012-08-23.04:10:12.455>
    actor = 'belopolsky'
    assignee = 'belopolsky'
    closed = True
    closed_date = <Date 2012-08-23.04:10:12.456>
    closer = 'belopolsky'
    components = ['Library (Lib)', 'email']
    creation = <Date 2003-01-09.18:24:53.000>
    creator = 'doerwalter'
    dependencies = ['5094']
    files = ['8214', '21886', '22649', '26094', '26954', '26955']
    hgrepos = []
    issue_num = 665194
    keywords = ['patch', 'easy']
    message_count = 36.0
    messages = ['53725', '53726', '53727', '53728', '53729', '53730', '53731', '53732', '53733', '81810', '107436', '135154', '135211', '135222', '135226', '135228', '140317', '140747', '140748', '161644', '163486', '163516', '163620', '168834', '168835', '168837', '168838', '168840', '168841', '168912', '168913', '168914', '168916', '168919', '168920', '168922']
    nosy_count = 10.0
    nosy_names = ['tim.peters', 'barry', 'akuchling', 'doerwalter', 'belopolsky', 'ajaksu2', 'eric.araujo', 'r.david.murray', 'Alexander.Belopolsky', 'python-dev']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue665194'
    versions = ['Python 3.3']

    @doerwalter
    Copy link
    Contributor Author

    It would be good to have a simply way to convert
    between datetime objects and RFC2822 style strings.
    From string to datetime is easy with
    datetime.datetime(*email.Utils.parsedate(m)[:7]) (but
    this drops the timezone), but the other direction seems
    impossible. email.Utils.formatdate takes a timestamp
    argument, but AFAICT there's no way to get a timestamp
    out of a datetime object.

    Of course the best solution (ignoring backwards
    compatibility) would be for parsedate und formatdate to
    return/accept datetime objects or for datetime to have
    the appropriate methods.

    @doerwalter doerwalter added type-feature A feature request or enhancement labels Jan 9, 2003
    @tim-one
    Copy link
    Member

    tim-one commented Jan 11, 2003

    Logged In: YES
    user_id=31435

    You can get a timestamp like so:

    >>> time.mktime(datetime.date(2002, 1, 1).timetuple())
    1009861200.0
    >>>

    The dates for which this works depends on the platform
    mktime implementation, though.

    BTW, this sounds a lot more like a new feature request
    than a bug!

    @doerwalter
    Copy link
    Contributor Author

    Logged In: YES
    user_id=89016

    OK, I'll mark this a feature request.

    datetime has fromordinal() and toordinal(), it has
    fromtimestamp(), so I'd say a totimestamp() method would be
    a logical addition.

    @tim-one
    Copy link
    Member

    tim-one commented Jan 11, 2003

    Logged In: YES
    user_id=31435

    Define what totimestamp() should do. The range of
    timestamps supported by the *platform* C library (and so
    indirectly by Python's time module) isn't defined by any
    standard, and isn't easily discoverable either. It may or
    may not work before 1970, may or may not after 2038.
    datetime covers days far outside that range. Note that
    even a double doesn't have enough bits of precision to
    cover the full range of datetime values, either.

    In contrast, ordinals are wholly under Python's control, so
    we can promise surprise-free conversion in both directions.
    All we can promise about timestamps is that if the platform
    supports a timestamp for a time in datetime's range,
    datetime can make sense of it.

    @doerwalter
    Copy link
    Contributor Author

    Logged In: YES
    user_id=89016

    totimestamp() should be the inverse of fromtimestamp(), i.e.
    foo.totimestamp() should be the same as
    time.mktime(foo.timetuple()).
    datetime.datetime.totimestamp() should raise OverflowError
    iff time.mktime() raises OverflowError.

    But as this may lose precision, I'd say it would be better,
    if datetime supported RFC1123 conversion directly, i.e. two
    methods frommime() and tomime(), that parse and format
    strings like "Sun, 06 Nov 1994 08:49:37 GMT" (what
    rfc822.parsedate() and rfc822.formatdate() do)

    @brettcannon
    Copy link
    Member

    Logged In: YES
    user_id=357491

    time.strptime doesn't solve your problem?

    @doerwalter
    Copy link
    Contributor Author

    Logged In: YES
    user_id=89016

    time.strptime() is locale aware, but RFC1123 & RFC822
    require english day and month names, so time.strptime()
    can't be used as-is. (And of course time.strptime() only
    works for formatting, not for parsing)

    @akuchling
    Copy link
    Member

    Moving to feature requests.

    @doerwalter
    Copy link
    Contributor Author

    Here is a patch that implements to formatting part. It adds a method mimeformat() to the datetime class. The datetime object must have a tzinfo for this to work.
    File Added: datetime-mimeformat.diff

    @devdanzin
    Copy link
    Mannequin

    devdanzin mannequin commented Feb 12, 2009

    Patch includes tests, should be updated.

    @devdanzin devdanzin mannequin added stdlib Python modules in the Lib dir labels Feb 12, 2009
    @devdanzin devdanzin mannequin added easy labels Apr 22, 2009
    @abalkin
    Copy link
    Member

    abalkin commented Jun 9, 2010

    -1 on adding more formatting/parsing methods to datetime. +1 on adding functions to email.utils that work with datetime objects. Adding bpo-5094 as a dependency because RFC 2822 requires timezone information for proper formatting.

    @bitdancer
    Copy link
    Member

    Here is a patch that adds datetime support to email.utils.formatdate. Ultimately the email package will give programs access to datetime+timezone representations of the dates in various headers, so this provides the output end of the round trip.

    Alexander, if you could review this that would be great. There may be bugs in the logic of formatdate, but my hope is that I haven't added any new ones.

    @abalkin
    Copy link
    Member

    abalkin commented May 5, 2011

    Rather than shoehorning datetime class support into existing functions, I think a separate set of functions should be written to convert between RFC 2822 timestamps and datetime instances. Currently, email.utils has three functions dealing with date-time specification in email messages:

    formatdate(timeval=None, localtime=False, usegmt=False),
    parsedate(data),
    parsedate_tz(data)

    To work with datetime instances, we can just provide two functions. For lack of better names, I'll call them format_datetime and parse_datetime. Rather than using a localtime flag in the format function, I suggest to always interpret naive datetime instances (those with tzinfo = None) as time given in UTC with no information about the local time zone. Per RFC 2822, this should be formatted with "-0000" in the timezone field. The format_datetime() may take usegmt flag, but it may be better to handle usegmt=True case by a separate format_datetime_gmt() function.

    The parse_datetime() function should use a similar convention and produce aware datetime instances unless TZ field contains "-0000". In this case a naive datetime containing unchanged time data should be produced.

    The problem of guessing the local timezone offset or converting naive datetime instance presumed to be in local time to an aware instance does not belong to the email package. This functionality should be added to the datetime module. See bpo-9527.

    There is a remaining question to which I don't have an immediate answer: How should parse_datetime() handle valid RFC 2882 date-time specifications that cannot be represented as a valid datetime. For example, a spec with seconds=60 or timezone > 2400.

    @bitdancer
    Copy link
    Member

    Do you think we can get 9527 in? My patch was based on the non-existence of a LocalTimezone facility in the stdlib.

    Good point about the special interpretation of -0000. Given 9527 (and only given 9527) it would indeed make sense to restrict email to aware datetimes plus naive datetimes for -0000 timestamps.

    I'll have to keep a flag for the 60th second outside of the datetime instance (or pretend it doesn't exist :)

    @abalkin
    Copy link
    Member

    abalkin commented May 5, 2011

    On Thu, May 5, 2011 at 12:44 PM, R. David Murray <report@bugs.python.org> wrote:
    ..

    Do you think we can get 9527 in?

    I hope we can. Pure Python implementation can be improved by deducing
    the TZ offset from localtime() and gmtime() calls. In C we can use
    additional struct tm fields when they are available to do even better.
    Would you like to add your voice to support bpo-9527?

    ..

    I'll have to keep a flag for the 60th second  outside of the datetime instance (or pretend it doesn't exist :)

    If you can find an e-mail message archived somewhere with 60 seconds
    in the timestamp, it will be a powerful argument to extend seconds
    range that can be stored in datetime objects. I doubt such messages
    exist, though. Few systems can produce such a timestamp even if they
    happen to process an e-mail during a leap second. In
    parse_datetime(), your choice will be between raising an error and
    approximating the leap second with the nearest representable time. I
    think clamping 60 seconds to 59 is the best option and this is what
    datetime.fromtimestamp does if the system happens to produce a leap
    second in the timetuple.

    @bitdancer
    Copy link
    Member

    Yes, since the package will save the original text anyway, I think just clamping to 59 is best.

    Hmm. Maybe instead I could put in an assert that says "please report this incident to bugs.python.org so we can argue that datetime should get support for leap seconds) :)

    @bitdancer
    Copy link
    Member

    OK, here is a new patch proposal. As suggested by Alexander, it adds two functions to email.utils: format_datetime and parsedate_to_datetime. Adding these does not require having localtime. In fact, with this patch you can get an aware datetime representing email's current best guess at localtime by doing:

       dt = parsedate_to_datetime(formatdate(localtime=True))

    Reviews welcome, but this is simple enough I'll probably just commit it if no one objects.

    @bitdancer bitdancer assigned bitdancer and unassigned abalkin Jul 13, 2011
    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jul 20, 2011

    New changeset 5f7b03dcd523 by R David Murray in branch 'default':
    bpo-665194: support roundtripping RFC2822 date stamps in the email.utils module
    http://hg.python.org/cpython/rev/5f7b03dcd523

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented May 26, 2012

    New changeset df12ce0c96eb by R David Murray in branch 'default':
    bpo-665194: Add a localtime function to email.utils.
    http://hg.python.org/cpython/rev/df12ce0c96eb

    @abalkin
    Copy link
    Member

    abalkin commented Jun 22, 2012

    Most of the localtime() logic is now implemented (correctly) in datetime.astimezone(). Attached patch fixes email.utils.localtime() implementation.

    @abalkin
    Copy link
    Member

    abalkin commented Jun 23, 2012

    David,

    bpo-665194.diff patch is a bug fix for localtime(). If you decide to keep localtime(), there is not much of a rush because bug fixes can go in after beta.

    @bitdancer
    Copy link
    Member

    Well, I guess I'd like to keep it since it does satisfy a need in the email package: to manipulate dates in the local timezone when composing messages. It isn't a critical need, though; the critical need is just to be able to have a datetime that has the correct timezone as its tzinfo for 'now', and your astimezone change supplies that. On the third hand, 'util.localtime()' is a lot more intuitive then 'datetime.now().astimezone()', so I'd probably still have a util.localtime() helper even if I restricted it to only generating 'now'.

    So, on balance, since the method is in and tested with the extra functionality, and that functionality will probably prove useful to programs manipulating emails, I'm coming down in favor of keeping it.

    @bitdancer
    Copy link
    Member

    Alexander, this slipped off my radar.

    Some of the tests Brian added fail with the patch applied. I fixed most of them by adding back the support for converting an aware datetime to an aware localtime. The remaining two I believe are what you are fixing with this patch (in addition to converting to using astimezone for the current-localtime case). So there I fixed the tests.

    Hopefully I got this right, and hopefully you can find time to review it. I'll probably check it in before the RC regardless, since it looks like an improvement to me.

    @abalkin
    Copy link
    Member

    abalkin commented Aug 22, 2012

    I'll take a look tomorrow morning. (EDT:-)

    @abalkin
    Copy link
    Member

    abalkin commented Aug 22, 2012

    I noticed this part:

    + # We have an aware datetime. Use aware datetime arithmetic to find the
    + # seconds since the epoch.
    + delta = dt - datetime.datetime(*time.gmtime(0)[:6],
    + tzinfo=datetime.timezone.utc)
    + seconds = delta.total_seconds()

    Why don't you just return dt.astimezone() here? A round trip through a floating point timestamp always makes me nervous. I doubt loss of precision is a problem for the e-mail package, but who knows.

    @bitdancer
    Copy link
    Member

    Heh, I was just copying the previous code, and didn't think about being able to update it. Will make that change.

    @bitdancer
    Copy link
    Member

    And my tests and code were wrong, and I was wrong about what you were trying to fix. So since the other tests were passing before, presumably there is some test that could be added to exercise the bug you were fixing. Do you remember what that was?

    @abalkin
    Copy link
    Member

    abalkin commented Aug 22, 2012

    So since the other tests were passing before, presumably there
    is some test that could be added to exercise the bug you were
    fixing. Do you remember what that was?

    Yes, the issue was the one that was mentioned in an XXX comment: in many places UTC offset was different at different historical times but standard C library and Python's time module provides a single constant for it. There are plenty of examples that can be found in Olson's database (Europe/Kiev comes to mind), but it is not easy to devise a test that will work cross-platform. Maybe we should just restrict the test to Linux/BSD family of OSes?

    @bitdancer
    Copy link
    Member

    I think restricting the test is fine. If we find a platform-specific bug on another platform we can add a test specific to that platform when we fix the bug.

    Can you provide the test? I'm going to commit what I've got at this point to make sure I don't miss the RC.

    @AlexanderBelopolsky
    Copy link
    Mannequin

    AlexanderBelopolsky mannequin commented Aug 23, 2012

    Please commit. I'll add the test.

    On Wed, Aug 22, 2012 at 9:11 PM, R. David Murray <report@bugs.python.org> wrote:

    R. David Murray added the comment:

    I think restricting the test is fine. If we find a platform-specific bug on another platform we can add a test specific to that platform when we fix the bug.

    Can you provide the test? I'm going to commit what I've got at this point to make sure I don't miss the RC.

    ----------


    Python tracker <report@bugs.python.org>
    <http://bugs.python.org/issue665194\>


    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Aug 23, 2012

    New changeset 71b9cca81598 by R David Murray in branch 'default':
    bpo-665194: Update email.utils.localtime to use astimezone, and fix bug.
    http://hg.python.org/cpython/rev/71b9cca81598

    @bitdancer
    Copy link
    Member

    Leaving open until the test is committed.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Aug 23, 2012

    New changeset a2d83fba8fd8 by R David Murray in branch 'default':
    bpo-665194: fix variable name in exception code path.
    http://hg.python.org/cpython/rev/a2d83fba8fd8

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Aug 23, 2012

    New changeset 604222c1f8a0 by Alexander Belopolsky in branch 'default':
    Added test for a bug fixed in issue bpo-665194.
    http://hg.python.org/cpython/rev/604222c1f8a0

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Aug 23, 2012

    New changeset 4c134e6ba0df by Alexander Belopolsky in branch 'default':
    Issue bpo-665194: Added a small optimization
    http://hg.python.org/cpython/rev/4c134e6ba0df

    @abalkin abalkin closed this as completed Aug 23, 2012
    @abalkin abalkin closed this as completed Aug 23, 2012
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 9, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    easy stdlib Python modules in the Lib dir topic-email type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    6 participants