classification
Title: imaplib: Internaldate2tuple() string/bytes issues, does not handle negative TZ offsets, does not handle DST correctly
Type: behavior Stage: resolved
Components: Documentation, Library (Lib), Tests Versions: Python 3.2, Python 2.7
process
Status: closed Resolution: duplicate
Dependencies: Superseder: imaplib: Internaldate2tuple() is documented to return UTC, but it returns local time
View: 10934
Assigned To: docs@python Nosy List: belopolsky, docs@python, georg.brandl, lavajoe, r.david.murray
Priority: normal Keywords: patch

Created on 2011-01-16 23:32 by lavajoe, last changed 2011-01-19 17:53 by belopolsky. This issue is now closed.

Files
File name Uploaded Description Edit
imaplib_Internaldate2tuple_python32.patch lavajoe, 2011-01-18 01:22 Patch for Python 3.2
imaplib_Internaldate2tuple_python27.patch lavajoe, 2011-01-18 01:36 Patch to Python 2.7 that fixes the pre-py3k subset of issues
Messages (11)
msg126386 - (view) Author: Joe Peterson (lavajoe) Date: 2011-01-16 23:32
Internaldate2tuple() is broken in several ways.  The last two issues have existed in the code for some time.

New issues in Python 3:

1. It crashes with "KeyError".  This is because the Mon2num dictionary's keys are strings, not bytes objects (note that other strings in imaplib have been updated, but not Mon2num).

2. The sign of the TZ offset (e.g. -0700) is compared to the string '-', not b'-', so the compare is never true, causing a large error when TZ offset is negative.

Left over from Python 2.x:

3. DST is not handled correctly.  Specifically, when the date is such that its local time form and its UTC form (if both interpreted as local time) are on different sides of a DST changeover, the result will be off by one hour.  This is because the check for DST is done after treating the time as local time, even if it is not local time, causing it be tested while sometimes on the wrong side of a DST change.  This can be corrected by using calls that keep time in UTC.

4. Related to #3, the result is returned in local time, whereas the documentation states that the result is in UT.  The fix for #3 takes care of this one, as well.

Here is an example:  Run the following two dates, that represent exactly the same time, through Internaldate2tuple:

'25 (INTERNALDATE "01-Apr-2000 19:02:23 -0700")'
'101 (INTERNALDATE "02-Apr-2000 02:02:23 +0000")'

Once the Mon2num issue is fixed, Python 3 will perform the conversions, but with a 15 hour difference.  Python 2 will produce results with a one hour difference.

Note that the last two issues (but describing only #4 above) were also addressed in a similar way in an old post I found by Colin Brown in 2004 (http://www.velocityreviews.com/forums/t336162-imaplib-function-bug.html).
msg126388 - (view) Author: Joe Peterson (lavajoe) Date: 2011-01-17 01:46
Regarding problem #4, it actually appears that returning local time is the right thing to do, since it matches the behavior of Time2Internaldate().  Returning UTC, as the doc states, would potentially break IMAP append() that can include an internaldate possibly returned from Internaldate2tuple() in typical usage (like when copying messages).  In this case, the doc is wrong on Internaldate2tuple().

I have attached a new patch to both the code and doc that replaces the old patch.  I now return localtime rather than gmtime, but other than that, the DST fix remains the same and now handles the cases near DST changes correctly.

Ultimately, it might be desirable to be able always stay in UTC, so perhaps adding UTC variants of both Internaldate2tuple() and Time2Internaldate() (and a UTC option to append()) would be a good enhancement for later.
msg126418 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2011-01-17 17:40
> 1. It crashes with "KeyError". ...

I assume this means it raises a KeyError when given a bytes object as an argument.

>>> Internaldate2tuple(b'INTERNALDATE "01-Jan-2000 12:00:00 +0000"')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "Lib/imaplib.py", line 1326, in Internaldate2tuple
    mon = Mon2num[mo.group('mon')]
KeyError: b'Jan'

> 2.  The sign of the TZ offset ..

Once Mon2num is fixed, the sign error show up as follows:

>>> Internaldate2tuple(b'INTERNALDATE "01-Jan-2000 12:00:00 +0500"')[3:6]
(2, 0, 0)
>>> Internaldate2tuple(b'INTERNALDATE "01-Jan-2000 12:00:00 -0500"')[3:6]
(2, 0, 0)

 
This looks like a 2 to 3 port oversight and we can probably fix it in RC.  Georg?
msg126421 - (view) Author: Joe Peterson (lavajoe) Date: 2011-01-17 18:01
> I assume this means it raises a KeyError when given a bytes object as an argument.

Yes, a KeyError is raised when arg is bytes, but passing a string also fails (raising TypeError).  The latter might also be a separate bug, in that strings cannot be passed as they could be in Python 2.

> This looks like a 2 to 3 port oversight and we can probably fix it in RC.

Probably, since many strings have been changed to bytes elsewhere in the file.

BTW, I just attached a patch for Python 2.7 that fixes the subset of non-py3k-related issues.
msg126444 - (view) Author: Joe Peterson (lavajoe) Date: 2011-01-18 01:07
Added fix for ParseFlags (another string/bytes issue) and now accept strings as args to ParseFlags and Internaldate2tuple.

Also added unit tests for changes.
msg126448 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2011-01-18 03:18
There are at least 3 issues here: a documentation issue, a py3k bug and at least one feature request.  Given that the logistics are different for different kinds of issues, I think it would be best to split them on the tracker.  Here is how I see what is being proposed:

1. Internaldate2tuple is documented to return UTC timetuple, but implemented to return local time (ignoring the DST issues.)  The proposal is to change the documentation.

2. There are a couple of str vs bytes bugs that obviously need to be fixed.

3. The proposed patch also make both str and bytes acceptable as Internaldate2tuple argument.

As discussed in issue 9864, it would be best if datetime formatting and parsing routines would operate on datetime objects.  Something along the lines of


def parseinternaldate(datestr):
    return datetime.strptime(datestr, "%d-%b-%Y %H:%M:%S %z")

def formatinternaldate(dateobj):
    return dateobj.strftime("%d-%b-%Y %H:%M:%S %z")

The above uses recently added timezone support in datetime module.

There is still a problem, though.  The code above would only work as expected in the C locale, but Time2Internaldate already has this limitation.
msg126451 - (view) Author: Joe Peterson (lavajoe) Date: 2011-01-18 04:19
>There are at least 3 issues here: a documentation issue, a py3k bug and at least one feature request.

Which is a feature request?  In these patches, I am attempting to fix the DST problems and regain the previous behavior in Python 2.  Are you talking about the ability to accept a string vs. a bytes object?

> 1. Internaldate2tuple is documented to return UTC timetuple, but implemented to return local time (ignoring the DST issues.)  The proposal is to change the documentation.

I prefer UTC, so this is a bit of a shame, I agree, but the use of the pervious interfaces assumed localtime, so changing to UTC would definitely break existing code.  I do think it would be nice to extend this to deal with UTC instead, but in this patch, I am only trying to retain current functionality.

2. There are a couple of str vs bytes bugs that obviously need to be fixed.

> 3. The proposed patch also make both str and bytes acceptable as Internaldate2tuple argument.

True, but given the new role of str and bytes, it is unclear what existing code would try to pass.

> As discussed in issue 9864, it would be best if datetime formatting and parsing routines would operate on datetime objects.

I can see that redoing some of this would be a good idea.  But I am only trying to keep the existing stuff from being broken in this patch.  I agree that the interfaces could be a lot better, and I would indeed like to see it improved (and I am willing to help with doing that).

> There is still a problem, though.  The code above would only work as expected in the C locale, but Time2Internaldate already has this limitation.

As long as we assume strings passed are ASCII, it should work.  And email headers should be ASCII (although I have seen some really weird deviations from this on old emails).  It's not perfect, certainly, and going forward, the IMAP lib could be tightened up.  Maybe this first patch could be thought of as a step, at least fixing what is broken until improved.
msg126464 - (view) Author: Joe Peterson (lavajoe) Date: 2011-01-18 13:59
I have started splitting these up as recommended.  First one (documentation) is: issue 10934.  I will split out more later today...
msg126466 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-01-18 15:09
If I understand correctly, what Alexander means by "only work in the C locale" is that both strptime and strftime are locale dependent, and so if the locale is something other than C it may fail to parse the month name and may generate a non-standard month name.
msg126504 - (view) Author: Joe Peterson (lavajoe) Date: 2011-01-18 23:49
Two more issues split out into their own issues:

issue 10939 (bytes/str issues)
issue 10941 (DST handled incorrectly)
msg126534 - (view) Author: Joe Peterson (lavajoe) Date: 2011-01-19 15:40
This issue has been split, as suggested by Alexander Belopolsky, into separate issues:

issue 10934 - doc change to correctly reflect return of local time vs. UTC
issue 10939 - bytes/str issues
issue 10941 - DST handled incorrectly
issue 10947 - compatibility using str/bytes

Closing this issue (see separate ones above).
History
Date User Action Args
2011-01-19 17:53:12belopolskysetresolution: duplicate
superseder: imaplib: Internaldate2tuple() is documented to return UTC, but it returns local time
nosy: georg.brandl, belopolsky, r.david.murray, docs@python, lavajoe
stage: test needed -> resolved
2011-01-19 15:40:12lavajoesetstatus: open -> closed
nosy: georg.brandl, belopolsky, r.david.murray, docs@python, lavajoe
messages: + msg126534
2011-01-18 23:49:49lavajoesetnosy: georg.brandl, belopolsky, r.david.murray, docs@python, lavajoe
messages: + msg126504
2011-01-18 15:09:45r.david.murraysetnosy: georg.brandl, belopolsky, r.david.murray, docs@python, lavajoe
messages: + msg126466
2011-01-18 13:59:45lavajoesetnosy: georg.brandl, belopolsky, r.david.murray, docs@python, lavajoe
messages: + msg126464
2011-01-18 04:19:11lavajoesetnosy: georg.brandl, belopolsky, r.david.murray, docs@python, lavajoe
messages: + msg126451
2011-01-18 03:18:44belopolskysetnosy: georg.brandl, belopolsky, r.david.murray, docs@python, lavajoe
messages: + msg126448
2011-01-18 01:36:03lavajoesetfiles: + imaplib_Internaldate2tuple_python27.patch
nosy: georg.brandl, belopolsky, r.david.murray, docs@python, lavajoe
2011-01-18 01:35:46lavajoesetfiles: - imaplib_Internaldate2tuple_python27.patch
nosy: georg.brandl, belopolsky, r.david.murray, docs@python, lavajoe
2011-01-18 01:22:51lavajoesetfiles: + imaplib_Internaldate2tuple_python32.patch
nosy: georg.brandl, belopolsky, r.david.murray, docs@python, lavajoe
2011-01-18 01:22:34lavajoesetfiles: - imaplib_Internaldate2tuple_python32.patch
nosy: georg.brandl, belopolsky, r.david.murray, docs@python, lavajoe
2011-01-18 01:22:24lavajoesetfiles: - imaplib_Internaldate2tuple.patch
nosy: georg.brandl, belopolsky, r.david.murray, docs@python, lavajoe
2011-01-18 01:07:35lavajoesetfiles: + imaplib_Internaldate2tuple_python32.patch
title: imaplib: Internaldate2tuple() crashes, does not handle negative TZ offsets, does not handle DST correctly -> imaplib: Internaldate2tuple() string/bytes issues, does not handle negative TZ offsets, does not handle DST correctly
nosy: georg.brandl, belopolsky, r.david.murray, docs@python, lavajoe
messages: + msg126444

assignee: docs@python
components: + Tests
2011-01-17 18:01:06lavajoesetfiles: + imaplib_Internaldate2tuple_python27.patch

versions: + Python 2.7
messages: + msg126421
nosy: georg.brandl, belopolsky, r.david.murray, docs@python, lavajoe
2011-01-17 17:40:26belopolskysetnosy: + georg.brandl

messages: + msg126418
stage: patch review -> test needed
2011-01-17 14:58:18pitrousetnosy: belopolsky, r.david.murray, docs@python, lavajoe
stage: patch review
2011-01-17 14:58:13pitrousetassignee: docs@python -> (no value)

nosy: + r.david.murray, belopolsky
2011-01-17 14:25:31brian.curtinsettype: crash -> behavior
2011-01-17 01:53:07lavajoesetcomponents: + Documentation
2011-01-17 01:51:59lavajoesetcomponents: - Documentation
2011-01-17 01:47:26lavajoesetfiles: - imaplib_Internaldate2tuple.patch
2011-01-17 01:46:35lavajoesetfiles: + imaplib_Internaldate2tuple.patch

assignee: docs@python
components: + Documentation
title: imaplib: Internaldate2tuple() crashes, does not handle negative TZ offsets, does not handle DST correctly, and outputs localtime (not UTC) -> imaplib: Internaldate2tuple() crashes, does not handle negative TZ offsets, does not handle DST correctly
nosy: + docs@python

messages: + msg126388
2011-01-16 23:35:50lavajoesettitle: imaplib: Internaldate2tuple crashes, does not handle negative TZ offsets, does not handle DST correctly, and outputs localtime (not UTC) -> imaplib: Internaldate2tuple() crashes, does not handle negative TZ offsets, does not handle DST correctly, and outputs localtime (not UTC)
2011-01-16 23:32:46lavajoesettype: crash
2011-01-16 23:32:31lavajoecreate