classification
Title: Bugs in parsedate_tz
Type: behavior Stage: test needed
Components: Library (Lib) Versions: Python 2.6
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: barry Nosy List: barry, therve, tzot (3)
Priority: normal Keywords

Created on 2005-03-02 21:03 by therve, last changed 2009-02-15 23:56 by ajaksu2.

Messages (3)
msg24433 - (view) Author: Thomas Herve (therve) Date: 2005-03-02 21:03
The parsing in emails is incomplete in both rfc822.py
and _parseaddr.py.

For example, "Wed, 02 Mar 2005 09:26:53+0800" is parsed
but "Wed, 02 Mar 2005 09:26:53-0800" is not.

The problem is clear by watching the code : only "+"
timezones are corrected.
Following a patch :

Index : _parseaddr.py
----------------------------------------------------------------
@@ -60,7 +66,11 @@ def parsedate_tz(data):
         if i > 0:
             data[3:] = [s[:i], s[i+1:]]
         else:
-            data.append('') # Dummy tz
+           i = s.find('-')
+           if i > 0:
+               data[3:] = [s[:i], s[i:]]
+           else:
+               data.append('') # Dummy tz
     if len(data) < 5:
         return None
     data = data[:5]
----------------------------------------------------------------
msg24434 - (view) Author: Χρήστος Γεωργίου (Christos Georgiou) (tzot) Date: 2005-03-20 11:48
Logged In: YES 
user_id=539787

Note that parsedate_tz as of current parses correctly "Wed,
02 Mar 2005 09:26:53 -0800" (space before '-'), because
data.split() in line 43 produces five parts: [dow, date,
month, year, time, timezone] (reduced to four by removing
initial dow).   The function includes a special check for
"+" in the time part, and this patch adds the "-" check.

I didn't find any date header in my whole email and
newsgroup archive (12095 messages) missing the space before
[-+].  However, if mail clients or servers exist that
produce such date headers, patch should be applied and bug
closed.

Notes:
Some test should be added too.  I updated
test_parsedate_no_dayofweek (line 2076 of
lib/email/test/test_email.py) adding same test dropping the
space before '-', and test fails before patch, succeeds
after patch.  Perhaps a separate test case should be included.
msg24435 - (view) Author: Thomas Herve (therve) Date: 2005-03-20 12:35
Logged In: YES 
user_id=1038797

In fact, the mails I've seen with this problem are likely to
be spam. 

I just thought it would be more logical to test both "+" and
"-" as "+" was already well parsed.
History
Date User Action Args
2009-02-15 23:56:25ajaksu2setstage: test needed
type: behavior
versions: + Python 2.6, - Python 2.3
2005-03-02 21:03:14thervecreate