classification
Title: mailbox.MH chokes on directories without .mh_sequences
Type: enhancement Stage: patch review
Components: email, Library (Lib) Versions: Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: barry, gumnos, r.david.murray, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2014-09-01 00:44 by gumnos, last changed 2017-04-16 16:58 by r.david.murray.

Files
File name Uploaded Description Edit
mailbox_mh_sequences.diff gumnos, 2014-09-01 00:52 A patch that wraps the open() call in a try/except/else block
mailbox_mh_sequences_lbyl.diff gumnos, 2014-09-01 00:57 A look-before-you-leap version of the patch that is much simpler, but has a minor race-condition compared to the other one.
Pull Requests
URL Status Linked Edit
PR 804 open serhiy.storchaka, 2017-03-24 16:09
Messages (8)
msg226197 - (view) Author: Tim Chase (gumnos) Date: 2014-09-01 00:44
If a mailbox.MH() object is created by pointing at a path that exists but doesn't contain a ".mh_sequences" file, it raises an exception upon iteration over .{iter,}items() rather than gracefully assuming that the file is empty.  I encountered this by pointing it at a Claws Mail IMAP-cache folder (which claims to store its messages in MH format¹ but it doesn't place a .mh_sequences file in those folders) only to have it raise an exception.

To replicate:
$ mkdir empty
$ python
>>> import mailbox
>>> for msg in mailbox.MH('empty').values(): pass

I suspect this could simply wrap the "f = open(os.path.join(self._path, '.mh_sequences'), 'r')" and following lines in a check to ignore the file if it doesn't exist (returning the empty "results").

¹ http://www.claws-mail.org/faq/index.php/General_Information#How_does_Claws_Mail_store_mails.3F
msg226228 - (view) Author: Tim Chase (gumnos) Date: 2014-09-01 12:52
I had to tweak the example reproduction code as it seemed to succeed (i.e., fail to demonstrate the problem) in some instances.  The same exception occurs, but here's the full original traceback:


$ cd /home/tim/.claws-mail/imapcache/mail.example.com/tim@example.com/INBOX/

$ python3
Python 3.2.3 (default, Feb 20 2013, 14:44:27) 
[GCC 4.7.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import mailbox
>>> m = mailbox.MH('.')
>>> for msg in m:
...     print(msg)
... 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.2/mailbox.py", line 114, in itervalues
    value = self[key]
  File "/usr/lib/python3.2/mailbox.py", line 78, in __getitem__
    return self.get_message(key)
  File "/usr/lib/python3.2/mailbox.py", line 1019, in get_message
    for name, key_list in self.get_sequences().items():
  File "/usr/lib/python3.2/mailbox.py", line 1128, in get_sequences
    f = open(os.path.join(self._path, '.mh_sequences'), 'r')
IOError: [Errno 2] No such file or directory: '/home/tim/.claws-mail/imapcache/mail.example.com/tim@example.com/INBOX/.mh_sequences'
msg290093 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-24 16:16
I consider this issue as a new feature.

PR 804 makes get_sequences() and set_sequences() working when the ".mh_sequences" file does not exist.

The open question is what to do with lock(). Currently it fails if the ".mh_sequences" file does not exist. Is it correct to create the ".mh_sequences" file in lock() or this invalidates the lock?

Is it safe to change the file open mode in set_sequences() from "r+" to "w" (the file is truncated later)?
msg291738 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-04-16 06:24
Ping.
msg291749 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2017-04-16 15:55
Honestly, given the open questions my inclination would be to reject this.
msg291754 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-04-16 16:48
Do you mean rejecting the support of Claws Mail IMAP-cache folder or just the patch?
msg291757 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2017-04-16 16:56
The support.  ClawsMail is broken, IMO.
msg291759 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2017-04-16 16:58
If there is a way to resolve the lock issue we can consider it.  But (without looking at the code again) I suspect the locking is too deeply embedded in the mbox logic for this to be a safe change.
History
Date User Action Args
2017-04-16 16:58:40r.david.murraysetmessages: + msg291759
2017-04-16 16:56:06r.david.murraysetmessages: + msg291757
2017-04-16 16:48:44serhiy.storchakasetmessages: + msg291754
2017-04-16 15:55:27r.david.murraysetmessages: + msg291749
2017-04-16 06:24:35serhiy.storchakasetmessages: + msg291738
2017-03-24 16:16:01serhiy.storchakasetversions: + Python 3.7, - Python 2.7, Python 3.4, Python 3.5
nosy: + serhiy.storchaka

messages: + msg290093

type: behavior -> enhancement
2017-03-24 16:09:40serhiy.storchakasetpull_requests: + pull_request709
2014-09-01 12:52:25gumnossetmessages: + msg226228
2014-09-01 01:26:34r.david.murraysetnosy: + barry, r.david.murray
stage: patch review

components: + email
versions: + Python 3.4, Python 3.5, - Python 3.1, Python 3.2
2014-09-01 00:57:22gumnossetfiles: + mailbox_mh_sequences_lbyl.diff
2014-09-01 00:52:20gumnossetfiles: + mailbox_mh_sequences.diff
keywords: + patch
2014-09-01 00:44:46gumnoscreate