Message 51108 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	paul.moore
Recipients
Date	2007-03-20.09:23:25
SpamBayes Score
Marked as misclassified
Message-id
In-reply-to

Content
The documentation isn't clear, but I would say that the code is written on the basis that the file is opened in text or universal newline mode (so that CRLF will ve translated to \n). On that assumption, the patch is unnecessary, as CRLF will never appear in the data. (Actually, the existence of Utils.fix_eols makes me wonder...) On the other hand, 99% of the time, the code will work fine if a CRLF file is opened in binary mode. This patch fixes one of the obscure corner cases, and (as far as I can see) does no real harm - I can't imagine a real mail file containing literal CR bytes which should be preserved! The documentation should be updated to require that the file is in text or universal newline mode, and note that files with CRLF opened in binary mode could be processed incorrectly. I'm probably +0 on applying this patch, on the basis that it makes the code more robust given the current lack of any documentation of the required file mode. If the documentation is patched as above, I'd be -0. If the rest of the email module is known not to robust in the face of files with CRLF opened in binary mode, the documentation should definitely be changed, and that fact made very clear. (This patch becomes irrelevant then). BTW, the test case looks wrong - surely the line 'self.assertNotEqual(data[8193], "\n")' should check for a \r? I assume it's asserting that the line after the break is not the end of the headers - if so, that line will be \r\n as the earlier translation changes LF to CRLF. Summary: CRLF handling is a bit of a can of worms - the module documentation could do with being clearer in any case, but the patch is probably good to apply on the basis that it fixes a small corner case without causing harm elsewhere. The test should be fixed as above.

The documentation isn't clear, but I would say that the code is written on the basis that the file is opened in text or universal newline mode (so that CRLF will ve translated to \n). On that assumption, the patch is unnecessary, as CRLF will never appear in the data. (Actually, the existence of Utils.fix_eols makes me wonder...)

On the other hand, 99% of the time, the code will work fine if a CRLF file is opened in binary mode. This patch fixes one of the obscure corner cases, and (as far as I can see) does no real harm - I can't imagine a *real* mail file containing literal CR bytes which should be preserved!

The documentation should be updated to require that the file is in text or universal newline mode, and note that files with CRLF opened in binary mode could be processed incorrectly.

I'm probably +0 on applying this patch, on the basis that it makes the code more robust given the current lack of any documentation of the required file mode. If the documentation is patched as above, I'd be -0.

If the rest of the email module is known not to robust in the face of files with CRLF opened in binary mode, the documentation should definitely be changed, and that fact made very clear. (This patch becomes irrelevant then).

BTW, the test case looks wrong - surely the line 'self.assertNotEqual(data[8193], "\n")' should check for a \r? I assume it's asserting that the line after the break is not the end of the headers - if so, that line will be \r\n as the earlier translation changes LF to CRLF.

Summary: CRLF handling is a bit of a can of worms - the module documentation could do with being clearer in any case, but the patch is probably good to apply on the basis that it fixes a small corner case without causing harm elsewhere. The test should be fixed as above.

History
Date	User	Action	Args
2007-08-23 15:54:38	admin	link	issue1555570 messages
2007-08-23 15:54:38	admin	create