Issue1823
Created on 2008-01-14 11:17 by Sharebear, last changed 2008-01-19 18:40 by Sharebear.
| Messages | |||
|---|---|---|---|
| msg59896 (view) | Author: Jonathan Share (Sharebear) | Date: 2008-01-14 11:17 | |
Steps to Reproduce
==================
>>> from email.mime.multipart import MIMEMultipart
>>> from email.mime.text import MIMEText
>>> multipart = MIMEMultipart()
>>> multipart.set_charset('UTF-8')
>>> text = MIMEText("sample text")
>>> multipart.attach(text)
>>> print multipart.as_string()
MIME-Version: 1.0
Content-Type: multipart/mixed; charset="utf-8";
boundary="===============0973828728=="
Content-Transfer-Encoding: base64
--===============0973828728==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
sample text
--===============0973828728==--
>>> multipart = MIMEMultipart()
>>> multipart.attach(text)
>>> multipart.set_charset('UTF-8')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/
email/message.py", line 262, in set_charset
self._payload = charset.body_encode(self._payload)
File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/
email/charset.py", line 384, in body_encode
return email.base64mime.body_encode(s)
File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/
email/base64mime.py", line 148, in encode
enc = b2a_base64(s[i:i + max_unencoded])
TypeError: b2a_base64() argument 1 must be string or read-only buffer,
not list
Explanation
===========
The first example above demonstrates that if you call set_charset('UTF-
8') on a MIMEMultipart instance before adding child parts then it is
possible to generate a multipart/* message with an illegal Content-
Transfer-Encoding as specified by RFC 2045[1] "If an entity is
of type "multipart" the Content-Transfer-Encoding is not permitted to
have any value other than "7bit", "8bit" or "binary"."
In the second example, I demonstrate that if you try and call
set_charset after adding child parts, the code exceptions. The user
should at least be provided with a more targeted exception.
Notes
=====
Where should this be fixed? The smallest fix would be to add a check to
set_charset to see if it is dealing with with a multipart message but
as I express in issue1822 I feel the better design would be to move
this subtype specific logic into the appropriate subclass.
Again, this is something I'm willing to work on in next saturday's bug
day if I can get some feedback on my architectural concerns.
[1] http://tools.ietf.org/html/rfc2045#section-6.4
|
|||
| msg59922 (view) | Author: Martin v. Löwis (loewis) | Date: 2008-01-14 21:36 | |
I'd like to question whether anything needs to be fixed at all, i.e. whether it is the responsibility of the email package to reject all kinds of non-sensical data. Garbage in, garbage out. Barry, can you take a look? |
|||
| msg60139 (view) | Author: Jonathan Share (Sharebear) | Date: 2008-01-19 10:02 | |
Martin, I can almost agree with you _if_ I was setting the Content-Transfer- Encoding myself, however I am not. I am setting the charset and the library chooses an appropriate Content-Transfer-Encoding to represent the mime part with. Currently I can't see any way other than reading the source or writing a test case (and that would require understanding what the email.mime module was doing "under the hood") for a developer to find out which Content-Transfer-Encoding was going to be used. Also, just from a usability point of view I would expect that creating an invalid mime part would be a little more difficult. Especially considering the fix should be as small as adding "if not encoding in valid encodings: raise SensibleException". |
|||
| msg60155 (view) | Author: Christian Heimes (christian.heimes) | Date: 2008-01-19 12:44 | |
Please provide an unit test which verifies the bug and a fix for the bug. |
|||
| msg60189 (view) | Author: Barry A. Warsaw (barry) | Date: 2008-01-19 16:28 | |
You're right that this should probably be fixed in the subclass, but you also have to remember that the parser generally doesn't create subclass instances. It only creates instances of Message. As long as you can make it work properly with the parser and generator, I'm okay with overriding set_charset() in the subclass to do the right thing. |
|||
| msg60193 (view) | Author: Jonathan Share (Sharebear) | Date: 2008-01-19 16:48 | |
I'm beginning to realise this is slightly bigger than I first thought ;-) Trying to make a nice test case for this issue, I thought it would be a good idea for the parser to register a defect for invalid content- transfer-encoding so I can test against that in the test case rather than fragile substring tests. Unfortunately the parser code isn't the easiest code to get your head around on a first look. |
|||
| msg60208 (view) | Author: Jonathan Share (Sharebear) | Date: 2008-01-19 18:40 | |
Run out of time to look at this today. In order to write a nice test case for this issue I need the parser to notice this error in messages. I've filed issue1874 for the parser not reporting the invalid cte in the msg.defects |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2008-01-19 18:40:59 | Sharebear | set | messages: + msg60208 |
| 2008-01-19 16:48:11 | Sharebear | set | messages: + msg60193 |
| 2008-01-19 16:28:05 | barry | set | messages: + msg60189 |
| 2008-01-19 12:44:11 | christian.heimes | set | priority: low nosy: + christian.heimes messages: + msg60155 |
| 2008-01-19 10:02:41 | Sharebear | set | messages: + msg60139 |
| 2008-01-14 21:36:04 | loewis | set | assignee: barry messages: + msg59922 nosy: + barry, loewis |
| 2008-01-14 11:17:11 | Sharebear | create | |