classification
Title: Add a datatype to represent mime types to the email module
Type: enhancement Stage: needs patch
Components: email Versions: Python 3.5
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: barry, ezio.melotti, martin.panter, pitrou, r.david.murray
Priority: normal Keywords: easy

Created on 2013-10-17 14:13 by r.david.murray, last changed 2014-09-05 23:52 by martin.panter.

Messages (3)
msg200128 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2013-10-17 14:13
In issue 18891, Stephen Turnbull wondered if adding a datatype to represent mime types would be worthwhile.  

I think it would be.  A mimetype is a pair (maintype/subtype), and while one may test the subparts independently in logic, the representation and what needs to be passed from place to place in the code is always a pair.  Most importantly, having a datatype to represent this would eliminate a common class of errors: forgetting to test the component strings case-insensitively.  If one is manipulating a Message object, the get_xxx methods used to access the mimetype do do case coercion, but within the email code itself there are a number of places where the raw strings are manipulated, and I have already made, discovered, and fixed case insensitivity bugs in that code.

It is not clear at this point if the object should be exposed, though I'm inclined that way.  I'd propose using a string subclass with maintype and subtype attributes, and this object could then be returned by get_content_type without breaking backward compatibility.  Another advantage of using a string subclass is that the original casing of the values is easily retained and easily accessible, which while not critical is something the email package normally does (preserve the case of the original data).
msg200265 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-10-18 14:45
I don't know much about the email module, but FWIW I think str subclasses (or any subclass of built-in types) are a delicate thing to expose in an API. I think a namedtuple would be the more idiomatic choice here (perhaps with an appropriate __str__ for convenient conversion / %-formatting / {}-formatting).
msg200638 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2013-10-20 21:17
Well, it's about backward compatibility, and the email module already uses str subclasses for headers in the new code, for backward compatibility reasons.  I hope this does not prove fragile in practice, but I have no way of knowing for sure, of course.

It occurs to me, however, that the (new) content-type header's value already has the maintype/subtype attributes, so there's really no need to change the return type of the get_content_type method.

For internal use...a named tuple is not adequate, since I need to preserve the original case of the values.
History
Date User Action Args
2014-09-05 23:52:54martin.pantersetnosy: + martin.panter
2013-10-20 21:17:47r.david.murraysetmessages: + msg200638
2013-10-19 05:18:50ezio.melottisetnosy: + ezio.melotti
2013-10-18 14:45:16pitrousetnosy: + pitrou
messages: + msg200265
2013-10-17 14:13:03r.david.murraycreate