classification
Title: Message.get_filename produces exception if the RFC2231 encoding is ill-formed
Type: behavior Stage: needs patch
Components: email Versions:
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Ankur.Ankan, barry, karlcow, python-dev, r.david.murray
Priority: normal Keywords: easy, patch

Created on 2013-03-06 22:25 by r.david.murray, last changed 2016-02-28 19:26 by mj.

Files
File name Uploaded Description Edit
collapse_rfc2231_value.patch r.david.murray, 2013-03-07 20:03
rfc2231_issue17369_in_progress.patch r.david.murray, 2013-04-11 19:53
Messages (7)
msg183619 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2013-03-06 22:25
>>> m = message_from_string("Content-Disposition: attachment; filename*0*="can't decode this filename")
>>> m.get_filename()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/rdmurray/python/p32/Lib/email/message.py", line 752, in get_filename
    return utils.collapse_rfc2231_value(filename).strip()
  File "/home/rdmurray/python/p32/Lib/email/utils.py", line 303, in collapse_rfc2231_value
    return str(rawbytes, charset, errors)
TypeError: str() argument 2 must be str, not None
msg183630 - (view) Author: karl (karlcow) * Date: 2013-03-07 03:20
r.david.murray,

how did you enter the first without a syntax error?

>>> import email.message
>>> m = message_from_string("Content-Disposition: attachment; filename*0*="can't decode this filename")
  File "<stdin>", line 1
    m = message_from_string("Content-Disposition: attachment; filename*0*="can't decode this filename")
                                                                             ^
SyntaxError: invalid syntax
msg183631 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2013-03-07 03:28
Heh.  I used """ in the original, and edited it when I posted it "for conciseness".  Sigh.  My apologies.
msg183704 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2013-03-07 20:03
Here's a patch.  Note that this fixes a regression relative to Python2, where fallback_charset was used in this case.
msg186585 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2013-04-11 19:53
It turns out the new header parsing code also missed this error recovery.  Or, rather, it recovers from it by completely ignoring the parameter with the bad syntax. 

I haven't worked out a solution that does more useful error recovery yet, but I'm posting an updated patch with tests for the new parser and something that may (or may not!) be heading in the direction of a fix, in case anyone else wants to work on it before I get back to it.
msg210538 - (view) Author: Roundup Robot (python-dev) Date: 2014-02-07 20:05
New changeset 63f8ea0eeb6d by R David Murray in branch '3.3':
#17369: Improve handling of broken RFC2231 values in get_filename.
http://hg.python.org/cpython/rev/63f8ea0eeb6d

New changeset e0a90b1c4cdf by R David Murray in branch 'default':
Merge: #17369: Improve handling of broken RFC2231 values in get_filename.
http://hg.python.org/cpython/rev/e0a90b1c4cdf
msg210539 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014-02-07 20:07
I've applied the first fix.  I'll leave the issue open until I make the equivalent fix for the new header parsing code.
History
Date User Action Args
2016-02-28 19:26:50mjsetcomponents: + email, - Library (Lib)
versions: - Python 3.3, Python 3.4
2016-02-28 19:19:27mjsetversions: - Python 3.5, Python 3.6
2016-02-28 19:18:41mjsetcomponents: + Library (Lib), - email
versions: + Python 3.5, Python 3.6
2014-02-07 20:07:09r.david.murraysetstage: patch review -> needs patch
messages: + msg210539
versions: - Python 3.2
2014-02-07 20:05:02python-devsetnosy: + python-dev
messages: + msg210538
2013-04-12 20:26:10Ankur.Ankansetnosy: + Ankur.Ankan
2013-04-11 19:53:36r.david.murraysetfiles: + rfc2231_issue17369_in_progress.patch

messages: + msg186585
2013-03-07 20:03:17r.david.murraysetstage: needs patch -> patch review
2013-03-07 20:03:05r.david.murraysetfiles: + collapse_rfc2231_value.patch
keywords: + patch
messages: + msg183704

versions: - Python 2.7
2013-03-07 03:28:57r.david.murraysetnosy: + barry
components: + email
2013-03-07 03:28:31r.david.murraysetmessages: + msg183631
2013-03-07 03:20:24karlcowsetnosy: + karlcow
messages: + msg183630
2013-03-06 22:25:40r.david.murraycreate