msg397478 - (view) |
Author: Baptiste (Abridbus) |
Date: 2021-07-14 12:44 |
Hello,
When using as_string() on a Reply-To header like the following:
msg['Reply-To'] = '"foo Research, Inc. Foofoo BarBar on Summer Special Friday: 0.50 days (2021-02-31)" <catchall@foobar.exchange.com>'
The double quote disappear, which lead to wrong header value
See attached file for example
|
msg397480 - (view) |
Author: R. David Murray (r.david.murray) * |
Date: 2021-07-14 13:47 |
There is definitely a problem here, though I see a different problem when I run it (AttributeError: 'Group' object has no attribute 'local_part', presumably because of the ':' not getting escaped correctly). I believe it applies to any address header, not just Reply-To. Unfortunately I don't have time to investigate the cause, at least right now. An interesting first step on diagnosing it might be to produce a minimal example: start deleting special characters from inside that quoted string until you find the one (or ones) that is triggering it.
|
msg397529 - (view) |
Author: Baptiste (Abridbus) |
Date: 2021-07-15 08:11 |
Thanks David,
Here is some other tests I ran
Issuing:
- msg['Reply-To'] = '"foo Research Inc Foofoo BarBar on Summer Special Friday 050 days (2021-02-31" <catchall@foobar.exchange.com>'
- msg['Reply-To'] = '"foo Research Inc Foofoo BarBar on Summer Special Friday 050 days 20210231 " <catchall@foobar.exchange.com>'
But:
msg['Reply-To'] = '"foo Research Inc Foofoo BarBar on Summer Special Friday 050 days 20210231 " <catchall@foobar.exchange.com>'
worked. It looks more related to the length of the name than the character used.
|
msg397545 - (view) |
Author: R. David Murray (r.david.murray) * |
Date: 2021-07-15 12:35 |
Forget what I said about my different error, I made a mistake running the test script.
Interesting. If it is related to the length of the name, then the problem is most likely in the folding algorithm, specifically in what happens when the "display-name" token is wrapped across lines. And indeed, if we clone the SMTP policy and set the max_line_len to 1000 in your sample script. it renders the header correctly.
The problem here is that the surrounding quotation marks are added by the 'value' property of DisplayName, but that property isn't invoked when handling parts of the display name separately during mulit-line folding. I was always bothered by the handling of the quotation marks in the part of the parser and folder dealing with quoted strings, but I never hit on a better way to do it. This, unfortunately, is going to be non-trivial problem to solve. It is probably going to require an ugly hack in the folding code :(
Really, the handling of quoted strings throughout the _header_value_parser code is...a hack :( There are probably other places where it breaks down during multi-line folding. If we are lucky the hack can just add special handling for the quoted-string token type in the folder. If we aren't it will get messier :(
Glancing at the folder code (it's been a long time since I worked on it), one possible approach (not necessarily the best one) would be to mark the first and last sub-tokens in a quoted-string so that folder knows to put in a leading or trailing quote mark, respectively, during folding.
|
msg397546 - (view) |
Author: Julien Castiaux (drlazor8) * |
Date: 2021-07-15 13:15 |
Hello David,
I'm working in the same company as Baptiste and I'm trying to solve the problem. The issue is indeed related to the folding algorithm, the DBQUOTE character is lost in the parse_tree AST thus when the folding algo split the children to find a sweat spot to split the line it doesn't re-introduce the DBQUOTE and instead inject the content of the BareQuotedString right away.
I'm working on a fix which consist of adding two DBQUOTE, one at the beginning and one at the end, of the BareQuotedString token when it is created (_header_value_parser.py@get_bare_quoted_string()). I was inspired by how the angles < and > are injected around the AddrSpec token in a AngleAddr token.
Right now my fix isn't correct, there are some unittest falling. I'm trying to get it working and hopefully get back to you with a nice pull-request :)
Regards,
Julien
|
msg397555 - (view) |
Author: Julien Castiaux (drlazor8) * |
Date: 2021-07-15 14:48 |
Update, it works fine with the compat32 policy
|
msg397563 - (view) |
Author: R. David Murray (r.david.murray) * |
Date: 2021-07-15 16:15 |
Yes, compat32 uses a different parser and folder (the legacy ones), that have a lot of small bugs relative to the RFCs (which is why I rewrote it).
|
msg399390 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2021-08-11 12:57 |
I change the issue type to security. The bug can be abused to send emails to the wrong email address.
|
msg399393 - (view) |
Author: Julien Castiaux (drlazor8) * |
Date: 2021-08-11 13:39 |
Hello David, Victor,
Thank you for the triage, it reminds me about this issue. David, the
solution I tried last month was wrong, it was breaking (for good
reasons) tons of unittests. It seems to me that there is indeed no other
solution than to bloat the re-folding function a bit more and to fix the
dbquotes there as your last email suggested.
I agree with you that the code will be even messier, honestly I spent
quite some time understanding the _refold_parse_tree function and I
don't feel like patching it.
Regards,
On 11.08.21 14:57, STINNER Victor wrote:
> STINNER Victor <vstinner@python.org> added the comment:
>
> I change the issue type to security. The bug can be abused to send emails to the wrong email address.
>
> ----------
> nosy: +vstinner
> type: behavior -> security
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <https://bugs.python.org/issue44637>
> _______________________________________
|
msg407545 - (view) |
Author: Alexander Mohr (thehesiod) * |
Date: 2021-12-02 20:44 |
btw my work-around was to set maxheaderlen=sys.maxsize, worked for AWS SES at least
|
msg407901 - (view) |
Author: Julien Castiaux (drlazor8) * |
Date: 2021-12-07 08:56 |
Hello there,
There is a pull-request on github, had to modify `_refold_parse_tree` but I could keep the diff quite small. It is properly tested and it is waiting a review :)
We have a patch at work so it is *absolutely not* urgent, feel free to review it *anytime*. Since we are using the Ubuntu LTS version of python, we might be interested by a backport till 3.7, quite honestly I'm happy it was flag as a security issue :D
|
msg411492 - (view) |
Author: Julien Castiaux (drlazor8) * |
Date: 2022-01-24 16:37 |
Hello there,
Friendly reminder that this issue is still open and that there is a pull request ready. We continue to face the issue in production and our customers are getting upset.
Can you provide us a schedule when this issue will be addressed? So that we can decide either to wait our to start thinking about possible mitigations our side?
Regards,
Julien
|
|
Date |
User |
Action |
Args |
2022-04-11 14:59:47 | admin | set | github: 88803 |
2022-01-24 16:37:33 | drlazor8 | set | messages:
+ msg411492 |
2021-12-07 08:56:53 | drlazor8 | set | messages:
+ msg407901 |
2021-12-02 20:44:28 | thehesiod | set | messages:
+ msg407545 |
2021-12-01 16:29:24 | drlazor8 | set | keywords:
+ patch stage: patch review pull_requests:
+ pull_request28107 |
2021-11-30 15:19:04 | vstinner | set | nosy:
- vstinner
|
2021-11-30 14:59:41 | r.david.murray | set | nosy:
+ thehesiod
title: Quoting issue on header Reply-To -> Quoting issue on header Reply-To and other address headers |
2021-08-11 13:39:30 | drlazor8 | set | messages:
+ msg399393 |
2021-08-11 12:57:41 | vstinner | set | type: behavior -> security
messages:
+ msg399390 nosy:
+ vstinner |
2021-07-15 16:15:18 | r.david.murray | set | messages:
+ msg397563 |
2021-07-15 14:48:35 | drlazor8 | set | messages:
+ msg397555 |
2021-07-15 13:15:28 | drlazor8 | set | messages:
+ msg397546 |
2021-07-15 12:35:42 | r.david.murray | set | messages:
+ msg397545 |
2021-07-15 08:11:45 | Abridbus | set | messages:
+ msg397529 |
2021-07-14 13:47:17 | r.david.murray | set | messages:
+ msg397480 |
2021-07-14 12:44:57 | Abridbus | create | |