New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
set_payload does not handle binary payloads correctly #62524
Comments
In order to maintain model consistency without exposing the need for 'surrogateescape' to library users, it should be possible to pass binary data to set_payload and have it do the correct conversion to the expected storage format for the model. Currently, this does not work. The attached patch provides one example test out of a class of tests that should be written and made to pass. |
Here is the preliminary patch for email module to pass the test. |
Thanks, but the patch is incorrect. The model consistently stores its data as surrogateescaped strings, and this assumption is baked in to other parts of the code. So the correct fix is to do the surrogateescape encoding at the time the payload is set. It might in fact be better to store a binary payload as binary, but making that kind of change to the model requires much more extensive review and testing. |
I see. Thanks for the explanation. I'll do this patch if nobody is interested. |
If you want to work on it that would be great. Note that one of the things that is needed is a bunch more tests of setting various *kinds* of binary payload, including ones containing non-ascii data, and making sure the right thing happens when the payload is later fetched/serialized. |
Here is the patch for this ticket. David Murray, am I on the right path? If yes, I'll put more robust tests, such as the ones with Asian encodings and unusual encodings. |
Sorry, got typo for the last patch. |
It looks like you are still patching get_payload. This should be a really simple patch against set_payload. It occurs to me that there could be a backward compatibility concern if passing binary to set_payload currently actually works in some cases, so we definitely needs a bunch of test that do that to make sure they all fail before we fix the bug :) |
"It looks like you are still patching get_payload. This should be a really simple patch against set_payload." Okay, do I get it right at this time? About your second point, I need more time to think about it. |
Yes, that's what I had in mind. |
Here is the third version of the patch. I am not sure what to do with the invalid data for base64 and uuencode. I decided to raise exception instead of converting it to None silently. |
New changeset 64e004737837 by R David Murray in branch '3.3': New changeset a4afcf93ef7b by R David Murray in branch 'default': |
Thanks, Vajrasky. The v2 patch was almost correct. What you couldn't know without being as deeply enmeshed in this code as I am is that the test failures from the encoders module were actually invalid. We'd previously "fixed" them, but the fixes were incorrectly compensating for this bug in set_payload. Once this bug was fixed, those "fixes" just needed to be backed out, a 'decode=True' added to their get_payload calls, and all the tests pass. I *think* this is the last inconsistency in the model. I hope. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: