This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: binascii doesn't work on some base64
Type: behavior Stage: resolved
Components: C API, Library (Lib) Versions: Python 3.9, Python 3.8
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: ammar2, kwatsen
Priority: normal Keywords:

Created on 2020-12-12 21:23 by kwatsen2, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (4)
msg382922 - (view) Author: Kent Watsen (kwatsen2) Date: 2020-12-12 21:23
[Tested on 3.8.2 and 3.9.0, bug may manifest in other versions too]

The IETF sometimes uses the dummy base64 value "base64encodedvalue==" in specifications in lieu of a block of otherwise meaningless b64.  

Even though it is a dummy value, the value should be convertible to binary and back again.  This works using the built-in command `base64` as well as OpenSSL command line, but binascii is unable to do it.  See below:

$ echo "base64encodedvalue==" | base64 | base64 -d
base64encodedvalue==

$ echo "base64encodedvalue==" | openssl enc -base64 -A | openssl enc -d base64 -A
base64encodedvalue==                                                                                                         

$ printf "import binascii\nprint(binascii.b2a_base64(binascii.a2b_base64('base64encodedvalue=='), newline=False).decode('ascii'))" |  python -
base64encodedvaluQ==

After some investigation, it appears that almost any valid base64 matching the pattern "??==" fails.  For instance:

$ printf "import binascii\nprint(binascii.b2a_base64(binascii.a2b_base64('ue=='), newline=False).decode('ascii'))" |  python -                                                           
uQ==

$ printf "import binascii\nprint(binascii.b2a_base64(binascii.a2b_base64('aa=='), newline=False).decode('ascii'))" |  python -                                                           
aQ==

$ printf "import binascii\nprint(binascii.b2a_base64(binascii.a2b_base64('a0=='), newline=False).decode('ascii'))" |  python -                                                           
aw==


Is this a bug?
msg385537 - (view) Author: Kent Watsen (kwatsen) Date: 2021-01-23 13:52
No activity in 3 weeks.  Selecting a couple components to give it a bump.
msg385789 - (view) Author: Ammar Askar (ammar2) * (Python committer) Date: 2021-01-27 16:59
It seems to me that your commands are just sequenced wrong, in Python you're performing (examples in parens):

* base64 (ue==) -> decode to binary (0xB9)
* binary (0xB9) -> encode to base64 (uQ==)

whereas in your command line commands you're doing:

* base64 (ue==) -> encode to double base64 (dWU9PQo=)
* double base64 (dWU9PQo=) -> decoded to base64 (ue==)


If we do the same thing on the command line as you're doing in Python, we get:

$ echo "base64encodedvalue==" | base64 -d | base64
base64encodedvaluQ==

$ echo "ue==" | base64 -d | base64
uQ==

$ echo "base64encodedvalue==" | openssl enc -d -base64 -A | openssl enc -base64 -A
base64encodedvaluQ==
msg385921 - (view) Author: Kent Watsen (kwatsen) Date: 2021-01-29 17:19
I see.  There are two issues:

1) my `base64` and `openssl` CLI commands were flipped, as you point out, giving a false positive - oops ;)

2) more importantly, the base64 value "ue==" is invalid (there is no binary input that could possibly generate it) and none of the implementations issued a warning or error, which is reasonable IMO.

Thank you for your help.  Please close this issue.
History
Date User Action Args
2022-04-11 14:59:39adminsetgithub: 86794
2021-01-29 18:08:08serhiy.storchakasetstatus: open -> closed
resolution: not a bug
stage: resolved
2021-01-29 17:19:56kwatsensetmessages: + msg385921
2021-01-27 16:59:46ammar2setnosy: + ammar2
messages: + msg385789
2021-01-23 13:52:59kwatsensetnosy: - kwatsen2
2021-01-23 13:52:23kwatsensetnosy: + kwatsen
messages: + msg385537
components: + Library (Lib), C API
2020-12-12 21:23:18kwatsen2create