classification
Title: Quadratic complexity in b32encode
Type: performance Stage: resolved
Components: Library (Lib) Versions: Python 3.4, Python 3.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: pitrou, python-dev, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2013-04-21 21:34 by serhiy.storchaka, last changed 2013-05-19 08:52 by serhiy.storchaka. This issue is now closed.

Files
File name Uploaded Description Edit
base32_fix_2.patch serhiy.storchaka, 2013-05-18 21:12 Fix quadratic complexity review
base32_optimize_2.patch serhiy.storchaka, 2013-05-18 21:13 Optimize b32encode and b32decode review
Messages (10)
msg187527 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-04-21 21:34
b32encode accumulates encoded data in a bytes object and this operation has quadratic complexity.

Here is a patch, which fixes this issue by accumulating in a list.
msg187529 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-04-21 21:42
And here are other patch, which not only fixes an issue with quadratic complexity, but optimize b32encode and b32decode about 2.5 times.

Microbenchmarks:

./python -m timeit -r 1 -n 10 -s "from base64 import b32encode as encode; data = open('python', 'rb').read(1000001)"  "encode(data)"
./python -m timeit -r 1 -n 1 -s "from base64 import b32encode as encode, b32decode as decode; data = encode(open('python', 'rb').read(1000001))"  "decode(data)"

Results:
           First patch  Second patch
b32encode    1.25 sec     486 msec
b32decode    2.08 sec     835 msec
msg189538 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-05-18 18:45
Note that without patches both tests run many minutes.

base32_fix.patch causes little (5%) performance degradation for small data (< KB), this is perhaps the cause for the use of string concatenation. However base32_optimize.patch speeds up about 3x for such data.
msg189539 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-05-18 18:46
Note that without patches both tests run many minutes.

base32_fix.patch causes little (5%) performance degradation for small data (< KB), this is perhaps the cause for the use of string concatenation. However base32_optimize.patch speeds up about 3x for such data.
msg189544 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-05-18 19:53
Using a bytearray, rather than accumulating into a list, may reduce memory consumption.
msg189551 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-05-18 21:12
Thank you Antoine. Here are updated patches.
msg189554 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-05-18 22:14
I think 3.3 needs a fix. The quadratic complexity is a bug.
msg189555 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-05-18 22:15
Then I would recommend committing the simple patch in 3.3 and the more optimized one in 3.4. Both look fine to me, althgugh I haven't really examined the padding stuff.
msg189573 - (view) Author: Roundup Robot (python-dev) Date: 2013-05-19 08:50
New changeset 4b5467d997f1 by Serhiy Storchaka in branch '3.3':
Issue #17812: Fixed quadratic complexity of base64.b32encode().
http://hg.python.org/cpython/rev/4b5467d997f1

New changeset 1b5ef05d6ced by Serhiy Storchaka in branch 'default':
Issue #17812: Fixed quadratic complexity of base64.b32encode().
http://hg.python.org/cpython/rev/1b5ef05d6ced
msg189574 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-05-19 08:51
That is why I proposed two patches.
History
Date User Action Args
2013-05-19 08:52:26serhiy.storchakasetstatus: open -> closed
assignee: serhiy.storchaka
resolution: fixed
stage: patch review -> resolved
2013-05-19 08:51:48serhiy.storchakasetmessages: + msg189574
versions: + Python 3.3
2013-05-19 08:50:01python-devsetnosy: + python-dev
messages: + msg189573
2013-05-18 22:15:41pitrousetmessages: + msg189555
2013-05-18 22:14:27serhiy.storchakasetmessages: + msg189554
2013-05-18 22:00:10pitrousetversions: - Python 3.3
2013-05-18 21:14:17serhiy.storchakasetfiles: - base32_optimize.patch
2013-05-18 21:14:06serhiy.storchakasetfiles: - base32_fix.patch
2013-05-18 21:13:24serhiy.storchakasetfiles: + base32_optimize_2.patch
2013-05-18 21:12:57serhiy.storchakasetfiles: + base32_fix_2.patch

messages: + msg189551
2013-05-18 19:53:19pitrousetmessages: + msg189544
2013-05-18 18:46:00serhiy.storchakasetmessages: + msg189539
2013-05-18 18:45:24serhiy.storchakasetmessages: + msg189538
2013-04-21 21:42:54serhiy.storchakasetfiles: + base32_optimize.patch
2013-04-21 21:42:00serhiy.storchakasetmessages: + msg187529
2013-04-21 21:34:19serhiy.storchakacreate