Issue 35559: Optimize base64.b16decode to use compiled regex

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/79740

classification

Title:	Optimize base64.b16decode to use compiled regex
Type:	performance	Stage:	resolved
Components:	Library (Lib)	Versions:	Python 3.8

process

Status:	closed	Resolution:	rejected
Dependencies:		Superseder:
Assigned To:	xtreak	Nosy List:	barry, djhoulihan, pitrou, scoder, serhiy.storchaka, xtreak
Priority:	normal	Keywords:	patch

Created on 2018-12-22 07:29 by xtreak, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Pull Requests
URL	Status	Linked	Edit
PR 11287	closed	cheryl.sabella, 2018-12-22 21:22

Messages (10)
msg332330 - (view)	Author: Karthikeyan Singaravelan (xtreak) *	Date: 2018-12-22 07:29
I came across this as a result of issue35557 and thought to make a new issue to keep the discussion separate. Currently the b16decode function uses a regex with re.search that can be compiled at the module level as a static variable to give up to 30% improvement when executed on Python 3.7. I am proposing a PR for this change since it looks safe to me. $ python3 -m perf compare_to default.json optimized.json --table +--------------------+---------+------------------------------+ \| Benchmark \| default \| optimized \| +====================+=========+==============================+ \| b16decode \| 2.97 us \| 2.03 us: 1.46x faster (-32%) \| +--------------------+---------+------------------------------+ \| b16decode_casefold \| 3.18 us \| 2.19 us: 1.45x faster (-31%) \| +--------------------+---------+------------------------------+ Benchmark script : import perf import re import binascii import base64 _B16DECODE_PAT = re.compile(b'[^0-9A-F]') def b16decode_re_compiled_search(s, casefold=False): s = base64._bytes_from_decode_data(s) if casefold: s = s.upper() if _B16DECODE_PAT.search(s): raise binascii.Error('Non-base16 digit found') return binascii.unhexlify(s) if __name__ == "__main__": hex_data = "806903d098eb50957b1b376385f233bb3a5d54f54191c8536aefee21fc9ba3ca" hex_data_upper = hex_data.upper() assert base64.b16decode(hex_data_upper) == b16decode_re_compiled_search(hex_data_upper) assert base64.b16decode(hex_data, casefold=True) == b16decode_re_compiled_search(hex_data, casefold=True) runner = perf.Runner() if True: # toggle to False for default.json runner.timeit(name="b16decode", stmt="b16decode_re_compiled_search(hex_data_upper)", setup="from __main__ import b16decode_re_compiled_search, hex_data, hex_data_upper") runner.timeit(name="b16decode_casefold", stmt="b16decode_re_compiled_search(hex_data, casefold=True)", setup="from __main__ import b16decode_re_compiled_search, hex_data, hex_data_upper") else: runner.timeit(name="b16decode", stmt="base64.b16decode(hex_data_upper)", setup="from __main__ import hex_data, hex_data_upper; import base64") runner.timeit(name="b16decode_casefold", stmt="base64.b16decode(hex_data, casefold=True)", setup="from __main__ import hex_data, hex_data_upper; import base64")
msg332331 - (view)	Author: Stefan Behnel (scoder) *	Date: 2018-12-22 07:42
One regex related code pattern that I generally like is to assign bound methods to good names and use those. In this case, I would write _has_non_base16_digits = re.compile(b'[^0-9A-F]').search ... if _has_non_base16_digits(s): raise ...
msg332333 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2018-12-22 07:55
How this affects the import time (use -X importtime)?
msg332334 - (view)	Author: Karthikeyan Singaravelan (xtreak) *	Date: 2018-12-22 08:03
Thanks @scoder . I took the convention since most of the places I have seen capitalized variable ending with PAT but this looks more readable to me. I have made the changes in my PR.
msg332335 - (view)	Author: Karthikeyan Singaravelan (xtreak) *	Date: 2018-12-22 08:10
> How this affects the import time (use -X importtime)? Is there reliable way to benchmark this? On multiple runs with regex for python3 -X importtime -c 'import base64' import time: 677 \| 11151 \| base64 On multiple runs without regex for python3 -X importtime -c 'import base64' import time: 551 \| 5726 \| base64
msg332339 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2018-12-22 08:55
According to your results, you will save 0.94 us per call of b16decode, but loss 126 us at import time. You will get a benefit if b16decode is used more than 130 time. Currently, if the performance is critical, the user can use binascii.unhexlify() directly or use bytes.fromhex(). base64.b16decode() is only needed if you need to ensure that the input strictly conforms to RFC 3548. Other options, which will maximize the performance while keeping the validation is to add an option to binascii.unhexlify() for making it more strict. At the time of writing the base64 module its performance was not considered critical, and additional checks and preprocessing was implemented in Python.
msg332341 - (view)	Author: Karthikeyan Singaravelan (xtreak) *	Date: 2018-12-22 09:20
Thanks Serhiy, the other issue noted about performance improvement removing casefold and I thought re.search per call to be inefficient. My bad that I didn't consider the cost of moving the compilation to module level that affects import time and about using -X importtime. I agree that the cost is not worthy given that regex is used only inside b16decode. I will keep these factors in mind when I am doing similar sort of work and try to do a better analysis.
msg332372 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2018-12-22 21:39
I don't think performance of base16 encoding is important enough to increase import times. I would recommend rejection.
msg332563 - (view)	Author: Stefan Behnel (scoder) *	Date: 2018-12-26 21:36
I agree with Antoine. After all, we are optimising a safety check here that runs in linear time. If people want speed, they should consider methods that do not do this check in the first place.
msg332574 - (view)	Author: Karthikeyan Singaravelan (xtreak) *	Date: 2018-12-27 05:32
Thanks for the feedback. I am closing this as rejected since it's not worth the cost of increasing import time and for performance reasons there are other options as Serhiy noted in msg332339.

History
Date	User	Action	Args
2022-04-11 14:59:09	admin	set	github: 79740
2018-12-27 05:32:23	xtreak	set	status: open -> closed resolution: rejected messages: + msg332574 stage: patch review -> resolved
2018-12-26 21:36:07	scoder	set	messages: + msg332563
2018-12-22 21:39:32	pitrou	set	messages: + msg332372
2018-12-22 21:22:49	cheryl.sabella	set	pull_requests: + pull_request10517
2018-12-22 21:22:22	cheryl.sabella	set	pull_requests: - pull_request10516
2018-12-22 21:21:40	cheryl.sabella	set	keywords: + patch stage: patch review pull_requests: + pull_request10516
2018-12-22 09:20:04	xtreak	set	messages: + msg332341
2018-12-22 08:55:41	serhiy.storchaka	set	nosy: + barry, pitrou messages: + msg332339
2018-12-22 08:10:28	xtreak	set	messages: + msg332335
2018-12-22 08:03:08	xtreak	set	messages: + msg332334
2018-12-22 07:55:19	serhiy.storchaka	set	messages: + msg332333
2018-12-22 07:42:06	scoder	set	nosy: + scoder messages: + msg332331
2018-12-22 07:29:08	xtreak	create