Optimize bytearray(int) constructor to use calloc() #65843

vstinner · 2014-06-02T20:25:58Z

BPO	21644
Nosy	@pitrou, @vstinner, @skrah, @bmerry

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2015-03-18.13:28:53.038>
created_at = <Date 2014-06-02.20:25:58.126>
labels = ['performance']
title = 'Optimize bytearray(int) constructor to use calloc()'
updated_at = <Date 2021-11-14.13:34:15.379>
user = 'https://github.com/vstinner'

bugs.python.org fields:

activity = <Date 2021-11-14.13:34:15.379>
actor = 'bmerry'
assignee = 'none'
closed = True
closed_date = <Date 2015-03-18.13:28:53.038>
closer = 'vstinner'
components = []
creation = <Date 2014-06-02.20:25:58.126>
creator = 'vstinner'
dependencies = []
files = []
hgrepos = []
issue_num = 21644
keywords = []
message_count = 9.0
messages = ['219632', '219633', '219636', '219672', '223642', '238434', '376942', '405945', '406321']
nosy_count = 4.0
nosy_names = ['pitrou', 'vstinner', 'skrah', 'bmerry']
pr_nums = []
priority = 'normal'
resolution = 'out of date'
stage = None
status = 'closed'
superseder = None
type = 'performance'
url = 'https://bugs.python.org/issue21644'
versions = ['Python 3.5']

vstinner · 2014-06-02T20:25:58Z

Python 3.5 has a new PyObject_Calloc() function which can be used for fast memory allocation of memory block initialized with zeros.

I already implemented an optimization, but Stefan Krah found issues in my change:
http://bugs.python.org/issue21233#msg217826

I reverted the optimization in the changeset dff6b4b61cac.

vstinner · 2014-06-02T20:26:52Z

Stefan also wrote:

"3) Somewhat similarly, I wonder if it was necessary to refactor
PyBytes_FromStringAndSize(). I find the new version more difficult
to understand."

This issue can also be addressed here.

vstinner · 2014-06-02T20:30:28Z

Stefan wrote:
"3) Somewhat similarly, I wonder if it was necessary to refactor
PyBytes_FromStringAndSize(). I find the new version more difficult
to understand."

Do you mean that the optimization is useless or that the implementation should be changed?

pitrou · 2014-06-03T09:08:10Z

Responding to a comment in bpo-21233:

Before:
=======

>        >>> x = bytearray(0)
>        >>> m = memoryview(x)
>        >>> x.__init__(10)
>        Traceback (most recent call last):
>          File "<stdin>", line 1, in <module>
>        BufferError: Existing exports of data: object cannot be re-sized

I don't think such use cases are supported. Generally, reinitializing an
object by calling __init__() explicitly is not well-defined, except when
advertised explicitly in the documentation. The only property we should
guarantee here is that it doesn't lead to an inconsistent object state, or to
hard crashes. Raising an exception is fine, and changing the raised
exception to another one should be fine as well.

vstinner · 2014-07-22T07:49:23Z

See also issue bpo-22030 (set).

vstinner · 2015-03-18T13:28:53Z

I'm not interested to work on this optimization, so I just close the issue.

bmerry · 2020-09-15T15:27:19Z

Was this abandoned just because nobody had the time, or was there a problem with the approach? I independently wanted this optimisation, and have ended up implementing something very similar to what was reverted in https://hg.python.org/lookup/dff6b4b61cac.

In a benchmark that creates a large bytearray, then fills it with socket.readinto, I'm seeing a 2x performance improvement on Linux, and from some quick benchmarking it seems to be just as fast as the old code for small arrays that are allocated from the pool.

vstinner · 2021-11-08T12:47:36Z

I abandonned the issue because I didn't have time to work on it. If you want, you can open a new issue for that.

bmerry · 2021-11-14T13:34:15Z

I abandonned the issue because I didn't have time to work on it. If you want, you can open a new issue for that.

If I make a pull request and run some microbenchmarks, will you (or some other core dev) have time to review it? I've had a bad experience before with a PR that I'm still unable to get reviewed after several years, so I'd like to get at least a tentative agreement before I invest time in it.

vstinner added the performance Performance or resource usage label Jun 2, 2014

vstinner closed this as completed Mar 18, 2015

ezio-melotti transferred this issue from another repository Apr 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize bytearray(int) constructor to use calloc() #65843

Optimize bytearray(int) constructor to use calloc() #65843

vstinner commented Jun 2, 2014

vstinner commented Jun 2, 2014

vstinner commented Jun 2, 2014

vstinner commented Jun 2, 2014

pitrou commented Jun 3, 2014

vstinner commented Jul 22, 2014

vstinner commented Mar 18, 2015

bmerry mannequin commented Sep 15, 2020

vstinner commented Nov 8, 2021

bmerry mannequin commented Nov 14, 2021

Optimize bytearray(int) constructor to use calloc() #65843

Optimize bytearray(int) constructor to use calloc() #65843

Comments

vstinner commented Jun 2, 2014

vstinner commented Jun 2, 2014

vstinner commented Jun 2, 2014

vstinner commented Jun 2, 2014

pitrou commented Jun 3, 2014

vstinner commented Jul 22, 2014

vstinner commented Mar 18, 2015

bmerry mannequin commented Sep 15, 2020

vstinner commented Nov 8, 2021

bmerry mannequin commented Nov 14, 2021