I posted benchmarks two years ago, in msg165795. Here are updated results:

$ ./python -m timeit -s "import io; n=100; d=[b'a'*n,b'bb'*n,b'ccc'*n]*10000"  "s=io.BytesIO(); w=s.write"  "for x in d: w(x)"  "s.getvalue()"

Before patch: 10 loops, best of 3: 42.3 msec per loop
After patch: 10 loops, best of 3: 27.6 msec per loop

$ ./python -m timeit -s "import io; n=1000; d=[b'a'*n,b'bb'*n,b'ccc'*n]*1000"  "s=io.BytesIO(); w=s.write"  "for x in d: w(x)"  "s.getvalue()"

Before patch: 10 loops, best of 3: 28.7 msec per loop
After patch: 100 loops, best of 3: 14.8 msec per loop

They don't depend from the resizing factor on Linux. I increased it in hope it will help on Windows.
