This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: StringIO uses inefficient PyUnicode_AsUCS4
Type: performance Stage: resolved
Components: IO Versions: Python 3.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: bhavishya, methane, pitrou, vstinner
Priority: normal Keywords:

Created on 2017-06-30 12:26 by methane, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (9)
msg297394 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2017-06-30 12:26
From PEP393, PyUnicode_AsUCS4 is inefficient.
And C implementation of io.StringIO() uses it.

That's why Python 3 is slower than Python 2 on logging_format and logging_simple benchmarks.
https://mail.python.org/pipermail/speed/2017-February/000503.html

Maybe, it can use _PyUnicodeWriter APIs instead.
msg297395 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-06-30 12:31
There was a discussion to use an adaptative implementation depending *how* the API is used. Write only is different than write, seek back, write, read, seek, etc.

The idea was used unicode writer when it's the most efficient, write only, and switch to something else (ex: current code) when other methods are used.

See bpo-15612.
msg297396 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-06-30 12:48
I rewrote my old benchmarks using the new perf module API: bench_stringio3.py. This benchmark suite now takes forever with perf, since perf computes much more values and the suite contains a total of 108 benchmarks! Most lines should be commented to take a reasonable time :-)
msg297397 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-06-30 12:51
According to my result computed 5 times ago, the most signicant different was on *reading* from StringIO which contains USC1 text:

reader long lines ascii               |  103 us (*) | 33.4 us (-68%)
reader long lines latin1              |  105 us (*) | 34.2 us (-67%)
msg297398 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2017-06-30 12:55
I'm sorry, it's my mistake.

I used vmprof on mac and I thought as_ucs4 is bottleneck.
But vmprof on Linux (and perf) shows totally different result.

Maybe, current vmprof doesn't work well for native code on macOS.
msg297399 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-06-30 12:58
> I rewrote my old benchmarks using the new perf module API: bench_stringio3.py.

WTF? The file is not attached to this issue, but I removed it locally :-(

It seems like Roundup cleared the file field of this form when I got a conflict when I wanted to post my message...
msg297400 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2017-06-30 13:33
FYI, https://github.com/python/performance/pull/27 will fix performance regression.
Python 3 performance is similar to Python 2 after s/warn/warning/
msg297401 - (view) Author: Bhavishya (bhavishya) Date: 2017-06-30 13:38
I'm running archlinux(on mac...not very confident of my system though I
tried cpu isolation with "isolcpus") so if please anyone else also can run
it....and confirm if it actually helps. Thanks.

On Fri, Jun 30, 2017 at 7:03 PM, INADA Naoki <report@bugs.python.org> wrote:

>
> INADA Naoki added the comment:
>
> FYI, https://github.com/python/performance/pull/27 will fix performance
> regression.
> Python 3 performance is similar to Python 2 after s/warn/warning/
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue30815>
> _______________________________________
>
msg297402 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-06-30 13:39
"FYI, https://github.com/python/performance/pull/27 will fix performance regression. Python 3 performance is similar to Python 2 after s/warn/warning/"

I was surprised to see that Logger.warn() is slower than Logger.warning()! It is because warn() emits a deprecation warning, thing which isn't cheap...
History
Date User Action Args
2022-04-11 14:58:48adminsetgithub: 74998
2017-06-30 13:39:18vstinnersetmessages: + msg297402
2017-06-30 13:38:43bhavishyasetmessages: + msg297401
2017-06-30 13:33:44methanesetmessages: + msg297400
2017-06-30 12:58:25vstinnersetmessages: + msg297399
2017-06-30 12:55:32methanesetstatus: open -> closed
resolution: not a bug
messages: + msg297398

stage: resolved
2017-06-30 12:51:55vstinnersetmessages: + msg297397
2017-06-30 12:48:28vstinnersetmessages: + msg297396
2017-06-30 12:41:52bhavishyasetnosy: + bhavishya
2017-06-30 12:31:05vstinnersetnosy: + vstinner, pitrou
messages: + msg297395
2017-06-30 12:26:04methanecreate