Message149695
iobench benchmarking tool showed that the UTF-8 encoder is slower in Python 3.3 than Python 3.2. The performance depends on the characters of the input string:
* 8x faster (!) for a string of 50.000 ASCII characters
* 1.5x slower for a string of 50.000 UCS-1 characters
* 2.5x slower for a string of 50.000 UCS-2 characters
The bottleneck looks to be the the PyUnicode_READ() macro.
* Python 3.2: s[i++]
* Python 3.3: PyUnicode_READ(kind, data, i++)
Because encoding string to UTF-8 is a very common operation, performances do matter. Antoine suggests to have different versions of the function for each Unicode kind (1, 2, 4). |
|
Date |
User |
Action |
Args |
2011-12-17 18:49:12 | vstinner | set | recipients:
+ vstinner, pitrou, ezio.melotti |
2011-12-17 18:49:11 | vstinner | set | messageid: <1324147751.99.0.957308589374.issue13624@psf.upfronthosting.co.za> |
2011-12-17 18:49:11 | vstinner | link | issue13624 messages |
2011-12-17 18:49:11 | vstinner | create | |
|