Message180439
>it would probably be reasonable to make these protocols use str objects at the heart, and only convert to bytes after the formatting is done.
I presume this would mean adding 'if py3: out = out.encode()' after the formatting. As I said before, this works much better in 3.3+ than in 3.2-. Some actual numbers:
for len in (0, 100, 1000, 10000, 100000):
a = 'a' * len
print(timeit("a.encode()", "from __main__ import a"))
>>>
0.19305401378265558
0.22193721412302575
0.2783227054755883
0.677596406192696
7.124387897799184
Given n = 1000000, these should be microseconds per encoding. Of note:
the copying of bytes does not double the total time until there are a few thousand chars. Would protocols be using .format for much more than this?
[If speed is really an issue, we could make binary file/socket write methods unicode implementation aware. They could directly access the ascii (or latin-1) bytes in a unicode object, just as they do with a bytes object, and the extra copy could be skipped.] |
|
Date |
User |
Action |
Args |
2013-01-22 23:34:32 | terry.reedy | set | recipients:
+ terry.reedy, gvanrossum, loewis, exarkun, pitrou, vstinner, eric.smith, christian.heimes, benjamin.peterson, glyph, ezio.melotti, arjennienhuis, uau, martin.panter, serhiy.storchaka |
2013-01-22 23:34:32 | terry.reedy | set | messageid: <1358897672.74.0.323995453244.issue3982@psf.upfronthosting.co.za> |
2013-01-22 23:34:32 | terry.reedy | link | issue3982 messages |
2013-01-22 23:34:32 | terry.reedy | create | |
|