Author glyph
Recipients arjennienhuis, benjamin.peterson, christian.heimes, eric.smith, exarkun, ezio.melotti, glyph, gvanrossum, loewis, martin.panter, pitrou, serhiy.storchaka, terry.reedy, uau, vstinner
Date 2013-01-23.01:03:22
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <3ABE017B-5DCC-4569-A7D5-625707002EA3@twistedmatrix.com>
In-reply-to <1358897672.74.0.323995453244.issue3982@psf.upfronthosting.co.za>
Content
On Jan 22, 2013, at 3:34 PM, Terry J. Reedy <report@bugs.python.org> wrote:

> I presume this would mean adding 'if py3: out = out.encode()' after the formatting. As I said before, this works much better in 3.3+ than in 3.2-. Some actual numbers:

I'm glad that this operation has been optimized, but treating blocks of protocol data as text is a hackish workaround that still doesn't perform as well (even on 3.3+) as bytes formatting in 2.7.

> [If speed is really an issue, we could make binary file/socket write methods unicode implementation aware. They could directly access the ascii (or latin-1) bytes in a unicode object, just as they do with a bytes object, and the extra copy could be skipped.]

Yes, speed is really an issue - this kind of message construction is on the critical path of many of the more popular protocols implemented with Twisted.  But trying to work around the performance issue by pretending that strings are bytes will just give new life to old bugs.  We've been loudly rejecting unicode from sockets I think for as long as Python has had unicode, and that's the way it should remain.
History
Date User Action Args
2013-01-23 01:03:22glyphsetrecipients: + glyph, gvanrossum, loewis, terry.reedy, exarkun, pitrou, vstinner, eric.smith, christian.heimes, benjamin.peterson, ezio.melotti, arjennienhuis, uau, martin.panter, serhiy.storchaka
2013-01-23 01:03:22glyphlinkissue3982 messages
2013-01-23 01:03:22glyphcreate