This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author pitrou
Recipients benjamin.peterson, daniel.urban, jcon, pitrou, stutzbach
Date 2011-05-07.09:01:26
SpamBayes Score 2.4501655e-06
Marked as misclassified No
Message-id <1304758887.2.0.143952046493.issue9971@psf.upfronthosting.co.za>
In-reply-to
Content
Oops... It hadn't jumped at me earlier, but the patch is actually problematic performance-wise. The reason is that it doesn't buffer data at all, so small readintos become slower (they have to go through raw I/O every time):

$ ./python -m timeit -s "f=open('LICENSE', 'rb'); b = bytearray(4)" \
  "f.seek(0)" "while f.readinto(b): pass"
-> without patch: 2.53 msec per loop
-> with patch: 3.37 msec per loop

$ ./python -m timeit -s "f=open('LICENSE', 'rb'); b = bytearray(128)" \
  "f.seek(0)" "while f.readinto(b): pass"
-> without patch: 90.3 usec per loop
-> with patch: 103 usec per loop

The patch does make large reads faster, as expected:

$ ./python -m timeit -s "f=open('LICENSE', 'rb'); b = bytearray(4096)" \
  "f.seek(0)" "while f.readinto(b): pass"
-> without patch: 13.2 usec per loop
-> with patch: 6.71 usec per loop

(that's a good reminder for the future: when optimizing something, always try to measure the "improvement" :-))

One solution would be to refactor _bufferedreader_read_generic() to take an existing buffer, and use that.
History
Date User Action Args
2011-05-07 09:01:27pitrousetrecipients: + pitrou, benjamin.peterson, stutzbach, daniel.urban, jcon
2011-05-07 09:01:27pitrousetmessageid: <1304758887.2.0.143952046493.issue9971@psf.upfronthosting.co.za>
2011-05-07 09:01:26pitroulinkissue9971 messages
2011-05-07 09:01:26pitroucreate