classification
Title: BufferedWriter non-blocking overage
Type: behavior Stage: patch review
Components: Library (Lib) Versions: Python 3.0, Python 2.7, Python 2.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: amaury.forgeotdarc, benjamin.peterson, christian.heimes, pitrou, severb
Priority: normal Keywords: patch

Created on 2008-11-05 18:22 by severb, last changed 2009-03-05 22:32 by benjamin.peterson. This issue is now closed.

Files
File name Uploaded Description Edit
bw_overage.diff severb, 2008-11-05 18:22
bw_overage2.diff severb, 2008-11-06 15:05
nonblocking.patch pitrou, 2008-12-30 17:17
nonblocking2.patch severb, 2009-01-07 11:38
Messages (17)
msg75523 - (view) Author: Sever Băneșiu (severb) Date: 2008-11-05 18:22
In some corner cases io.BufferedWriter raises an io.BlockingIOError
"lying" about the number of characters written.
I've added some tests and a small change to fix this issue.
msg75563 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-11-06 13:34
The patch is good. 

I was first surprised by the fact that e.characters_written is not used
in the write() method; but _flush_unlocked() already adjusts the
_write_buf according to the original e.characters_written raised by the
underlying raw file. Everything is fine.

I suggest however to add some tests around the first "except
BlockingIOError". This would answer the question:
    # XXX Why not just let the exception pass through?
For example, I modified a function in your patch:

    def testWriteNonBlockingOverage(self):
        raw = MockNonBlockWriterIO((-1, -2))
        [...]

        # Subsequent calls to write() try to flush the raw file.
        try:
            bufio.write(b"x")
        except io.BlockingIOError as e:
            # Two more chars were written at the raw level
            self.assertEqual(bufio._write_buf, write_buf[2:])
            # But write() did not accept anything.
            self.assertEqual(e.characters_written, 0)
        else:
            self.fail("BlockingIOError not raised")
msg75564 - (view) Author: Sever Băneșiu (severb) Date: 2008-11-06 15:05
Thanks for your review, here's a new patch.
I've added a new test for the pre-flush condition and made the comments
less cryptic.
msg75565 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2008-11-06 16:05
We have discussed this bug in the python developer chat yesterday. I
decided to wait until after the 3.0.0 release. The problem is not
critical enough for 3.0.0. I like to keep the amount of changes during
the RC phase to a minimum.
msg75566 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-11-06 16:43
I do concur with the desire to restrict changes during RC phase. Do this
also mean that merges from trunk will be reduced to the strict minimum?
No global merge, only on a revision basis after review.

In this case we could apply the patch to the trunk, and let a future
global merge propagate the change to py3k.
msg77218 - (view) Author: Sever Băneșiu (severb) Date: 2008-12-07 10:28
Christian, if the patch looks good to you I think now it's a good time
to commit it.
msg78546 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-12-30 16:59
The tests should be written so as not to rely on internal implementation
details (the _write_buf attribute).
msg78548 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-12-30 17:17
Here is a patch which replaces testWriteNonBlocking with a reasonable
implementation-independent test (it works with e.g. the io-c sandbox).
The new test also checks for the current problem, i.e. it passes with
the fix to io.py and fails without.
msg78599 - (view) Author: Sever Băneșiu (severb) Date: 2008-12-31 12:12
Thanks for the new implementation of MockNonBlockWriterIO class. It
makes tests so much easier to read.

There are some minor things in your patch that I would change. For example:

    # 1 byte will be written, the rest will be buffered
    raw.block_on(b"k")
    self.assertEquals(bufio.write(b"jklmn"), 5)
    # ...
    raw.block_on(b"0")

The comment is misleading because in fact no byte is written at raw
level. That's because the data size is smaller than the buffer size and
the buffer is empty (was emptied by the last write call). All this
renders raw.block_on(b"k") call useless. I also think this is the
correct behavior regardless of implementation language of BufferedWriter
class i.e. no write call should write at raw level smaller chunks of
data than buffer's size unless it has to.

Your tests can't cover the pre-flush condition because max_buffer_size
equals buffer_size.

Unless you'll beat me to it or prove me wrong, I'll update your patch
next year.
msg78601 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-12-31 12:46
> The comment is misleading because in fact no byte is written at raw
> level. That's because the data size is smaller than the buffer size and
> the buffer is empty (was emptied by the last write call).

It depends on the implementation. A different implementation may use a
different algorithm.

> I also think this is the
> correct behavior regardless of implementation language of BufferedWriter
> class i.e. no write call should write at raw level smaller chunks of
> data than buffer's size unless it has to.

But how do you decide when it "has to"? Unless you want to constrain the
exact implemented algorithm, you can't do that in your tests.
msg78606 - (view) Author: Sever Băneșiu (severb) Date: 2008-12-31 13:49
>> The comment is misleading because in fact no byte is written at raw
>> level. That's because the data size is smaller than the buffer size and
>> the buffer is empty (was emptied by the last write call).

> It depends on the implementation. A different implementation may use a
> different algorithm.

I feel that no matter what implementation algorithm BufferedWriter uses
it shouldn't write smaller chunks of data than buffer's size or else the
buffer is useless.

>> I also think this is the
>> correct behavior regardless of implementation language of BufferedWriter
>> class i.e. no write call should write at raw level smaller chunks of
>> data than buffer's size unless it has to.

> But how do you decide when it "has to"? Unless you want to constrain the
> exact implemented algorithm, you can't do that in your tests.

When a direct or indirect (e.g. on close) flush is called for the file
object.
msg78608 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-12-31 14:00
> I feel that no matter what implementation algorithm BufferedWriter uses
> it shouldn't write smaller chunks of data than buffer's size or else the
> buffer is useless.

If you rewrite the above sentence using the word "statistically", then I
can agree :)
But if I look at e.g. the fwrite() manpage, I see no guarantee that the
stdio layer will never make a call to write() with a size smaller than
the buffer size. The buffered layer should be free to manage its buffer
in what it believes is the most efficient way. The only guarantee is
that it won't buffer more than max_buffer_size.

Anyway :) Practically, the test does work on both py3k and another
implementation, so I don't see any urgency to remove anything from it.
msg79324 - (view) Author: Sever Băneșiu (severb) Date: 2009-01-07 11:38
> Anyway :) Practically, the test does work on both py3k and another
> implementation, so I don't see any urgency to remove anything from it.
Indeed, it doesn't hurt keeping it.

For completeness' sake I've updated your tests to cover the pre-flush
condition existent in the python implementation. Theoretically this
should be the desired behavior in any other implementation.
msg82922 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2009-02-28 17:10
This has been cured in the io-c branch.
msg83137 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2009-03-04 21:31
This has been fixed in io-c branch. (r70152)
msg83185 - (view) Author: Sever Băneșiu (severb) Date: 2009-03-05 09:11
Looks like the test covering the pre-flush condition is missing.
msg83224 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2009-03-05 22:32
2009/3/5 Sever Băneșiu <report@bugs.python.org>:
>
> Sever Băneșiu <banesiu.sever@gmail.com> added the comment:
>
> Looks like the test covering the pre-flush condition is missing.

That test is no longer applicable because max_buffer_size is
deprecated and unused.
History
Date User Action Args
2009-03-05 22:32:04benjamin.petersonsetmessages: + msg83224
2009-03-05 09:11:05severbsetmessages: + msg83185
2009-03-04 21:31:12benjamin.petersonsetstatus: open -> closed
resolution: fixed
messages: + msg83137
2009-02-28 17:12:32benjamin.petersonlinkissue4565 dependencies
2009-02-28 17:10:44benjamin.petersonsetnosy: + benjamin.peterson
messages: + msg82922
2009-01-07 11:38:04severbsetfiles: + nonblocking2.patch
messages: + msg79324
2008-12-31 14:00:17pitrousetmessages: + msg78608
2008-12-31 13:49:03severbsetmessages: + msg78606
2008-12-31 12:46:40pitrousetmessages: + msg78601
2008-12-31 12:12:51severbsetmessages: + msg78599
2008-12-30 17:17:40pitrousetfiles: + nonblocking.patch
messages: + msg78548
2008-12-30 16:59:45pitrousetnosy: + pitrou
messages: + msg78546
2008-12-07 10:28:38severbsetmessages: + msg77218
2008-11-06 16:43:14amaury.forgeotdarcsetmessages: + msg75566
2008-11-06 16:05:24christian.heimessetversions: - Python 3.1
nosy: + christian.heimes
messages: + msg75565
priority: normal
type: behavior
stage: patch review
2008-11-06 15:05:51severbsetfiles: + bw_overage2.diff
messages: + msg75564
2008-11-06 13:34:51amaury.forgeotdarcsetnosy: + amaury.forgeotdarc
messages: + msg75563
2008-11-05 18:22:29severbcreate