classification
Title: BufferedWriter not thread-safe
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.0, Python 2.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: pitrou Nosy List: amaury.forgeotdarc, benjamin.peterson, loewis, pitrou, wplappert
Priority: high Keywords: patch

Created on 2008-07-31 09:03 by pitrou, last changed 2008-12-07 08:03 by wplappert. This issue is now closed.

Files
File name Uploaded Description Edit
bufferedwriter.patch pitrou, 2008-07-31 14:20
bufferedwriter2.patch pitrou, 2008-08-01 20:11
bufferedwriter3.patch pitrou, 2008-08-14 19:23
Messages (14)
msg70494 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-07-31 09:03
As discovered in #3139, io.BufferedWriter mutates the size of its
internal bytearray object. Consequently, invocations of write() from
multiple threads can produce exceptions when one thread gets a buffer to
the bytearray while the other thread tries to resize it. This especially
affects calling print() from multiple threads.

A solution is to use a fixed-size (preallocated) bytearray object.
Another solution is to get rid of the bytearray approach and replace it
with a growing list of immutable bytes objects.

Here is the test script provided by Amaury:

import sys, io, threading
stdout2 = io.open(sys.stdout.fileno(), mode="w")
def f(i):
    for i in range(10):
        stdout2.write(unicode((x, i)) + '\n')
for x in range(10):
    t = threading.Thread(target=f, args=(x,))
    t.start()

(with py3k, replace "stdout2.write" with a simple "print")
msg70503 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-07-31 14:20
Here is a sample implementation (using a fixed-size bytearray) with test.
Amaury, would you like to take a look?

I'm a bit puzzled by the shady BufferedRandom semantics: though the
tests do pass, it's hard to say why e.g. BufferedRandom.tell() was
implemented like that.
msg70504 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2008-07-31 14:22
It seems that as with the quadratic binary buffered reading, the best
solution is the list of bytes which should be especially helped by your
bytes joining optimization.
msg70509 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-07-31 14:36
Selon Benjamin Peterson <report@bugs.python.org>:
>
> Benjamin Peterson <musiccomposition@gmail.com> added the comment:
>
> It seems that as with the quadratic binary buffered reading, the best
> solution is the list of bytes which should be especially helped by your
> bytes joining optimization.

I don't really know. The logic is quite different (and harder to get right) than
in BufferedReader. I might try to write a list-of-bytes version and do some
timings, but it's not trivial... I should first compare against the current
version.

Also, an advantage of using a fixed-size bytearray is that it more closely
mimicks what will have to be written for an efficient C implementation.
msg70511 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2008-07-31 14:39
On Thu, Jul 31, 2008 at 9:36 AM, Antoine Pitrou <report@bugs.python.org> wrote:
>
> I don't really know. The logic is quite different (and harder to get right) than
> in BufferedReader. I might try to write a list-of-bytes version and do some
> timings, but it's not trivial... I should first compare against the current
> version.
>
> Also, an advantage of using a fixed-size bytearray is that it more closely
> mimicks what will have to be written for an efficient C implementation.

You are correct that this should probably be rewritten in C at some
point, so let's go with this.
msg70561 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008-08-01 15:45
I don't think the proposed patch (file11012) makes it thread-safe, and I
believe you cannot get it thread-safe without using thread synchronization.

For example, if two threads simultaneously determine that there is still
space left (len(b) <= free), they might both write to self._write_buf
and self._write_end, and the last writer would win. (In case it isn't
clear how this might happen: thread 1 exhausts its Py_Ticker just after
checking the size, then thread 2 runs a full .write(), then thread 1
continues filling the buffer)
msg70563 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-08-01 15:55
Hi,

Selon "Martin v. Löwis" <report@bugs.python.org>:
>
> I don't think the proposed patch (file11012) makes it thread-safe, and I
> believe you cannot get it thread-safe without using thread synchronization.

I should have precised that in the context of this issue, "thread-safe" does not
mean "produces perfectly correct output" but simply "does not raise exceptions
when using the same buffered object from two different threads". The former
would be preferable but is not required, IMHO, for a buffered IO library; the
latter is much more critical because as Amaury points out, you otherwise get
exceptions when printing e.g. debut output from multiple threads.
msg70590 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-08-01 20:11
Attaching a new patch with better performance characteristics than my
previous one, and the non-blocking test rewritten in a sane way.

Some timeit runs:

-s "import io; f=io.open('/dev/null', 'wb'); s=b'a'*1" "for i in xrange(100): f.write(s)"

without patch: 533 usec per loop
with patch: 724 usec per loop
with builtin open(): 59.6 usec per loop

-s "import io; f=io.open('/dev/null', 'wb'); s=b'a'*100" "for i in xrange(100): f.write(s)"

without patch: 563 usec per loop
with patch: 768 usec per loop
with builtin open(): 67.8 usec per loop

-s "import io; f=io.open('/dev/null', 'wb'); s=b'a'*10000" "for i in xrange(100): f.write(s)"

without patch: 1.35 msec per loop
with patch: 1.34 msec per loop
with builtin open(): 194 usec per loop

-s "import io; f=io.open('/dev/null', 'wb'); s=b'a'*100000" "for i in xrange(100): f.write(s)"

without patch: 4.92 msec per loop
with patch: 1.34 msec per loop
with builtin open(): 199 usec per loop

-s "import io; f=io.open('/dev/null', 'wb'); s=b'a'*1000000" "for i in xrange(100): f.write(s)"

without patch: 173 msec per loop
with patch: 1.35 msec per loop
with builtin open(): 194 usec per loop
msg70598 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008-08-01 21:51
> I should have precised that in the context of this issue, "thread-safe" does not
> mean "produces perfectly correct output" but simply "does not raise exceptions
> when using the same buffered object from two different threads".

In that case, I'm -1 for this patch. Raising exceptions is much more
preferable to silently losing data, or writing garbage.

> The former would be preferable but is not required, IMHO, for a buffered IO library; the
> latter is much more critical because as Amaury points out, you otherwise get
> exceptions when printing e.g. debut output from multiple threads.

And that's a good thing, when using a library that is not thread-safe.

Either the library provides thread-safety, or it doesn't. If it doesn't,
it's the application's responsibility to not use the library from
multiple threads, or protect all access with appropriate
synchronization. Now that print is a function, it's easy to implement
a version of it that synchronizes all prints.

With the status quo, people have at least a chance of learning that the
library is not thread-safe. If the shallow problems are resolved, people
will cry FOUL loudly when they learn about the deep problems.
msg70600 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-08-01 22:08
Le vendredi 01 août 2008 à 21:51 +0000, Martin v. Löwis a écrit :
> With the status quo, people have at least a chance of learning that the
> library is not thread-safe. If the shallow problems are resolved, people
> will cry FOUL loudly when they learn about the deep problems.

But the current implementation can also silently produce garbage without
raising an exception. It's only if the context switch happens at a
certain precise point that an exception is raised. If the internal
buffer is mutated without resizing of the backing memory area (or
without any buffer being currently held), there is no exception and
still the behaviour is incorrect.

> Now that print is a function, it's easy to implement
> a version of it that synchronizes all prints.

Well, if that resolution is prefered, I think it should be integrated to
the builtin print function, rather than forcing users to monkeypatch it
(but a debugging facility directly calling sys.stdout.write or
equivalent will not be helped).
msg70612 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008-08-02 01:50
> Well, if that resolution is prefered, I think it should be integrated to
> the builtin print function, rather than forcing users to monkeypatch it
> (but a debugging facility directly calling sys.stdout.write or
> equivalent will not be helped).

True. I think it is the stream's write method that must be
synchronized, e.g. by renaming the current write function to
_write_locked, and adding a write function that obtains a per-file
lock, and calls write_locked. Other methods accessing the buffer
need to get synchronized with the same lock as well.

This is how it's done in about any stdio implementation I ever looked at.
msg71144 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-08-14 19:23
Here is a new patch which simply wraps the current BufferedWriter
methods with a lock. It has a test case, and Amaury's example works fine
too.

Martin, do you think it's fine?

(as for BufferedReader, I don't see the use cases for multithreaded reading)
msg71149 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008-08-14 20:08
The patch looks fine (as far as it goes).

I do think the same should be done to the reader: IO libraries typically
provide a promise that concurrent threads can read, and will get the
complete stream in an overlapped manner (i.e. each input byte goes to
exactly one thread - no input byte gets lost, and no input byte is
delivered to multiple threads). I don't think this is currently the
case: two threads reading simultaneously may very well read the same
bytes twice, and then, subsequently, skip bytes (i.e. when both
increment _read_pos, but both see the original value of pos)
msg71153 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-08-14 21:07
Both BufferedReader and BufferedWriter are now fixed in r65686.
Perhaps someone wants to open a separate issue for TextIOWrapper...
History
Date User Action Args
2008-12-07 08:03:46wplappertsetnosy: + wplappert
2008-08-14 21:07:06pitrousetstatus: open -> closed
resolution: fixed
messages: + msg71153
2008-08-14 20:08:28loewissetmessages: + msg71149
2008-08-14 19:23:16pitrousetfiles: + bufferedwriter3.patch
assignee: pitrou
messages: + msg71144
2008-08-02 01:50:24loewissetmessages: + msg70612
2008-08-01 22:08:56pitrousetmessages: + msg70600
2008-08-01 21:51:02loewissetmessages: + msg70598
2008-08-01 20:11:40pitrousetfiles: + bufferedwriter2.patch
messages: + msg70590
2008-08-01 15:55:03pitrousetmessages: + msg70563
2008-08-01 15:45:38loewissetnosy: + loewis
messages: + msg70561
2008-07-31 14:39:21benjamin.petersonsetmessages: + msg70511
2008-07-31 14:36:16pitrousetmessages: + msg70509
2008-07-31 14:22:08benjamin.petersonsetnosy: + benjamin.peterson
messages: + msg70504
2008-07-31 14:20:46pitrousetfiles: + bufferedwriter.patch
keywords: + patch
messages: + msg70503
2008-07-31 09:03:11pitroucreate