This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: [SSL: BAD_WRITE_RETRY] bad write retry in _ssl.c:1636
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.4, Python 3.5
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: alex, christian.heimes, dstufft, giampaolo.rodola, janssen, nikratio, pitrou, skrah, tacocat, xgdomingo
Priority: normal Keywords:

Created on 2014-09-26 01:24 by nikratio, last changed 2022-04-11 14:58 by admin.

Messages (9)
msg227582 - (view) Author: Nikolaus Rath (nikratio) * Date: 2014-09-26 01:24
I received a bugreport due to a crash when calling SSLObject.send(). The traceback ends with:

[...]
  File "/usr/local/lib/python3.4/dist-packages/dugong-3.2-py3.4.egg/dugong/__init__.py", line 584, in _co_send
    len_ = self._sock.send(buf)
  File "/usr/lib/python3.4/ssl.py", line 679, in send
    v = self._sslobj.write(data)
ssl.SSLError: [SSL: BAD_WRITE_RETRY] bad write retry (_ssl.c:1636)

At first I thought that this is an exception that my application should catch and handle. However, when trying to figure out what exactly BAD_WRITE_RETRY means I get the impression that the fault is actually in Python's _ssl.c. The only places where this error is returned by OpenSSL are ssl/s2_pkt.c:480 and ssl/s3_pkt.c:1179, and in each case the problem seems to be with the caller supplying an invalid buffer after an initial write request failed to complete due to non-blocking IO.

This does not seem to be something that could be caused by whatever Python code, so I think there is a problem in _ssl.c.
msg227583 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-09-26 02:06
Hmm... this sounds like issue8240, except that it should be fixed in 3.4...
msg254343 - (view) Author: Nikolaus Rath (nikratio) * Date: 2015-11-08 15:39
This just happened again to someone else, also using Python 3.4: https://bitbucket.org/nikratio/s3ql/issues/87

Is there anything the affected people can do to help debugging this?
msg254389 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2015-11-09 14:28
I had a similar issue with ucspi-ssl that was fixed by following
the O'Reilly book's recommendations w.r.t WANT_READ/WANT_WRITE
with non-blocking sockets to the letter.

The recommendations are quite complex since apparently
WANT_READ/WANT_WRITE mean different things depending
on whether they occur *during a read* or *during a write*.


I haven't used Python's SSL module much: Since those flags
are exposed on the Python level, are users supposed to take
care of the above issues themselves for non-blocking sockets?
msg257350 - (view) Author: Nikolaus Rath (nikratio) * Date: 2016-01-02 17:45
*ping*

Just letting people know that this is still happening regularly and still present in 3.5.

Some reports:

https://bitbucket.org/nikratio/s3ql/issues/87/
https://bitbucket.org/nikratio/s3ql/issues/109/ (last comment)
msg257413 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2016-01-03 13:08
As I said in msg254389, the read/write handling for non-blocking
sockets is far from trivial.

I'm not sure that this is a Python bug:

Looking at dugong/__init__.py, I don't think this implements the
recommendations in the OpenSSL book that I mentioned.

The book recommends to keep a state ...

struct iostate {
        int read_waiton_write;
        int read_waiton_read;
        int write_waiton_write;
        int write_waiton_read;
        int can_read;
        int can_write;
};

... a check_availability() function that sets 'can_read', 'can_write'

... a write_allowed() function that determines whether a write is
even possible to attempt ...

int write_allowed(struct iostate *state)
{
        if (state->read_waiton_write || state->read_waiton_read)
                return 0;
        if (state->can_write)
                return 1;
        if (state->can_read && state->write_waiton_read)
                return 1;

        return 0;
}

... and finally, the glorious loop:

while (!done) {

                while (check_availability(ssl, &state) == -1 || !state.can_write)
                        nanosleep(&ts, NULL);


if (write_allowed(&state)) {

                        state.write_waiton_read = 0;
                        state.write_waiton_write = 0;

                        retval = SSL_write(ssl, wbuf, strlen(wbuf));
                        switch (SSL_get_error(ssl, retval)) {
                                case SSL_ERROR_NONE:
                                        done = 1;
                                        break;
                                case SSL_ERROR_ZERO_RETURN:
                                        log_sslerr();
                                        return -1;
                                        break;
                                case SSL_ERROR_WANT_READ:
                                        state.write_waiton_read = 1;
                                        break;
                                case SSL_ERROR_WANT_WRITE:
                                        state.write_waiton_write = 1;
                                        break;
                                default:
                                        log_sslerr();
                                        break;
                        }
                }
        }
msg257535 - (view) Author: Nikolaus Rath (nikratio) * Date: 2016-01-05 17:33
Stefan, sorry for ignoring your earlier reply. I somehow missed the question at the end.

I believe that users of the Python module are *not* expected to make use of the WANT_READ, WANT_WRITE flags. Firstly because the documentation (of Python's ssl module) doesn't say anything about that, and secondly because the code that's necessary to handle these flags is a prime example for complexity that is imposed by the C API that should be hidden to Python users.

That said, could you give a more specific reference to the O'Relly book (and maybe even page or chapter)? At the moment it's a little hard for me to follow the rest of your message. 

Essentially, if I'm trying to write to a non-blocking, Python SSL socket, I would expect that this either succeeds or raises SSL_WANT_WRITE/READ. Not having read the book, it seems to me this is the only information that's useful to a Python caller. In what situation would you need the more exact state that your C example tracks?
msg257562 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2016-01-05 22:33
https://books.google.com/books?id=IIqwAy4qEl0C&redir_esc=y , page 159 ff.

Modules/_ssl.c:_ssl__SSLSocket_write_impl() just raises
PySSLWantReadError etc. if the socket is non-blocking.

IOW, it's a thin wrapper around SSL_write().

So yes, I think you do need complete error handling on
the Python level.
msg257614 - (view) Author: Nikolaus Rath (nikratio) * Date: 2016-01-06 17:06
Would you be willing to review a patch to incorporate the handling into the SSL module?
History
Date User Action Args
2022-04-11 14:58:08adminsetgithub: 66689
2018-09-23 17:16:46tacocatsetnosy: + tacocat
2018-01-22 20:25:10xgdomingosetnosy: + xgdomingo
2016-01-06 17:06:03nikratiosetmessages: + msg257614
2016-01-05 22:33:28skrahsetmessages: + msg257562
2016-01-05 17:33:15nikratiosetmessages: + msg257535
2016-01-03 13:08:40skrahsetmessages: + msg257413
2016-01-02 17:45:54nikratiosetmessages: + msg257350
versions: + Python 3.5
2015-11-09 14:28:20skrahsetnosy: + skrah
messages: + msg254389
2015-11-08 15:39:41nikratiosetmessages: + msg254343
2014-09-26 02:06:18pitrousettype: crash -> behavior
2014-09-26 02:06:12pitrousetmessages: + msg227583
2014-09-26 01:24:30nikratiocreate