classification
Title: nntplib is broken when responses are longer than _MAXLINE
Type: behavior Stage: patch review
Components: Library (Lib) Versions: Python 3.7, Python 3.6, Python 3.5, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: christian.heimes, martin.panter, python-dev, serhiy.storchaka, xdegaye, xiang.zhang, zach.ware
Priority: high Keywords: buildbot, patch

Created on 2016-12-14 11:54 by xdegaye, last changed 2016-12-24 06:54 by martin.panter.

Files
File name Uploaded Description Edit
test_nntplib.log xdegaye, 2016-12-14 11:54
nntplib_maxline.patch xdegaye, 2016-12-14 21:12 review
nntplib_maxbytes.patch xdegaye, 2016-12-17 15:15 review
max_over_line.patch martin.panter, 2016-12-24 06:54 review
Messages (25)
msg283189 - (view) Author: Xavier de Gaye (xdegaye) * (Python committer) Date: 2016-12-14 11:54
All the buildbots are currently failing at test_nntplib.
See the errors in attached test_nntplib.log.
The same error messages can be reproduced with a 3.5.2 interpreter:

$ python
Python 3.5.2 (default, Nov  7 2016, 11:31:36)
[GCC 6.2.1 20160830] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from nntplib import NNTP_SSL
>>> s = NNTP_SSL('nntp.aioe.org')
>>> resp, count, first, last, name = s.group('comp.lang.python')
>>> resp, overviews = s.over((last - 1, last))                      # does not fail
>>> resp, overviews = s.over((last - 9, last))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.5/nntplib.py", line 831, in over
    resp, lines = self._longcmdstring(cmd, file)
  File "/usr/lib/python3.5/nntplib.py", line 525, in _longcmdstring
    resp, list = self._getlongresp(file)
  File "/usr/lib/python3.5/nntplib.py", line 494, in _getlongresp
    line = self._getline()
  File "/usr/lib/python3.5/nntplib.py", line 434, in _getline
    raise NNTPDataError('line too long')
nntplib.NNTPDataError: line too long
>>> resp, overviews = s.over((last - 1, last))                      # fails now
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.5/nntplib.py", line 831, in over
    resp, lines = self._longcmdstring(cmd, file)
  File "/usr/lib/python3.5/nntplib.py", line 525, in _longcmdstring
    resp, list = self._getlongresp(file)
  File "/usr/lib/python3.5/nntplib.py", line 476, in _getlongresp
    resp = self._getresp()
  File "/usr/lib/python3.5/nntplib.py", line 458, in _getresp
    raise NNTPProtocolError(resp)
nntplib.NNTPProtocolError: a657tKkPSmKkOzCP9RkpDChwanytHhX4XFYLLMWqmA@mail.gmail.com> <CANc-5UxXPRF3qYKEmnyDdANoN-5GLiRXDrOU6E=R-8N8ZV2UdQ@mail.gmail.com> <CANc-5UwNFgS4aFN5qbwOPNdTMfXrEpJrv4b00_ycGmwAAoK81Q@mail.gmail.com> <CANc-5UyhtmjE9rXwhMBx=H9tBmYARhfzVjw2O=iKDmyEdCpqSA@mail.gmail.com> <CANc-5UyHLP3W8-_D18f8qE-9z0Y-DMbuHRMLpLY10eA52YZMew@mail.gmail.com> <CANc-5UxQA=-9Q=1GPh+5epX9E3KbAgd6Ye_63=kB55aB-MHqow@mail.gmail.com> <CANc-5UwqKjfxrxS0tvSk6r0kRQbr5kiAyCxJhUT8JPfcySSG2A@mail.gmail.com>
<CANc-5Uyj9VnJbVGSepVLLHeCgoB9uAOD61Ft4b=bdLTQJ5dE=A@mail.gmail.com>    10039   25      Xref: aioe.org comp.lang.python:183369
msg283205 - (view) Author: Zachary Ware (zach.ware) * (Python committer) Date: 2016-12-14 16:20
This appears to have been a transient failure; all builders are passing a forced build.  I'm no expert on nntplib, but I'm not sure there's much we can do here beyond replacing the outside connection with a mock server.  That pretty much defeats the purpose of the tests, though.
msg283207 - (view) Author: Xavier de Gaye (xdegaye) * (Python committer) Date: 2016-12-14 16:34
The problem is when getting an overview of the following comp.lang.python mail:
    OT - "Soft" ESC key on the new MacBook Pro   Michael Torrie
    Tue Dec 13 21:33:07 EST 2016

This is at index (last - 16) now and that is why the buildbots do not fail anymore.
The response received from the server for this mail in the _getline() function is 2058 bytes long and _getline() raises NNTPDataError because this is greater than _MAXLINE (2048). Setting _MAXLINE to 4096 fixes the problem.

There remains to explain the failures in the other tests.
msg283208 - (view) Author: Xavier de Gaye (xdegaye) * (Python committer) Date: 2016-12-14 16:57
BTW while searching for this problem, I found that there were three other mails raising this same  NNTPDataError('line too long') in the 40 last mails.
msg283218 - (view) Author: Xavier de Gaye (xdegaye) * (Python committer) Date: 2016-12-14 21:12
When the server sends a line longer than _MAXLINE, nntplib reads only _MAXLINE + 1 bytes leaving the remaining bytes left to be processed by the next command that will interpret those bytes as a protocol error. Hence the failing tests that follow the first NNTPDataError exception in the failing test_nntplib.

The patch increases _MAXLINE from 2048 to 4096 and fixes the above problem. It also takes care to read up to (and including) the terminator so that the following lines or the terminator is not interpreted as a protocol error by the next nntp command.

The patch does not include a test case.
msg283226 - (view) Author: Xavier de Gaye (xdegaye) * (Python committer) Date: 2016-12-14 23:05
> I found that there were three other mails raising this same  NNTPDataError('line too long') in the 40 last mails.

Cannot reproduce it. These three exceptions must have been NNTPProtocolError instead, caused by the initial NNTPDataError.
msg283236 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-12-15 04:07
Just a quick note for the moment: It may not be wise to drop the limit to readline(). That is undoing Issue 16040. Maybe we need a better test if this change doesn't fail the test suite.
msg283261 - (view) Author: Xavier de Gaye (xdegaye) * (Python committer) Date: 2016-12-15 09:09
The patch:
1) Increases _MAXLINE to 4096.
2) Reverts issue 16040 and that is not correct, please ignore that part.

The changes made in issue 16040 limit the amount of data read by readline() and does not close the nntp session when the server sends a message whose size is greater than this limit. In that case, should not nntplib close the session as the session has become unusable as shown by the test_nntplib results in test_nntplib.log?
msg283265 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-12-15 09:29
It seems to me there are two issues:

1) The limit of line length is not large enough.
2) After raising an error on too long line the NNTP object is left in broken state.

The first issue can be solved by increasing the default limit or by patching the nntp module in tests.

For the second issue I suggest to open new tracker issue.
msg283294 - (view) Author: Xavier de Gaye (xdegaye) * (Python committer) Date: 2016-12-15 12:51
It seems that the comment placed above the definition of _MAXLINE in the nntplib module is not correct:
    "RFC 3977 limits NNTP line length to 512 characters, including CRLF. We have selected 2048 just to be on the safe side."
The 512 characters limit in RFC 3977 only applies to command lines and to the initial line of a response.

RC 3977 says instead:
    "This document does not place any limit on the length of a line in a multi-line block.  However, the standards that define the format of articles may do so."

So I think _MAXLINE should have a large value (64 K ?) and its semantic is that a line whose length is above that value is considered by nntplib as a Dos attack (and not a protocol violation). In that case nntplib should behave in consequence and prevent any further reads from that connection (either by closing the connection or raising an exception on each of these attempts). IMHO this should be handled in the same issue because it is one single problem, and this may possibly be handled in two different changesets.
msg283367 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-12-16 06:28
The limit to readline() was added to prevent consuming an excessive amount of memory. But this doesn't help in case of long multiline responses, since all lines are accumulated in a list in memory. A malicious server could cause a client consuming an excessive amount of memory by sending large number of short lines instead of one long line.

Christian, what are you think about this?
msg283423 - (view) Author: Zachary Ware (zach.ware) * (Python committer) Date: 2016-12-16 17:48
This is causing all buildbots to fail again; I'd be in favor of temporarily skipping the affected tests with a message that points back to here until we have a permanent solution.
msg283424 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-12-16 18:37
Xavier's plan sounds good. We could increase the line length limitation to 64K and add another limitation of the maximum lines a multi-line block could contain. Any limitation is violated the connection is refused. This situation seems quite similar to http.client.parse_headers, which doesn't get a standard length limitation or number limitation either.
msg283425 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2016-12-16 18:49
A larger limit is totally fine. The check protects against DoS with hundreds of MB.

I'm currently travelling and won't be available much until next week.

Am 16. Dezember 2016 19:37:03 MEZ, schrieb Xiang Zhang <report@bugs.python.org>:
>
>Xiang Zhang added the comment:
>
>Xavier's plan sounds good. We could increase the line length limitation
>to 64K and add another limitation of the maximum lines a multi-line
>block could contain. Any limitation is violated the connection is
>refused. This situation seems quite similar to
>http.client.parse_headers, which doesn't get a standard length
>limitation or number limitation either.
>
>----------
>nosy: +xiang.zhang
>
>_______________________________________
>Python tracker <report@bugs.python.org>
><http://bugs.python.org/issue28971>
>_______________________________________
msg283429 - (view) Author: Xavier de Gaye (xdegaye) * (Python committer) Date: 2016-12-16 19:40
> I'd be in favor of temporarily skipping the affected tests with a message that points back to here until we have a permanent solution.

I am working on it.
msg283430 - (view) Author: Roundup Robot (python-dev) Date: 2016-12-16 19:52
New changeset 1386795d266e by Xavier de Gaye in branch '3.5':
Issue #28971: Temporarily skip test_over until a permanent solution is found
https://hg.python.org/cpython/rev/1386795d266e

New changeset a33047e08711 by Xavier de Gaye in branch '3.6':
Issue #28971: Merge 3.5
https://hg.python.org/cpython/rev/a33047e08711

New changeset b51914417b41 by Xavier de Gaye in branch 'default':
Issue #28971: Merge 3.6
https://hg.python.org/cpython/rev/b51914417b41
msg283490 - (view) Author: Xavier de Gaye (xdegaye) * (Python committer) Date: 2016-12-17 15:15
This patch defines a _MAXBYTES limit of 100 Mb. The limit is checked upon reading one line and also upon reading the multi-lines of the response to a user command.
msg283508 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-12-17 19:30
If sender sends a lot of empty lines and file is not None, LF or CRLF is stripped from lines, and len(line) is 0. Every empty line increases the size of the lines list by 4 or 8 bytes. Since count is not changed, the loop is not bounded. Every LF byte sent by malicious sender increases memory consumption by 4 or 8 bytes.
msg283509 - (view) Author: Xavier de Gaye (xdegaye) * (Python committer) Date: 2016-12-17 19:35
I responded to your last review Serhiy, but the psf mail system reports a delivery failure, so you may have missed the notification as well as the other reviewers.
msg283510 - (view) Author: Xavier de Gaye (xdegaye) * (Python committer) Date: 2016-12-17 19:58
> If sender sends a lot of empty lines and file is not None, LF or CRLF is stripped from lines

Oh, I missed that. Maybe give a weight of 4 to 8 bytes or even more to a line, this value being added to the bytes count whether the line is empty or not.
Do you have another suggestion ?
msg283512 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-12-17 20:12
On 32-bit platform the size of empty bytes object is 17 bytes not counting padding and GC links. Plus 4 bytes for a pointer in a list. On 64 bit platform numbers are about twice larger. Therefore add at least 20-40 bytes per line.

Is not 100 MiB too large for a list in memory? For files we could use larger limit, but the problem is that file can be io.BytesIO().
msg283536 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-12-18 03:50
The first offending message I found is number 183465:

>>> s.group("comp.lang.python")
('211 4329 179178 183507 comp.lang.python', 4329, 179178, 183507, 'comp.lang.python')
>>> s._putcmd("OVER 183465")
>>> s._getresp()
'224 Overview information for 183465 follows'
>>> line = s.file.readline()
>>> len(line)
2702
>>> [number, subject, from_header, date, message_id, references, bytes, lines, extra] = line.split(b"\t", 8)
>>> message_id
b'<1481898601.2322.python-list@python.org">mailman.122.1481898601.2322.python-list@python.org>'
>>> len(references)
2483
>>> len(references.split())
36

Assuming the References value is the only large field in an OVER response line, I would guess that a reasonable limit per line might be

200 bytes/message-id * 200 messages/thread = 40,000 bytes per OVER line

I agree with closing the session whenever the client’s protocol state is broken, but only for a different, tangential reason. See Issue 22350. Even if you raised an explicit exception when using a closed session, like in nntplib_maxbytes.patch, won’t that still cause the subsequent tests to fail? I think the best solution is to stop sharing one connection across multiple tests: Issue 19756.
msg283619 - (view) Author: Xavier de Gaye (xdegaye) * (Python committer) Date: 2016-12-19 13:02
Martin in response to your last review, I still hold to my opinion stated in msg283294 but this should not prevent this high priority issue to progress. Can you propose another patch.
msg283683 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-12-20 11:54
I will try to come up with something in a few days
msg283924 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-12-24 06:45
Max_over_line.patch is my attempt: Keep the original _MAXLINES = 2048 code, but override it with _MAX_OVER_LINE = 64000 when reading OVER response lines. I also added a test case.
History
Date User Action Args
2016-12-24 06:54:18martin.pantersetfiles: + max_over_line.patch
2016-12-24 06:53:57martin.pantersetfiles: - max_over_line.patch
2016-12-24 06:45:34martin.pantersetfiles: + max_over_line.patch

messages: + msg283924
2016-12-20 11:54:32martin.pantersetmessages: + msg283683
2016-12-19 13:02:39xdegayesetmessages: + msg283619
2016-12-18 03:50:02martin.pantersetmessages: + msg283536
2016-12-17 20:12:18serhiy.storchakasetmessages: + msg283512
2016-12-17 19:58:08xdegayesetmessages: + msg283510
2016-12-17 19:35:16xdegayesetmessages: + msg283509
2016-12-17 19:30:35serhiy.storchakasetmessages: + msg283508
2016-12-17 15:15:32xdegayesetfiles: + nntplib_maxbytes.patch

messages: + msg283490
2016-12-16 19:52:03python-devsetnosy: + python-dev
messages: + msg283430
2016-12-16 19:40:14xdegayesetmessages: + msg283429
2016-12-16 18:50:00christian.heimessetmessages: + msg283425
2016-12-16 18:37:03xiang.zhangsetnosy: + xiang.zhang
messages: + msg283424
2016-12-16 17:48:23zach.waresetkeywords: + buildbot
priority: normal -> high
messages: + msg283423
2016-12-16 06:28:49serhiy.storchakasetmessages: + msg283367
2016-12-15 12:51:34xdegayesetmessages: + msg283294
2016-12-15 09:29:10serhiy.storchakasetmessages: + msg283265
2016-12-15 09:09:51xdegayesetmessages: + msg283261
2016-12-15 06:12:40serhiy.storchakasetnosy: + christian.heimes, serhiy.storchaka
2016-12-15 04:07:39martin.pantersetmessages: + msg283236
2016-12-14 23:05:12xdegayesetmessages: + msg283226
2016-12-14 21:12:14xdegayesetfiles: + nntplib_maxline.patch
versions: + Python 2.7
title: nntplib fails on all buildbots -> nntplib is broken when responses are longer than _MAXLINE
messages: + msg283218

keywords: + patch
stage: patch review
2016-12-14 16:57:47xdegayesetmessages: + msg283208
2016-12-14 16:34:44xdegayesetmessages: + msg283207
2016-12-14 16:20:59zach.waresetmessages: + msg283205
2016-12-14 15:24:00xdegayesetnosy: + martin.panter
2016-12-14 11:54:14xdegayecreate