This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: poplib maxline behaviour may be wrong
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.6, Python 3.4, Python 3.5, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Chris Smowton, Ingo Ruhnke, berker.peksag, christian.heimes, doko, gnarvaja, introom, r.david.murray, rblank
Priority: normal Keywords:

Created on 2015-04-10 15:26 by gnarvaja, last changed 2022-04-11 14:58 by admin.

Messages (11)
msg240425 - (view) Author: Guillermo Narvaja (gnarvaja) Date: 2015-04-10 15:26
After #16041 was fixed, Python started to validate that lines coming from the POP server should be under 2048 bytes. 

This breaks the mail retrieval from at least dovecot servers, as this mail server does not breaks responses in 512 o 2048 sized lines.

On dovecot's side, they said there is a misunderstood of the RFC on the Python side, that the RFC 1939 "is talking about POP3 responses themselves - not about the actual email message body". You can see here the related mail thread:

http://dovecot.org/pipermail/dovecot/2015-April/100475.html

I'm not sure Who is right, but I think it's a problem (at least it was for me).
msg245900 - (view) Author: Ingo Ruhnke (Ingo Ruhnke) Date: 2015-06-28 06:40
This also breaks mail retrieval from both gmx.de and gmail.com (two rather large and popular mail provider). After setting _MAXLINE in/usr/lib/python2.7/poplib.py to some arbitrary higher number mail retrieval from both services worked fine again.

This this 2048 does definitively looks badly broken.
msg245909 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-06-28 15:40
The RFC is in fact not clear on this point.  It is entirely possible to read it as saying that each line of a mulitline response is limited to 512 octets.  I agree, however, that that is not the most reasonable interpretation.  Instead, the line length of RETR message lines should be governed by RFC 5322, which specifies a maximum line length of 998 octets.

That, however, means that technically dovecot is still broken, since 2048 is quite a bit larger than 998.  In reality, it means that the *internet* is broken, in that I presume the root of the problem is that there are mail originators out there that are not obeying RFC 5322 (and its predecessors...this limit goes back to 821/822).

We use 8192 in smtplib, and that hasn't caused any problems...but then again smtplib is originating email, not receiving it.  The IMAP protocol has its own problems, quite aside from the length of message body lines, so we ended up with a very large MAXLINE there.  It may be that we have no choice except to do something similar in poplib.

An interesting question in this context is what smtp servers do. since if anyone was going to reject messages with overlong lines, it would be the smtp server's job to do it.
msg246726 - (view) Author: Chris Smowton (Chris Smowton) Date: 2015-07-14 11:01
I found the same problem retrieving mail from my ISP's (unknown) POP3 server. I was sent an HTML email as one long 50KB line, which naturally broke everything.

Instead of limiting line length, I suggest you should limit total message body size, since that's what you're actually trying to defend against here. You could also either use the +OK XXX octets line to set a more conservative limit (and fail fast if it announces intent to send more than your limit).

As above the workaround was to insert import poplib; poplib._MAXLINE = 1000000 at the top of the 'getmail' script.

A side-note: one message that is broken this way causes all future messages to fail because poplib does not flush the connection when bailing due to a 'line too long' error. If it isn't prepared to read the rest of the incoming data, it *must* hang up the connection and re-login to fetch the next message.
msg246730 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-07-14 14:28
Could you open a separate bug for the recovery problem, please?

Using a maximum message size would not solve this problem, but it would give the library user control of when it failed, so it is a good feature request.
msg247282 - (view) Author: Chris Smowton (Chris Smowton) Date: 2015-07-24 15:20
Why wouldn't that fix the problem? The issue is poplib not tolerating server behaviour seen in the wild, and if you limit by message size not line length you shouldn't see this problem?

(Side note, I'm surprised not to have been emailed when you replied, any idea what I'm missing?)
msg247284 - (view) Author: Chris Smowton (Chris Smowton) Date: 2015-07-24 15:25
Created #24706 to describe the unflushed connection problem.
msg247289 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-07-24 16:06
Sorry, I was unclear.  In order to implement maximum message size we have to do a bit more to the logic than just use the max message size as the readline limit.  But it does seem like the right approach to me.
msg248455 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-08-12 11:46
Note that the max message size solution can be applied to the maintenance releases as a fix for this issue by choosing a suitable large default message size.  The 'feature' part is just the part exposing the size limit in the library API...that part is a feature for 3.6.
msg248459 - (view) Author: shiyao.ma (introom) * Date: 2015-08-12 15:05
Instead of setting a MAXSIZE for the email body, rasing up the MAXLINE might be more meaningful.


Consider the case of MAXSIZE, it's essentially the same as MAXLINE. If MAXSIZE is relatively small, some messages won't pass through. If the MAXSIZE is relatively large, then what's the meaning of setting it?


Thus, it might be more practical to increase the value of MAXLINE so that 99% messages can pass through.
msg248460 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-08-12 15:24
If maxline is too small, messages won't get through.  If maxline is too large *huge* messages will get through...and the DDOS danger of exhausting the server's resources will occur.  So, we really ought to provide a way to limit the maximum message size *anyway*...at which point a separate maxline value doesn't make any sense, since the RFC specifies no maximum line size.

I'm much more comfortable setting a large maximum message size than setting a large enough maximum line size to permit that size of message consisting of mostly a single line.  Since we aren't going to back out the DDOS fix, we have to put the limit *somewhere*.  At least in 3.6 we can make it easy for the application to set it.  (Programs using earlier versions will just have to monkey-patch, unfortunately...which they have to do right now anyway.)
History
Date User Action Args
2022-04-11 14:58:15adminsetgithub: 68094
2015-08-12 15:24:20r.david.murraysetmessages: + msg248460
2015-08-12 15:05:26introomsetnosy: + introom
messages: + msg248459
2015-08-12 11:46:44r.david.murraysetmessages: + msg248455
2015-07-24 16:06:23r.david.murraysetmessages: + msg247289
2015-07-24 15:25:14Chris Smowtonsetmessages: + msg247284
2015-07-24 15:20:08Chris Smowtonsetmessages: + msg247282
2015-07-14 14:28:43r.david.murraysetmessages: + msg246730
2015-07-14 11:01:17Chris Smowtonsetnosy: + Chris Smowton
messages: + msg246726
2015-06-28 15:41:06r.david.murraysetversions: + Python 3.4, Python 3.5, Python 3.6, - Python 3.2
2015-06-28 15:40:38r.david.murraysetnosy: + r.david.murray
messages: + msg245909
2015-06-28 06:40:22Ingo Ruhnkesetnosy: + Ingo Ruhnke
messages: + msg245900
2015-05-28 18:55:26rblanksetnosy: + rblank
2015-04-10 15:26:33gnarvajacreate