Author doerwalter
Recipients doerwalter, lemburg, loewis, martin.panter, ncoghlan, vstinner
Date 2014-01-10.11:26:49
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1389353210.68.0.603563067958.issue20132@psf.upfronthosting.co.za>
In-reply-to
Content
The best solution IMHO would be to implement real incremental codecs for all of those.

Maybe iterencode() with an empty iterator should never call encode()? (But IMHO it would be better to document that iterencode()/iterdecode() should only be used with "real" codecs.)

Note that the comment before PyUnicode_DecodeUTF7Stateful() in unicodeobject.c reads:

/* The decoder.  The only state we preserve is our read position,
 * i.e. how many characters we have consumed.  So if we end in the
 * middle of a shift sequence we have to back off the read position
 * and the output to the beginning of the sequence, otherwise we lose
 * all the shift state (seen bits, number of bits seen, high
 * surrogate). */

Changing that would have to introduce a state object that the codec updates and from which it can be restarted.

Also the encoder does not buffer anything. To implement the suggested behaviour, the encoder might have to buffer unlimited data.
History
Date User Action Args
2014-01-10 11:26:50doerwaltersetrecipients: + doerwalter, lemburg, loewis, ncoghlan, vstinner, martin.panter
2014-01-10 11:26:50doerwaltersetmessageid: <1389353210.68.0.603563067958.issue20132@psf.upfronthosting.co.za>
2014-01-10 11:26:50doerwalterlinkissue20132 messages
2014-01-10 11:26:49doerwaltercreate