This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author nascheme
Recipients belopolsky, doerwalter, ezio.melotti, lemburg, nascheme, r.david.murray, serhiy.storchaka, vstinner, wpk
Date 2018-10-05.00:20:20
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1538698823.56.0.545547206417.issue18291@psf.upfronthosting.co.za>
In-reply-to
Content
Attached is a rough patch that tries to fix this problem.  I changed the behavior in that unicode char 0x2028 is no longer treated as a line separator.  It would be trival to change the regex to support that too, if we want to preserve backwards compatibility.  Personally, I think readlines() on a codecs reader should do that same line splitting as an 'io' file.

If we want to use the patch, the following must yet be done: write tests that check the splitting on FS, RS, and GS characters.  Write a news entry.  I didn't do any profiling to see what the performance effect of my change is so that should be checked too.
History
Date User Action Args
2018-10-05 00:20:23naschemesetrecipients: + nascheme, lemburg, doerwalter, belopolsky, vstinner, ezio.melotti, r.david.murray, serhiy.storchaka, wpk
2018-10-05 00:20:23naschemesetmessageid: <1538698823.56.0.545547206417.issue18291@psf.upfronthosting.co.za>
2018-10-05 00:20:23naschemelinkissue18291 messages
2018-10-05 00:20:22naschemecreate