Message 327112 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	lemburg
Recipients	belopolsky, doerwalter, ezio.melotti, lemburg, nascheme, r.david.murray, serhiy.storchaka, vstinner, wpk, xtreak
Date	2018-10-05.08:07:02
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1538726822.65.0.545547206417.issue18291@psf.upfronthosting.co.za>
In-reply-to

Content
The Unicode .splitlines() splits strings on what Unicode defines as linebreak characters (all code points with character properties Zl or bidirectional property B). This is different than what typical CSV file parsers or other parsers built for the ASCII text files treat as newline. They usually only break on CR, CRLF, LF, so the use of .splitlines() in this context is wrong, not the method itself. It may make sense extending .splitlines() to pass in a list of linebreak characters to break on, but that would make it a lot slower and the same can already be had by using re.split() on Unicode strings. Closing this as won't fix.

The Unicode .splitlines() splits strings on what Unicode defines as linebreak characters (all code points with character properties Zl or bidirectional property B).

This is different than what typical CSV file parsers or other parsers built for the ASCII text files treat as newline. They usually only break on CR, CRLF, LF, so the use of .splitlines() in this context is wrong, not the method itself.

It may make sense extending .splitlines() to pass in a list of linebreak characters to break on, but that would make it a lot slower and the same can already be had by using re.split() on Unicode strings.

Closing this as won't fix.

History
Date	User	Action	Args
2018-10-05 08:07:02	lemburg	set	recipients: + lemburg, doerwalter, nascheme, belopolsky, vstinner, ezio.melotti, r.david.murray, serhiy.storchaka, wpk, xtreak
2018-10-05 08:07:02	lemburg	set	messageid: <1538726822.65.0.545547206417.issue18291@psf.upfronthosting.co.za>
2018-10-05 08:07:02	lemburg	link	issue18291 messages
2018-10-05 08:07:02	lemburg	create