classification
Title: TextIOWrapper.readline and str.splitlines have different behavior
Type: behavior Stage: resolved
Components: Documentation, Interpreter Core, IO Versions: Python 3.2, Python 3.3, Python 2.7
process
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: docs@python Nosy List: benjamin.peterson, docs@python, iritkatriel, pitrou, r.david.murray
Priority: normal Keywords:

Created on 2011-02-24 02:19 by benjamin.peterson, last changed 2020-10-20 10:49 by iritkatriel. This issue is now closed.

Messages (7)
msg129240 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2011-02-24 02:19
For example:
>>> 'print 1\n\x0cprint 2\n\n'.splitlines()
['print 1\n', '\x0cprint 2\n', '\n']
>>> list(io.StringIO('print 1\n\x0cprint 2\n\n'))


I'm not sure which is preferable.
msg129241 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-02-24 02:29
Your example got a little messed up.

>>> list(io.StringIO('print 1\n\x0cprint 2\n\n'))
['print 1\n', '\x0cprint 2\n', '\n']
>>> 'print 1\n\x0cprint 2\n\n'.splitlines(True)
['print 1\n', '\x0c', 'print 2\n', '\n']
>>> list(io.StringIO('print 1\x0cprint 2\n\n'))
['print 1\x0cprint 2\n', '\n']
>>> 'print 1\x0cprint 2\n\n'.splitlines(True)
['print 1\x0c', 'print 2\n', '\n']

I think splitlines has it correct.
msg129242 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-02-24 02:31
On the other hand, I believe io is documented as only recognizing /r and /n, so its behavior matches its documentation.
msg129244 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2011-02-24 02:33
I don't see that, but the chances of changing either of these is quite low, so I suppose we should just document.
msg129245 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-02-24 02:45
"newline controls how universal newlines works (it only applies to text mode). It can be None, '', '\n', '\r', and '\r\n'..."

Whereas splitlines says:

"Return a list of the lines in the string, breaking at line boundaries."

So if we are fixing docs, we need to add that "line boundaries" are based on the relevant unicode properties (see issue 7643).

And, indeed, Antoine has already pronounced on this in issue 6664.

Since this has come up more than once, adding a note that they are not recognized by design in the io module to the io docs might be a good idea.
msg377499 - (view) Author: Irit Katriel (iritkatriel) * (Python triager) Date: 2020-09-25 18:54
I think this documentation enhancement was done under issue36642 (PR 12855).
msg377500 - (view) Author: Irit Katriel (iritkatriel) * (Python triager) Date: 2020-09-25 18:55
Sorry, I copied the wrong numbers. The doc change is here: https://github.com/python/cpython/commit/8218bd4caf683ee98c450a093bf171dbca6c4849
History
Date User Action Args
2020-10-20 10:49:12iritkatrielsetstatus: open -> closed
resolution: out of date
stage: needs patch -> resolved
2020-09-25 18:55:54iritkatrielsetmessages: + msg377500
2020-09-25 18:54:44iritkatrielsetnosy: + iritkatriel
messages: + msg377499
2011-02-24 02:45:03r.david.murraysetassignee: docs@python
type: behavior
components: + Documentation
versions: + Python 2.7
nosy: + docs@python, pitrou

messages: + msg129245
stage: needs patch
2011-02-24 02:33:42benjamin.petersonsetnosy: benjamin.peterson, r.david.murray
messages: + msg129244
2011-02-24 02:31:22r.david.murraysetnosy: benjamin.peterson, r.david.murray
messages: + msg129242
2011-02-24 02:29:40r.david.murraysetnosy: + r.david.murray
messages: + msg129241
2011-02-24 02:19:52benjamin.petersoncreate