Qualifier: This is the first issue that I've raised, so I apologise before-hand for any protocol flubs.
str.splitlines()'s implementation jives unexpectedly with str.split().
In this code snippet, a server buffers input until it receives a blank line, and then it processes the input:
request_buffer = ""
while request_buffer.split("\n")[-1] != "" or request_buffer == "":
request_buffer += self.conn.recv(1024)
print("Got a line!")
print("Got an empty line!")
self.handleRequest(request_buffer)
I found out the hard way that this code isn't prepared to handle clients that use a different "new line" standard, such as those that send "\r". I discovered str.splitlines() at that point and found that, to some extent, it works as advertised: splitting lines regardless of exactly what new line character is being used.
However, this code doesn't work:
request_buffer = ""
while request_buffer.splitlines[-1] != "" or request_buffer == "":
request_buffer += self.conn.recv(1024)
print("Got a line!")
print("Got an empty line!")
self.handleRequest(request_buffer)
Python complains that -1 is out of request_buffer.splitlines()'s range. I know that str.splitlines() treats empty lines, because I've used it on longer strings for testing trailing blank lines before; it only refuses to count a line as being blank if there isn't another line after it. "derp".splitlines() has a length of 1, but "".splitlines() has a length of 0. "derp\n".splitlines() also has a length of 1, thus excluding the trailing blank line.
In my opinion, "".splitlines() should have 1 element. "derp".splitlines() should persist as having 1 element, but "derp\n".splitlines() should have 2 elements. This would result in the same functionality as str.split("\n") (where "\n".split("\n") results in two empty-string elements), but it would have the benefit of working predictably with all line-breaking standards, which I assume was the idea all along.
|