This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author susurrus
Recipients susurrus
Date 2013-01-30.18:32:04
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1359570725.1.0.7766942653.issue17083@psf.upfronthosting.co.za>
In-reply-to
Content
When opening binary files in Python 3, the newline parameter cannot be set. While this kind of makes sense, readline() can still be used on binary files. This is great for my usage, but it is doing universal newline mode, I believe, so that any \r, \n, or \r\n triggers an EOL.

The data I'm working with is mixed ASCII/binary, with line termination specified by \r\n. I can't read a line (even though that concept occurs in my file) because some of the binary data includes \r or \n even though they aren't newlines in this context.

The issue here is that if the newline string can't be specified, readline() is useless on binary data, which often uses custom EOL strings. So would it be reasonable to add the newline parameter support to binary files? If not, then shouldn't readline() throw an exception when used on binary files?

I don't know if it's helpful here, but I've written a binary_readline() function supporting arbitrary EOL strings:

def binary_readline(file, newline=b'\r\n'):
    line = bytearray()
    newlineIndex = 0
    while True:
        x = file.read(1)
        if x:
            line += x
        else:
            if len(line) == 0:
                return None
            else:
                return line
        # If this character starts to match the newline string, start that comparison til it matches or doesn't.
        while line[-1] == newline[newlineIndex]:
            x = file.read(1)
            if x:
                line += x
            else:
                return line
            newlineIndex += 1
            if newlineIndex == len(newline):
                return line
               
        # We failed checking for the newline string, so reset the checking index
        newlineIndex = 0
History
Date User Action Args
2013-01-30 18:32:05susurrussetrecipients: + susurrus
2013-01-30 18:32:05susurrussetmessageid: <1359570725.1.0.7766942653.issue17083@psf.upfronthosting.co.za>
2013-01-30 18:32:05susurruslinkissue17083 messages
2013-01-30 18:32:04susurruscreate