This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author Matthew.Boehm
Recipients Matthew.Boehm
Date 2011-08-29.21:42:29
SpamBayes Score 0.022638481
Marked as misclassified No
Message-id <1314654150.68.0.797504547224.issue12855@psf.upfronthosting.co.za>
In-reply-to
Content
A file opened with codecs.open() splits on a form feed character (\x0c) while a file opened with open() does not.

>>> with open("formfeed.txt", "w") as f:
...   f.write("line \fone\nline two\n")
...
>>> with open("formfeed.txt", "r") as f:
...   s = f.read()
...
>>> s
'line \x0cone\nline two\n'
>>> print s
line
    one
line two

>>> import codecs
>>> with open("formfeed.txt", "rb") as f:
...   lines = f.readlines()
...
>>> lines
['line \x0cone\n', 'line two\n']
>>> with codecs.open("formfeed.txt", "r", encoding="ascii") as f:
...   lines2 = f.readlines()
...
>>> lines2
[u'line \x0c', u'one\n', u'line two\n']
>>>

Note that lines contains two items while lines2 has 3.

Issue 7643 has a good discussion on newlines in python, but I did not see this discrepancy mentioned.
History
Date User Action Args
2011-08-29 21:42:30Matthew.Boehmsetrecipients: + Matthew.Boehm
2011-08-29 21:42:30Matthew.Boehmsetmessageid: <1314654150.68.0.797504547224.issue12855@psf.upfronthosting.co.za>
2011-08-29 21:42:30Matthew.Boehmlinkissue12855 messages
2011-08-29 21:42:29Matthew.Boehmcreate