classification
Title: Seeking to the beginning of a text file a second time will return the BOM as first character
Type: behavior Stage:
Components: Unicode Versions: Python 3.0, Python 2.7, Python 2.6
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: eggy, pitrou (2)
Priority: low Keywords

Created on 2009-06-11 18:26 by eggy, last changed 2009-06-11 18:33 by pitrou.

Messages (2)
msg89257 - (view) Author: Mark Florisson (eggy) Date: 2009-06-11 18:26
>>> f = open('foo', 'wt+', encoding='UTF-16')
>>> f.write('spam ham eggs')
13
>>> f.seek(0)
0
>>> f.read()
'spam ham eggs'
>>> f.seek(0)
0
>>> f.read()
'\ufeffspam ham eggs'

Although the BOM character is a ZERO WIDTH NO-BREAK SPACE, and should
therefore not impose many problems, the behavior is inconsistent and
unexpected.
codecs.open in 2.x suffers from this same behavior.
msg89258 - (view) Author: Antoine Pitrou (pitrou) Date: 2009-06-11 18:33
This is fixed in 3.1.
History
Date User Action Args
2009-06-11 18:33:53pitrousetpriority: low
versions: - Python 2.5, Python 2.4, Python 3.1, Python 3.2
nosy: + pitrou

messages: + msg89258
2009-06-11 18:26:42eggycreate