import and execfile don't handle utf-16 encoded files,
but if I read the file with an appropriate encoder,
exec works fine on the loaded uncode string.
Also, changing site.encoding to utf-16 has a
detrimental effect. (I need to understand this better.)
I understand that the general problem is difficult to
solve, but it seems it would be fairly easy to handle
for the specific case of utf-16 file with some byte
order mark at the begining: if import/execfile fail
and the file starts with some BOM, re-read the file
with an appropriate codec.
Use this code to reproduce the problem
--------------
import sys
print sys.getdefaultencoding()
code = u'print "this is a test: OK"'
import traceback
import codecs
codecs.open("foo.py","w+","utf-16").write(code)
try:
execfile("foo.py")
except:
traceback.print_exc()
try:
import foo
except:
traceback.print_exc()
uu = codecs.open("foo.py","r","utf-16").read()
exec(uu)
--------------
produces this output
--------------
ascii
Traceback (most recent call last):
File "C:\opt\unicode-exec.py", line 12, in ?
execfile("foo.py")
File "<string>", line 1
p
^
SyntaxError: invalid syntax
Traceback (most recent call last):
File "C:\opt\unicode-exec.py", line 17, in ?
import foo
File "<string>", line 1
p
^
SyntaxError: invalid syntax
this is a test: OK
--------------
If I edit site.py to change encoding to "utf-16", I get
--------------
utf-16
Traceback (most recent call last):
File "C:\opt\unicode-exec.py", line 15, in ?
execfile("foo.py")
File "<string>", line 1
p
^
SyntaxError: invalid syntax
Traceback (most recent call last):
File "C:\opt\unicode-exec.py", line 20, in ?
import foo
File "<string>", line 1
p
^
SyntaxError: invalid syntax
Traceback (most recent call last):
File "C:\opt\unicode-exec.py", line 27, in ?
exec(uu)
TypeError: expected string without null bytes
----
|