Message28769
Logged In: YES
user_id=561546
Turns out the code is already written but disabled. Simply
turning it on would work.
tokenizer.c(321):
#if 0
/* Disable support for UTF-16 BOMs until a decision
is made whether this needs to be supported. */
} else if (ch == 0xFE) {
ch = get_char(tok); if (ch != 0xFF) goto NON_
BOM;
if (!set_readline(tok, "utf-16-be")) return 0;
tok->decoding_state = -1;
} else if (ch == 0xFF) {
ch = get_char(tok); if (ch != 0xFE) goto NON_
BOM;
if (!set_readline(tok, "utf-16-le")) return 0;
tok->decoding_state = -1;
#endif
Executing an utf-16 text file with BOM file would work.
However if I also include an encoding declaration plus BOM
like this
# -*- coding: UTF-16le -*-
It would result in this error, for some logic in the code
that I couldn't sort out {tokenizer.c(291)}:
g:\bin\py_repos\python-svn\PCbuild>python_d.exe test16le.
py
File "test16le.py", line 1
SyntaxError: encoding problem: utf-8
If you need a justification for checking the UTF-16 BOM, it
is Microsoft. As an early adopter of unicode before UTF-8
is popularized, there is some software that generates UTF-
16 by default. Not a fatal issue. But I see no reason not
to support it either.
|
|
Date |
User |
Action |
Args |
2007-08-23 14:40:30 | admin | link | issue1503789 messages |
2007-08-23 14:40:30 | admin | create | |
|