This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author tungwaiyip
Date 2006-06-23.01:31:10
SpamBayes Score
Marked as misclassified
Logged In: YES 

Turns out the code is already written but disabled. Simply 
turning it on would work.

#if 0
	/* Disable support for UTF-16 BOMs until a decision
	   is made whether this needs to be supported.  */
	} else if (ch == 0xFE) {
		ch = get_char(tok); if (ch != 0xFF) goto NON_
		if (!set_readline(tok, "utf-16-be")) return 0;
		tok->decoding_state = -1;
	} else if (ch == 0xFF) {
		ch = get_char(tok); if (ch != 0xFE) goto NON_
		if (!set_readline(tok, "utf-16-le")) return 0;
		tok->decoding_state = -1;

Executing an utf-16 text file with BOM file would work. 
However if I also include an encoding declaration plus BOM 
like this

  # -*- coding: UTF-16le -*-

It would result in this error, for some logic in the code 
that I couldn't sort out {tokenizer.c(291)}:

  g:\bin\py_repos\python-svn\PCbuild>python_d.exe test16le.
    File "", line 1
  SyntaxError: encoding problem: utf-8

If you need a justification for checking the UTF-16 BOM, it 
is Microsoft. As an early adopter of unicode before UTF-8 
is popularized, there is some software that generates UTF-
16 by default. Not a fatal issue. But I see no reason not 
to support it either.
Date User Action Args
2007-08-23 14:40:30adminlinkissue1503789 messages
2007-08-23 14:40:30admincreate