You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Noncharacters
These codes are intended for process internal uses, but
are not permitted for interchange.
FFFE !<not a character>
¨ the value FFFE !is guaranteed not to be
a Unicode character at all
¨ may be used to detect byte order by
contrast with FEFF which is a character
FEFF zero width no-break space
FFFF !<not a character>
¨ the value FFFF !is guaranteed not to be
a Unicode character at all
In particular, an XML document that contains such an
alleged unicode entity in not well-formed.
All unicode-aware versions of Python threat these
codepoints in the same manner as other codepoints, e.g.
both unichr(0xFFFE) and u'\uffff' pass without complaint.
I believe the correct behavior would be for Python to
raise an exception, or at least a warning, on access to
these spurious characters.
This is on purpose: you do need a way to write programs
which write and handle BOMs. If you want your program to
raise exceptions for these character points, you can easily
implement the required checks.
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: