ncoghlan
2013-08-23
Prompted by issue 18713 and, here are some possible utilities we could add to the codecs module to help deal with/debug issues related to surrogate escaped strings:

    def has_escaped_bytes(s):
        """Returns true if string contains surrogate escaped bytes"""

    def replace_escaped_bytes(s):
        """Replaces each surrogate escaped byte with a valid code point"""

    def decode_escaped_bytes(s, nominal_encoding, actual_encoding):
        """Reinterprets incorrectly decoded text using a new encoding"""
        return s.encode(nominal_encoding, 'surrogateescape').decode(actual_encoding)
