This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Add method to detect if a string contains surrogates
Type: enhancement Stage:
Components: Interpreter Core, Unicode Versions: Python 3.6
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: ezio.melotti, r.david.murray, vstinner
Priority: normal Keywords:

Created on 2015-09-29 12:48 by r.david.murray, last changed 2022-04-11 14:58 by admin.

Messages (1)
msg251853 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-09-29 12:48
Because surrogates are in several contexts used to "smuggle" bytes through string APIs using surrogateescape, it is very useful to be able to determine if a given string contains surrogates.  The email package, for example, uses different logic to handle strings that contain smuggled bytes and strings that don't when serializing a Message object.  Currently it uses x.encode() and checks for an exception (we determined that for CPython this was the most efficient method to check).  It would be better, I think, to have a dedicated method on str for this, among other reasons so that different python implementations could optimize it appropriately.

(Note that another aspect of dealing with surrogateescaped strings is discussed in issue 18814.)
History
Date User Action Args
2022-04-11 14:58:21adminsetgithub: 69456
2015-09-29 13:05:02vstinnersetnosy: + ezio.melotti, vstinner
components: + Unicode
2015-09-29 12:48:03r.david.murraycreate