Message283271
It seems like encodings.normalize_encoding() currently has no unit test! Before modifying it, I would prefer to see a few unit tests:
* " utf 8 "
* "UtF 8"
* "utf8\xE9"
* etc.
Since we are talking about an optimmization, I would like to see a benchmark result before/after. I also would like to test Marc-Andre's idea of exposing the C function _Py_normalize_encoding().
_Py_normalize_encoding() works on a byte string encoded to Latin1. To implement encodings.normalize_encoding(), we might rewrite the function to work on Py_UCS4 character, or have a fast version on char*, and a more generic version for UCS2 and UCS4? |
|
Date |
User |
Action |
Args |
2016-12-15 09:53:01 | vstinner | set | recipients:
+ vstinner, lemburg, jcea, belopolsky, ezio.melotti, sdaoden, serhiy.storchaka |
2016-12-15 09:53:01 | vstinner | set | messageid: <1481795581.59.0.160043064791.issue11322@psf.upfronthosting.co.za> |
2016-12-15 09:53:01 | vstinner | link | issue11322 messages |
2016-12-15 09:53:01 | vstinner | create | |
|