This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author lemburg
Recipients eryksun, ezio.melotti, lemburg, paul.moore, python-dev, rafaelblsilva, serhiy.storchaka, steve.dower, tim.golden, vstinner, zach.ware
Date 2021-09-17.07:34:38
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1631864078.28.0.775823855831.issue45120@roundup.psfhosted.org>
In-reply-to
Content
Just to be clear: The Python code page encodings are (mostly) taken from the unicode.org set of mappings (ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/). This is our standards body for such mappings, where possible. In some cases, the Unicode consortium does not provide such mappings and we resort to other standards (ISO, commonly used mapping files in OSes, Wikipedia, etc).

Changes to the existing mapping codecs should only be done in case corrections are applied to the mappings under those names by the standard bodies.

If you want to add variants such as the best fit ones from MS, we'd have to add them under a different name, e.g. bestfit1252 (see ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WindowsBestFit/).

Otherwise, interop with other systems would no longer.

From Eryk's description it sounds like we should always add WC_NO_BEST_FIT_CHARS as an option to MultiByteToWideChar() in order to make sure it doesn't use best fit variants unless explicitly requested.
History
Date User Action Args
2021-09-17 07:34:38lemburgsetrecipients: + lemburg, paul.moore, vstinner, tim.golden, ezio.melotti, python-dev, zach.ware, serhiy.storchaka, eryksun, steve.dower, rafaelblsilva
2021-09-17 07:34:38lemburgsetmessageid: <1631864078.28.0.775823855831.issue45120@roundup.psfhosted.org>
2021-09-17 07:34:38lemburglinkissue45120 messages
2021-09-17 07:34:38lemburgcreate