This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients erlendaasland, hroncok, koubaa, methane, vstinner
Date 2021-06-14.17:09:38
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1623690578.37.0.693700451886.issue44418@roundup.psfhosted.org>
In-reply-to
Content
Oh, I forgot about this issue. Let me rebuild the context.

Copy of the What's New in Python 3.10 entry:

"Removed the unicodedata.ucnhash_CAPI attribute which was an internal PyCapsule object. The related private _PyUnicode_Name_CAPI structure was moved to the internal C API. (Contributed by Victor Stinner in bpo-42157.)"


The C API changes. Python <= 3.9:

typedef struct {

    /* Size of this struct */
    int size;

    /* Get name for a given character code.  Returns non-zero if
       success, zero if not.  Does not set Python exceptions.
       If self is NULL, data come from the default version of the database.
       If it is not NULL, it should be a unicodedata.ucd_X_Y_Z object */
    int (*getname)(PyObject *self, Py_UCS4 code, char* buffer, int buflen,
                   int with_alias_and_seq);

    /* Get character code for a given name.  Same error handling
       as for getname. */
    int (*getcode)(PyObject *self, const char* name, int namelen, Py_UCS4* code,
                   int with_named_seq);

} _PyUnicode_Name_CAPI;

Python >= 3.10:

typedef struct {

    /* Get name for a given character code.
       Returns non-zero if success, zero if not.
       Does not set Python exceptions. */
    int (*getname)(Py_UCS4 code, char* buffer, int buflen,
                   int with_alias_and_seq);

    /* Get character code for a given name.
       Same error handling as for getname(). */
    int (*getcode)(const char* name, int namelen, Py_UCS4* code,
                   int with_named_seq);

} _PyUnicode_Name_CAPI;

Changes:

* _PyUnicode_Name_CAPI.size was removed
* getname and getcode functions have no more "self" argument

There was also a "void *state" parameter in commit https://github.com/python/cpython/commit/47e1afd2a1793b5818a16c41307a4ce976331649 but it was removed later.


In Python, it's used in two places:

* unicodeobject.c: "\N{...}" format to get a code point by its name
* codecs.c: PyCodec_NameReplaceErrors(), "namereplace" error handler

Both used self=NULL in Python 3.9.


It was simpler to remove the C API rather than trying to keep backward compatibility. The problem was to support the "self" parameter.

See the comment:
---
// Check if self is an unicodedata.UCD instance.
// If self is NULL (when the PyCapsule C API is used), return 0.
// PyModule_Check() is used to avoid having to retrieve the ucd_type.
// See unicodedata_functions comment to the rationale of this macro.
#define UCD_Check(self) (self != NULL && !PyModule_Check(self))
---

In my PR, I wrote:

"I prefer to merge this early in the 3.10 dev cycle, to increase chances of getting early user feedback if this change breaks 3rd party applications.

Thanks for the review @methane. Usually, features require 2 Python release to be removed, with a deprecation first. But this specific case is really weird. I chose to remove it immediately. IMO it was exposed in public "by mistake", whereas a private attribute would be enough for internal usage."

https://github.com/python/cpython/pull/22994#issuecomment-716958371
History
Date User Action Args
2021-06-14 17:09:38vstinnersetrecipients: + vstinner, methane, hroncok, erlendaasland, koubaa
2021-06-14 17:09:38vstinnersetmessageid: <1623690578.37.0.693700451886.issue44418@roundup.psfhosted.org>
2021-06-14 17:09:38vstinnerlinkissue44418 messages
2021-06-14 17:09:38vstinnercreate