New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Get rid of rare format units in PyArg_Parse* #68197
Comments
There are a lot of format units supported in PyArg_Parse* functions, but some of them are rarely or never used in current CPython code. Some of format units are legacy from Python 2 and are not needed in modern Python 3 code or can be replaced with custom converter. Here are results of grepping (not including Modules/_testcapimodule.c). "es", "es#", "et#", "z*", "Z#" are not used. "y#": "z#": "u#": "y": "et": "s*": "s#": In future may be we could deprecate some format units and remove them in 4.0. This issue is a meta issue. Every case should be considered individually. |
In textio.c, the decoder always should return bytes, not arbitrary read-only buffer (this is required in other parts of the code). So "y#" can be replaced with "O" with PyBytes_GET_SIZE. |
+1. I was recently trying to use the C API for a 3rd party library, and all of these subtly different string parameter formats made things surprisingly confusing. These are part of the Python C API, so removing them could break 3rd party code. Simply searching through the stdlib is not enough to show that these are not in use. So removal would require a deprecation period. |
New changeset d65233f630e1 by Serhiy Storchaka in branch 'default': |
Note that these format characters can also be used outside of CPython. |
Yes, of course, I think we shouldn't drop support of these format units. But using them likely is a sign of outdated or transitional code. It should be discouraged in new code, and every case should be analyzed and cleaned. |
“u#” should not be deprecated without first deprecating “u”, which is less useful due to not returning a buffer length. Also, I have always been mystified about how “s#”, “z#”, “y” and “y#” can properly to return a pointer into a buffer for arbitrary immutable bytes-like objects, without requiring PyBuffer_Release() to be called. Perhaps this is bad design to be discouraged. Or maybe a documentation oversight somewhere. |
“s#”, “z#”, “y” and “y#” work only with read-only buffers, for which PyBuffer_Release() is no-op operation. Initially they was designed for work with old buffer protocol that doesn't support releasing a buffer. |
Yes I just figured out that myself. Specifically, PyBufferProcs.bf_releasebuffer has to be NULL, and the buffer stays alive as long as the object stays alive. Also it looks like I was wrong about “u” being useless. I was tricked by a contradiction in the documentation, but I will try to fix this in a patch to bpo-24278. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: