Issue18710
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2013-08-12 12:40 by pitrou, last changed 2022-04-11 14:57 by admin. This issue is now closed.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
getmodattr.patch | pitrou, 2013-08-12 12:40 | review |
Messages (13) | |||
---|---|---|---|
msg194949 - (view) | Author: Antoine Pitrou (pitrou) * | Date: 2013-08-12 12:40 | |
Attached patch adds PyState_GetModuleAttr() and converts the _csv module to use it (as an example). As you can see, the _csv module grows a little but it now has proper error handling (previously, it didn't check for PyState_FindModule() returning NULL). (A few lines would be removed if we added PyErr_FormatV as well...) |
|||
msg194953 - (view) | Author: STINNER Victor (vstinner) * | Date: 2013-08-12 12:50 | |
I don't understand your change. Why do we need to change the _csv module and why storing module global variables in the module dict is better than storing them in a simple C structure? |
|||
msg194954 - (view) | Author: Antoine Pitrou (pitrou) * | Date: 2013-08-12 12:51 | |
> I don't understand your change. Why do we need to change the _csv > module and why storing module global variables in the module dict is > better than storing them in a simple C structure? I won't repeat what was already said in the python-dev thread: http://mail.python.org/pipermail/python-dev/2013-August/127862.html |
|||
msg194955 - (view) | Author: Christian Heimes (christian.heimes) * | Date: 2013-08-12 12:55 | |
The patch contains the new function and a patch for the CSV module. How about you split it up across two patches: one for the new feature and one for the CSV module? It makes review easier. |
|||
msg194956 - (view) | Author: Antoine Pitrou (pitrou) * | Date: 2013-08-12 12:56 | |
Bundling the two was aimed at showcasing the effect the new function can have. The function itself is trivial, there's not much point in reviewing it alone. |
|||
msg194957 - (view) | Author: STINNER Victor (vstinner) * | Date: 2013-08-12 13:01 | |
+PyObject * +PyState_GetModuleAttr(struct PyModuleDef *def, + const char *name, + PyObject *restrict_type) When the char* type is used, the function has usually the suffix String. I prefer the PyIdentifier API because it avoids to create a temporary Python Unicode object. It's the first type that I see a function getting an attribute which also checks for its type. I don't know if the check should be done in PyState_GetModuleAttr() or in the _csv module. |
|||
msg194960 - (view) | Author: Antoine Pitrou (pitrou) * | Date: 2013-08-12 13:51 | |
> When the char* type is used, the function has usually the suffix > String. I prefer the PyIdentifier API because it avoids to create a > temporary Python Unicode object. This is a convenience API, not a performance "optimization". > It's the first type that I see a function getting an attribute which > also checks for its type. I don't know if the check should be done > in PyState_GetModuleAttr() or in the _csv module. The aim is to help writing extension modules, which is why the type check is done inside the helper function. |
|||
msg194962 - (view) | Author: Eli Bendersky (eli.bendersky) * | Date: 2013-08-12 13:55 | |
The patched code looks better than the original one in several respects, but I think it raises some questions. I agree with Victor that the type-checking API is unnatural, but I also think there may be a deeper issue hiding behind. You felt compelled to add the type-checking to provide some level of safety to the C code - but is it enough? Previously, the only way to add a dialect was through register_dialect that does type checking to make sure it gets a legit dialect object. Now, the _dialects dict is directly accessible to Python code and it can add arbitrary objects to it (both as keys and as values). Does this mean that the C code now has to do type checking in all internal code that accesses _dialects? I haven't had time to find a crasher example, but it's plausible that it's possible to crash the interpreter now by assigning un-expected stuff to members of _dialects. Maybe it's not a problem in practice for _csv but something worth thinking about if we're going to switch other methods to this approach. Also, the problem with http://mail.python.org/pipermail/python-dev/2013-August/127862.html does not go away with this patch, as expected. However, it's a step in the right direction in case we do have multiple instances of the extension module alive at the same time in the future. Although then it would be interesting to consider how to find the actually correct module instance from internal functions. |
|||
msg194963 - (view) | Author: Antoine Pitrou (pitrou) * | Date: 2013-08-12 14:00 | |
> Previously, the only way to add a dialect was through register_dialect that does > type checking to make sure it gets a legit dialect object. Now, the _dialects dict is > directly accessible to Python code and it can add arbitrary objects to it (both as > keys and as values). Does this mean that the C code now has to do type checking in all > internal code that accesses _dialects? You are right, I forgot about that part. That will make the patch significantly more complicated. > However, it's a step in the right direction in case we do have multiple instances of > the extension module alive at the same time in the future. Although then it would be > interesting to consider how to find the actually correct module instance from internal > functions. This sounds basically impossible (which is why we *can't* have multiple instances of an extension module alive in a single interpreter :-)). |
|||
msg194964 - (view) | Author: Eli Bendersky (eli.bendersky) * | Date: 2013-08-12 14:07 | |
> Previously, the only way to add a dialect was through register_dialect that does > > type checking to make sure it gets a legit dialect object. Now, the > _dialects dict is > > directly accessible to Python code and it can add arbitrary objects to > it (both as > > keys and as values). Does this mean that the C code now has to do type > checking in all > > internal code that accesses _dialects? > > You are right, I forgot about that part. That will make the patch > significantly more complicated. > :-/ Turns out there's real merit in having a clear dividing line between the safe world of Python and unsafe world of C. Within the C code, some invariants are assumed to be correct (and can be efficiently assert()-ed) - this makes the code simpler and more performant. > > However, it's a step in the right direction in case we do have multiple > instances of > > the extension module alive at the same time in the future. Although then > it would be > > interesting to consider how to find the actually correct module instance > from internal > > functions. > > This sounds basically impossible (which is why we *can't* have multiple > instances of an extension module alive in a single interpreter :-)). > Well, it surely isn't possible without significant code rewriting. But I can envision carrying such "state" around in extension-specific objects. Although it's not entirely clear whether the memory cost is justified. But without it, it's not clear to me how we plan to have multiple living extension modules at all. Perhaps this is a discussion outside this specific issue though (maybe for the PEP Stefan writes) |
|||
msg194965 - (view) | Author: Nick Coghlan (ncoghlan) * | Date: 2013-08-12 14:08 | |
Breaking the 1:1 interpreter <-> extension module mapping involves adding custom types to sys.modules rather than module objects. Making that work sensibly will involve larger changes to the extension initialisation APIs. import-sig already has plans for this :) |
|||
msg352319 - (view) | Author: Petr Viktorin (petr.viktorin) * | Date: 2019-09-13 13:24 | |
PEP 573 proposes an alternative to PyState_FindModule. |
|||
msg383065 - (view) | Author: Petr Viktorin (petr.viktorin) * | Date: 2020-12-15 15:17 | |
Let me summarize the original issue with the following quote from the python-dev thead: > doing some sys.modules acrobatics and re-importing suddenly changes the internal state of a previously imported [_csv] module. The issue is now solved; both reproducers from that thread pass. If you're interested, please read PEP 630 for a general overview of where we're at w.r.t. C-API changes that started with PEP 3121. For the API used to fix this issue, see PEP 573; for the specific fix in _csv, see GH-23224. Thanks to everyone involved, and especially to Nick Coghlan for helping me start this journey way back in 2013 :) |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:57:49 | admin | set | github: 62910 |
2020-12-15 15:17:34 | petr.viktorin | set | status: open -> closed resolution: fixed messages: + msg383065 stage: patch review -> resolved |
2019-09-13 13:24:04 | petr.viktorin | set | nosy:
+ petr.viktorin messages: + msg352319 |
2013-08-12 14:08:35 | ncoghlan | set | messages: + msg194965 |
2013-08-12 14:07:46 | eli.bendersky | set | messages: + msg194964 |
2013-08-12 14:00:18 | pitrou | set | messages: + msg194963 |
2013-08-12 13:55:27 | eli.bendersky | set | messages: + msg194962 |
2013-08-12 13:51:16 | pitrou | set | messages: + msg194960 |
2013-08-12 13:01:31 | vstinner | set | messages: + msg194957 |
2013-08-12 12:56:51 | pitrou | set | messages: + msg194956 |
2013-08-12 12:55:21 | christian.heimes | set | nosy:
+ christian.heimes messages: + msg194955 |
2013-08-12 12:51:43 | pitrou | set | messages: + msg194954 |
2013-08-12 12:50:28 | vstinner | set | nosy:
+ vstinner messages: + msg194953 |
2013-08-12 12:40:41 | pitrou | create |