Issue1658799
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2007-02-13 10:34 by hniksic, last changed 2022-04-11 14:56 by admin. This issue is now closed.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
subtype-intern-patch | hniksic, 2007-02-13 10:34 |
Messages (7) | |||
---|---|---|---|
msg51863 - (view) | Author: Hrvoje Nikšić (hniksic) * | Date: 2007-02-13 10:34 | |
This patch implements a small modification of PyString_InternInPlace that allows for safe interning of string subtype instances. The change should be fully backward compatible -- for a rationale and discussion, see: http://mail.python.org/pipermail/python-dev/2007-February/070973.html |
|||
msg51864 - (view) | Author: Armin Rigo (arigo) * | Date: 2007-02-13 22:36 | |
I think that this opens an attack (untested, though): it allows a previously-interned string to be removed from the dictionary. This might lead to a crash because the old string still thinks it is interned. Try something along the lines of: s1 = "hel" s1 = intern(s1 + "lo") class S(str): def __hash__(self): return hash(s1) def __eq__(self, other): return other == s1 s = S("world") intern(s) del s1 |
|||
msg51865 - (view) | Author: Hrvoje Nikšić (hniksic) * | Date: 2007-02-14 08:31 | |
I don't think an attack is possible. This patch retains the property that only exact strings are interned. If you create a pathological string subtype that hashes like a different string instance (one that has already been interned), all you'll achieve is that "interning" will return the other instance. As far as I can tell, no string is actually removed from the interned dictionary (until it becomes unreachable, that is.) What is the expected result of your test code? I tried it and it ran without error. |
|||
msg51866 - (view) | Author: Armin Rigo (arigo) * | Date: 2007-02-14 12:13 | |
Ah, the code was the wrong way around. The following causes an Fatal Python error in a debug build: s1 = "hel" s1 = intern(s1 + "lo") class S(str): def __hash__(self): return 0 def __eq__(self, other): return False s = S(s1) s2 = intern(s) del s1 |
|||
msg51867 - (view) | Author: Armin Rigo (arigo) * | Date: 2007-02-14 12:21 | |
Btw, any reason why you cannot simply say in your Python program: intern(str(s)) ? |
|||
msg51868 - (view) | Author: Hrvoje Nikšić (hniksic) * | Date: 2007-03-20 14:25 | |
Thanks for the detailed analysis. I missed this case and I retract the patch until I think of a way to resolve this problem. The obvious possibility is to always copy subtype instances before attempting to intern them, but right now I don't have the time to investigate if this slows things down. As for using s = intern(str(s)) in Python, it's a start, but it does somewhat more than I'd like -- for example, it "interns" all kinds of objects, which is not desirable. My patch was written with the intention of making PyString_InternInPlace more robust wrt string subtype instances, so that all the code in core and extensions that simply calls PyString_InternInPlace keeps working without modification. In the long run, the interface of PyString_InternInPlace is a bit too undeterministic for my taste. It has no error reporting, so it silently ignores some kinds of errors (not enough memory), throws fatal error on others (non-string being passed), and also completely ignores string subtypes. In my code I use this utility function: int intern_force(PyObject **s) { if (PyString_CheckExact(*s)) /* Most likely case: we're passed an exact string. */ PyString_InternInPlace(s); else if (PyString_Check(*s)) { /* The case we're covering with this function: we got a string subtype. Intern a copy. */ PyObject *copy; copy = PyString_FromStringAndSize(PyString_AS_STRING(*s), PyString_GET_SIZE(*s)); if (!copy) return -1; Py_DECREF(*s); *s = copy; PyString_InternInPlace(s); } else { PyErr_SetString(PyExc_TypeError, "intern_force passed a non-string"); return -1; } if (!PyString_CHECK_INTERNED(s)) { /* PyString_InternInPlace failed and cleared the exception, most likely due to insufficient memory. */ PyErr_Format(PyExc_RuntimeError, "failed to intern string '%s'", PyString_AS_STRING(*s)); return -1; } return 0; } I don't expect a function like this one to become a part of Python, but PyString_InternInPlace could be usefully improved even without breaking compatibility. |
|||
msg84651 - (view) | Author: Daniel Diniz (ajaksu2) * | Date: 2009-03-30 21:24 | |
Closing as the patch was retracted. Please reopen if you disagree. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:56:22 | admin | set | github: 44576 |
2009-03-30 21:24:51 | ajaksu2 | set | status: open -> closed type: enhancement nosy: + ajaksu2 messages: + msg84651 resolution: not a bug stage: resolved |
2007-02-13 10:34:01 | hniksic | create |