msg320107 - (view) |
Author: Barry A. Warsaw (barry) *  |
Date: 2018-06-20 20:06 |
The _PyCoreConfig structure in pystate.h has some interesting fields that I don't think are exposed anywhere else to Python-land. I was particularly interested recently in hash_seed and use_hash_seed. I'm thinking that it may be useful to expose this structure in the sys module.
|
msg320108 - (view) |
Author: Christian Heimes (christian.heimes) *  |
Date: 2018-06-20 20:28 |
hash_seed and use_hash_seed could be added to sys.hash_info. This would be the first place I'd look for the information. After all I implemented it. :)
|
msg320109 - (view) |
Author: Barry A. Warsaw (barry) *  |
Date: 2018-06-20 20:30 |
On Jun 20, 2018, at 13:28, Christian Heimes <report@bugs.python.org> wrote:
>
> Christian Heimes <lists@cheimes.de> added the comment:
>
> hash_seed and use_hash_seed could be added to sys.hash_info. This would be the first place I'd look for the information. After all I implemented it. :)
That was the first place I looked too :)
|
msg320111 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2018-06-20 20:53 |
Is is still a secret seed if it's public? :)
|
msg320198 - (view) |
Author: Barry A. Warsaw (barry) *  |
Date: 2018-06-21 23:39 |
I think the basic implementation problem is that by the time you get to get_hash_info() in sysmodule.c, you no longer have access to the _PyCoreConfig object, nor the _PyMain object that it's generally attached to.
|
msg320211 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2018-06-22 07:04 |
> I think the basic implementation problem is that by the time you get to get_hash_info() in sysmodule.c, you no longer have access to the _PyCoreConfig object, nor the _PyMain object that it's generally attached to.
An interpreter now keeps a copy of _PyCoreConfig and _PyMainInterpreterConfig.
See for example make_flags() in sysmodule.c:
_PyCoreConfig *core_config = &_PyGILState_GetInterpreterStateUnsafe()->core_config;
...
PyStructSequence_SET_ITEM(seq, pos++, PyBool_FromLong(core_config->dev_mode));
The interpreter really owns the copy of these configs and they are kept until the interpreter object is destroyed.
Another example:
static PyObject *
import_find_and_load(PyObject *abs_name)
{
...
int import_time = interp->core_config.import_time;
...
}
|
msg320250 - (view) |
Author: Barry A. Warsaw (barry) *  |
Date: 2018-06-22 18:13 |
Thanks for the hint! I had a feeling there had to be an API to get at it, but I couldn’t find it. Maybe we should start documenting the Python Secret Underscore API? :)
On Jun 22, 2018, at 00:04, STINNER Victor <report@bugs.python.org> wrote:
>
> _PyCoreConfig *core_config = &_PyGILState_GetInterpreterStateUnsafe()->core_config;
> ...
> PyStructSequence_SET_ITEM(seq, pos++, PyBool_FromLong(core_config->dev_mode));
>
> The interpreter really owns the copy of these configs and they are kept until the interpreter object is destroyed.
|
msg320265 - (view) |
Author: Barry A. Warsaw (barry) *  |
Date: 2018-06-22 20:53 |
I think there's another thing I'd like to change, and it seems like it's "just" an implementation detail. In _Py_HashRandomization_Init(), if use_hash_seed is 0, then we directly inject the random bits into the buffer, and then there's no hash_seed. I'd like to change that so that if use_hash_seed is 0, then we create a random hash seed first, and then call lcg_urandom() for the hash secret. That way, even if Python itself uses a random hash seed, we'll have a record of that in the runtime that can be used to reproduce the hashing. In this case, I'd still leave use_hash_seed == 0, and that would tell you what combinations of env vars were used.
|
msg320267 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2018-06-22 20:54 |
Nick plans to finish his PEP 432 for Python 3.8 and make the API public.
See with him? The PEP should document these structures but I was ahead and
made changes which were not scheduled and the PEP is now outdated.
|
msg320269 - (view) |
Author: Barry A. Warsaw (barry) *  |
Date: 2018-06-22 20:56 |
Nosying Nick. I agree there's some overlap with Python startup restructuring, but it feels kind of orthogonal too. I really am only exposing (some elements) of that structure to Python.
What might be interesting though would be if we want to expose the entire structure and not just the hash seeds, as I'm leaning toward here (given that we already have sys.hash_info).
|
msg320270 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2018-06-22 20:57 |
Barry: generating a 32 bit seed gives less entropy and so makes Python
easier to crash. If you need reproducible Python: generate a seed and set
the env var before starting Python. Tox does that. Regrtest should do that.
|
msg320285 - (view) |
Author: Barry A. Warsaw (barry) *  |
Date: 2018-06-22 22:20 |
We could make the hash_seed 64 bits.
|
msg320289 - (view) |
Author: Barry A. Warsaw (barry) *  |
Date: 2018-06-22 22:30 |
Although I guess that would require modifications to lcg_urandom(). I don't feel qualified to change that function.
|
msg320299 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2018-06-23 08:32 |
> We could make the hash_seed 64 bits.
On my 64-bit Linux, _Py_HashSecret_t takes 24 bytes (192 bits).
|
msg320345 - (view) |
Author: Alyssa Coghlan (ncoghlan) *  |
Date: 2018-06-24 03:03 |
I'm thoroughly open to co-author requests for PEP 432 - the "Let's implement it as a private API for Python 3.7 and see what we learn from the experience" plan worked beautifully, but it *also* means the PEP text is now woefully out of date with reality :)
The pieces that are missing are:
- bring it up to date with what we actually did for 3.7
- decide which of those pieces we want to make public as-is, and which we want to tweak before making them generally available (e.g. does the "ConfigureMainInterpreter" naming still make sense? Or should we go back to the earlier "BeginInitialization" and "EndInitialization" pair?)
- now that we store this state in a more coherent way, what do we want to make public at the Python layer, and where should we make it public to avoid causing too many problems for other implementations?
However, while I'd definitely be able to make time to review a PR to the PEP, I can't make any promises as to when I'd be able to sit down and actually draft that update myself.
|
msg320347 - (view) |
Author: Alyssa Coghlan (ncoghlan) *  |
Date: 2018-06-24 03:21 |
Back on the original hash seed topic:
1. The exact size of the seed ranges from 128 bits (SIPHash) to 32-bits depending on exactly which hash algorithm you're talking about (https://www.python.org/dev/peps/pep-0456/#hash-secret)
2. While PEP 456 doesn't state it explicitly, my recollection is that omitting the exact hash seed value from the Python level API was a deliberate decision, since one of the *purposes* of PEP 456 was to protect against seed recovery attacks like https://131002.net/siphash/poc.py. Being able to read the seed directly from the sys modules would rather simplify the task of seed recovery :)
Only exposing a `forced_hash_seed` (and hiding randomly generated ones as `forced_hash_seed=None`) seems reasonable though, since those can already be read from os.environ anyway.
|
msg320438 - (view) |
Author: Barry A. Warsaw (barry) *  |
Date: 2018-06-25 20:25 |
On Jun 23, 2018, at 20:21, Nick Coghlan <report@bugs.python.org> wrote:
>
> Only exposing a `forced_hash_seed` (and hiding randomly generated ones as `forced_hash_seed=None`) seems reasonable though, since those can already be read from os.environ anyway.
Only mirroring $PYTHONHASHSEED probably makes the whole ask less useful. Maybe I should abandon the PR, although it may still make sense to export the full _PyCoreConfig structure.
|
msg342530 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2019-05-15 02:07 |
Update. I implemented _testinternalcapi.get_configs() which exports *all* Python configuration used to initialize Python. It contains the hash seed for example. The function is only written for tests.
Moreover, I proposed the PEP 587 to expose the new _PyCoreConfig as a public C API.
|
msg343706 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2019-05-27 23:40 |
So far, there is no clear agreement to expose C PyConfig structure in Python, so I close the issue.
My PEP 587 has been accepted. I chose to not expose PyConfig in Python in the PEP. But I'm open to revisit this idea later, especially to move towards PEP 432: implement multi-phase initialization (only partially supported in my PEP 587).
But I would prefer to a different rationale than exposing hash_seed. For hash_seed alone, I don't think that it's worth it. Moreover, Christian wrote:
> hash_seed and use_hash_seed could be added to sys.hash_info. This would be the first place I'd look for the information. After all I implemented it. :)
|
|
Date |
User |
Action |
Args |
2022-04-11 14:59:02 | admin | set | github: 78100 |
2019-05-27 23:40:30 | vstinner | set | status: open -> closed resolution: rejected messages:
+ msg343706
stage: patch review -> resolved |
2019-05-15 02:07:27 | vstinner | set | messages:
+ msg342530 |
2018-06-25 20:25:45 | barry | set | messages:
+ msg320438 |
2018-06-24 03:21:08 | ncoghlan | set | messages:
+ msg320347 |
2018-06-24 03:03:58 | ncoghlan | set | messages:
+ msg320345 |
2018-06-23 08:32:15 | vstinner | set | messages:
+ msg320299 |
2018-06-23 00:35:33 | barry | set | keywords:
+ patch stage: needs patch -> patch review pull_requests:
+ pull_request7475 |
2018-06-22 23:50:29 | eric.snow | set | nosy:
+ eric.snow, emilyemorehouse
|
2018-06-22 22:30:52 | barry | set | messages:
+ msg320289 |
2018-06-22 22:20:26 | barry | set | messages:
+ msg320285 |
2018-06-22 20:57:38 | vstinner | set | messages:
+ msg320270 |
2018-06-22 20:56:31 | barry | set | nosy:
+ ncoghlan messages:
+ msg320269
|
2018-06-22 20:54:22 | vstinner | set | messages:
+ msg320267 |
2018-06-22 20:53:09 | barry | set | messages:
+ msg320265 |
2018-06-22 18:13:28 | barry | set | messages:
+ msg320250 |
2018-06-22 07:04:41 | vstinner | set | messages:
+ msg320211 |
2018-06-21 23:39:36 | barry | set | messages:
+ msg320198 |
2018-06-20 20:53:35 | vstinner | set | nosy:
+ vstinner messages:
+ msg320111
|
2018-06-20 20:30:22 | barry | set | messages:
+ msg320109 |
2018-06-20 20:28:55 | christian.heimes | set | nosy:
+ christian.heimes messages:
+ msg320108
|
2018-06-20 20:06:15 | barry | create | |