msg385365 - (view) |
Author: Illia Volochii (illia-v) * |
Date: 2021-01-20 20:06 |
Documentation [1] suggests using at least 100,000 iterations of SHA-256 as of 2013.
Currently, it is 2021, and it is common to use much more iterations.
For example, Django will use 260,000 by default in the next 3.2 LTS release and 320,000 in 4.0 [2][3].
I suggest suggesting at least 250,000 iterations that is a somewhat round number close to the one used by modern libraries.
[1] https://docs.python.org/3/library/hashlib.html#hashlib.pbkdf2_hmac
[2] https://github.com/django/django/commit/f2187a227f7a3c80282658e699ae9b04023724e5
[3] https://github.com/django/django/commit/a948d9df394aafded78d72b1daa785a0abfeab48
|
msg385442 - (view) |
Author: Christian Heimes (christian.heimes) * |
Date: 2021-01-21 19:14 |
Is there any scientific research or mathematical proof for 250,000 iteration?
|
msg385455 - (view) |
Author: Illia Volochii (illia-v) * |
Date: 2021-01-21 22:39 |
I didn't find any. I think it is based on some benchmarks like `openssl speed sha`.
|
msg385939 - (view) |
Author: Raymond Hettinger (rhettinger) * |
Date: 2021-01-29 20:59 |
FWIW, OnePass uses 100,000. https://support.1password.com/pbkdf2/
Also, I don't think an additional time factor of 2.5x would make substantial difference in security, but it may make a noticeable difference in user authentication time.
|
msg385944 - (view) |
Author: Illia Volochii (illia-v) * |
Date: 2021-01-29 21:40 |
> FWIW, OnePass uses 100,000. https://support.1password.com/pbkdf2/
There is a history section on that page. And current 100,000 is ten times more than 1Password used in 2013 when the suggestion was added to the documentation.
> Also, I don't think an additional time factor of 2.5x would make substantial difference in security, but it may make a noticeable difference in user authentication time.
2.5x difference can be substantial if x is hours, days, or years :)
|
msg385992 - (view) |
Author: Christian Heimes (christian.heimes) * |
Date: 2021-01-30 18:30 |
PBKDF2-HMAC is a serialized algorithm. It cannot be parallized. That means the runtime depends on single core-performance. The single core-performance of desktop and server CPUs hasn't improved much in the last decade. Modern CPUs have more cores, larger caches, and better IPC. Intel Nehalem architecture from 2009 had up to 3.33 GHz. Fast 2020 Comet Lake CPUs have up to 3.7 GHz base frequence and about 5GHz turbo.
|
msg386605 - (view) |
Author: Illia Volochii (illia-v) * |
Date: 2021-02-07 20:32 |
Clock rate is not the only indicator. Some new instructions supporting SHA were introduced during the last decade.
https://software.intel.com/content/www/us/en/develop/articles/intel-sha-extensions.html
https://software.intel.com/content/www/us/en/develop/articles/improving-openssl-performance.html
https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/sha-256-implementations-paper.pdf
|
msg411524 - (view) |
Author: April King (april) |
Date: 2022-01-24 22:42 |
Django uses 390,000 iterations as of late 2021, as does the Python Cryptography project. We should be aligned with their recommendations, or at least a good deal closer than we are now.
390,000 actually makes it a conservative recommendation for key derivation, as that number of rounds takes ~133ms to compute on my M1 versus 36ms. Usually you're shooting for ~250ms.
Being off by ~50% is probably okay, being off by this much is considerably worse.
Anyways, I'd be happy to make such a PR if folks are amenable to it.
|
msg411566 - (view) |
Author: Christian Heimes (christian.heimes) * |
Date: 2022-01-25 08:50 |
My question from last year has not been answered yet. Is there any valid scientific research on the number of rounds or duration? I neither know nor do I understand how Django came up with the numbers. PyCA cryptography copied the numbers without questioning them.
Were does 250ms come from? 250ms at 100% CPU load sound way too costly for a website login and too fast for a password manager. For comparison Argon2's default runtime on my laptop is 50ms.
|
msg411610 - (view) |
Author: April King (april) |
Date: 2022-01-25 15:14 |
Django probably stores and computes more passwords than every other Python framework combined, and it doesn't provide you any control over the number of iterations. And it hasn't for years. If this were truly a problem, wouldn't their users be complaining about it constantly?
Werkzeug was doing 150,000 iterations as of 0.15.x, released three years ago, and does 260,000 iterations today. Again, no complaints or issues.
In practicality, this is almost never a problem - user logins and password changes are extremely rare events compared to all other activity, and so the computation time is essentially irrelevant outside response time for that individual user. No matter how many users, the systems are scaling such that the computation time of that rare event remains a fraction of overall CPU use.
|
msg411623 - (view) |
Author: Paul Kehrer (reaperhulk) |
Date: 2022-01-25 15:46 |
NIST provides no official guidance on iteration count other than NIST SP 800-132 Appendix A.2.2, which states "The number of iterations should be set as high as can be tolerated for the environment, while maintaining acceptable performance."
I can think of no better resource for what constitutes acceptable performance at the highest iteration count than popular packages like Django. Django's choice (and lack of evidence that they've had any cause to revert due to performance issues) argues that 390k iterations is a reasonable number in 2022. Certainly the 100k suggested in these docs as of 2013 is no longer best practice as we've seen 9 years of computational improvement in the intervening time.
I would, additionally, suggest that the documentation recommend the use of scrypt where possible over any iteration count of PBKDF2, but increasing the iteration count is still a useful improvement to the docs!
|
msg411624 - (view) |
Author: Christian Heimes (christian.heimes) * |
Date: 2022-01-25 15:56 |
You are arguing from the perspective of a Django/werkzeug developer and you are using experiential domain knowledge to argue for higher recommendation.
I'm asking for a scientific answer. Based on my experience 100k PBKDF2 HMAC-SHA256 rounds is already a DoS issue for some use cases. For other uses cases even 500k rounds is not the right answer, because the application should rather use a different algorithm all together.
If you are concerned about PBKDF2's strength, then better switch to Scrypt or Argon2. They are better suited against GPU-based crackers. PBKDF2 is still required for FIPS compliance, but most people can (and should!) ignore FIPS.
|
msg411644 - (view) |
Author: Alex Gaynor (alex) * |
Date: 2022-01-25 17:48 |
Sticking with 100k is not scientific though ;-) Empiricism is science!
I'm probably the person responsible for Django's process, which is to increase by some % (10% or 20% IIRC) every release.
As you point out, the exact value one should use is a function of context, which we don't have as documentation authors. However, what we can do is try to select a value that's most likely to be practical for many users and will in-turn protect their users data most. 100k isn't that value, and taking inspiration from places that have had their values tested by many users is intuitive to me.
|
msg411789 - (view) |
Author: Zachary Ware (zach.ware) * |
Date: 2022-01-26 20:12 |
Rather than suggesting an actual number, perhaps we should link to an external resources that covers how to choose the number?
Or we leave it vague and say "The number of iterations should be chosen based on the hash algorithm and computing power; there is no universal recommendation, but hundreds of thousands of iterations may be reasonable." This avoids bikeshedding a specific number, but still gives a general idea of the magnitude of number involved.
|
msg411844 - (view) |
Author: Gregory P. Smith (gregory.p.smith) * |
Date: 2022-01-27 08:39 |
New changeset 897ce9018775bcd679fb49aa17258f8f6e818e23 by Illia Volochii in branch 'main':
bpo-42982: Improve the text on suggested number of iterations of PBKDF2 (GH-24276)
https://github.com/python/cpython/commit/897ce9018775bcd679fb49aa17258f8f6e818e23
|
msg411845 - (view) |
Author: Gregory P. Smith (gregory.p.smith) * |
Date: 2022-01-27 08:41 |
I reworked the PR and went with less specific text and linking to the NIST 800 132 appendix as guidance on how people should determine what is right for them.
there is no one right number. it is application specific.
thanks for everyone's valuable input!
|
msg411846 - (view) |
Author: miss-islington (miss-islington) |
Date: 2022-01-27 09:02 |
New changeset 1ecc98dedb7ae77c2d806a70b52dfecdac39ff5b by Miss Islington (bot) in branch '3.10':
bpo-42982: Improve the text on suggested number of iterations of PBKDF2 (GH-24276)
https://github.com/python/cpython/commit/1ecc98dedb7ae77c2d806a70b52dfecdac39ff5b
|
msg411879 - (view) |
Author: April King (april) |
Date: 2022-01-27 14:16 |
The code snippet still uses 100000. Given that many people will simply copy-and-paste without questioning, should we update that too?
|
msg411916 - (view) |
Author: miss-islington (miss-islington) |
Date: 2022-01-27 20:18 |
New changeset ace0aa2a2793ba4a2b03e56c4ec375c5470edee8 by Gregory P. Smith in branch 'main':
bpo-42982: update pbkdf2 example & add another link (GH-30966)
https://github.com/python/cpython/commit/ace0aa2a2793ba4a2b03e56c4ec375c5470edee8
|
msg414293 - (view) |
Author: Ned Deily (ned.deily) * |
Date: 2022-03-01 20:56 |
New changeset 7dbb2f8eaf07c105f4d2bb0fe61763463e68372d by Miss Islington (bot) in branch '3.10':
bpo-42982: update pbkdf2 example & add another link (GH-30966) (#30968)
https://github.com/python/cpython/commit/7dbb2f8eaf07c105f4d2bb0fe61763463e68372d
|
|
Date |
User |
Action |
Args |
2022-04-11 14:59:40 | admin | set | github: 87148 |
2022-03-01 20:56:40 | ned.deily | set | nosy:
+ ned.deily messages:
+ msg414293
|
2022-01-27 20:18:46 | miss-islington | set | pull_requests:
+ pull_request29146 |
2022-01-27 20:18:36 | miss-islington | set | messages:
+ msg411916 |
2022-01-27 19:34:10 | gregory.p.smith | set | pull_requests:
+ pull_request29145 |
2022-01-27 14:16:30 | april | set | messages:
+ msg411879 |
2022-01-27 09:02:02 | miss-islington | set | messages:
+ msg411846 |
2022-01-27 08:41:18 | gregory.p.smith | set | status: open -> closed messages:
+ msg411845
assignee: docs@python -> gregory.p.smith resolution: fixed stage: patch review -> commit review |
2022-01-27 08:39:22 | gregory.p.smith | set | nosy:
+ gregory.p.smith messages:
+ msg411844
|
2022-01-27 08:39:19 | miss-islington | set | nosy:
+ miss-islington pull_requests:
+ pull_request29130
|
2022-01-26 20:12:02 | zach.ware | set | nosy:
+ zach.ware messages:
+ msg411789
|
2022-01-25 17:48:06 | alex | set | nosy:
+ alex messages:
+ msg411644
|
2022-01-25 15:56:53 | christian.heimes | set | messages:
+ msg411624 |
2022-01-25 15:46:07 | reaperhulk | set | nosy:
+ reaperhulk messages:
+ msg411623
|
2022-01-25 15:14:07 | april | set | messages:
+ msg411610 |
2022-01-25 08:50:50 | christian.heimes | set | messages:
+ msg411566 |
2022-01-24 22:42:04 | april | set | nosy:
+ april messages:
+ msg411524
|
2021-02-07 20:32:14 | illia-v | set | messages:
+ msg386605 |
2021-01-30 18:30:14 | christian.heimes | set | messages:
+ msg385992 |
2021-01-29 21:40:15 | illia-v | set | messages:
+ msg385944 |
2021-01-29 20:59:48 | rhettinger | set | nosy:
+ rhettinger messages:
+ msg385939
|
2021-01-21 22:39:09 | illia-v | set | messages:
+ msg385455 |
2021-01-21 19:14:30 | christian.heimes | set | nosy:
+ christian.heimes messages:
+ msg385442
|
2021-01-20 20:16:38 | illia-v | set | keywords:
+ patch stage: patch review pull_requests:
+ pull_request23099 |
2021-01-20 20:06:39 | illia-v | create | |