Update suggested number of iterations for pbkdf2_hmac() #87148

illia-v · 2021-01-20T20:06:39Z

BPO	42982
Nosy	@rhettinger, @gpshead, @april, @tiran, @ned-deily, @alex, @zware, @miss-islington, @illia-v
PRs	bpo-42982: Increase suggested number of iterations of PBKDF2 to 250,000 #24276 [3.10] bpo-42982: Improve the text on suggested number of iterations of PBKDF2 (GH-24276) #30951 bpo-42982: update pbkdf2 example & add another link #30966 [3.10] bpo-42982: update pbkdf2 example & add another link (GH-30966) #30968

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = 'https://github.com/gpshead'
closed_at = <Date 2022-01-27.08:41:18.359>
created_at = <Date 2021-01-20.20:06:39.244>
labels = ['docs']
title = 'Update suggested number of iterations for pbkdf2_hmac()'
updated_at = <Date 2022-03-01.20:56:40.688>
user = 'https://github.com/illia-v'

bugs.python.org fields:

activity = <Date 2022-03-01.20:56:40.688>
actor = 'ned.deily'
assignee = 'gregory.p.smith'
closed = True
closed_date = <Date 2022-01-27.08:41:18.359>
closer = 'gregory.p.smith'
components = ['Documentation']
creation = <Date 2021-01-20.20:06:39.244>
creator = 'illia-v'
dependencies = []
files = []
hgrepos = []
issue_num = 42982
keywords = ['patch']
message_count = 20.0
messages = ['385365', '385442', '385455', '385939', '385944', '385992', '386605', '411524', '411566', '411610', '411623', '411624', '411644', '411789', '411844', '411845', '411846', '411879', '411916', '414293']
nosy_count = 11.0
nosy_names = ['rhettinger', 'gregory.p.smith', 'april', 'christian.heimes', 'ned.deily', 'alex', 'docs@python', 'zach.ware', 'reaperhulk', 'miss-islington', 'illia-v']
pr_nums = ['24276', '30951', '30966', '30968']
priority = 'normal'
resolution = 'fixed'
stage = 'commit review'
status = 'closed'
superseder = None
type = None
url = 'https://bugs.python.org/issue42982'
versions = []

illia-v · 2021-01-20T20:06:39Z

Documentation [1] suggests using at least 100,000 iterations of SHA-256 as of 2013.

Currently, it is 2021, and it is common to use much more iterations.
For example, Django will use 260,000 by default in the next 3.2 LTS release and 320,000 in 4.0 [2][3].

I suggest suggesting at least 250,000 iterations that is a somewhat round number close to the one used by modern libraries.

[1] https://docs.python.org/3/library/hashlib.html#hashlib.pbkdf2_hmac
[2] django/django@f2187a2
[3] django/django@a948d9d

tiran · 2021-01-21T19:14:31Z

Is there any scientific research or mathematical proof for 250,000 iteration?

illia-v · 2021-01-21T22:39:10Z

I didn't find any. I think it is based on some benchmarks like openssl speed sha.

rhettinger · 2021-01-29T20:59:48Z

FWIW, OnePass uses 100,000. https://support.1password.com/pbkdf2/

Also, I don't think an additional time factor of 2.5x would make substantial difference in security, but it may make a noticeable difference in user authentication time.

illia-v · 2021-01-29T21:40:16Z

FWIW, OnePass uses 100,000. https://support.1password.com/pbkdf2/

There is a history section on that page. And current 100,000 is ten times more than 1Password used in 2013 when the suggestion was added to the documentation.

Also, I don't think an additional time factor of 2.5x would make substantial difference in security, but it may make a noticeable difference in user authentication time.

2.5x difference can be substantial if x is hours, days, or years :)

tiran · 2021-01-30T18:30:15Z

PBKDF2-HMAC is a serialized algorithm. It cannot be parallized. That means the runtime depends on single core-performance. The single core-performance of desktop and server CPUs hasn't improved much in the last decade. Modern CPUs have more cores, larger caches, and better IPC. Intel Nehalem architecture from 2009 had up to 3.33 GHz. Fast 2020 Comet Lake CPUs have up to 3.7 GHz base frequence and about 5GHz turbo.

illia-v · 2021-02-07T20:32:15Z

Clock rate is not the only indicator. Some new instructions supporting SHA were introduced during the last decade.

https://software.intel.com/content/www/us/en/develop/articles/intel-sha-extensions.html
https://software.intel.com/content/www/us/en/develop/articles/improving-openssl-performance.html
https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/sha-256-implementations-paper.pdf

april · 2022-01-24T22:42:04Z

Django uses 390,000 iterations as of late 2021, as does the Python Cryptography project. We should be aligned with their recommendations, or at least a good deal closer than we are now.

390,000 actually makes it a conservative recommendation for key derivation, as that number of rounds takes ~133ms to compute on my M1 versus 36ms. Usually you're shooting for ~250ms.

Being off by ~50% is probably okay, being off by this much is considerably worse.

Anyways, I'd be happy to make such a PR if folks are amenable to it.

tiran · 2022-01-25T08:50:50Z

My question from last year has not been answered yet. Is there any valid scientific research on the number of rounds or duration? I neither know nor do I understand how Django came up with the numbers. PyCA cryptography copied the numbers without questioning them.

Were does 250ms come from? 250ms at 100% CPU load sound way too costly for a website login and too fast for a password manager. For comparison Argon2's default runtime on my laptop is 50ms.

april · 2022-01-25T15:14:08Z

Django probably stores and computes more passwords than every other Python framework combined, and it doesn't provide you any control over the number of iterations. And it hasn't for years. If this were truly a problem, wouldn't their users be complaining about it constantly?

Werkzeug was doing 150,000 iterations as of 0.15.x, released three years ago, and does 260,000 iterations today. Again, no complaints or issues.

In practicality, this is almost never a problem - user logins and password changes are extremely rare events compared to all other activity, and so the computation time is essentially irrelevant outside response time for that individual user. No matter how many users, the systems are scaling such that the computation time of that rare event remains a fraction of overall CPU use.

reaperhulk · 2022-01-25T15:46:08Z

NIST provides no official guidance on iteration count other than NIST SP 800-132 Appendix A.2.2, which states "The number of iterations should be set as high as can be tolerated for the environment, while maintaining acceptable performance."

I can think of no better resource for what constitutes acceptable performance at the highest iteration count than popular packages like Django. Django's choice (and lack of evidence that they've had any cause to revert due to performance issues) argues that 390k iterations is a reasonable number in 2022. Certainly the 100k suggested in these docs as of 2013 is no longer best practice as we've seen 9 years of computational improvement in the intervening time.

I would, additionally, suggest that the documentation recommend the use of scrypt where possible over any iteration count of PBKDF2, but increasing the iteration count is still a useful improvement to the docs!

tiran · 2022-01-25T15:56:53Z

You are arguing from the perspective of a Django/werkzeug developer and you are using experiential domain knowledge to argue for higher recommendation.

I'm asking for a scientific answer. Based on my experience 100k PBKDF2 HMAC-SHA256 rounds is already a DoS issue for some use cases. For other uses cases even 500k rounds is not the right answer, because the application should rather use a different algorithm all together.

If you are concerned about PBKDF2's strength, then better switch to Scrypt or Argon2. They are better suited against GPU-based crackers. PBKDF2 is still required for FIPS compliance, but most people can (and should!) ignore FIPS.

alex · 2022-01-25T17:48:06Z

Sticking with 100k is not scientific though ;-) Empiricism is science!

I'm probably the person responsible for Django's process, which is to increase by some % (10% or 20% IIRC) every release.

As you point out, the exact value one should use is a function of context, which we don't have as documentation authors. However, what we can do is try to select a value that's most likely to be practical for many users and will in-turn protect their users data most. 100k isn't that value, and taking inspiration from places that have had their values tested by many users is intuitive to me.

zware · 2022-01-26T20:12:02Z

Rather than suggesting an actual number, perhaps we should link to an external resources that covers how to choose the number?

Or we leave it vague and say "The number of iterations should be chosen based on the hash algorithm and computing power; there is no universal recommendation, but hundreds of thousands of iterations may be reasonable." This avoids bikeshedding a specific number, but still gives a general idea of the magnitude of number involved.

gpshead · 2022-01-27T08:39:23Z

New changeset 897ce90 by Illia Volochii in branch 'main':
bpo-42982: Improve the text on suggested number of iterations of PBKDF2 (GH-24276)
897ce90

gpshead · 2022-01-27T08:41:18Z

I reworked the PR and went with less specific text and linking to the NIST 800 132 appendix as guidance on how people should determine what is right for them.

there is no one right number. it is application specific.

thanks for everyone's valuable input!

miss-islington · 2022-01-27T09:02:02Z

New changeset 1ecc98d by Miss Islington (bot) in branch '3.10':
bpo-42982: Improve the text on suggested number of iterations of PBKDF2 (GH-24276)
1ecc98d

april · 2022-01-27T14:16:31Z

The code snippet still uses 100000. Given that many people will simply copy-and-paste without questioning, should we update that too?

miss-islington · 2022-01-27T20:18:36Z

New changeset ace0aa2 by Gregory P. Smith in branch 'main':
bpo-42982: update pbkdf2 example & add another link (GH-30966)
ace0aa2

ned-deily · 2022-03-01T20:56:41Z

New changeset 7dbb2f8 by Miss Islington (bot) in branch '3.10':
bpo-42982: update pbkdf2 example & add another link (GH-30966) (bpo-30968)
7dbb2f8

illia-v mannequin assigned docspython Jan 20, 2021

illia-v mannequin added the docs Documentation in the Doc dir label Jan 20, 2021

illia-v mannequin assigned docspython Jan 20, 2021

illia-v mannequin added the docs Documentation in the Doc dir label Jan 20, 2021

gpshead closed this as completed Jan 27, 2022

gpshead assigned gpshead and unassigned docspython Jan 27, 2022

gpshead closed this as completed Jan 27, 2022

gpshead assigned gpshead and unassigned docspython Jan 27, 2022

ezio-melotti transferred this issue from another repository Apr 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update suggested number of iterations for pbkdf2_hmac() #87148

Update suggested number of iterations for pbkdf2_hmac() #87148

illia-v mannequin commented Jan 20, 2021

illia-v mannequin commented Jan 20, 2021

tiran commented Jan 21, 2021

illia-v mannequin commented Jan 21, 2021

rhettinger commented Jan 29, 2021

illia-v mannequin commented Jan 29, 2021

tiran commented Jan 30, 2021

illia-v mannequin commented Feb 7, 2021

april mannequin commented Jan 24, 2022

tiran commented Jan 25, 2022

april mannequin commented Jan 25, 2022

reaperhulk mannequin commented Jan 25, 2022

tiran commented Jan 25, 2022

alex commented Jan 25, 2022

zware commented Jan 26, 2022

gpshead commented Jan 27, 2022

gpshead commented Jan 27, 2022

miss-islington commented Jan 27, 2022

april mannequin commented Jan 27, 2022

miss-islington commented Jan 27, 2022

ned-deily commented Mar 1, 2022

Update suggested number of iterations for pbkdf2_hmac() #87148

Update suggested number of iterations for pbkdf2_hmac() #87148

Comments

illia-v mannequin commented Jan 20, 2021

illia-v mannequin commented Jan 20, 2021

tiran commented Jan 21, 2021

illia-v mannequin commented Jan 21, 2021

rhettinger commented Jan 29, 2021

illia-v mannequin commented Jan 29, 2021

tiran commented Jan 30, 2021

illia-v mannequin commented Feb 7, 2021

april mannequin commented Jan 24, 2022

tiran commented Jan 25, 2022

april mannequin commented Jan 25, 2022

reaperhulk mannequin commented Jan 25, 2022

tiran commented Jan 25, 2022

alex commented Jan 25, 2022

zware commented Jan 26, 2022

gpshead commented Jan 27, 2022

gpshead commented Jan 27, 2022

miss-islington commented Jan 27, 2022

april mannequin commented Jan 27, 2022

miss-islington commented Jan 27, 2022

ned-deily commented Mar 1, 2022