Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hmac.secure_compare() leaks information about length of strings #59266

Closed
tiran opened this issue Jun 13, 2012 · 96 comments
Closed

hmac.secure_compare() leaks information about length of strings #59266

tiran opened this issue Jun 13, 2012 · 96 comments
Labels
stdlib Python modules in the Lib dir type-security A security issue

Comments

@tiran
Copy link
Member

tiran commented Jun 13, 2012

BPO 15061
Nosy @loewis, @birkenfeld, @gpshead, @ncoghlan, @pitrou, @tiran, @alex, @akheron, @hynek, @serhiy-storchaka
Files
  • secure_compare_length.patch
  • timingsafe.h
  • timingsafe_cmp.patch
  • timingsafe_cmp-2.patch
  • compare_digest_c.patch
  • compare_digest_c-2.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2012-06-24.13:12:05.157>
    created_at = <Date 2012-06-13.23:00:23.931>
    labels = ['type-security', 'library']
    title = 'hmac.secure_compare() leaks information about length of strings'
    updated_at = <Date 2012-06-24.13:12:05.156>
    user = 'https://github.com/tiran'

    bugs.python.org fields:

    activity = <Date 2012-06-24.13:12:05.156>
    actor = 'christian.heimes'
    assignee = 'none'
    closed = True
    closed_date = <Date 2012-06-24.13:12:05.157>
    closer = 'christian.heimes'
    components = ['Library (Lib)']
    creation = <Date 2012-06-13.23:00:23.931>
    creator = 'christian.heimes'
    dependencies = []
    files = ['26003', '26068', '26079', '26106', '26112', '26120']
    hgrepos = []
    issue_num = 15061
    keywords = ['patch', 'needs review']
    message_count = 96.0
    messages = ['162739', '162758', '162759', '162760', '162761', '162762', '162763', '162764', '162765', '162766', '162767', '162768', '162769', '162770', '162773', '162775', '162777', '162778', '162838', '162845', '162846', '162847', '162848', '162850', '162852', '162853', '162855', '162856', '162857', '162858', '162859', '162860', '162861', '162862', '162863', '162864', '162865', '162866', '162867', '162868', '162871', '162872', '162873', '162875', '162877', '162880', '162882', '162885', '162888', '162891', '162892', '162893', '162895', '162899', '162914', '162949', '162950', '163159', '163163', '163168', '163170', '163186', '163188', '163192', '163193', '163196', '163204', '163329', '163333', '163343', '163347', '163365', '163366', '163368', '163371', '163377', '163378', '163385', '163390', '163468', '163469', '163613', '163614', '163615', '163617', '163619', '163623', '163625', '163626', '163627', '163630', '163652', '163671', '163696', '163780', '163784']
    nosy_count = 13.0
    nosy_names = ['loewis', 'georg.brandl', 'gregory.p.smith', 'ncoghlan', 'pitrou', 'christian.heimes', 'alex', 'fijall', 'python-dev', 'petri.lehtinen', 'hynek', 'serhiy.storchaka', 'Jon.Oberheide']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'security'
    url = 'https://bugs.python.org/issue15061'
    versions = ['Python 3.3']

    @tiran
    Copy link
    Member Author

    tiran commented Jun 13, 2012

    The secure_compare() function immediately returns False when both strings don't have equal length. With the patch the run time of secure_compare() always depends on the length of the right side. It no longer gives away information about the length of the left side.

    The patch should be applied in combination with the patch in issue bpo-14955.

    @tiran tiran added type-bug An unexpected behavior, bug, or error topic-IO labels Jun 13, 2012
    @fijall
    Copy link
    Mannequin

    fijall mannequin commented Jun 14, 2012

    secure_compare leaks the password always. Note that it takes different time to create a result of ord() depending whether it's <=100 or > 100 due to caching of small numbers. Such functions should be written in C.

    @hynek
    Copy link
    Member

    hynek commented Jun 14, 2012

    We should. Adding secure functions that aren't really secure is something we should rather avoid. :)

    Christian, are you willing to do that?

    @hynek hynek added stdlib Python modules in the Lib dir type-security A security issue and removed topic-IO type-bug An unexpected behavior, bug, or error labels Jun 14, 2012
    @arigo
    Copy link
    Mannequin

    arigo mannequin commented Jun 14, 2012

    fijal: while I agree with you, the limit for small ints has actually been pushed to 257 in recent CPythons. So it should still theoretically work --- of course, assuming a predictable CPU, which is wrong, and assuming a simple interpreter. (We can probably dig enough to find a timing issue even with CPython, and of course it's clear with any of the other Python interpreters out there.)

    @tiran
    Copy link
    Member Author

    tiran commented Jun 14, 2012

    I don't see how the function is going to leak this information when both this patch and the patch in bpo-14955 are applied. With http://bugs.python.org/file25801/secure-compare-fix-v2.patch ord() is no longer used and thus avoid the timing difference for integers > 256 (NSMALLPOSINTS is defined as 257, not 100).

    @tiran tiran changed the title hmac.secure_compare() leaks information of length of strings hmac.secure_compare() leaks information about length of strings Jun 14, 2012
    @fijall
    Copy link
    Mannequin

    fijall mannequin commented Jun 14, 2012

    Ah unicodes. is encode('unicode-internal') independent on the string characters? I heavily doubt so. you leak at least some information through that function alone.

    @pitrou
    Copy link
    Member

    pitrou commented Jun 14, 2012

    With PEP-393 unicode objects can have several representations, which makes it unlikely that *really* constant-timing functions can be devised.

    Speaking about this particular patch, I don't understand the point. secure_compare() is obviously meant to be used on inputs of some known length, not arbitrary data.

    @tiran
    Copy link
    Member Author

    tiran commented Jun 14, 2012

    IMHO it's not obvious to all users. Better safe than sorry. ;)

    The invariant 'known and equal length' impresses an artificial limitation. Code may need to compare outside data with internal data without exposing too many details about the structure of the internal data.

    The other matter should be discussed at bpo-14955.

    @hynek
    Copy link
    Member

    hynek commented Jun 14, 2012

    I don’t want to be the killjoy but I find it highly questionable to add a function that is advertised as "secure" while we can't fully grok the complexities at play. If we can't produce a provable secure one, we should scrub the function for good; or at least rename it somehow.

    Unjustified perceived security is IMHO the worst possible case.

    @pitrou
    Copy link
    Member

    pitrou commented Jun 14, 2012

    I don’t want to be the killjoy but I find it highly questionable to
    add a function that is advertised as "secure" while we can't fully
    grok the complexities at play. If we can't produce a provable secure
    one, we should scrub the function for good; or at least rename it
    somehow.

    The function is probably secure (modulo unseen bugs) in the
    bytestrings-of-the-same-size case. To make it "provably" secure, we
    could write a C version (which would be quite easy).

    For unicode strings things are a bit trickier though. Again, a C version
    could provide some guarantees (and could raise an error if the passed
    unicode strings use a different representation from each other).

    @pitrou pitrou changed the title hmac.secure_compare() leaks information about length of strings hmac.secure_compare() leaks information about length of strings Jun 14, 2012
    @fijall
    Copy link
    Mannequin

    fijall mannequin commented Jun 14, 2012

    Antoine, seriously? You want to explore a function that's called "secure" when the only thing you know about it is "probably secure"? This is extremely tricky business and I think it should be called secure only if you can prove it's secure. Otherwise it's plain insecure and should not be named that.

    @fijall
    Copy link
    Mannequin

    fijall mannequin commented Jun 14, 2012

    export not explore. Why can't I edit my own post?

    @pitrou
    Copy link
    Member

    pitrou commented Jun 14, 2012

    Antoine, seriously? You want to explore a function that's called
    "secure" when the only thing you know about it is "probably secure"?
    This is extremely tricky business and I think it should be called
    secure only if you can prove it's secure. Otherwise it's plain
    insecure and should not be named that.

    What's the methodology to "prove" that it's secure?

    We could rename "secure" to "safe" to downtone it a bit, but it's still
    an improvement on the nominal equality comparison.

    @fijall
    Copy link
    Mannequin

    fijall mannequin commented Jun 14, 2012

    For unicode at the very least it's not an improvement at all. With the patch mentioned that does encode it's also not an improvement at all. Prove as in reason about the function in C and make sure it does not do any conditionals depending on the input data. This is much easier than it is in Python.

    We did this exercise for PyPy once, just for the sake of it. We looked at generated IR and made sure a comparison is not leaking any data.

    As far as the function goes right now - I don't know. For now following the entire code of long_bitwise is a lot of effort - I genuinely can't say that it'll be the same for all numbers of 0-255. Can you? It's easier with low-level language simply (And yes, this is one of the few cases where I would argue it makes sense to implement something in C :)

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Jun 14, 2012

    I recommend to revert the addition of the function, given that it can't be made secure.

    @tiran
    Copy link
    Member Author

    tiran commented Jun 14, 2012

    I've two suggestions:

    • rename the function to 'total_compare'. The name explains what the function actually does in comparison to '=='. It takes the total input values into account instead of using short circuit comparison.

    • restrict the function to bytes. The implementation works well with bytes but not with unicode. It's not an issue since hash digests are bytes. The docs could explain the issue with unicode and how user code can implement a reasonable safe conversion.

    @fijall
    Copy link
    Mannequin

    fijall mannequin commented Jun 14, 2012

    Hi Christian. It's either secure or it's not. If it's not, there is no point in introducing it at all as I don't think it's a good idea to have a kind-of-secure-but-i-dont-know functions in stdlib.

    If you restrict input to bytes it looks okish, but I looked at all the code that's invoked on the C side and it's quite a lot of code. Does you or anyone else actually go and review all the C code that's called via various operations to check if it does or does not depend on the value of various characters? I can't tell myself, it's too long.

    @pitrou
    Copy link
    Member

    pitrou commented Jun 14, 2012

    It's either secure or it's not.

    I don't think that's true. By that reasoning, Python is not secure so there's no point in fixing crashes or providing a hashlib module.

    That said, I think renaming to "total_compare" isn't really helpful. The point of the function is to be secure (as much as possible), not to do a "total" comparison.

    @ncoghlan
    Copy link
    Contributor

    Maciej, please read http://mjg59.dreamwidth.org/13061.html

    "Secure" vs "not secure" is not a binary state - it's about making attacks progressively more difficult. Something that is secure against a casual script kiddie scatter gunning attacks on various sites with an automated script won't stand up to a systematic attack from a motivated attacker (also see the reporting on Flame and Stuxnet for what a *really* motivated and well resourced attacker can achieve).

    The hash randomisation changes didn't make Python completely secure against hashing DoS attacks - it just made them harder, by requiring attackers to figure out the hashing seed for the currently running process first. It's protecting against scatter gun attacks, not targeted ones.

    Performing a timing attack on Python's default short-circuiting comparison operation is *relatively easy* because the timing variations can be so large (send strings of increasing length until the time stops increasing - now you know the target digest length. Then try various initial characters until the time starts increasing again - now you know the first character. Repeat for the last character, then start with the second character and work through the string. Now you have the target hash, which you can try to crack offline at your leisure.

    The new comparison function is designed to significantly *reduce* the variance, thus leaking *less* information about the target hash, and making the attack *harder* (albeit, probably still not impossible).

    I agree with Christian's last two suggestions: change the name to total_compare, and only allow use on byte sequences (where the integer values are certain to be cached).

    Nothing should ever be called "safe" or "secure" in the standard library, because the immediate question from anyone that knows what they're talking about is "Secure/safe against what level of threat and what kind of threat"? People that *don't* know what they're doing will think "Secure/safe against everything" and that's dangerously misleading.

    Improving protection against remote timing attacks (e.g. by reducing comparison timing variance to the point where it is small relative to network timing variance) is a *lot* easier than protecting against local timing attacks. Protecting against someone with physical access to the machine hosting the target hash is harder still.

    Regardless, the target needs to be *improving the status quo*.

    Being able to tell people "using hmac.total_compare will make you less vulnerable to timing attacks than using ordinary short circuiting comparisons" is a *good thing*. We just need to be careful not to oversell it as making you *immune* to timing attacks.

    @hynek
    Copy link
    Member

    hynek commented Jun 15, 2012

    "Secure" vs "not secure" is not a binary state - it's about making attacks progressively more difficult. Something that is secure against a casual script kiddie scatter gunning attacks on various sites with an automated script won't stand up to a systematic attack from a motivated attacker (also see the reporting on Flame and Stuxnet for what a *really* motivated and well resourced attacker can achieve).

    The problem here is, that _if_ you add a "secure" to the name of a method, it becomes binary. At least in the minds of the users. I know you address that, but that's the main point here.

    Regardless, the target needs to be *improving the status quo*.

    Being able to tell people "using hmac.total_compare will make you less vulnerable to timing attacks than using ordinary short circuiting comparisons" is a *good thing*. We just need to be careful not to oversell it as making you *immune* to timing attacks.

    Why not write a C function which can be more secure than Python code? I would argue that would be an general asset for the stdlib, not just for HMAC (therefore, I'd put it elsewhere).

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Jun 15, 2012

    On 14.06.2012 14:26, Antoine Pitrou wrote:

    Antoine Pitrou <pitrou@free.fr> added the comment:

    > It's either secure or it's not.

    I don't think that's true. By that reasoning, Python is not secure so
    there's no point in fixing crashes or providing a hashlib module.

    The proper statement is "It's either time-independent or it's not".
    This *is* a binary state (I agree that being secure is not a binary
    state).

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Jun 15, 2012

    Being able to tell people "using hmac.total_compare will make you
    less vulnerable to timing attacks than using ordinary short
    circuiting comparisons" is a *good thing*.

    No, it's not. It's a *bad thing*. The two issues that have been
    opened since the function was first submitted indicate that people
    will keep inspecting the code and find out that it's not
    time-independent. If they had been relying on that it is, they will
    be upset. Since it's inherently impossible to make the function
    time-independent, people will be constantly annoyed about this function.
    I can't find anything good in that.

    If nobody else does, I'll revert the addition before the beta. Note
    that there is no *actual* issue that is being resolved by this function;
    it was added only because of its cuteness value.

    @fijall
    Copy link
    Mannequin

    fijall mannequin commented Jun 21, 2012

    Hi.

    This is what we did with Armin: http://bpaste.net/show/32123/

    It seems there is still *some* information leaking via side-channels, although it's a bit unclear what. Feel free to play with it (try swapping, having different object etc.)

    @tiran
    Copy link
    Member Author

    tiran commented Jun 21, 2012

    I've attached a header for that implements a single C function timingsafe_eq(a, b). The file is targeted for Objects/stringlib/timingsafe.h. Please review the file.

    Comments
    --------

    • I only handle exact byte or unicode types (no subclasses) since a user may have overwritten __eq__ and I don't want to special case it.

    • The unicode path works only with compact ASCII strings. I'm not familiar with the new API so please scream if I did it wrong.

    • length difference is currently optimized, length 0 isn't. I could easily un-optimize the len(a) != len(b) case or optimize the len(a) == len(b) == 0 case.

    Open questions
    --------------

    Where should I place the function? hashlib would be a nice place but there are multiple backends for hashlib. _hashopenssl.c seems wrong.

    @pitrou
    Copy link
    Member

    pitrou commented Jun 21, 2012

    The file is targeted for Objects/stringlib/timingsafe.h.

    stringlib is for type-generic functions, so I don't think it should be
    put there.

    • I only handle exact byte or unicode types (no subclasses) since a
      user may have overwritten __eq__ and I don't want to special case it.

    We could handle all bytes-compatible objects, using the buffer API.

    • The unicode path works only with compact ASCII strings. I'm not
      familiar with the new API so please scream if I did it wrong.

    It looks ok to me.

    Where should I place the function? hashlib would be a nice place but
    there are multiple backends for hashlib. _hashopenssl.c seems wrong.

    Well, if it's meant to be a private function called from hmac, where
    it's defined is an implementation detail. I think practicality beats
    purity, so _hashopenssl is a good enough place.

    @serhiy-storchaka
    Copy link
    Member

    > - I only handle exact byte or unicode types (no subclasses) since a
    > user may have overwritten __eq__ and I don't want to special case it.
    We could handle all bytes-compatible objects, using the buffer API.

    It is timing unsafe.

    > - The unicode path works only with compact ASCII strings. I'm not
    > familiar with the new API so please scream if I did it wrong.
    It looks ok to me.

    The user can just do timingsafe_eq(a.decode('ascii'),
    b.decode('ascii')). I do not see a necessity in support of unicode
    strings. Support ASCII strings will create the false impression that all
    strings are supported.

    About code. Instead (PyBytes_CheckExact(a) && PyBytes_CheckExact(b)) you
    should use ((PyBytes_CheckExact(a) != 0) & (PyBytes_CheckExact(b) !=
    0)).

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Jun 21, 2012

    The user can just do timingsafe_eq(a.decode('ascii'),
    b.decode('ascii')).

    You mean .encode()?

    I do not see a necessity in support of unicode
    strings. Support ASCII strings will create the false impression that all
    strings are supported.

    I agree.

    About code. Instead (PyBytes_CheckExact(a) && PyBytes_CheckExact(b)) you
    should use ((PyBytes_CheckExact(a) != 0) & (PyBytes_CheckExact(b) !=
    0)).

    What's the difference? They are the same.

    @serhiy-storchaka
    Copy link
    Member

    You mean .encode()?

    Yes, of cause. timingsafe_eq(a.encode('ascii'), b.encode('ascii')).

    > About code. Instead (PyBytes_CheckExact(a) && PyBytes_CheckExact(b)) you
    > should use ((PyBytes_CheckExact(a) != 0) & (PyBytes_CheckExact(b) !=
    > 0)).

    What's the difference? They are the same.

    Laziness. If "a" (a secret key) is not bytes then PyBytes_CheckExact(b)
    ("b" is a user input) is not called. It exposes secret key type. I'm not
    sure if it is real secret however.

    @pitrou
    Copy link
    Member

    pitrou commented Jun 21, 2012

    > > - I only handle exact byte or unicode types (no subclasses) since a
    > > user may have overwritten __eq__ and I don't want to special case it.
    > We could handle all bytes-compatible objects, using the buffer API.

    It is timing unsafe.

    How so?

    > > - The unicode path works only with compact ASCII strings. I'm not
    > > familiar with the new API so please scream if I did it wrong.
    > It looks ok to me.

    The user can just do timingsafe_eq(a.decode('ascii'),
    b.decode('ascii')).

    I don't think that's the right answer, because people will instead e.g.
    encode('utf-8'), and suddently the encodingly will not be timing-safe.

    @tiran
    Copy link
    Member Author

    tiran commented Jun 21, 2012

    I'm a bit rusty and I hope I got it right. The ASCII unicode case is a good idea and IMO timing safe. The buffer path is also timing safe once I have both views.

    The function leaks some timing information when an error occurs. Since the timing just reveals minimal information about the involved types and none about the bytes it's IMO safe. The acquiring of the buffer views may leak an unknown amount of timing data which may be an issue. The comparison is still safe.

    I've introduced a new module _hashlibfb (fb = fallback) for systems without openssl. I'm also open for a completely new module for future implementation of other digest, key derivation (PBKDF2) and password related C code.

    @tiran
    Copy link
    Member Author

    tiran commented Jun 21, 2012

    The patch has another flaw. The compiler may choose to fold and optimize code in _tscmp(). I'm going to declare the length of the right side and both char* as volatile. That should stop any compiler.

    I could also add some pragmas:

    MSVC:
    #pragma optimize("", off)
    code
    #pragma optimize("", on)

    GCC 4.4+:
    #pragma GCC push_options
    #pragma GCC optimize ("O0")
    code
    #pragma GCC pop_options

    @serhiy-storchaka
    Copy link
    Member

    > > We could handle all bytes-compatible objects, using the buffer API.
    > It is timing unsafe.
    How so?

    I checked myself, and I see that most likely I was wrong. At least for
    bytes and bytearrays it is timing safe.

    I don't think that's the right answer, because people will instead e.g.
    encode('utf-8'), and suddently the encodingly will not be timing-safe.

    And what of that? It is outside of the timingsafe_eq function. People
    can also do other timing unsafe operations with the secret key (for
    example reading it from file or from DB) or not to use timingsafe_eq at
    all. The secret key should be pre-encoded.

    The error will be if code works for developer from ASCII word, and then
    on the other side of ocean it will no longer work with non-ASCII
    strings. You are expected to be familiar with such issues. In any case,
    the obvious (and simplest, and fastest) way to check that a string is
    ASCII-only is try to encoded it to ASCII.

    @tiran
    Copy link
    Member Author

    tiran commented Jun 22, 2012

    The error will be if code works for developer from ASCII word, and then
    on the other side of ocean it will no longer work with non-ASCII
    strings. You are expected to be familiar with such issues. In any case,
    the obvious (and simplest, and fastest) way to check that a string is
    ASCII-only is try to encoded it to ASCII.

    No, the fastest way is to check the kind attribute of the unicode object in C code. That doesn't involve any additional conversion or Python function call. The function is deliberately limited.

    The ASCII fallback is very useful as most people will store hex encoded bytes of their passphrases in their databases. With ASCII support you can do timingsafe_compare(hex_from_db, hmac.hexdigest()).

    Maciej:
    http://pastebin.com/ZAAjSkJh

    The C function is one order of magnitude faster and the spread is one order smaller. 1e-07 is within the noise level on my idle computer. A busy server will have a higher noise level.

    @tiran
    Copy link
    Member Author

    tiran commented Jun 22, 2012

    In order to get the patch in before the beta release I'm willing to drop the promise and document that the function may leak some information if the arguments differ in length or the arguments' types are incompatible.

    @pitrou
    Copy link
    Member

    pitrou commented Jun 22, 2012

    In order to get the patch in before the beta release I'm willing to
    drop the promise and document that the function may leak some
    information if the arguments differ in length or the arguments' types
    are incompatible.

    That's not a problem to me. Programming errors are not the nominal case.

    @tiran
    Copy link
    Member Author

    tiran commented Jun 23, 2012

    Updated patch with volatile, better error report for non-ASCII strings and updated comments

    @pitrou
    Copy link
    Member

    pitrou commented Jun 23, 2012

    I'm not really happy with the addition of a separate extension module for a single private function. You could just put it in the operator module, for instance.

    Also, the idea was not to expose timingsafe_cmp but to use it in compare_digest().

    @tiran
    Copy link
    Member Author

    tiran commented Jun 23, 2012

    Me neither but you didn't want it in the operator module in the first place (msg162882). :) Please make a decision. I'm happy to follow it.

    My idea is to drop the pure Python implementation of compare_digest() and just use the C implementation.

    @pitrou
    Copy link
    Member

    pitrou commented Jun 23, 2012

    Me neither but you didn't want it in the operator module in the first
    place (msg162882). :) Please make a decision. I'm happy to follow it.

    Oh, sorry. I've changed my mind about it, but I think the operator module should only export a private function (and hmac or hashlib export it publicly).

    @birkenfeld
    Copy link
    Member

    Doesn't belong into operator IMO. We used to have a "strop" module where it would have fitted...

    @ncoghlan
    Copy link
    Contributor

    This is why I wanted to close the issue with the pure Python implementation, and punt on the question of a C accelerator for the moment.

    compare_digest is effectively the same as what all the Python web servers and frameworks do now for signature checking. Yes, it's more vulnerable to timing attacks than a C implementation, but it's going to to take a sophisticated attacker to attack that through the noise of network jitter. It's sufficiently hardened that's it's unlikely to be the weakest link in the security chain.

    For 3.4, I hope to see a discussion open up regarding the idea of something like a "securitytools" module that aims to provide some basic primitives for operations where Python's standard assumptions (such as flexibility and short circuiting behaviour) are a bad fit for security reasons. That would include exposing a C level full_compare option, as well as the core pbkdf2 algorithm.

    @pitrou
    Copy link
    Member

    pitrou commented Jun 23, 2012

    Doesn't belong into operator IMO. We used to have a "strop" module
    where it would have fitted...

    Again, it can be a private function in the operator module that happens
    to be wrapped or exposed in the hmac module. Practicality beats purity.

    @birkenfeld
    Copy link
    Member

    Yes.

    @tiran
    Copy link
    Member Author

    tiran commented Jun 23, 2012

    Again, it can be a private function in the operator module that happens
    to be wrapped or exposed in the hmac module. Practicality beats purity.

    Yes, we just need a place for the function. The operator module is a
    good place if we don't want to introduce a new module.

    Nick:
    I'll target pbkdf2 and other password hashing / security related for 3.4.

    @hynek
    Copy link
    Member

    hynek commented Jun 23, 2012

    For 3.4, I hope to see a discussion open up regarding the idea of something like a "securitytools" module that aims to provide some basic primitives for operations where Python's standard assumptions (such as flexibility and short circuiting behaviour) are a bad fit for security reasons. That would include exposing a C level full_compare option, as well as the core pbkdf2 algorithm.

    Strong +1 on that one. We could even consider adding bcrypt and scrypt as C isn't really an issue for us.

    Ideally we'd add a module with docs which both promote and leverage secure behavior. Basically how to realize http://www.daemonology.net/blog/2009-06-11-cryptographic-right-answers.html in Python.

    @tiran
    Copy link
    Member Author

    tiran commented Jun 23, 2012

    New patch. The compare_digest method now lives in the operator module as operator._compare_digest

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Jun 23, 2012

    >> About code. Instead (PyBytes_CheckExact(a) && PyBytes_CheckExact(b)) you
    >> should use ((PyBytes_CheckExact(a) != 0) & (PyBytes_CheckExact(b) !=
    >> 0)).
    >
    > What's the difference? They are the same.

    Laziness. If "a" (a secret key) is not bytes then PyBytes_CheckExact(b)
    ("b" is a user input) is not called. It exposes secret key type. I'm not
    sure if it is real secret however.

    I see; I missed that your version was using &. In any case, I don't
    think this is a threat: you couldn't use it to get the secret key
    faster.

    @tiran
    Copy link
    Member Author

    tiran commented Jun 24, 2012

    New patch.

    I've removed the special handling of PyBytes_CheckExact, support subclasses of str, non compact ASCII str and updated the docs.

    (Next time I'll create a sandbox and push my work to its own branch.)

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jun 24, 2012

    New changeset c82451eeb595 by Christian Heimes in branch 'default':
    Issue bpo-15061: Re-implemented hmac.compare_digest() in C
    http://hg.python.org/cpython/rev/c82451eeb595

    @tiran
    Copy link
    Member Author

    tiran commented Jun 24, 2012

    Thanks to all for your input and assistance!

    @tiran tiran closed this as completed Jun 24, 2012
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-security A security issue
    Projects
    None yet
    Development

    No branches or pull requests

    7 participants