Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiprocessing.connection challenge implicitly uses MD5 #61460

Closed
davidmalcolm opened this issue Feb 20, 2013 · 11 comments
Closed

multiprocessing.connection challenge implicitly uses MD5 #61460

davidmalcolm opened this issue Feb 20, 2013 · 11 comments
Assignees
Labels
3.12 bugs and security fixes stdlib Python modules in the Lib dir topic-multiprocessing type-feature A feature request or enhancement type-security A security issue

Comments

@davidmalcolm
Copy link
Member

davidmalcolm commented Feb 20, 2013

BPO 17258
Nosy @warsaw, @doko42, @pitrou, @tiran, @davidmalcolm, @miss-islington
PRs
  • bpo-17258: use sha256 instead of md5 within multiprocessing.connection #16264
  • gh-61460: Stronger HMAC in multiprocessing #20380
  • bpo-17258: Add requires_hashdigest to multiprocessing tests #20412
  • [3.9] bpo-17258: Add requires_hashdigest to multiprocessing tests (GH-20412) #20626
  • Files
  • avoid-implicit-usage-of-md5-in-multiprocessing.patch: Patch to multiprocessing.connection to specify which hash algorithm to use
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2013-02-20.20:11:36.540>
    labels = ['3.7', 'library']
    title = 'multiprocessing.connection challenge implicitly uses MD5'
    updated_at = <Date 2020-06-04.17:22:43.520>
    user = 'https://github.com/davidmalcolm'

    bugs.python.org fields:

    activity = <Date 2020-06-04.17:22:43.520>
    actor = 'miss-islington'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2013-02-20.20:11:36.540>
    creator = 'dmalcolm'
    dependencies = []
    files = ['29134']
    hgrepos = []
    issue_num = 17258
    keywords = ['patch']
    message_count = 7.0
    messages = ['182547', '182550', '182553', '309103', '309134', '370705', '370717']
    nosy_count = 7.0
    nosy_names = ['barry', 'doko', 'pitrou', 'christian.heimes', 'dmalcolm', 'sbt', 'miss-islington']
    pr_nums = ['16264', '20380', '20412', '20626']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue17258'
    versions = ['Python 3.7']

    @davidmalcolm
    Copy link
    Member Author

    Within multiprocessing.connection, deliver_challenge() and
    answer_challenge() use hmac for a challenge/response.

    hmac implicitly defaults to using MD5.

    MD5 should no longer be used for security purposes. See e.g.
    http://www.kb.cert.org/vuls/id/836068

    This fails in a FIPS-compliant environment (e.g. with the patches I
    apply to hashlib in bpo-9216).

    There's thus a possibility of an attacker defeating the multiprocessing
    authenticator.

    I'm attaching a patch which changes multiprocessing to use a clearly
    identified algorithm (for the day when it needs changing again),
    hardcoding it as "sha256"; presumably all processes within a
    multiprocess program that share authkey can share the algorithm.

    It's not clear to me whether hmac.py should also be changed (this would
    seem to have tougher backwards-compat concerns).

    [Note to self: I'm tracking this downstream for RHEL as
    https://bugzilla.redhat.com/show_bug.cgi?id=879695 (this bug is
    currently only visible to RH employees)]

    @davidmalcolm davidmalcolm added the stdlib Python modules in the Lib dir label Feb 20, 2013
    @tiran
    Copy link
    Member

    tiran commented Feb 20, 2013

    The statement "MD5 should no longer be used for security purposes" is not entirely correct. MD5 should no longer be used as cryptographic hash function for signatures. However HMAC-MD5 is a different story.

    From https://tools.ietf.org/html/rfc6151

    The attacks on HMAC-MD5 do not seem to indicate a practical
    vulnerability when used as a message authentication code.
    [...]
    Therefore, it may not be urgent to remove HMAC-MD5 from the existing
    protocols. However, since MD5 must not be used for digital
    signatures, for a new protocol design, a ciphersuite with HMAC-MD5
    should not be included.

    I agree that we should slowly migrate to a more modern MAC such as HMAC-SHA256. AES-CBC is too hard to get right and most AES implementation are vulnerable to timing attacks, too.

    How about we include the name of the MAC in multiprocessing's wire protocol and define "no MAC name given" as HMAC-MD5? Please don't call it SHA256 but HMAC-SHA256, too.

    @sbt
    Copy link
    Mannequin

    sbt mannequin commented Feb 20, 2013

    Banning md5 as a matter of policy may be perfectly sensible.

    However, I think the way multiprocessing uses hmac authentication is *not* affected by the collision attacks the advisory talks about. These depend on the attacker being able to determine for himself whether a particular candidate string is a "solution".

    But with the way multiprocessing uses hmac authentication there is no way for the attacker to check for himself whether a candidate string has the desired hash: he does not know what the desired hash value is, or even what the hash function is. (The effective hash function, though built on top of md5, depends on the secret key.)

    @tiran
    Copy link
    Member

    tiran commented Dec 27, 2017

    Dave, are you still interested to address the issue?

    I think it's a good idea to replace HMAC-MD5 in the long run. But instead of hard-coding another hash algorithm, I would like to see an improved handshake protocol that supports flexible authentication algorithms. You could send an algorithm indicator (e.g. HMAC_SHA256) in the request.

    It would be really cool to run multiprocessing protocol over TLS with support for SASL with SCRAM or EXTERNAL (TLS cert auth, AF_UNIX PEERCRED, GSSAPI)...

    @tiran tiran added the 3.7 (EOL) end of life label Dec 27, 2017
    @pitrou
    Copy link
    Member

    pitrou commented Dec 28, 2017

    Indeed, we probably want a flexible handshake mechanism. This needn't be very optimized: probably a magic number followed by a JSON-encoded dict is sufficient.

    (of course, several years down the road, someone will engineer a downgrade attack)

    @miss-islington
    Copy link
    Contributor

    New changeset b022e5c by Christian Heimes in branch 'master':
    bpo-17258: Add requires_hashdigest to multiprocessing tests (GH-20412)
    b022e5c

    @miss-islington
    Copy link
    Contributor

    New changeset 196810a by Miss Islington (bot) in branch '3.9':
    bpo-17258: Add requires_hashdigest to multiprocessing tests (GH-20412)
    196810a

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @AA-Turner AA-Turner added 3.11 only security fixes 3.10 only security fixes 3.9 only security fixes 3.8 only security fixes 3.12 bugs and security fixes labels Jun 7, 2022
    @gpshead gpshead added type-feature A feature request or enhancement and removed 3.11 only security fixes 3.10 only security fixes 3.9 only security fixes 3.8 only security fixes 3.7 (EOL) end of life labels Oct 22, 2022
    @gpshead
    Copy link
    Member

    gpshead commented Oct 22, 2022

    This isn't something to backport to a release as what we've got isn't broken. Changing this is a feature.

    @gpshead
    Copy link
    Member

    gpshead commented Nov 13, 2022

    So #20380 is a more complicated version of my draft PR above. In other PRs and issues related to this in the past, I see one claim by @tiran in particular that bothers me - #16264 (comment) - "The change breaks backward compatibility. multiprocessing supports distributed computing across multiple machines and works with multiple Python versions. With the change a controller with Python 3.N+1 would no longer be able to talk to a 3.N server or the other way around."

    What evidence is there that multi-python-version use of multiprocessing rather than use as a single Python process launching and controlling a bunch of children is a supported use case? People really should not be using multiprocessing that way. This isn't a distributed computing system.

    gpshead added a commit that referenced this issue Nov 20, 2022
    …tocol (gh-99623)
    
    Describe the multiprocessing connection protocol.
    
    It isn't a good protocol, but it is what it is.  This way we can more
    easily reason about making changes to it in a backwards compatible way.
    @CAM-Gerlach
    Copy link
    Member

    Just to note, this came up again, in #100755 , an apparent duplicate of this issue

    gpshead added a commit that referenced this issue May 20, 2023
    bpo-17258:  `multiprocessing` now supports stronger HMAC algorithms for inter-process connection authentication rather than only HMAC-MD5.
    
    Signed-off-by: Christian Heimes <christian@python.org>
    
    gpshead: I Reworked to be more robust while keeping the idea.
    
    The protocol modification idea remains, but we now take advantage of the
    message length as an indicator of legacy vs modern protocol version.  No
    more regular expression usage.  We now default to HMAC-SHA256, but do so
    in a way that will be compatible when communicating with older clients
    or older servers. No protocol transition period is needed.
    
    More integration tests to verify these claims remain true are required. I'm
    unaware of anyone depending on multiprocessing connections between
    different Python versions.
    
    ---------
    
    Signed-off-by: Christian Heimes <christian@python.org>
    Co-authored-by: Gregory P. Smith [Google] <greg@krypto.org>
    @gpshead
    Copy link
    Member

    gpshead commented May 20, 2023

    3.12beta1 is going out using hmac-sha256 by default with the above PR merge.

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.12 bugs and security fixes stdlib Python modules in the Lib dir topic-multiprocessing type-feature A feature request or enhancement type-security A security issue
    Projects
    Development

    No branches or pull requests

    8 participants