Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

random.seed mutates input bytearray #88184

Closed
arjaz mannequin opened this issue May 3, 2021 · 8 comments
Closed

random.seed mutates input bytearray #88184

arjaz mannequin opened this issue May 3, 2021 · 8 comments
Assignees
Labels
3.9 only security fixes stdlib Python modules in the Lib dir

Comments

@arjaz
Copy link
Mannequin

arjaz mannequin commented May 3, 2021

BPO 44018
Nosy @rhettinger, @mdickinson, @pablogsal, @miss-islington, @shreyanavigyan, @miguendes, @arjaz
PRs
  • bpo-44018: random.seed() no longer mutates its inputs #25856
  • [3.9] bpo-44018: random.seed() no longer mutates its inputs (GH-25856) #25864
  • [3.8] bpo-44018: random.seed() no longer mutates its inputs (GH-25856) #25865
  • [3.10] bpo-44018: random.seed() no longer mutates its inputs (GH-25856) #25867
  • [3.10] bpo-44018: random.seed() no longer mutates its inputs (GH-25856) #25872
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/rhettinger'
    closed_at = <Date 2021-05-04.04:02:13.950>
    created_at = <Date 2021-05-03.09:13:38.608>
    labels = ['library', '3.9']
    title = 'random.seed mutates input bytearray'
    updated_at = <Date 2021-07-18.15:30:52.175>
    user = 'https://github.com/arjaz'

    bugs.python.org fields:

    activity = <Date 2021-07-18.15:30:52.175>
    actor = 'pablogsal'
    assignee = 'rhettinger'
    closed = True
    closed_date = <Date 2021-05-04.04:02:13.950>
    closer = 'rhettinger'
    components = ['Library (Lib)']
    creation = <Date 2021-05-03.09:13:38.608>
    creator = 'arjaz'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 44018
    keywords = ['patch']
    message_count = 8.0
    messages = ['392782', '392784', '392786', '392789', '392801', '392821', '392844', '392849']
    nosy_count = 7.0
    nosy_names = ['rhettinger', 'mark.dickinson', 'pablogsal', 'miss-islington', 'shreyanavigyan', 'miguendes', 'arjaz']
    pr_nums = ['25856', '25864', '25865', '25867', '25872']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue44018'
    versions = ['Python 3.9']

    @arjaz
    Copy link
    Mannequin Author

    arjaz mannequin commented May 3, 2021

    https://github.com/python/cpython/blob/master/Lib/random.py#L157
    If bytearray is passed as a seed, the function will change it.
    It either has to be documented, or the implementation should change.

    @arjaz arjaz mannequin added 3.9 only security fixes stdlib Python modules in the Lib dir labels May 3, 2021
    @rhettinger
    Copy link
    Contributor

    I don't see a bug here. As documented, "a str, bytes, or bytearray object gets converted to an int and all of its bits are used." The various input types are converted to an integer and it doesn't really matter how it is done as long as it is consistent. FWIW, the docs link to the source code so it can be examined if needed.

    @arjaz
    Copy link
    Mannequin Author

    arjaz mannequin commented May 3, 2021

    I find the following behaviour very confusing:

    >>> import random
    >>> a = bytearray("1234", "utf-8")
    >>> b = bytearray("1234", "utf-8")
    >>> a == b
    True
    >>> random.seed(a)
    >>> a == b
    False
    >>> a
    bytearray(b'1234\xd4\x04U\x9f`.\xabo\xd6\x02\xacv\x80\xda\xcb\xfa\xad\xd1603^\x95\x1f\tz\xf3\x90\x0e\x9d\xe1v\xb6\xdb(Q/.\x00\x0b\x9d\x04\xfb\xa5\x13>\x8b\x1cn\x8d\xf5\x9d\xb3\xa8\xab\x9d`\xbeK\x97\xcc\x9e\x81\xdb')
    >>> b
    bytearray(b'1234')

    The function doesn't document it will change the input argument

    @miguendes
    Copy link
    Mannequin

    miguendes mannequin commented May 3, 2021

    The problem is that random seed will do

                if isinstance(a, str):
                    a = a.encode()
                a += _sha512(a).digest()
                a = int.from_bytes(a, 'big')
    

    and that will modify the bytearray in place.

    >>> a = bytearray("1234", "utf-8")
    >>> a += b"digest"
    >>> a
    bytearray(b'1234digest')

    IMHO, seed shouldn't modify the input. Since str, and bytes are immutable that will only happen when passing a bytearray which is not consistent.

    @shreyanavigyan
    Copy link
    Mannequin

    shreyanavigyan mannequin commented May 3, 2021

    One solution to these would be making a copy of the parameter using copy.deepcopy and then performing operations on that copy. (The downside of this solution is that it will change the behavior.)

    @rhettinger
    Copy link
    Contributor

    Okay, I now understand your report and will prepare a fix.

    @rhettinger rhettinger changed the title Bug in random.seed random.seed mutates input bytearray May 3, 2021
    @rhettinger rhettinger changed the title Bug in random.seed random.seed mutates input bytearray May 3, 2021
    @rhettinger
    Copy link
    Contributor

    New changeset e733e99 by Miss Islington (bot) in branch '3.9':
    bpo-44018: random.seed() no longer mutates its inputs (GH-25856) (GH-25864)
    e733e99

    @rhettinger
    Copy link
    Contributor

    New changeset 2995bff by Miss Islington (bot) in branch '3.10':
    bpo-44018: random.seed() no longer mutates its inputs (GH-25856) (GH-25872)
    2995bff

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes stdlib Python modules in the Lib dir
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant