classification
Title: zipfile with multiprocessing: zipfile.BadZipFile
Type: crash Stage:
Components: Library (Lib) Versions: Python 3.6
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: maxime-lemonnier
Priority: normal Keywords:

Created on 2020-01-16 18:24 by maxime-lemonnier, last changed 2020-01-16 18:32 by maxime-lemonnier.

Files
File name Uploaded Description Edit
test_filesource.py maxime-lemonnier, 2020-01-16 18:24 python script
foo_bar_small.zip maxime-lemonnier, 2020-01-16 18:24
Messages (2)
msg360134 - (view) Author: (maxime-lemonnier) * Date: 2020-01-16 18:24
zipfile sometimes throws zipfile.BadZipFile when opening the same zip file from multiple processes

see attached file to reproduce the error. You'll need a zipfile with multiple files in it to reproduce.
msg360135 - (view) Author: (maxime-lemonnier) * Date: 2020-01-16 18:32
Here's my console output:

python3 test_filesource.py
lock
file
access_mode = file, nb processes = 1, res = 110289, 0.08039402961730957 ms/frame
file
access_mode = file, nb processes = 4, res = 110289, 0.32297492027282715 ms/frame
lock
access_mode = lock, nb processes = 4, res = 110289, 0.2950408458709717 ms/frame
lock
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/usr/lib/python3.6/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/path/to/script/test_filesource.py", line 64, in read_small
    return fs_small[i%len(fs_small)][42]
  File "/path/to/script/test_filesource.py", line 55, in __getitem__
    data_bytes = self.access( lambda archive: archive.read(member))
  File "/path/to/script/test_filesource.py", line 27, in access_lock
    return f(self.archive)
  File "/path/to/script/test_filesource.py", line 55, in <lambda>
    data_bytes = self.access( lambda archive: archive.read(member))
  File "/usr/lib/python3.6/zipfile.py", line 1337, in read
    with self.open(name, "r", pwd) as fp:
  File "/usr/lib/python3.6/zipfile.py", line 1419, in open
    % (zinfo.orig_filename, fname))
zipfile.BadZipFile: File name in directory '00000005.pkl' and header b'00000004.pkl' differ.
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/path/to/script/test_filesource.py", line 90, in <module>
    f(4, "lock") #crash
  File "/path/to/script/test_filesource.py", line 81, in f
    for i in pool.imap_unordered(read_small, frames):
  File "/usr/lib/python3.6/multiprocessing/pool.py", line 735, in next
    raise value
zipfile.BadZipFile: File name in directory '00000005.pkl' and header b'00000004.pkl' differ.
History
Date User Action Args
2020-01-16 18:32:16maxime-lemonniersetmessages: + msg360135
2020-01-16 18:24:25maxime-lemonniersetfiles: + foo_bar_small.zip
2020-01-16 18:24:07maxime-lemonniercreate