Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zipfile has problem reading zip files over 2GB #47785

Closed
alonwas mannequin opened this issue Aug 10, 2008 · 14 comments
Closed

zipfile has problem reading zip files over 2GB #47785

alonwas mannequin opened this issue Aug 10, 2008 · 14 comments
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@alonwas
Copy link
Mannequin

alonwas mannequin commented Aug 10, 2008

BPO 3535
Nosy @loewis, @amauryfa, @pitrou
Files
  • large.c
  • largezip.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2008-09-05.23:43:33.596>
    created_at = <Date 2008-08-10.09:27:27.354>
    labels = ['type-bug', 'library']
    title = 'zipfile has problem reading zip files over 2GB'
    updated_at = <Date 2008-09-05.23:43:33.595>
    user = 'https://bugs.python.org/alonwas'

    bugs.python.org fields:

    activity = <Date 2008-09-05.23:43:33.595>
    actor = 'pitrou'
    assignee = 'none'
    closed = True
    closed_date = <Date 2008-09-05.23:43:33.596>
    closer = 'pitrou'
    components = ['Library (Lib)']
    creation = <Date 2008-08-10.09:27:27.354>
    creator = 'alonwas'
    dependencies = []
    files = ['11105', '11137']
    hgrepos = []
    issue_num = 3535
    keywords = ['patch', 'needs review']
    message_count = 14.0
    messages = ['70968', '70987', '71003', '71025', '71076', '71101', '71265', '71269', '72590', '72630', '72631', '72632', '72649', '72651']
    nosy_count = 5.0
    nosy_names = ['loewis', 'amaury.forgeotdarc', 'alanmcintyre', 'pitrou', 'alonwas']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = None
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue3535'
    versions = ['Python 3.1', 'Python 2.7']

    @alonwas
    Copy link
    Mannequin Author

    alonwas mannequin commented Aug 10, 2008

    zipfile complains about "Bad magic number for central directory" when I
    give it files over 2GB. I believe the problem is that the offset for the
    central directory should be read as an unsigned long rather than as a
    signed long. Modifying structEndArchive from "<4s4H2lH" to "<4s4H2LH"
    (note the capital L) should probably fix it. When the offset is >2^31
    you get a negative offset and the code fails to find the central
    directory. I'll appreciate it if someone more knowledgeable looks at the
    problem and the suggested fix, Thanks, Alon

    @alonwas alonwas mannequin added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Aug 10, 2008
    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Aug 10, 2008

    What Python version exactly are you using? This might have been fixed in
    2.5.2, with r60117.

    @alonwas
    Copy link
    Mannequin Author

    alonwas mannequin commented Aug 11, 2008

    Hi,
    I'm using 2.5.2 (r252:60911),
    Thanks,
    Alon

    On Sun, 2008-08-10 at 17:51 +0000, Martin v. Löwis wrote:

    Martin v. Löwis <martin@v.loewis.de> added the comment:

    What Python version exactly are you using? This might have been fixed in
    2.5.2, with r60117.

    ----------
    nosy: +loewis


    Python tracker <report@bugs.python.org>
    <http://bugs.python.org/issue3535\>


    @pitrou
    Copy link
    Member

    pitrou commented Aug 11, 2008

    Do you have a public URL for such a zip file?

    @alonwas
    Copy link
    Mannequin Author

    alonwas mannequin commented Aug 13, 2008

    Hi Antoine,
    The problem happens for files between 2GB and 4GB. I can't really send
    you a link to such a big file. To reproduce the problem, you can
    generate one. I created (and attach) a tiny C program that helps
    generate one. If you want to, you can run it, save its output to a file
    and then add it to a zip file (it should compress around 12%). The
    resulting zip file will fail to open from python using the zipfile
    package because of the bug I mentioned. Please let me know whether this
    is enough information to reproduce,
    Thanks,
    Alon

    On Mon, 2008-08-11 at 17:30 +0000, Antoine Pitrou wrote:

    Antoine Pitrou <pitrou@free.fr> added the comment:

    Do you have a public URL for such a zip file?

    ----------
    nosy: +pitrou


    Python tracker <report@bugs.python.org>
    <http://bugs.python.org/issue3535\>


    @pitrou
    Copy link
    Member

    pitrou commented Aug 13, 2008

    The problem happens for files between 2GB and 4GB. I can't really send
    you a link to such a big file. To reproduce the problem, you can
    generate one.

    The problem is that the "zip" command fails to create a zip file larger
    than 2GB (I get "zip I/O error: Invalid argument"). And even if it
    didn't fail the internal structure of the zip file might not be exactly
    the same as with other compression tools. That's why I was asking you
    for an existing file.

    If I give you an ssh/sftp access somewhere, would you be able to upload
    such a file?

    @alonwas
    Copy link
    Mannequin Author

    alonwas mannequin commented Aug 17, 2008

    Antoine,
    I had a similar problem with zip version 2.32, but this is fixed in
    version 3.0 (or on 64-bit architectures). Would you be able to give it a
    try with the newer version (which can be obtained from info-zip.org)?
    Unfortunately, my upload bandwidth will not allow me to upload such a
    big file.
    Thanks,
    Alon

    On Wed, 2008-08-13 at 22:11 +0000, Antoine Pitrou wrote:

    Antoine Pitrou <pitrou@free.fr> added the comment:

    > The problem happens for files between 2GB and 4GB. I can't really send
    > you a link to such a big file. To reproduce the problem, you can
    > generate one.

    The problem is that the "zip" command fails to create a zip file larger
    than 2GB (I get "zip I/O error: Invalid argument"). And even if it
    didn't fail the internal structure of the zip file might not be exactly
    the same as with other compression tools. That's why I was asking you
    for an existing file.

    If I give you an ssh/sftp access somewhere, would you be able to upload
    such a file?


    Python tracker <report@bugs.python.org>
    <http://bugs.python.org/issue3535\>


    @pitrou
    Copy link
    Member

    pitrou commented Aug 17, 2008

    Alon, can you try with the following patch? It seems to fix it here.

    @pitrou
    Copy link
    Member

    pitrou commented Sep 5, 2008

    Alan, do you have an opinion on this?

    @alanmcintyre
    Copy link
    Mannequin

    alanmcintyre mannequin commented Sep 5, 2008

    Your patch seems like a better way to detect whether a file is written
    as Zip64, and it seems to be able to properly handle extracting a >2GB
    file from a >2GB archive, so I'd vote to include it.

    I tested it with r66233, using a file made from the output of large.c,
    zipped with the built-in archiver on OS X 10.4.11. All regression tests
    pass, including test_zipfile64, on both Linux and OS X.

    @pitrou
    Copy link
    Member

    pitrou commented Sep 5, 2008

    Alan, do you have commit access? Otherwise the patch needs approval from
    another core developer.

    @alanmcintyre
    Copy link
    Mannequin

    alanmcintyre mannequin commented Sep 5, 2008

    No, I don't have commit access at the moment.

    @amauryfa
    Copy link
    Member

    amauryfa commented Sep 5, 2008

    I also agree with the patch. This seems the correct way to detect the
    Zip64 format.

    @pitrou
    Copy link
    Member

    pitrou commented Sep 5, 2008

    Fixed in r66240, r66241. Thanks!

    @pitrou pitrou closed this as completed Sep 5, 2008
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants