Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZIP does not support timestamps before 1980 #78278

Closed
encukou opened this issue Jul 11, 2018 · 16 comments
Closed

ZIP does not support timestamps before 1980 #78278

encukou opened this issue Jul 11, 2018 · 16 comments
Labels
3.8 only security fixes stdlib Python modules in the Lib dir

Comments

@encukou
Copy link
Member

encukou commented Jul 11, 2018

BPO 34097
Nosy @vstinner, @encukou, @serhiy-storchaka, @Dormouse759
PRs
  • bpo-34097: Add support for zipping files older than 1980-01-01 #8270
  • bpo-34097: Polish API design #8725
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2018-08-31.14:44:46.500>
    created_at = <Date 2018-07-11.14:35:58.247>
    labels = ['3.8', 'library']
    title = 'ZIP does not support timestamps before 1980'
    updated_at = <Date 2018-08-31.14:44:46.499>
    user = 'https://github.com/encukou'

    bugs.python.org fields:

    activity = <Date 2018-08-31.14:44:46.499>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2018-08-31.14:44:46.500>
    closer = 'vstinner'
    components = ['Library (Lib)']
    creation = <Date 2018-07-11.14:35:58.247>
    creator = 'petr.viktorin'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 34097
    keywords = ['patch']
    message_count = 16.0
    messages = ['321460', '321464', '321595', '321597', '321598', '321600', '321607', '322949', '322950', '322954', '322957', '322960', '323023', '323095', '324423', '324424']
    nosy_count = 4.0
    nosy_names = ['vstinner', 'petr.viktorin', 'serhiy.storchaka', 'Dormouse759']
    pr_nums = ['8270', '8725']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue34097'
    versions = ['Python 3.8']

    @encukou
    Copy link
    Member Author

    encukou commented Jul 11, 2018

    The ZIP format cannot handle times before 1980.
    bpo-6090 provided a nice error message for trying to add such files.

    I'm seeing a system for reproducible builds that sets mtime to 1970 (zero UNIX timestamp), resulting in files that Python can't package into (zip-based) wheels.

    At least here on Fedora, the zip command-line utility silently bumps old timestamps to 1980-01-01. Of course, silently corrupting data would not be good default behavior for Python.
    But in many cases timestamps don't matter. It would be nice to give ZipFile and ZipFile.write() a strict_timestamps=True keyword argument that could be turned off.

    @encukou encukou added 3.8 only security fixes stdlib Python modules in the Lib dir labels Jul 11, 2018
    @Dormouse759
    Copy link
    Mannequin

    Dormouse759 mannequin commented Jul 11, 2018

    I'm going to have a closer look at this and try to add the option.

    @Dormouse759
    Copy link
    Mannequin

    Dormouse759 mannequin commented Jul 13, 2018

    I have created a PR for this: #8270

    @serhiy-storchaka
    Copy link
    Member

    There are ZIP extensions which allow to save timestamps before 1980 and with better than 2-seconds resolution. It would be better to use this feature. But first we need to resolve bpo-17681.

    @encukou
    Copy link
    Member Author

    encukou commented Jul 13, 2018

    I'm not sure the extensions will solve this problem fully.
    If an implementation that doesn't support these extensions, how does it handle them?

    For example, if Python 3.8 implements the extensions, I use py3.8 to create a zipfile containing old timestamps, and then want to uncompress the file using Python 2.7, how do I avoid losing the timestamp information?

    @serhiy-storchaka
    Copy link
    Member

    You can't, because Python 2.7 doesn't support it. But you will be able to pack files in a ZIP archive and extract them without a loss on 3.8 and with a loss of a timestamp on 2.7. And you will be able to use a third-party utilities for extracting files without a loss.

    @encukou
    Copy link
    Member Author

    encukou commented Jul 13, 2018

    Which third-party utilities support these? As I said, AFAIK zip on my system does not.

    I assume the loss of data is the reason we have an error now -- if that wasn't a concern, zipfile could just silently bump the timestamp to 1980.

    When the extensions are implemented, strict_timestamps=False should additionally use the extension, but default should still be to raise the error (to prevent metadata loss when decompressing with older/simpler tools).

    @vstinner
    Copy link
    Member

    vstinner commented Aug 2, 2018

    New changeset a2fe1e5 by Victor Stinner (Marcel Plch) in branch 'master':
    bpo-34097: Add support for zipping files older than 1980-01-01 (GH-8270)
    a2fe1e5

    @vstinner
    Copy link
    Member

    vstinner commented Aug 2, 2018

    Thank you Petr Viktorin for the bug report and thanks to Marcel Plch for the implementation of the new strict_timestamps keyword-only parameter!

    @vstinner vstinner closed this as completed Aug 2, 2018
    @serhiy-storchaka
    Copy link
    Member

    If add a new option, I prefer to add it to the ZipFile constructor, similarly to allowZip64. Initially allowZip64 was False by default, because not all third-party tools supported ZIP64 extension. Later the default was changed to True, but you still can force raising an error if the archive is not compatible with old tools. I think the same policy should be applied here. Add the ability of writing date before 1980, and add an option for raising an error for date before 1980.

    @vstinner
    Copy link
    Member

    vstinner commented Aug 2, 2018

    Serhiy:

    If add a new option, I prefer to add it to the ZipFile constructor, similarly to allowZip64.

    Aha, that would make sense. I'm not sure that it's useful to control the parameter per added file, it's enough to control the parameter per ZIP archive.

    Marcel: would you mind to try to move the strict_timestamps parameter from ZipFile.write() to ZipFile constructor?

    @vstinner vstinner reopened this Aug 2, 2018
    @Dormouse759
    Copy link
    Mannequin

    Dormouse759 mannequin commented Aug 2, 2018

    It seems reasonable, I'll have a look at it.

    @serhiy-storchaka
    Copy link
    Member

    Tests are failed on some buildbots. See bpo-34325.

    @vstinner
    Copy link
    Member

    vstinner commented Aug 3, 2018

    Tests are failed on some buildbots. See bpo-34325.

    Marcel Plch already fixed it: commit 7b41dba. The two buildbots are back to green.

    @vstinner
    Copy link
    Member

    New changeset 77b112c by Victor Stinner (Marcel Plch) in branch 'master':
    bpo-34097: Polish API design (GH-8725)
    77b112c

    @vstinner
    Copy link
    Member

    Thanks Petr Viktorin for reporting this issue and thanks Marcel Plch for the fix!

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 only security fixes stdlib Python modules in the Lib dir
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants