Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stored (uncompressed) ZipExtFile in zipfile can be seekable at lower cost #88339

Closed
JuniorJPDJ mannequin opened this issue May 19, 2021 · 2 comments
Closed

Stored (uncompressed) ZipExtFile in zipfile can be seekable at lower cost #88339

JuniorJPDJ mannequin opened this issue May 19, 2021 · 2 comments
Assignees
Labels
3.12 bugs and security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@JuniorJPDJ
Copy link
Mannequin

JuniorJPDJ mannequin commented May 19, 2021

BPO 44173
Nosy @Yhg1s, @gpshead, @serhiy-storchaka, @JuniorJPDJ
PRs
  • bpo-44173: better approach for seeking in non-compressed ZipExtFile #26227
  • gh-88339: enable fast seeking of uncompressed unencrypted zipfile.ZipExtFile #27737
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/gpshead'
    closed_at = None
    created_at = <Date 2021-05-19.02:28:39.857>
    labels = ['type-feature', 'library', '3.11']
    title = 'Stored (uncompressed) ZipExtFile in zipfile can be seekable at lower cost'
    updated_at = <Date 2021-08-18.17:30:30.110>
    user = 'https://github.com/JuniorJPDJ'

    bugs.python.org fields:

    activity = <Date 2021-08-18.17:30:30.110>
    actor = 'gregory.p.smith'
    assignee = 'gregory.p.smith'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2021-05-19.02:28:39.857>
    creator = 'juniorjpdj'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 44173
    keywords = ['patch']
    message_count = 1.0
    messages = ['393918']
    nosy_count = 5.0
    nosy_names = ['twouters', 'gregory.p.smith', 'alanmcintyre', 'serhiy.storchaka', 'juniorjpdj']
    pr_nums = ['26227', '27737']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue44173'
    versions = ['Python 3.11']

    @JuniorJPDJ
    Copy link
    Mannequin Author

    JuniorJPDJ mannequin commented May 19, 2021

    At the moment stored ZipExtFile is being read to the place of seek like all other compressed variants.
    It's not needed as it's possible to freely seek uncompressed file inside zip without this penalty.

    Lots of apps depend on ZipExtFile seeking ability and it would lower performance and IO penalty significantly.

    I've POC patch created.
    It disables CRC checking after first seek as it's impossible to check CRC if we are not reading whole file.

    @JuniorJPDJ JuniorJPDJ mannequin added 3.11 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels May 19, 2021
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    gpshead pushed a commit that referenced this issue Aug 6, 2022
    …ExtFile (GH-27737)
    
    Avoid reading all of the intermediate data in uncompressed items in a zip file when the user seeks forward.
    
    Contributed by: @JuniorJPDJ
    @gpshead gpshead added 3.12 bugs and security fixes and removed 3.11 only security fixes labels Aug 6, 2022
    @gpshead
    Copy link
    Member

    gpshead commented Aug 6, 2022

    merged for 3.12. thanks for the contribution!

    @gpshead gpshead closed this as completed Aug 6, 2022
    iritkatriel pushed a commit to iritkatriel/cpython that referenced this issue Aug 11, 2022
    …le.ZipExtFile (pythonGH-27737)
    
    Avoid reading all of the intermediate data in uncompressed items in a zip file when the user seeks forward.
    
    Contributed by: @JuniorJPDJ
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.12 bugs and security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    Status: Done
    Development

    No branches or pull requests

    1 participant