This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author anadius
Recipients anadius, shaanbhaya
Date 2021-11-05.00:04:58
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1636070698.99.0.861558169316.issue44067@roundup.psfhosted.org>
In-reply-to
Content
I was looking at `zipfile._strip_extra` trying to figure out how it works. It doesn't. It skips extra headers after the last one that matches. That's what causes this issue.

Here's a fixed version:

def _strip_extra(extra, xids):
    # Remove Extra Fields with specified IDs.
    unpack = _EXTRA_FIELD_STRUCT.unpack
    modified = False
    buffer = []
    start = i = 0
    while i + 4 <= len(extra):
        xid, xlen = unpack(extra[i : i + 4])
        j = i + 4 + xlen
        if xid in xids:
            if i != start:
                buffer.append(extra[start : i])
            start = j
            modified = True
        i = j
    if i != start:
        buffer.append(extra[start : i])
    if not modified:
        return extra
    return b''.join(buffer)

Or this one, easier to understand:

def _strip_extra(extra, xids):
    # Remove Extra Fields with specified IDs.
    unpack = _EXTRA_FIELD_STRUCT.unpack
    modified = False
    buffer = []
    i = 0
    while i + 4 <= len(extra):
        xid, xlen = unpack(extra[i : i + 4])
        j = i + 4 + xlen
        if xid in xids:
            modified = True
        else:
            buffer.append(extra[i : j])
        i = j
    if not modified:
        return extra
    return b''.join(buffer)

Not sure which one is better.
History
Date User Action Args
2021-11-05 00:04:59anadiussetrecipients: + anadius, shaanbhaya
2021-11-05 00:04:58anadiussetmessageid: <1636070698.99.0.861558169316.issue44067@roundup.psfhosted.org>
2021-11-05 00:04:58anadiuslinkissue44067 messages
2021-11-05 00:04:58anadiuscreate