This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author gregory.p.smith
Recipients aroussel, bckohan, gregory.p.smith, iritkatriel, vstinner
Date 2020-10-24.19:05:16
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1603566316.55.0.803768614836.issue42096@roundup.psfhosted.org>
In-reply-to
Content
for what it's worth: false positives are always going to be possible in any such "magic" check as is_zipfile is.

we don't check the start of the file because zip files are defined by their end of file central directory which contains length information to determine where within the file the zip archive actually starts.

The issue28494 tests are a demonstration of this; It is somewhat common practice to append a zipfile to an executable of various forms for use as application specific data.

If you need more more reliable determination of file type not tied to a specific Python release, you might look at what the various file type sniffing magic libraries do for you, some examples include:
 https://pypi.org/project/filetype/
 https://pypi.org/project/puremagic/
 https://pypi.org/project/python-magic/

I _can_ reproduce this issue with the testdata @bckohan provided.

But I can't promise there is anything to fix here.  Even if we make the test slightly more robust by looking at another byte or two, it is always possible for files to appear to be a bunch of things at once based on small data signatures.

If nothing else we should reinforce in the documentation that is_zipfile is at best a guess.  False means it is not as far as the zipfile module is concerned.  True cannot guarantee that it is.
History
Date User Action Args
2020-10-24 19:05:17gregory.p.smithsetrecipients: + gregory.p.smith, vstinner, iritkatriel, aroussel, bckohan
2020-10-24 19:05:16gregory.p.smithsetmessageid: <1603566316.55.0.803768614836.issue42096@roundup.psfhosted.org>
2020-10-24 19:05:16gregory.p.smithlinkissue42096 messages
2020-10-24 19:05:16gregory.p.smithcreate