This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author bckohan
Recipients aroussel, bckohan, gregory.p.smith, iritkatriel, serhiy.storchaka, vstinner
Date 2020-10-27.18:53:33
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1603824813.94.0.592521996609.issue42096@roundup.psfhosted.org>
In-reply-to
Content
I concur with Gregory. It seems that the action here is to just make it apparent in the docs the very real possibility of false positives.

In my experience processing data from the wild, I see a pretty high rate of about 1/1000. I'm sure the probability is a function of the types of files I'm working with. But in any case, is_zipfile can't be made to be sufficient in and of itself for reliably identifying zip files. It still has utility in weeding out true negatives though. In my case I don't ever expect to see a self extracting file or a file compounded into an executable so I use the results of is_zipfile as well as a manual check of the magic bytes at the start. So far so good.
History
Date User Action Args
2020-10-27 18:53:33bckohansetrecipients: + bckohan, gregory.p.smith, vstinner, serhiy.storchaka, iritkatriel, aroussel
2020-10-27 18:53:33bckohansetmessageid: <1603824813.94.0.592521996609.issue42096@roundup.psfhosted.org>
2020-10-27 18:53:33bckohanlinkissue42096 messages
2020-10-27 18:53:33bckohancreate