This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients brett.cannon, christian.heimes, vstinner
Date 2013-12-04.09:57:49
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1386151069.82.0.541926480726.issue19883@psf.upfronthosting.co.za>
In-reply-to
Content
read_directory() uses fseek() and ftell() which don't support offset larger than LONG_MAX (2 GB on 32-bit system).  I don't know if it's an issue. What happens if the file is longer?

"header_offset += arc_offset;" can overflow or not? This instuction looks weird.

    header_position = ftell(fp);
    ...
    header_offset = get_long((unsigned char *)endof_central_dir + 16);
    arc_offset = header_position - header_offset - header_size;
    header_offset += arc_offset;

If I computed correctly, the final line can be replaced with:

    arc_offset = header_position - header_offset - header_size;
    header_offset = header_position - header_size;

(It is weird to reuse header_position for two different values, a new variable may be added.)

Instead of checking that "header_offset > LONG_MAX", it may be safer to check that:

 - header_size >= 0
 - header_offset >= 0
 - header_offset + header_size <= LONG_MAX ---> header_offset <= LONG_MAX - header_size
 - arc_offset >= 0 ---> header_position >= header_offset + header_size
 - header_offset > 0 ---> header_position >= header_size

If all these values must be positive according to ZIP format, get_long() may be replaced with get_ulong() to simplify these checks.
History
Date User Action Args
2013-12-04 09:57:49vstinnersetrecipients: + vstinner, brett.cannon, christian.heimes
2013-12-04 09:57:49vstinnersetmessageid: <1386151069.82.0.541926480726.issue19883@psf.upfronthosting.co.za>
2013-12-04 09:57:49vstinnerlinkissue19883 messages
2013-12-04 09:57:49vstinnercreate