This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author jafo
Recipients cpn, georg.brandl, jafo
Date 2007-08-28.10:26:07
SpamBayes Score 0.53758144
Marked as misclassified No
Message-id <1188296768.17.0.940722994522.issue1597011@psf.upfronthosting.co.za>
In-reply-to
Content
There are some bugs in the bz2 module.  The problem boils down to the
following code, notice how *c is assigned *BEFORE* the check to see if
there was a read error:

   do {
      BZ2_bzRead(&bzerror, f->fp, &c, 1);
      f->pos++;
      *buf++ = c;
   } while (bzerror == BZ_OK && c != '\n' && buf != end);

This could be fixed by putting a "if (bzerror == BZ_OK) break;" after
the BZ2_bzRead() call.

However, I also noticed that in the universal newline section of the
code it is reading a character, incrementing f->pos, *THEN* checking if
buf == end and if so is throwing away the character.

I changed the code around so that the read loop is unified between
universal newlines and regular newlines.  I guess this is a small
performance penalty, since it's checking the newline mode for each
character read, however we're already doing a system call for every
character so one additional comparison and jump to merge duplicate code
for maintenance reasons is probably a good plan.  Especially since the
reason for this bug only existed in one of the two duplicated parts of
the code.

Please let me know if this looks good to commit.
Files
File name Uploaded
python-trunk-bz2.patch jafo, 2007-08-28.10:26:07
History
Date User Action Args
2007-08-28 10:26:08jafosetspambayes_score: 0.537581 -> 0.53758144
recipients: + jafo, georg.brandl, cpn
2007-08-28 10:26:08jafosetspambayes_score: 0.537581 -> 0.537581
messageid: <1188296768.17.0.940722994522.issue1597011@psf.upfronthosting.co.za>
2007-08-28 10:26:08jafolinkissue1597011 messages
2007-08-28 10:26:07jafocreate