This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: TextIOWrapper: issues with interlaced read-write
Type: Stage: resolved
Components: IO Versions: Python 3.4, Python 3.5, Python 2.7
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: Nosy List: Arfrever, martin.panter, pitrou, terry.reedy, vstinner
Priority: normal Keywords: patch

Created on 2011-05-30 12:38 by vstinner, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
textiowrapper_interlaced_read_write.patch vstinner, 2011-05-30 13:07 review
Messages (9)
msg137260 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-05-30 12:38
The following code fails on an assertion error (Python exception for _pyio, C assertion for io):

------------------
with io.BytesIO(b'abcd') as raw:
    with _pyio.TextIOWrapper(raw, encoding='ascii') as f:
        f.read(1)
        f.write('2')
        f.tell()
------------------

I found this assertion while testing interlaced read-write on TextIOWrapper:
------------------
with io.BytesIO(b'abcd') as raw:
    with _pyio.TextIOWrapper(raw, encoding='ascii') as f:
        f.write("1")
        # read() must call writer.flush()
        assertEqual(f.read(1), 'b')
        # write() must rewind the raw stream
        f.write('2')
        assertEqual(f.read(), 'd')
        f.flush()
        assertEqual(raw.getvalue(), b'1b2d')

with io.BytesIO(b'abc') as raw:
    with _pyio.TextIOWrapper(raw, encoding='ascii') as f:
        self.assertEqual(f.read(1), b'a')
        # write() must undo reader readahead
        f.write(b"2")
        assertEqual(f.read(1), b'c')
        f.flush()
        assertEqual(raw.getvalue(), b'a2c')
------------------
These tests fails on "read, write, read" path: write() breaks TextIOWrapper internal state if a read() occured just before. Note that _pyio.TextIOWrapper.write() contains the following comment...

# XXX What if we were just reading?

See also the issue #12213 for BufferedRandom and BufferedRWPair.
msg137262 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-05-30 13:07
textiowrapper_interlaced_read_write.patch: TextIOWrapper.write() calls self.seek(self.tell()) if it has a decoder or if snapshot is not None.

I suppose that we can do better, but at least it does fix this issue.

"read(); write(); write()" only calls self.seek(self.tell()) once, at the first write. So I don't think that it changes anything with performances.

In which case snapshot can be not different than None, whereas decoder is None?
msg137596 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2011-06-03 22:00
For c stdio files, intermixed reads and writes require a file positioning operation. This is a nuisance and source of program bugs. I do not see any such limitation documented for our io module. So for both reasons, it will be nice to not have the limitation in the code.

If I understand, the essence of the patch is to do the file positioning automatically internally when needed.
msg137598 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-06-03 22:02
> If I understand, the essence of the patch is to do
> the file positioning automatically internally when needed.

My patch is just a proposition to fix the issue. I wrote "I suppose that we can do better": self.seek(self.tell()) is more a workaround than a real fix. I don't understand why it does fix this bug :-)
msg137616 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2011-06-04 00:53
Perhaps the stdio requirement was based on an underlying OS (*nix?) requirement, which io has to fulfill even if it does not use stdio.

Stdio was, I presume, optimized for speed.  In the relatively rare case of mixed read/write, it *should* put the burden on the programmer.  Python is a bit different.
msg221439 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2014-06-24 08:54
Does anybody want to follow up on this?  #12213 was closed as fixed, #12513 is still open.
msg258636 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-01-20 02:53
Consider codecs that maintain an internal buffer (UTF-7) or other state (ISO-2022). When you call TextIOWrapper.read() and then tell(), the I think the returned number is supposed to hold the _decoder_ state, so you can seek back and read again. But I don’t think the number holds any _encoder_ state, so seek() cannot set up the encoder properly for these more awkward codecs.

I don’t think it is practical to fix this problem using the incremental codec API. You would need to construct the encoder’s state from the decoder. There are a couple of bugs marked as duplicates of this. What are the real-world use cases?
msg297126 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-06-28 01:38
Given that nobody complains the last 9 years (since Python 3.0 was released), I'm not sure that it's worth it to fix this corner case.

If you consider that I'm wrong, please reopen the issue ;-)
msg330367 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2018-11-24 00:31
For the record, the more recent bug I mentioned was a complaint from 2015 (one and a half years before Victor’s comment). Even if it is not worth supporting writing after reading, the problem could be documented.
History
Date User Action Args
2022-04-11 14:57:17adminsetgithub: 56424
2019-11-17 23:20:53martin.panterlinkissue38710 superseder
2018-11-24 00:31:23martin.pantersetresolution: out of date -> wont fix
messages: + msg330367
2017-06-28 01:38:47vstinnersetstatus: open -> closed
resolution: out of date
messages: + msg297126

stage: resolved
2016-01-20 11:25:50BreamoreBoysetnosy: - BreamoreBoy
2016-01-20 02:53:42martin.pantersetnosy: + martin.panter
messages: + msg258636
2015-12-20 09:56:50SilentGhostlinkissue25915 superseder
2015-03-02 15:45:52r.david.murraylinkissue23562 superseder
2014-06-24 14:02:12terry.reedysetversions: + Python 3.4, Python 3.5, - Python 3.2, Python 3.3
2014-06-24 08:54:14BreamoreBoysetnosy: + BreamoreBoy
messages: + msg221439
2011-07-01 16:51:00Arfreversetnosy: + Arfrever
2011-06-04 00:53:06terry.reedysetmessages: + msg137616
2011-06-03 22:02:40vstinnersetmessages: + msg137598
2011-06-03 22:00:05terry.reedysetnosy: + terry.reedy
messages: + msg137596
2011-05-30 13:08:20vstinnersetversions: + Python 2.7
2011-05-30 13:08:13vstinnersetversions: + Python 3.2
2011-05-30 13:07:55vstinnersetfiles: + textiowrapper_interlaced_read_write.patch
keywords: + patch
messages: + msg137262
2011-05-30 12:38:03vstinnercreate