Python and C implementations of io are out of sync #54067

pitrou · 2010-09-15T08:53:09Z

BPO	9858
Nosy	@gpshead, @pitrou, @rbtcollins, @benjaminp
Files	missing_methods.py issue9858.patch

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2015-05-20.19:52:57.042>
created_at = <Date 2010-09-15.08:53:09.266>
labels = ['easy', 'type-bug', 'expert-IO']
title = 'Python and C implementations of io are out of sync'
updated_at = <Date 2015-05-20.19:52:57.040>
user = 'https://github.com/pitrou'

bugs.python.org fields:

activity = <Date 2015-05-20.19:52:57.040>
actor = 'pitrou'
assignee = 'none'
closed = True
closed_date = <Date 2015-05-20.19:52:57.042>
closer = 'pitrou'
components = ['IO']
creation = <Date 2010-09-15.08:53:09.266>
creator = 'pitrou'
dependencies = []
files = ['18889', '39212']
hgrepos = []
issue_num = 9858
keywords = ['patch', 'easy']
message_count = 21.0
messages = ['116440', '116442', '116443', '116444', '116446', '116455', '116460', '116462', '116463', '116464', '116468', '116501', '116502', '221727', '241007', '242101', '243686', '243688', '243689', '243691', '243692']
nosy_count = 9.0
nosy_names = ['gregory.p.smith', 'pitrou', 'rbcollins', 'benjamin.peterson', 'stutzbach', 'BreamoreBoy', 'python-dev', 'jcon', 'laura']
pr_nums = []
priority = 'normal'
resolution = 'fixed'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue9858'
versions = ['Python 3.5']

pitrou · 2010-09-15T08:53:08Z

RawIOBase is defined as having the readinto() method but it doesn't. It should at least have a failing one (raising io.UnsupportedOperation like the BufferedIOBase.{read,readinto} methods do).

This is, of course, mainly for auto-documenting purposes (with help() and friends).

benjaminp · 2010-09-15T12:28:24Z

+1 for a failing one. (Does the base implementation not raise?)

Is it even possible to implement a real one without buffering?

pitrou · 2010-09-15T12:39:58Z

+1 for a failing one. (Does the base implementation not raise?)

The base implementation doesn't exist :)

Is it even possible to implement a real one without buffering?

FileIO does. It's the same as read(), except that you read into an existing buffer rather than allocating a new bytes object.

stutzbach · 2010-09-15T13:36:51Z

Attached is a script to find all of the mismatches between the C and Python implementations. There are several. Below is the output:

RawIOBase C is missing: ['readinto', 'write']
StringIO C is missing: ['name']
StringIO python is missing: ['__getstate__', '__setstate__']
BufferedRandom python is missing: ['raw']
BlockingIOError python is missing: ['characters_written']
TextIOWrapper python is missing: ['buffer']
BufferedReader python is missing: ['raw']
BufferedWriter python is missing: ['raw']
BytesIO python is missing: ['__setstate__']

Since the Python version was the original reference implementation, adding the attributes missing from the C side seems to be a straightforward decision (and it should simply match whatever the Python version does). It looks like a number of new attributes have creeped into the C side, which will require more thoughtful decision making.

We should probably double-check that each of these is documented, while we're at it.

stutzbach · 2010-09-15T13:47:18Z

FWIW, I just opened bpo-9859: Add tests to verify API match of modules with 2 implementations.

benjaminp · 2010-09-15T15:06:10Z

2010/9/15 Daniel Stutzbach <report@bugs.python.org>:

Daniel Stutzbach <daniel@stutzbachenterprises.com> added the comment:

Attached is a script to find all of the mismatches between the C and Python implementations. There are several. Below is the output:

BufferedRandom python is missing: ['raw']
TextIOWrapper python is missing: ['buffer']
BufferedReader python is missing: ['raw']
BufferedWriter python is missing: ['raw']

These attributes exist; they're just not properties.

stutzbach · 2010-09-15T15:44:16Z

These attributes exist; they're just not properties.

Yes, I see. They're added to the instance in the constructor, so they don't exist as attributes of the class. Also in that category:

BlockingIOError python is missing: ['characters_written']

That leaves:

RawIOBase C is missing: ['readinto', 'write']
StringIO C is missing: ['name']
StringIO python is missing: ['__getstate__', '__setstate__']
BytesIO python is missing: ['__setstate__']

The Python version of StringIO throws an AttributeException on the .name attribute. It's a property inherited from the TextIOWrapper. Effectively, TextIOWrapper provides the .name attribute if the object that it's wrapping provides the .name attribute. This behavior is undocumented.

Is that reasonable behavior? Or should TextIOWrapper define .name always and return some suitable value if the wrapped object doesn't define .name? (e.g., None)

In any case, it needs documentation.

benjaminp · 2010-09-15T15:51:09Z

2010/9/15 Daniel Stutzbach <report@bugs.python.org>:

Daniel Stutzbach <daniel@stutzbachenterprises.com> added the comment:

> These attributes exist; they're just not properties.

Yes, I see. They're added to the instance in the constructor, so they don't exist as attributes of the class. Also in that category:

BlockingIOError python is missing: ['characters_written']

That leaves:

RawIOBase C is missing: ['readinto', 'write']
StringIO C is missing: ['name']
StringIO python is missing: ['__getstate__', '__setstate__']
BytesIO python is missing: ['__setstate__']

I'm not sure if this pickling stuff matters as long as they both pickle.

The Python version of StringIO throws an AttributeException on the .name attribute. It's a property inherited from the TextIOWrapper. Effectively, TextIOWrapper provides the .name attribute if the object that it's wrapping provides the .name attribute. This behavior is undocumented.

Is that reasonable behavior? Or should TextIOWrapper define .name always and return some suitable value if the wrapped object doesn't define .name? (e.g., None)

Yes, all the code expects an AttributeError on name is it's not
present. For example, FileIO only sets its name attribute if it is
passed a filename and not an fd.

stutzbach · 2010-09-15T16:01:35Z

On Wed, Sep 15, 2010 at 10:51 AM, Benjamin Peterson
<report@bugs.python.org>wrote:

I'm not sure if this pickling stuff matters as long as they both pickle.

I'm not sure either. Do other Python implementations try to maintain a
compatible pickle format with CPython? If so, I think CPython's C and
Python versions should also pickle to the same format.

Yes, all the code expects an AttributeError on name is it's not
present. For example, FileIO only sets its name attribute if it is
passed a filename and not an fd.

Actually, FileIO sets the name attribute to the file descriptor if it isn't
passed a name:

1

stutzbach · 2010-09-15T16:10:32Z

Roundup does not play well with Gmail when Gmail is in Rich Format mode ;-)

That example should have read:

>>> f = io.FileIO(1)
>>> f.name
1

benjaminp · 2010-09-15T16:47:51Z

2010/9/15 Daniel Stutzbach <report@bugs.python.org>:
>
> Daniel Stutzbach <daniel@stutzbachenterprises.com> added the comment:
>
> Roundup does not play well with Gmail when Gmail is in Rich Format mode ;-)
>
> That example should have read:
>
>>>> f = io.FileIO(1)
>>>> f.name
> 1

Hmm, none the less other code expects it.

stutzbach · 2010-09-16T00:39:52Z

Hmm, none the less other code expects it.

Does that imply that there's some code that will break on FileIO objects that are created using a file descriptor instead of a filename? If so, where?

benjaminp · 2010-09-16T00:53:36Z

2010/9/15 Daniel Stutzbach <report@bugs.python.org>:

Daniel Stutzbach <daniel@stutzbachenterprises.com> added the comment:

> Hmm, none the less other code expects it.

Does that imply that there's some code that will break on FileIO objects that are created using a file descriptor instead of a filename? If so, where?

No, just that all code that access name expects an AttributeError,
indicating that when there is no "name" an error should result.

BreamoreBoy · 2014-06-27T20:50:58Z

Is there anything left to do on this or can it be closed as fixed?

gpshead · 2015-04-14T20:06:38Z

A test has been added as part of bpo-9859, it is marked with @unittest.skip as the API surface does not yet match.

Laura · 2015-04-27T06:14:47Z

There were originally three methods present in RawIOBase that were not present in PyRawIOBase_Type:

readinto
write
__weakref__

I've created a patch that adds the first two to PyRawIOBase_Type. The python class readinto and write methods raise UnsupportedOperation, so the c methods return a PyExc_NotImplementedError.

The next major question I have is whether we need to implement a __weakref__ method or this should be ignored in the test.

Laura · 2015-05-20T18:47:17Z

Can anyone provide feedback on this patch?

rbtcollins · 2015-05-20T19:08:43Z

I think ignoring weakref is wrong: it means the two implementations are different.

pitrou · 2015-05-20T19:41:42Z

__weakref__ is just an implementation detail of how heap types expose weak references (actually, I'm not sure why it's exposed at all).

Laura, thank you for contributing. Your patch looks good to me.

python-dev · 2015-05-20T19:52:31Z

New changeset 962b42d67b9e by Antoine Pitrou in branch 'default':
Issue bpo-9858: Add missing method stubs to _io.RawIOBase. Patch by Laura Rupprecht.
https://hg.python.org/cpython/rev/962b42d67b9e

pitrou · 2015-05-20T19:52:57Z

I've pushed this to the default branch. Thanks!

pitrou added topic-IO easy type-bug An unexpected behavior, bug, or error labels Sep 15, 2010

stutzbach mannequin changed the title ~~RawIOBase doesn't define readinto~~ Python and C implementations of io are out of sync Sep 15, 2010

gpshead self-assigned this Apr 14, 2015

pitrou unassigned gpshead May 20, 2015

pitrou closed this as completed May 20, 2015

ezio-melotti transferred this issue from another repository Apr 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python and C implementations of io are out of sync #54067

Python and C implementations of io are out of sync #54067

pitrou commented Sep 15, 2010

pitrou commented Sep 15, 2010

benjaminp commented Sep 15, 2010

pitrou commented Sep 15, 2010

stutzbach mannequin commented Sep 15, 2010

stutzbach mannequin commented Sep 15, 2010

benjaminp commented Sep 15, 2010

stutzbach mannequin commented Sep 15, 2010

benjaminp commented Sep 15, 2010

stutzbach mannequin commented Sep 15, 2010

stutzbach mannequin commented Sep 15, 2010

benjaminp commented Sep 15, 2010

stutzbach mannequin commented Sep 16, 2010

benjaminp commented Sep 16, 2010

BreamoreBoy mannequin commented Jun 27, 2014

gpshead commented Apr 14, 2015

Laura mannequin commented Apr 27, 2015

Laura mannequin commented May 20, 2015

rbtcollins commented May 20, 2015

pitrou commented May 20, 2015

python-dev mannequin commented May 20, 2015

pitrou commented May 20, 2015

Python and C implementations of io are out of sync #54067

Python and C implementations of io are out of sync #54067

Comments

pitrou commented Sep 15, 2010

pitrou commented Sep 15, 2010

benjaminp commented Sep 15, 2010

pitrou commented Sep 15, 2010

stutzbach mannequin commented Sep 15, 2010

stutzbach mannequin commented Sep 15, 2010

benjaminp commented Sep 15, 2010

stutzbach mannequin commented Sep 15, 2010

benjaminp commented Sep 15, 2010

stutzbach mannequin commented Sep 15, 2010

stutzbach mannequin commented Sep 15, 2010

benjaminp commented Sep 15, 2010

stutzbach mannequin commented Sep 16, 2010

benjaminp commented Sep 16, 2010

BreamoreBoy mannequin commented Jun 27, 2014

gpshead commented Apr 14, 2015

Laura mannequin commented Apr 27, 2015

Laura mannequin commented May 20, 2015

rbtcollins commented May 20, 2015

pitrou commented May 20, 2015

python-dev mannequin commented May 20, 2015

pitrou commented May 20, 2015