classification
Title: Add pure Python operator module
Type: enhancement Stage: committed/rejected
Components: Extension Modules, Library (Lib) Versions: Python 3.4
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Arfrever, alex, brett.cannon, eric.araujo, ezio.melotti, jcea, meador.inge, pitrou, python-dev, r.david.murray, rhettinger, serhiy.storchaka, zach.ware
Priority: normal Keywords: patch

Created on 2012-12-16 07:24 by zach.ware, last changed 2013-05-11 02:57 by python-dev. This issue is now closed.

Files
File name Uploaded Description Edit
py_operator.v10.diff zach.ware, 2013-01-29 21:30 Version 10 review
py_operator.v11.diff zach.ware, 2013-04-14 05:18 Version 11, now with proper git format
py_operator.v12.diff zach.ware, 2013-04-15 16:01 Version 12
py_operator.v13.diff zach.ware, 2013-04-16 17:24 Version 13
Messages (39)
msg177579 - (view) Author: Zachary Ware (zach.ware) * (Python committer) Date: 2012-12-16 07:24
(Brett, I've made you nosy due to the relation to Issue16651.)

Here is a pure Python implementation of the operator module, or at least a first draft thereof :).  I'm attaching the module itself, as well as a patch to integrate it.

Any and all review is quite welcome. I'm confident in the fact that the module as it stands passes all current tests, but how it gets there is entirely up for debate (namely, the attrgetter, itemgetter, and methodcaller classes, as well as length_hint(), countOf(), and indexOf()).

Note that there's also a change to hmac.py; _compare_digest() in operator.c doesn't seem to have any relation to the rest of the module (see issue15061 discussion) and is private anyway, so operator.py doesn't go near it.  hmac.py has to import directly from _operator.

Thanks,

Zach Ware
msg177585 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-12-16 11:29
Here is a functional (and more effective) equivalent of attrgetter:

def attrgetter(attr, *attrs):
    """
    Return a callable object that fetches the given attribute(s) from its operand.
    After f=attrgetter('name'), the call f(r) returns r.name.
    After g=attrgetter('name', 'date'), the call g(r) returns (r.name, r.date).
    After h=attrgetter('name.first', 'name.last'), the call h(r) returns
    (r.name.first, r.name.last).
    """
    if not attrs:
        if not isinstance(attr, str):
            raise TypeError('attribute name must be a string')
        names = attr.split('.')
        def func(obj):
            for name in names:
                obj = getattr(obj, name)
            return obj
        return func
    else:
        getters = tuple(map(attrgetter, (attr,) + attrs))
        def func(obj):
            return tuple(getter(obj) for getter in getters)
        return func
msg177587 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-12-16 11:37
Perhaps Modules/operator.c should be renamed to Modules/_operator.c.

Also note, that error messages in Python an C implementations sometimes differ.
msg177795 - (view) Author: Zachary Ware (zach.ware) * (Python committer) Date: 2012-12-20 01:25
Sorry to have disappeared on this, other things took priority...

Thank you for the comments, Serhiy.  v2 of the patch renames Modules/operator.c to Modules/_operator.c, and changes that name every place I could find it.

I also tried to tidy up some of the error message mismatches.  I didn't bother with the ones regarding missing arguments, as that would mean checking args and throwing an exception in each and every function.

I do like the functional attrgetter better than the object version I wrote.  The main reason I went with an object version in the first place was because that's what the C implementation used.  Is there any reason not to break with the C implementation and use a function instead?  The updated patch takes a rather ugly hack to try to use the functional version in an object.

length_hint() was horrible and has been rewritten.  It should be less horrible now :).  It should also follow the C implementation quite a bit better.
msg177799 - (view) Author: Zachary Ware (zach.ware) * (Python committer) Date: 2012-12-20 05:13
Considering what a huge headache it was to get my own patch to apply at home on Linux rather than at work on Windows, here's a new version of the patch that straightens out the line ending nightmare present in v2. No other changes made.
msg177810 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-12-20 11:42
Sorry, I forgot push a "Publish All My Drafts" button. Please consider other my comments to first patch. I also have added new comments about length_hint().

Your implementation of attrgetter() looks good. One possible disadvantage of pure functional approach is that attrgetter() will be not a class. Unlikely someone subclass attrgetter, but it can be used in an isinstance() check. You solve this issue.

The same approach can be applied to itemgetter().
msg177865 - (view) Author: Zachary Ware (zach.ware) * (Python committer) Date: 2012-12-21 06:42
Here's v4, addressing Serhiy's comments on Reitveld.
msg177871 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-12-21 09:39
About length_hint():

I were mean something like (even explicit getattr() not needed):

try:
    hint = type(obj).__length_hint__
except AttributeError:
    return default
try:
    val = hint(obj)
except TypeError:
    return default
...

This is a little faster because there is only one attribute lookup instead two. This is a little safer because there is a little less chance of race when an attribute changed between two lookups (it is enough non-probably and doesn't matter).

There is type(obj) here because the C code uses _PyObject_LookupSpecial() which doesn't honor instance attributes and looks only class attributes.


About concat() and iconcat():

I think only first argument can be checked. If arguments are not concatenable then '+'/'+=' operator will raise an exception. I'm not sure. Does anyone have any thoughts about this?


About methodcaller():

Here is a catch. With this implementation you can't use `methodcaller('foo', name='spam')` or `methodcaller('foo', self='spam')` (please add tests for those cases). Here is a trick needed:

def __init__(*args, **kwargs):
    self = args[0]
    self._name = args[1]
    self._args = args[2:]
    self._kwargs = kwargs

(You can add a code for better error reporting).


I have added smaller comments on Rietveld.
msg177895 - (view) Author: Zachary Ware (zach.ware) * (Python committer) Date: 2012-12-21 21:05
Here's another new version.  Changes include:

- Address Serhiy's Rietveld comments
- Fix length_hint() the way it was meant to be fixed last time.
- Remove __getitem__ check on 'b' in concat and iconcat.  More notes on this below.
- Fix methodcaller as Serhiy suggested
- Add test case for methodcaller for 'name' and 'self' keyword arguments
- Add comments to 'subdivide' the module into the rough sections the docs are divided into.  Move length_hint() with other sequence operations to also match the doc order.

On concat and iconcat: Looking at the glossary, a sequence should actually have both __getitem__ and __len__.  The test class in the test case for iconcat only defines __getitem__, though.  Should we check only for __getitem__ on the first argument, or check for both __getitem__ and __len__, and add __len__ to the test class?  Requiring __len__ may cause breakage for anyone using the Python implementation with a class they defined and used with the C implementation with only __getitem__, so I'm leaning towards only checking for __getitem__.  I can't really tell what the C implementation really looks for as I don't speak C, but it almost looks to me like it may be only checking for __getitem__.  Latest patch only checks argument 'a' for __getitem__.
msg177899 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-12-21 21:45
Good work, Zachary. I have no more nitpicks for you. ;)

LGTM.
msg177901 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-12-21 22:06
One comment to a committer. Don't forget to run `hg rename Modules/operator.c Modules/_operator.c` before applying the patch.
msg177902 - (view) Author: Zachary Ware (zach.ware) * (Python committer) Date: 2012-12-21 22:07
Nits are no fun; thank you for picking them, Serhiy ;)
msg177907 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2012-12-21 23:22
FYI Mercurial can use the extended diff format invented by git, which supports renames, changes to file permissions, etc.
msg177908 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2012-12-21 23:28
The base test class should not inherit from TestCase: it will be picked up by test discovery and then will break, as self.module will be None.

Typical usage:

class OperatorTestsMixin:
    module = None

class COperatorTests(OperatorTestsMixin, unittest.TestCase):
    module = _operator
msg177910 - (view) Author: Zachary Ware (zach.ware) * (Python committer) Date: 2012-12-21 23:45
Did not know that about test discovery, thank you Éric.  Fixed in v6.

A few other test modules may need the same fix; I based my changes to Lib/test/test_operator.py on Lib/test/test_heapq.py which has the same issue.  I'll open a new report for it and any others I find.

Also, this patch was created with `hg diff -g`; the operator.c rename should be well taken care of by this patch.
msg177926 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-12-22 09:01
I don't understand what is difference between v5 and v6.
msg178778 - (view) Author: Zachary Ware (zach.ware) * (Python committer) Date: 2013-01-01 23:03
Sorry, I misunderstood Éric's suggestions regarding the tests; v6 is useless.  v7 forthcoming.
msg178787 - (view) Author: Zachary Ware (zach.ware) * (Python committer) Date: 2013-01-01 23:39
Ok, I believe the attached v7 properly addresses Éric's concerns about test discovery, and has no other changes unrelated to that compared to v5.

Thank you very much to Ezio for directing me towards the json tests for an example to work from.
msg178813 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-01-02 14:19
v8 LGTM (except some trailing whitespaces).
msg178830 - (view) Author: Zachary Ware (zach.ware) * (Python committer) Date: 2013-01-02 18:18
Note to self: learn to run patchcheck.py before posting.  Whitespace issues fixed in v9.
msg178838 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-01-02 19:12
If no one objects I will commit this next week.
msg180948 - (view) Author: Zachary Ware (zach.ware) * (Python committer) Date: 2013-01-29 21:30
Since the older Windows project files were removed, v10 removes the patches to them.

Everything else still applies cleanly.

Also, in the spirit of what Brett said in 16651 about not re-implementing blindly, I did just look up what Jython, IronPython, and PyPy do for the operator module.  The first two implement it in their VM language, and PyPy uses a very specialized version that didn't look easy to adapt to CPython, at least at a glance.  It was fun for me to write any way about it, though :)
msg186814 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-04-13 20:04
Zachary, I suppose Modules/_operator.c is a rename of Modules/operator.c.
Could you generate your patch using "hg diff --git" so that history isn't lost here?

See also http://docs.python.org/devguide/committing.html#minimal-configuration
msg186883 - (view) Author: Zachary Ware (zach.ware) * (Python committer) Date: 2013-04-14 05:18
> Zachary, I suppose Modules/_operator.c is a rename of Modules/operator.c.
> Could you generate your patch using "hg diff --git" so that history isn't lost here?

Of course; I thought I already had, but apparently I messed that up a bit. v11 is in the proper format.  In it, you can actually see what was changed in Modules/operator.c, which is the necessary s/operator/_operator/ changes, and a few extra commas removed from a couple of docstrings (to match the docstrings in the new Python versions).

> See also http://docs.python.org/devguide/committing.html#minimal-configuration

Thank you for that link! I had read through this some time ago, but either missed the part about the diff section, or it just didn't sink in or something.  That is now added to my hg config file :)
msg186916 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-04-14 12:04
Thank you!
One optional thing, the code churn could be minimized in test_operator.py by writing "operator = self.module" at the beginning of each test method.
Otherwise, looks good to me.
msg187003 - (view) Author: Zachary Ware (zach.ware) * (Python committer) Date: 2013-04-15 16:01
Here's another new version of the patch, addressing Ezio's review comments and a few things I found after giving operator.py a closer look myself.

Things changed in operator.py in this version:

- all ``__func__ = func`` assignments are moved to the end, after importing * from _operator.  With the assignments after each func, __func__ was still the Python version after importing from _operator.  I suspect this means that _operator.c could be changed to not mess with creating each __func__ and just let operator.py do it, but not being a native C speaker, I don't know how to do it.  Also, there is an added test case to test whether __func__ is func.  It passes with the rest of the patch, but would fail on current operator.c; it seems that operator.c actually creates separate __func__ and func functions (that do the same thing).

- If importing from _operator succeeds, import __doc__ from _operator as well.  The Python implementation has an extra note at the end of __doc__ advertising that it is a Python implementation.


Also, after submitting this patch, I'm going to try to clean up the files list on this issue a bit.  I'll clear the nosy list while I do so to avoid spamming everybody with messages about it.  (At least, I assume I can do so, I haven't tried this before :).  If I can't clear the nosy list, I won't bother with cleaning up the files, again to avoid spamming)
msg187007 - (view) Author: Zachary Ware (zach.ware) * (Python committer) Date: 2013-04-15 16:41
A change that I mentioned in a Rietveld comment on v10, but not in my last message: __all__ in operator.py no longer includes all of the __func__s, as currently doing "from operator import *" does not import all of the __func__s.
msg187023 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-04-15 19:38
I think Antoine is more appropriate for committing this patch. I waited so long with this because I do not dare to take responsibility for themselves (it's almost like adding a new module).
msg187043 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2013-04-16 01:34
I would like to spend some time with this before it goes forward (especially the attrgetter, itemgetter, methodgetter group).

Right now, it looks like a nice effort but I don't see how it makes Python any better for adding it.  The odds are that this code will add bloat but not benefit any user (it won't get called at all).
msg187046 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2013-04-16 02:00
Raymond: it's not for the benefit of CPython.
msg187048 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2013-04-16 04:16
[David]
> Raymond: it's not for the benefit of CPython.

IIRC, all the other implementations of Python already have this code passing tests, so it isn't really for their benefit either.
msg187049 - (view) Author: Alex Gaynor (alex) * (Python committer) Date: 2013-04-16 04:22
If a pure python operator module were a part of the stdlib, we (PyPy) would probably delete most (if not all) of our own operator module.
msg187050 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2013-04-16 06:40
I reviewed the attrgetter(), mathodgetter(), and itemgetter() code in py_operator.v12.diff.  The looks clean and correct.
msg187056 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-04-16 08:43
Now we can remove all __func__s from _operator.c.
msg187103 - (view) Author: Zachary Ware (zach.ware) * (Python committer) Date: 2013-04-16 17:24
Thank you for the review, Raymond.

Since Serhiy agrees that the _operator __func__s are unnecessary, here's a v13 that removes them.  Again, I'm not a native C speaker, so these new changes in _operator.c deserve a bit of extra scrutiny.  Everything builds and still passes the test suite, though.

Also changed in this patch, test_pow and test_inplace remove explicit testing of __func__s.  Those tests are useless, as they are merely rerunning already run tests on the same function with a different name, which is confirmed by test_dunder_is_original.  I can extend that test with an explicit list of funcs which should have a __func__ if anyone thinks it's worth it.
msg187148 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-04-17 09:04
length_hint() looks ok as well.
msg187440 - (view) Author: Roundup Robot (python-dev) Date: 2013-04-20 17:21
New changeset 97834382c6cc by Antoine Pitrou in branch 'default':
Issue #16694: Add a pure Python implementation of the operator module.
http://hg.python.org/cpython/rev/97834382c6cc
msg187441 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-04-20 17:22
I've now commited the latest patch. Thank you very much, Zachary!
msg188890 - (view) Author: Roundup Robot (python-dev) Date: 2013-05-11 02:57
New changeset 4b3238923b01 by Raymond Hettinger in branch 'default':
Issue #16694:  Add source code link for operator.py
http://hg.python.org/cpython/rev/4b3238923b01
History
Date User Action Args
2013-05-11 02:57:58python-devsetmessages: + msg188890
2013-04-20 17:22:45pitrousetstatus: open -> closed
resolution: fixed
messages: + msg187441

stage: commit review -> committed/rejected
2013-04-20 17:21:53python-devsetnosy: + python-dev
messages: + msg187440
2013-04-17 09:04:28pitrousetmessages: + msg187148
2013-04-16 17:24:36zach.waresetfiles: + py_operator.v13.diff

messages: + msg187103
2013-04-16 08:43:01serhiy.storchakasetmessages: + msg187056
2013-04-16 06:40:33rhettingersetassignee: rhettinger ->
messages: + msg187050
2013-04-16 04:22:58alexsetnosy: + alex
messages: + msg187049
2013-04-16 04:16:42rhettingersetmessages: + msg187048
2013-04-16 02:00:06r.david.murraysetnosy: + r.david.murray
messages: + msg187046
2013-04-16 01:34:56rhettingersetassignee: rhettinger

messages: + msg187043
nosy: + rhettinger
2013-04-15 19:38:53serhiy.storchakasetassignee: serhiy.storchaka -> (no value)
messages: + msg187023
2013-04-15 16:41:56zach.waresetmessages: + msg187007
2013-04-15 16:03:39zach.waresetnosy: + brett.cannon, jcea, pitrou, ezio.melotti, eric.araujo, Arfrever, meador.inge, zach.ware, serhiy.storchaka
2013-04-15 16:03:20zach.waresetfiles: - py_operator.v9.diff
2013-04-15 16:03:15zach.waresetfiles: - py_operator.v8.diff
2013-04-15 16:03:10zach.waresetfiles: - py_operator.v7.diff
2013-04-15 16:03:05zach.waresetfiles: - py_operator.v5.diff
2013-04-15 16:03:01zach.waresetfiles: - py_operator.v4.diff
2013-04-15 16:02:56zach.waresetfiles: - py_operator.v3.diff
2013-04-15 16:02:51zach.waresetfiles: - py_operator.diff
2013-04-15 16:02:44zach.waresetfiles: - operator.py
2013-04-15 16:02:32zach.waresetnosy: - brett.cannon, jcea, pitrou, ezio.melotti, eric.araujo, Arfrever, meador.inge, zach.ware, serhiy.storchaka
-> (no value)
2013-04-15 16:01:06zach.waresetfiles: + py_operator.v12.diff

messages: + msg187003
2013-04-14 12:04:53pitrousetmessages: + msg186916
2013-04-14 05:18:49zach.waresetfiles: + py_operator.v11.diff

messages: + msg186883
2013-04-13 20:04:36pitrousetnosy: + pitrou
messages: + msg186814
2013-01-29 21:30:32zach.waresetfiles: + py_operator.v10.diff

messages: + msg180948
2013-01-02 19:12:22serhiy.storchakasetmessages: + msg178838
2013-01-02 18:18:06zach.waresetfiles: + py_operator.v9.diff

messages: + msg178830
2013-01-02 14:19:22serhiy.storchakasetmessages: + msg178813
2013-01-02 04:14:32zach.waresetfiles: + py_operator.v8.diff
2013-01-01 23:39:57zach.waresetfiles: + py_operator.v7.diff
nosy: + ezio.melotti
messages: + msg178787

2013-01-01 23:03:52zach.waresetfiles: - py_operator.v6.diff
2013-01-01 23:03:40zach.waresetmessages: + msg178778
2012-12-29 21:06:41serhiy.storchakasetassignee: serhiy.storchaka
2012-12-29 03:05:01meador.ingesetnosy: + meador.inge
2012-12-22 09:01:23serhiy.storchakasetmessages: + msg177926
2012-12-21 23:45:35zach.waresetfiles: + py_operator.v6.diff

messages: + msg177910
2012-12-21 23:28:45eric.araujosetmessages: + msg177908
2012-12-21 23:22:31eric.araujosetmessages: + msg177907
2012-12-21 22:07:33zach.waresetmessages: + msg177902
2012-12-21 22:06:45serhiy.storchakasetmessages: + msg177901
2012-12-21 21:45:20serhiy.storchakasetmessages: + msg177899
stage: patch review -> commit review
2012-12-21 21:05:46zach.waresetfiles: + py_operator.v5.diff

messages: + msg177895
2012-12-21 17:20:58eric.araujosetnosy: + eric.araujo
2012-12-21 09:39:25serhiy.storchakasetmessages: + msg177871
2012-12-21 06:43:07zach.waresetfiles: + py_operator.v4.diff

messages: + msg177865
2012-12-20 11:42:51serhiy.storchakasetmessages: + msg177810
2012-12-20 05:27:42zach.waresetfiles: - py_operator.v2.diff
2012-12-20 05:13:29zach.waresetfiles: + py_operator.v3.diff

messages: + msg177799
2012-12-20 01:25:22zach.waresetfiles: + py_operator.v2.diff

messages: + msg177795
2012-12-17 08:20:36jceasetnosy: + jcea
2012-12-16 20:00:45Arfreversetnosy: + Arfrever
2012-12-16 11:37:00serhiy.storchakasetmessages: + msg177587
2012-12-16 11:29:26serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg177585
2012-12-16 09:46:55serhiy.storchakasetstage: patch review
2012-12-16 09:05:53serhiy.storchakalinkissue16651 dependencies
2012-12-16 07:25:14zach.waresetfiles: + py_operator.diff
keywords: + patch
2012-12-16 07:24:41zach.warecreate