msg119247 - (view) |
Author: Χρήστος Γεωργίου (Christos Georgiou) (tzot) * |
Date: 2010-10-21 00:10 |
(Discovered in that StackOverflow answer: http://stackoverflow.com/questions/3940518/3942509#3942509 ; check the comments too)
operator.attrgetter in its simplest form (i.e. with a single non-dotted name) needs more time to execute than an equivalent lambda expression.
Attached file so3940518.py runs a simple benchmark comparing: a list comprehension of plain attribute access; attrgetter; and lambda. I will append sample benchmark times at the end of the comment.
Browsing Modules/operator.c, I noticed that the dotted_getattr function was using PyUnicode_Check and (possibly) splitting on dots on *every* call of the attrgetter, which I thought to be most inefficient.
I changed the py3k-daily-snapshot source to make the PyUnicode_Check calls in the attrgetter_new function; also, I modified the algorithm to pre-parse the operator.attrgetter functions for possible dots in the names, in order for the dotted_getattr function to become speedier.
The only “drawback” is that now operator.attrgetter raises a TypeError on creation, not on subsequent calls of the attrgetter object; this shouldn't be a compatibility problem. However, I obviously had to update both Doc/library/operator.rst and Lib/test/test_operator.py .
I am not sure whether I should attach a zip/tar file with both the attachments (the sample benchmark and the diff); so I'll attach the diff in a further comment.
On the Ubuntu server 9.10 where I made the changes, I ran the so3940518.py sample benchmark before and after the changes.
Run before the changes (third column is seconds, less is better):
list comp 0.40999999999999925 1000000
map attrgetter 1.3899999999999997 1000000
map lambda 1.0099999999999998 1000000
Run after the changes:
list comp 0.40000000000000036 1000000
map attrgetter 0.5199999999999996 1000000
map lambda 0.96 1000000
|
msg119249 - (view) |
Author: Χρήστος Γεωργίου (Christos Georgiou) (tzot) * |
Date: 2010-10-21 00:22 |
Here comes the diff to Modules/operator.c, Doc/library/operator.rst and Lib/test/test_operator.py . As far as I could check, there are no leaks, but a more experienced eye in core development could not hurt. Also, obviously test_operatory.py passes all tests.
Should this be accepted, I believe it should be backported to 2.7 (at least). I can do that, just let me know.
|
msg119250 - (view) |
Author: Χρήστος Γεωργίου (Christos Georgiou) (tzot) * |
Date: 2010-10-21 00:28 |
Newer version of the diff, since I forgot some "if(0) fprintf" debug calls that shouldn't be there.
|
msg119251 - (view) |
Author: Χρήστος Γεωργίου (Christos Georgiou) (tzot) * |
Date: 2010-10-21 00:40 |
An explanation to the changes.
The old code kept the operator.itemgetter arguments in the ag->attr member. If the argument count (ag->nattrs) was 1, the single argument was kept; if more than 1, a tuple of the original arguments was kept.
On every attrgetter_call call, if ag->nattrs was 1, dotted_getattr was called with the plain ag->attr as attribute name; if > 2, dotted_getattr was called for every one of the original arguments.
Now, ag->attr is always a tuple, containing either dotless strings or tuples of dotless strings:
operator.attrgetter("name1", "name2.name3", "name4")
stores ("name1", ("name2", "name3"), "name4") in ag->attr.
dotted_getattr accordingly chooses based on type (either str or tuple, ensured by attrgetter_new) whether to do a single access or a recursive one.
|
msg119258 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2010-10-21 02:03 |
Thanks for noticing this. I wasn't aware that it had slowed down after dotted name support had been added.
Since it is a mild performance issue, I'm lowering the priority. Will take a further look when I get the chance. A first look at the patch shows that it is bigger than I expected. Isn't it possible to check for a dot on construction and just record the boolean so that a fast, non-splitting path can be used.
I'm reluctant to make the code volume grow much for this function. It is already a bit voluminous for the minor convenience it offers.
|
msg119261 - (view) |
Author: Alex Gaynor (alex) *  |
Date: 2010-10-21 05:33 |
Voice of ignorance here: why can't this be implemented in the "naive" way one might in Python, use the existing string splitting algorithms of stringlib, but just leave it in __new__.
|
msg119264 - (view) |
Author: Χρήστος Γεωργίου (Christos Georgiou) (tzot) * |
Date: 2010-10-21 06:55 |
Modules/operator.c grows by ~70 lines, most of it the setup code for ag->attr; also I loop twice over the args of attrgetter_new, choosing fast code that runs once per attrgetter creation than temporary data.
Alex's suggestion to make use of Python-level functions to shorten the code of attrgetter_new could obviously work to decrease the source lines. I don't know how fast I would produce such a version if requested, though.
Whatever the way attrgetter_new sets up the data, I would suggest that you keep the logic changes in general, i.e. set-up in attrgetter_new and keep a thinner dotted_getattr , since it avoids running the same checks and splitting over and over again for every attrgetter_call invocation.
|
msg119417 - (view) |
Author: Terry J. Reedy (terry.reedy) *  |
Date: 2010-10-23 04:17 |
>I am not sure whether I should attach a zip/tar file with both the attachments (the sample benchmark and the diff); so I'll attach the diff in a further comment.
Uploading separate files with .py, .diff extensions that can be viewed in a browser is indeed the right thing to do.\
|
msg119419 - (view) |
Author: Χρήστος Γεωργίου (Christos Georgiou) (tzot) * |
Date: 2010-10-23 06:41 |
A newer version of the patch with the following changes:
- single loop in the ag->attr setup phase of attrgetter_new; interning of the stored attribute names
- added two more tests of invalid attrgetter parameters (".attr", "attr.")
|
msg120006 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2010-10-30 18:11 |
Some comments about the patch:
- this seems to be some dead code:
if (nattrs <= 1) {
if (!PyArg_UnpackTuple(args, "attrgetter", 1, 1, &attr))
return NULL;
- you can't call PyUnicode_GET_SIZE or PyUnicode_AS_UNICODE before you have called PyUnicode_Check. If the object is not an unicode object, it triggers an assertion in debug mode:
python: ./Modules/operator.c:404: attrgetter_new: Assertion `((((((PyObject*)(item))->ob_type))->tp_flags & ((1L<<28))) != 0)' failed.
- you should also call PyUnicode_InternInPlace() on the last dotless name (after the for loop)
- this comment looks false:
/* attrgetter_new code should ensure we never come here */
- in dotted_getattr, when attr is a tuple, you leak references to intermediate objects (because you don't decref obj before doing "newobj = obj")
Other than that, looks quite worthwhile.
|
msg120023 - (view) |
Author: Χρήστος Γεωργίου (Christos Georgiou) (tzot) * |
Date: 2010-10-30 21:31 |
Thank you very much, Antoine, for your review. My comments in reply:
- the dead code: it's not dead, IIRC it ensures that at least one argument is given, otherwise it raises an exception.
- PyUnicode_GET_SIZE: you're right. The previous patch didn't have this problem, because there were two loops: the first one made sure in advance that all arguments are PyUnicode.
- the false comment: right again. A remain from the first patch.
- dotted_getattr and references: right! I should have noted better what Raymond's initial loop did.
Attached a corrected version of the patch according to Antoine's comments.
|
msg120035 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2010-10-31 04:19 |
The patch looks fine.
You're overview of the process presented here in the tracker would make a nice comment in the code.
If Antoine is happy with the latest patch, then it's ready to apply.
|
msg120059 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2010-10-31 15:27 |
The PyUnicode_GET_SIZE issue was still there, but I've fixed it and committed in r86036. Thanks for your contribution!
|
msg128957 - (view) |
Author: Χρήστος Γεωργίου (Christos Georgiou) (tzot) * |
Date: 2011-02-21 11:33 |
This is not the proper place for it, but in the 3.2 and 2.7 news it is reported that “The multi-argument form of operator.attrgetter() function now runs slightly faster” while it should be “The multi-argument form of operator.attrgetter() function now runs slightly faster and the single-argument form runs much faster”.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:57:07 | admin | set | github: 54369 |
2011-02-21 11:33:45 | tzot | set | nosy:
rhettinger, terry.reedy, tzot, pitrou, alex messages:
+ msg128957 |
2010-10-31 15:27:40 | pitrou | set | status: open -> closed resolution: fixed messages:
+ msg120059
stage: patch review -> resolved |
2010-10-31 04:19:24 | rhettinger | set | assignee: rhettinger -> pitrou messages:
+ msg120035 |
2010-10-30 21:31:07 | tzot | set | files:
+ issue10160.diff
messages:
+ msg120023 |
2010-10-30 18:11:04 | pitrou | set | nosy:
+ pitrou messages:
+ msg120006
|
2010-10-23 06:41:16 | tzot | set | files:
+ issue10160-2.diff
messages:
+ msg119419 |
2010-10-23 04:17:40 | terry.reedy | set | nosy:
+ terry.reedy messages:
+ msg119417
|
2010-10-21 11:11:20 | eric.araujo | set | nosy:
- docs@python components:
- Documentation
|
2010-10-21 06:56:00 | tzot | set | messages:
+ msg119264 |
2010-10-21 05:33:24 | alex | set | nosy:
+ alex messages:
+ msg119261
|
2010-10-21 02:03:57 | rhettinger | set | priority: normal -> low type: performance messages:
+ msg119258
stage: patch review |
2010-10-21 00:40:51 | tzot | set | messages:
+ msg119251 |
2010-10-21 00:28:46 | tzot | set | files:
+ issue10160.diff
messages:
+ msg119250 |
2010-10-21 00:27:33 | tzot | set | files:
- issue10160.diff |
2010-10-21 00:22:55 | benjamin.peterson | set | assignee: docs@python -> rhettinger
nosy:
+ rhettinger |
2010-10-21 00:22:37 | tzot | set | files:
+ issue10160.diff keywords:
+ patch messages:
+ msg119249
|
2010-10-21 00:10:18 | tzot | create | |