This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author meador.inge
Recipients lemburg, meador.inge, pitrou
Date 2011-10-09.00:49:49
SpamBayes Score 4.1744386e-14
Marked as misclassified No
Message-id <CAK1QoooEzxnQhFdWnK0YHSNf9mzFTJok-vb4G2=NqHD4rbvhJQ@mail.gmail.com>
In-reply-to <1318113072.9190.14.camel@localhost.localdomain>
Content
On Sat, Oct 8, 2011 at 5:34 PM, Antoine Pitrou <report@bugs.python.org> wrote:

> Antoine Pitrou <pitrou@free.fr> added the comment:
>
>> Before going further with this, I'd suggest you have a look at your
>> compiler settings.
>
> They are set by the configure script:
>
> gcc -pthread -c -Wno-unused-result -DNDEBUG -g -fwrapv -O3 -Wall
> -Wstrict-prototypes    -I. -I./Include    -DPy_BUILD_CORE -o
> Objects/unicodeobject.o Objects/unicodeobject.c
>
>> Such optimizations are normally performed by the
>> compiler and don't need to be implemented in C, making maintenance
>> harder.
>
> The fact that the glibc includes such optimization (in much more
> sophisticated form) suggests to me that many compilers don't perform
> these optimizations automically.

I agree.  This is more of an optimized runtime library problem than
code optimization problem.

>> I tested using memchr() when writing those "naive" loops.
>
> memchr() is mentioned in another issue, #13134.

Yeah, this conversation is really more relevant to issue13134, but I will
reply to these here anyway ....

>> memchr()
>> is inlined by the compiler just like the direct loop
>
> I don't think so. If you look at the glibc's memchr() implementation,
> it's a sophisticated routine, not a trivial loop. Perhaps you're
> thinking about memcpy().

Without link-time optimization enabled, I doubt the toolchain can
"inline" 'memchr'
in the traditional sense (i.e. inserting the body of the routine
inline).  Even if it could,
the inline heuristics would most likely choose not to.  I don't think we use LTO
with GCC, but I think we might with VC++.

GCC does something else called builtin folding that is more likely.
For example,
'memchr ("bca", 'c', 3)' gets replace with instructions to compute a pointer
index into "bca".  This won't happen in this case because all of the 'memchr'
arguments are all variable.

>> and the generated
>> code for the direct version is often easier to optimize for the compiler
>> than the memchr() one, since it receives more knowledge about the used
>> data types.
>
> ?? Data types are fixed in the memchr() definition, there's no knowledge
> to be gained by inlining.

I think what Marc-Andre is alluding to is that the first parameter of
'memchr' is 'void *'
which could (in theory) limit optimization opportunities.  Where as if
it knew that
the data being searched is a 'char *' or something it could take
advantage of that.
History
Date User Action Args
2011-10-09 00:49:52meador.ingesetrecipients: + meador.inge, lemburg, pitrou
2011-10-09 00:49:51meador.ingelinkissue13136 messages
2011-10-09 00:49:49meador.ingecreate