classification
Title: Compilation error if configuref --with-computed-gotos
Type: compile error Stage: resolved
Components: Build Versions: Python 3.1, Python 3.2, Python 2.7, Python 2.6
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: mark.dickinson Nosy List: loewis, mark.dickinson, pitrou, wiget
Priority: normal Keywords: easy, patch

Created on 2009-07-30 14:30 by wiget, last changed 2009-10-31 10:23 by mark.dickinson. This issue is now closed.

Files
File name Uploaded Description Edit
tsc_x86_64.patch mark.dickinson, 2009-10-12 18:14
tsc2.patch mark.dickinson, 2009-10-13 20:08
tsc3.patch mark.dickinson, 2009-10-13 20:31
Messages (21)
msg91097 - (view) Author: Artur Frysiak (wiget) Date: 2009-07-30 14:30
Building Python 3.1 (or Python from mercural py3k branch) fail if
configured --with-computed-gotos.

Traceback (most recent call last):
  File "./setup.py", line 13, in <module>
    from distutils.core import Extension, setup
  File "/home/users/wiget/mirror/python/cpython/Lib/distutils/core.py",
line 18, in <module>
    from distutils.dist import Distribution
  File "/home/users/wiget/mirror/python/cpython/Lib/distutils/dist.py",
line 12, in <module>
    import warnings
  File "/home/users/wiget/mirror/python/cpython/Lib/warnings.py", line
6, in <module>
    import linecache
  File "/home/users/wiget/mirror/python/cpython/Lib/linecache.py", line
10, in <module>
    import tokenize
  File "/home/users/wiget/mirror/python/cpython/Lib/tokenize.py", line
94, in <module>
    Ignore = Whitespace + any(r'\\\r?\n' + Whitespace) + maybe(Comment)
  File "/home/users/wiget/mirror/python/cpython/Lib/tokenize.py", line
87, in any
    def any(*choices): return group(*choices) + '*'
TypeError: group() argument after ** must be a mapping, not tuple
make: *** [sharedmods] Error 1

Compiled on PLD Linux using GCC 4.4.1
msg91098 - (view) Author: Artur Frysiak (wiget) Date: 2009-07-30 14:33
Exact options used to configure:

./configure \
 --enable-ipv6 \
 --with-wide-unicode \
 --enable-shared \
 --with-computed-gotos \
 --with-dbmliborder=gdbm:bdb \
 --with-signal-module \
 --with-tsc \
 --with-threads \
 --with-doc-strings \
 --with-fpectl \
 --with-system-ffi
msg91306 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-08-05 10:34
I can't reproduce here (using gcc 4.3.2). Have you tried `make
distclean`? Is it a fresh checkout?
msg91325 - (view) Author: Artur Frysiak (wiget) Date: 2009-08-05 17:14
reproduced on fresh checkout from
http://svn.python.org/projects/python/branches/py3k (rev. 74321)

$ gcc --version|head -n 1
gcc (PLD-Linux) 4.4.1 20090724 (release)
msg91387 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-08-06 20:52
I'm afraid you'll have to investigate a bit by yourself, or wait for a
core developer to get the same problem :-/
msg91447 - (view) Author: Artur Frysiak (wiget) Date: 2009-08-10 14:18
Adding -fno-caller-saves to OPT in Makefile fixes problem.

BTW All compilations done on x86_64 architecture.
msg91454 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-08-10 16:31
It sounds like a compiler bug... It would be nice if people could give
results with other gcc versions and builds.
(my test was done on x86-64 too, by the way)
msg93896 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-10-12 13:54
I'm able to reproduce this on Debian Lenny/x86_64 with a home-built gcc
4.4.0 and the --with-computed-gotos and --with-tsc configure options.

I'm compiling with:

CC=gcc-4.4 ./configure --with-tsc --with-computed-gotos && make

The Python executable builds successfully, and then the module build
stage fails exactly as Artur described.  Here's a minimal failing example:

Python 3.2a0 (py3k:75376, Oct 12 2009, 14:32:36)
[GCC 4.4.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> def f(*args): return None
...
>>> def g(*args): return f(*args)
...
>>> g()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 1, in g
TypeError: f() argument after ** must be a mapping, not tuple

With gcc 4.3.2 I get no failure.  With either of the configure options
removed I get no failure.

After some investigation, I believe the problem is in Python/ceval.c: 
what's going on is that the inline x86 assembly for the READ_TIMESTAMP
macro is clobbering the rdx register.  This clobbering changes the value
of the 'flags' variable in _call_function_var_kw, leading the
ext_do_call to think that it's supposed to be doing a
CALL_FUNCTION_VAR_KW instead of a CALL_FUNCTION_VAR.

I believe this *is* a Python bug:  the inline assembly should be fixed
to use the eax and edx registers explicitly instead of using the "=A"
constraint;  for x86_64, "=A" apparently means the 64-bit rax register,
rather than the edx/eax pair of 32-bit registers (though it's difficult
to find any mention of this in the gcc documentation).

I don't know why the --with-computed-gotos configure option affects this.
msg93898 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-10-12 18:14
Here's a patch.

Martin, would you be able to take a look at this?

N.B.  I also tried './configure --with-tsc && make && make test' for a 
build of the trunk on OS X 10.6;  the configure and make steps succeeded, 
but 'make test' immediately segfaults;  gdb shows the segfault coming from 
ceval.c.  This patch fixes that segfault.
msg93899 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-10-12 18:16
N.B.  The patch assumes that unsigned int has 32 bits.  This is almost 
certainly true on the platforms of interest, but it might be better to use  
uint64_t and uint32_t throughout the tsc code.  For Python 2.7 and 3.x, 
uint32_t and uint64_t are already detected by the configure script;  for 
2.6, the detection might need to be added.
msg93927 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2009-10-13 16:07
I'm skeptical that the patch is right (or, rather, that it is not a 
compiler bug). According to 

http://gcc.gnu.org/viewcvs/trunk/gcc/config/i386/i386.h?view=markup

in macro REG_CLASS_CONTENTS, the class AD_REGS includes indeed EDX and 
EAX. The exact place where this gets used has varied recently quite a 
lot; it currently lives in predicates.md:

http://gcc.gnu.org/viewcvs/trunk/gcc/config/i386/constraints.md?
revision=149373&view=markup

where "A" becomes the letter denoting the AD_REGS constraint. So IIUC, 
this means that EDX is the output of the insn, and your change should 
have no effect. I suggest talking to gcc people what the problem might 
be.
msg93928 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-10-13 16:28
Here's a relevant gcc bug report:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21249

As I understand it, the rules are that in 32-bit mode "A" means edx/eax 
for a 64-bit quantity, and in 64-bit mode "A" means rdx/rax for a 128-bit 
quantity and "rdx or rax" for a 64-bit quantity.
msg93929 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-10-13 16:41
I just found a more recent one:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41133

Whether this is a gcc misfeature or not, it sounds as though it's not 
going to change.

Maybe there should be two separate READ_TIMESTAMP cases?  One for x86 (or 
x86_64 in 32-bit mode), and one for x86_64 in 64-bit mode.
msg93930 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2009-10-13 16:42
Looking at this further, it seems that the rdtsc code got miscompiled on 
x64 for some time already. Consider this code

typedef unsigned long long uint64;
uint64 f(uint64 b)
{
   uint64 a;
   __asm__ __volatile__("rdtsc" : "=A" (a));
   return a+b;
}

My Apple gcc 4.0.1 compiles that into 

_f:
        pushq   %rbp
        movq    %rsp, %rbp
        rdtsc
        addq    %rdi, %rax
        leave
        ret

Here, %rdi is the incoming parameter; %rdx is not considered at all. 
This seems to come from DImode (double integer) processing: gcc just 
"knows" that a DImode variable lives in a single register on AMD64.

So even if your code is right in principle, I still think there is a gcc 
bug here.

As for the specific code: I'm not sure whether it's guaranteed that you 
can truncate output registers in an asm. If you can't, you should make 
the output registers 64-bit integers on AMD64. If you can, I think you 
can "simplify" the code by directly outputting to (int*)v and 
((int*)v)[1]; this would be worthwhile only if the generated code 
actually gets better by omitting the shift operation.

FWIW, I don't consider a bug that only occurs --with-tsc and only on 
AMD64 critical.
msg93931 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-10-13 16:53
> As for the specific code: I'm not sure whether it's guaranteed that
> you  can truncate output registers in an asm. If you can't, you
> should make the output registers 64-bit integers on AMD64.

Specifying "eax" and "edx" directly, rather than using "a" and "d", 
would presumably also work here?


> If you can, I think you 
> can "simplify" the code by directly outputting to (int*)v and 
>((int*)v)[1];

I'll take a look at that.  Thanks.

> FWIW, I don't consider a bug that only occurs --with-tsc and only on 
> AMD64 critical.

Sorry.  I got overexcited.  Changed priority to 'normal'.
msg93933 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-10-13 17:17
FWIW, the Linux source splits things up into 32-bit and 64-bit, and uses 
"A" for 32-bit and "a" and "d" for 64-bit;  their 64-bit code (after 
unwinding the macros) is essentially identical to the code in the patch.
See

http://lxr.linux.no/linux+v2.6.31/arch/x86/include/asm/msr.h
msg93937 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-10-13 20:07
> "simplify" the code by directly outputting to (int*)v and 
> ((int*)v)[1];

This does indeed seem to produce better compiler output, at least under 
Apple gcc-4.0.1, Apple gcc-4.2.1, and non-Apple gcc-4.4.1.

Here's a revised patch, which also separates the i386 and x86_64 cases.

Assigning to Martin for review.
msg93938 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-10-13 20:08
Whoops. Wrong patch.  Here's the right one.
msg93939 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-10-13 20:31
Aargh.  I somehow deleted a backslash between testing and submitting that 
patch.  Sorry.

Third time lucky.
msg93945 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2009-10-13 22:29
The patch is fine, please apply (notice that the 2.6 branch is closed for 
anything but 2.6.3 regressions right now, so you may want to defer this 
after 2.6.4).
msg94750 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-10-31 10:23
Applied to 2.7, 2.6, 3.2, 3.1 in r75982 through r75985.
History
Date User Action Args
2009-10-31 10:23:02mark.dickinsonsetstatus: open -> closed

messages: + msg94750
stage: patch review -> resolved
2009-10-13 22:29:32loewissetassignee: loewis -> mark.dickinson
resolution: accepted
messages: + msg93945
2009-10-13 20:31:12mark.dickinsonsetfiles: + tsc3.patch

messages: + msg93939
2009-10-13 20:09:00mark.dickinsonsetfiles: - tsc2.patch
2009-10-13 20:08:54mark.dickinsonsetfiles: + tsc2.patch

messages: + msg93938
2009-10-13 20:07:35mark.dickinsonsetfiles: + tsc2.patch
assignee: mark.dickinson -> loewis
messages: + msg93937
2009-10-13 17:17:10mark.dickinsonsetmessages: + msg93933
2009-10-13 16:53:35mark.dickinsonsetpriority: critical -> normal

messages: + msg93931
2009-10-13 16:42:48loewissetmessages: + msg93930
2009-10-13 16:41:23mark.dickinsonsetmessages: + msg93929
2009-10-13 16:28:21mark.dickinsonsetmessages: + msg93928
2009-10-13 16:07:02loewissetmessages: + msg93927
2009-10-12 18:16:56mark.dickinsonsetmessages: + msg93899
2009-10-12 18:14:02mark.dickinsonsetfiles: + tsc_x86_64.patch

nosy: + loewis
messages: + msg93898

keywords: + patch
stage: needs patch -> patch review
2009-10-12 13:54:36mark.dickinsonsetpriority: critical

assignee: mark.dickinson
versions: + Python 2.6, Python 2.7
keywords: + easy
nosy: + mark.dickinson

messages: + msg93896
stage: needs patch
2009-08-10 16:31:04pitrousetmessages: + msg91454
2009-08-10 14:18:28wigetsetmessages: + msg91447
versions: + Python 3.2
2009-08-06 20:52:24pitrousetmessages: + msg91387
2009-08-05 17:14:39wigetsetmessages: + msg91325
2009-08-05 10:34:21pitrousetnosy: + pitrou
messages: + msg91306
2009-07-30 14:33:24wigetsetmessages: + msg91098
2009-07-30 14:30:34wigetcreate