classification
Title: Make python code compilable with a C++ compiler
Type: enhancement Stage: patch review
Components: Versions: Python 3.4
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Arfrever, belopolsky, lemburg, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2009-01-02 04:56 by belopolsky, last changed 2012-12-03 08:08 by Arfrever.

Files
File name Uploaded Description Edit
c++-patch.diff belopolsky, 2009-01-02 04:56 review
c++-patch-2.diff belopolsky, 2009-01-03 05:17 review
Messages (12)
msg78756 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2009-01-02 04:56
I am posting this patch mainly to support python-dev discussion on this 
topic.  In the past (see r45330) it was possible to compile python core 
and standard library modules using a C++ compiler.

According to Martin v. Löwis (issue4665), "It's not a requirement that 
the Python source code is compilable as C++. Only the header files 
should be thus compilable." However, I can see certain benefits from 
such requirement:

1. It is hard to verify that header files are compilable if source code 
is not.  With compilable source code, CC=g++ ./configure; make will 
supply an adequate test that does not require anything beyond a standard 
distribution.

2. Arguably, C++ compliant code is more consistent and less error prone.   
For example, "new" is a really bad choice for a variable name regardless 
of being a C++ keyword, especially when used instead of prevailing "res" 
for a result of a function producing a PyObject. Even clearly redundant 
explicit casts of malloc return values arguably improve readability by 
reminding the type of the object that is being allocated.

3. Compiling with C++ may reveal actual coding errors that otherwise go unnoticed.  For example, use of undefined PyLong_BASE_TWODIGITS_TYPE in Objects/longobject.c.

4. Stricter type checking may promote use of specific types instead of 
void* which in turn may help optimizing compilers.

5. Once achieved, C++ compilability is not that hard to maintain, but 
restoring it with patches like this one is hard because it requires 
review of changes to many unrelated files.
msg78764 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2009-01-02 07:30
A related question discussed on python-dev is whether extern "C" {} 
wrappers should ever be used in .c files.  I argue that the answer is "no" 
even if C++ compilability is desired.

The new patch eliminates several uses of extern "C" {} in .c files while 
preserving C++ compilability.  This is achieved by including proper header 
files instead of declaring external functions in .c files and in some 
cases adding declarations of functions that are used outside of the files 
they are defined in in the appropriate header files.
msg78821 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2009-01-02 15:51
Moving declarations into header files is not really in line with the way
Python developers use header files:

We usually only put code into header files that is meant for public use. 

Buy putting declarations into the header files without additional
warning, you implicitly document them and make them usable in
non-interpreter code.
msg78822 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2009-01-02 15:54
Also note that by removing the extern "C" declarations, you not only
change the exported symbol names of functions, but also those of
exported globals.

Those would also have to get declared in the header files, to prevent
their names from being mangled (causing the exported C API to change).
msg78823 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2009-01-02 15:55
The added type casts are useful to have - even outside the context of
the idea behind the patch.
msg78928 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2009-01-03 03:38
On Fri, Jan 2, 2009 at 10:54 AM, Marc-Andre Lemburg
<report@bugs.python.org> wrote:
>
> Marc-Andre Lemburg <mal@egenix.com> added the comment:
>
> Also note that by removing the extern "C" declarations, you not only
> change the exported symbol names of functions, but also those of
> exported globals.
>
What are " exported globals" other than "exported symbol names of
functions"?  AFAIK, C++ does not mangle non-function symbols.

> Those would also have to get declared in the header files, to prevent
> their names from being mangled (causing the exported C API to change).

I believe .c files should only contain static functions and functions
that are declared in an included header file.  If a function that is
not advertised in a header, it is not part of API and it is a fair
game to mangle it.  The only exception is the module init functions
that are part of the ABI rather than API.
msg79130 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2009-01-05 12:03
On 2009-01-03 04:38, Alexander Belopolsky wrote:
> Alexander Belopolsky <belopolsky@users.sourceforge.net> added the comment:
> 
> On Fri, Jan 2, 2009 at 10:54 AM, Marc-Andre Lemburg
> <report@bugs.python.org> wrote:
>> Marc-Andre Lemburg <mal@egenix.com> added the comment:
>>
>> Also note that by removing the extern "C" declarations, you not only
>> changes the exported symbol names of functions, but also those of
>> exported globals.
>>
> What are " exported globals" other than "exported symbol names of
> functions"?  AFAIK, C++ does not mangle non-function symbols.

GCC doesn't appear to do so, but there's no guarantee that other
C++ compilers won't touch these symbols:

http://en.wikipedia.org/wiki/Name_mangling

>> Those would also have to get declared in the header files, to prevent
>> their names from being mangled (causing the exported C API to change).
> 
> I believe .c files should only contain static functions and functions
> that are declared in an included header file.  If a function that is
> not advertised in a header, it is not part of API and it is a fair
> game to mangle it.  The only exception is the module init functions
> that are part of the ABI rather than API.

That's right, but I was referring to non-function globals.
These would have to be declared in the header files as well to
prevent their names from being mangled.

OTOH, those globals will often not be accessed directly from other
object files - only through functions providing an interface to
them. Still, this would have to be checked.
msg79162 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2009-01-05 16:43
On 2009-01-05 13:03, Marc-Andre Lemburg wrote:
> Marc-Andre Lemburg <mal@egenix.com> added the comment:
> 
> On 2009-01-03 04:38, Alexander Belopolsky wrote:
>> Alexander Belopolsky <belopolsky@users.sourceforge.net> added the comment:
>>
>> On Fri, Jan 2, 2009 at 10:54 AM, Marc-Andre Lemburg
>> <report@bugs.python.org> wrote:
>>> Marc-Andre Lemburg <mal@egenix.com> added the comment:
>>>
>>> Also note that by removing the extern "C" declarations, you not only
>>> changes the exported symbol names of functions, but also those of
>>> exported globals.
>>>
>> What are " exported globals" other than "exported symbol names of
>> functions"?  AFAIK, C++ does not mangle non-function symbols.
> 
> GCC doesn't appear to do so, but there's no guarantee that other
> C++ compilers won't touch these symbols:
> 
> http://en.wikipedia.org/wiki/Name_mangling

Issue #4846 is a good example of a situation where such name mangling
causes problems even for non-function symbols.
msg79182 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2009-01-05 18:55
On Mon, Jan 5, 2009 at 11:43 AM, Marc-Andre Lemburg
<report@bugs.python.org> wrote:
..
>> GCC doesn't appear to do so, but there's no guarantee that other
>> C++ compilers won't touch these symbols:
>>
>> http://en.wikipedia.org/wiki/Name_mangling
>
> Issue #4846 is a good example of a situation where such name mangling
> causes problems even for non-function symbols.
>

You are right, I did not know that fact about MS compilers.   I am not
sure what this means to the isue of removing extern "C" from the .c
files, though.  Note that properly declared in header files global
symbols will not be affected, but only semi-private vars such as for
example allocs counters in Objects/object.c.

The allocs counters (tuple_zero_allocs, fast_tuple_allocs,
quick_int_allocs, quick_neg_int_allocs) present a case where it is
really hard to justify a change that is only motivated by C++
compilability.   Note that currently they are not getting extern "C"
at the point of definition (Objects/tupleobject.c and
Objects/intobject.c) but do at the point of declaration
(Objects/object.c).  Moving them to a header file would require
renaming with a _Py_ prefix.  Affected applications are really
esoteric: MS C++ compilation with -DCOUNT_ALLOCS.

I find it hard to get motivated to do a more thorough review of the
code searching for affected non-function symbols.  My original
motivation was just the curiosity as to why extern "C"  were added to
.c files.  I got my questions answered and I believe these
declarations serve no valid purpose, particularly inside the files
that no longer valid C++.

I see little to be gained in refining the patch further to support
non-g++ compilers.  It does not look like there is much interest in
C++ compilability to begin with.  Despite my posting to c++-sig
mailing list, no one has subscribed to this issue so far.   Maybe we
should ask on the python-list as well.
msg79194 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2009-01-05 19:58
On 2009-01-05 19:55, Alexander Belopolsky wrote:
> The allocs counters (tuple_zero_allocs, fast_tuple_allocs,
> quick_int_allocs, quick_neg_int_allocs) present a case where it is
> really hard to justify a change that is only motivated by C++
> compilability.   Note that currently they are not getting extern "C"
> at the point of definition (Objects/tupleobject.c and
> Objects/intobject.c) but do at the point of declaration
> (Objects/object.c).  Moving them to a header file would require
> renaming with a _Py_ prefix.  Affected applications are really
> esoteric: MS C++ compilation with -DCOUNT_ALLOCS.

For completeness, all exported symbols in Python should have a _Py_
prefix, even if they only get exported in certain debug builds.

> I find it hard to get motivated to do a more thorough review of the
> code searching for affected non-function symbols.  My original
> motivation was just the curiosity as to why extern "C"  were added to
> .c files.  I got my questions answered and I believe these
> declarations serve no valid purpose, particularly inside the files
> that no longer valid C++.

Like I mentioned earlier on: those declarations did serve a purpose
for early MS VC++ versions (at least AFAIR). It may well be the case
that they are no longer needed nowadays, but then they don't hurt
much either.

Given that you can build Python as library on Unix and as DLL on
Windows, there doesn't appear to be any need to actually be able
to build Python itself using a C++ compiler. Simply using the
header files and linking against those libraries should do the
trick in most cases.
msg79201 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2009-01-05 21:06
On Mon, Jan 5, 2009 at 2:58 PM, Marc-Andre Lemburg
<report@bugs.python.org> wrote:
..
> For completeness, all exported symbols in Python should have a _Py_
> prefix, even if they only get exported in certain debug builds.
>
I actually agree, but I felt that doing this as a part of this patch
would make it even less likely to be accepted.   There is another
change that needs to be done to the alloc counts - namely changing the
type from int to Py_ssize_t and %d to %zd in print formats.  I will
submit that as a separate issue.  (See issue4850.)

The only downside of having them is that a #ifdef __cplusplus
instruction strongly suggests that a file is intended to be valid C++,
which is currently not the case.

> Given that you can build Python as library on Unix and as DLL on
> Windows, there doesn't appear to be any need to actually be able
> to build Python itself using a C++ compiler. Simply using the
> header files and linking against those libraries should do the
> trick in most cases.

So what is your position on the proposed patch?  Is it worthwhile to
track down the remaining symbols that may be affected by removal of
extern "C" from .c files?  What is your opinion on the original patch
(c++-patch.diff) which restores C++ compilability but does not touch
these declarations?

I think using C++ as a lint variant from time to time is a good
exercise to catch some corner issues as I hope this patch
demonstrates.  I don't think this should be a requirement on every
submission, but once an effort is made to restore C++ compilability,
such changes should be applied unless there are valid concerns against
them.
msg79248 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2009-01-06 10:42
On 2009-01-05 22:06, Alexander Belopolsky wrote:
> Alexander Belopolsky <belopolsky@users.sourceforge.net> added the comment:
>> Given that you can build Python as library on Unix and as DLL on
>> Windows, there doesn't appear to be any need to actually be able
>> to build Python itself using a C++ compiler. Simply using the
>> header files and linking against those libraries should do the
>> trick in most cases.
> 
> So what is your position on the proposed patch?  Is it worthwhile to
> track down the remaining symbols that may be affected by removal of
> extern "C" from .c files?  What is your opinion on the original patch
> (c++-patch.diff) which restores C++ compilability but does not touch
> these declarations?
> 
> I think using C++ as a lint variant from time to time is a good
> exercise to catch some corner issues as I hope this patch
> demonstrates.  I don't think this should be a requirement on every
> submission, but once an effort is made to restore C++ compilability,
> such changes should be applied unless there are valid concerns against
> them.

I agree with using C++ compatibility as additional means of checking
for correctness of the code. The type casts you have added in the
patch should definitely make it into the trunk.

Making sure that all exported private symbols get a _Py prefix and
a declaration in the header files that adds a comment explaining their
private nature is also a good idea.

I'm not sure about removing the extern "C" declarations altogether.
We'd need further testing with non-G++ compilers to see whether we
still need them or not. With the above fixes, I doubt that we still
need them nowadays.
History
Date User Action Args
2012-12-03 08:08:08Arfreversetnosy: + Arfrever
2012-11-30 21:50:14serhiy.storchakasetnosy: + serhiy.storchaka

versions: + Python 3.4, - Python 3.2
2010-07-10 06:13:16terry.reedysetstage: patch review
2010-07-10 06:12:55terry.reedysetversions: + Python 3.2, - Python 2.6
2009-01-06 10:42:09lemburgsetmessages: + msg79248
2009-01-05 21:06:26belopolskysetmessages: + msg79201
2009-01-05 19:58:23lemburgsetmessages: + msg79194
2009-01-05 18:55:46belopolskysetmessages: + msg79182
2009-01-05 16:43:58lemburgsetmessages: + msg79162
2009-01-05 12:03:23lemburgsetmessages: + msg79130
2009-01-03 05:17:29belopolskysetfiles: - c++-patch-1.diff
2009-01-03 05:17:23belopolskysetfiles: + c++-patch-2.diff
2009-01-03 03:38:09belopolskysetmessages: + msg78928
2009-01-02 15:55:34lemburgsetmessages: + msg78823
2009-01-02 15:54:42lemburgsetmessages: + msg78822
2009-01-02 15:51:54lemburgsetnosy: + lemburg
messages: + msg78821
2009-01-02 07:30:46belopolskysetfiles: + c++-patch-1.diff
messages: + msg78764
2009-01-02 04:56:46belopolskycreate