Message 398663 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	blastwave
Recipients	Dennis Sweeney, blastwave, lys.nikolaou, pablogsal
Date	2021-07-31.20:55:16
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1627764918.5.0.830410841515.issue44789@roundup.psfhosted.org>
In-reply-to

Content
This was an excellent opportunity to review these CFLAGS and to ponder the value of each. This took me a day to write and was then reviewed by a team. I hope it answers your question. --------------------------------------------------------------- There is nothing too unusual in the CFLAGS on any system. I have used this sort of config for many years without too many problems. There are always some open source software packages that are a bit "special" and one can not expect strictly portable code everywhere. However some packages are really critical and Python would be one of those certainly. We have to agree that the usage of some gnu extensions breaks "-pedantic" always. Let us go over these compiler flags for a Solaris 10 SPARC64 system. beta $ echo $CC /opt/developerstudio12.6/bin/c99 Clearly that is the C99 compiler. Similar to running f77 in order to handle Fortran77 code. However the f77 is just a symlink these days and it points to f90. Such is life in the modern world. CFLAGS ? I guess we can go over these one by one however they are all clearly documented in the "Oracle(R) Developer Studio 12.6: C User's Guide" which we may see here : https://docs.oracle.com/cd/E77782_01/html/E77788/index.html There is a fairly extensive discussion regarding "Features of C 99" : https://docs.oracle.com/cd/E77782_01/html/E77788/bjayy.html OKay lets look at these flags that I have used almost everywhere for many years : -Xc is seen in section B.2.84 : (c = conformance) Issues errors and warnings for programs that use non-ISO C constructs. This option is strictly conformant ISO C without K&R C compatibility extensions. As a side comment here the compiler in use is C99 and this option is somewhat similar to saying "-pedantic" and yes I really do mean iso9899:1999 without any special flavour sauce added :) The documentation states : See D.1 for a discussion of supported 1999 ISO/IEC features. See Appendix H for a discussion of differences between ISO/IEC C and K&R C. All of that discussion is in the links above. -errtags=yes -errwarn=%none -errfmt=error -erroff=%none -errshort=full Lets look at these as a group of options that ensure we get a really verbose error message when needed. From section B.2.12 we see -errfmt[=[no%]error] which is used if you want to prefix the string "error:" to the beginning of error messages so they are more easily distinguishable from warning messages. The prefix is also attached to warnings that are converted to errors by -errwarn. Section B.2.17 we see -errwarn[=t] where I use t=%none such that "... any warning message from causing the compiler to exit with a fatal status should any warning message be issued." Around the same section we see -errtags=a for a being either a yes or no. From the manual "Displays the message tag for each warning message of the C compiler that can be suppressed with the -erroff option or made a fatal error with the -errwarn option." This brings us to the -erroff flag discussed in section B.2.14 where it simply says %none enables all warning messages. Finally there is -errshort which will determine how much data we get from an error message. The option "full" should be pretty clear and the section B.2.15 states "Error messages are printed with tag names for types which have tag names. If there is no tag name, the type is shown in expanded form." -m64 -xarch=sparc These are trivial and merely specify that we are building for a 64 bit platform and the target architecture is a SPARC. In this specific case we are using a Fujitsu SPARC64 based server where the full cpu description would be SPARC64-VII+ clock 2860 MHz. -xO0 -g -xs The -xO0 option is similar to what we see from GCC and LLVM/Clang and other compilers. The compiler optimization level where here we use a zero and this is not documented. It is in fact the default and the compiler accepts this flag just fine. Any other number from 1 upwards to 5 indicates a level of optimization that is ever more complex. To be blunt the use of a debugging switch -g with any level of optimization above 2 will result in limited debug data. Section B.2.150 lays out everything one would want to know for the SPARC and AMD64 platforms. The -xs option is a bit special in that it allows debug information to be encoded into the executable binaries. Section B.2.172 shows us the default is in fact -xs=yes. This is the same and simply -xs by itself. When the compile command forces linking (that is, -c is not specified) there will be no object file(s) and the debug info must be placed in the executable. I tend to always have -xs with the -g option and I do specify the -xO0 just to be really clear at a glance that this is a non-optimized debug build. -xstrconst Strangely this is a deprecated option that seems to be silently accepted. I have had this in my CFLAGS for at least twenty years and never had a problem. It still works and as section B.2.178 says it "might be removed in a future release." The manual suggests that we replace this switch with -features=conststrings which is documented in section B.2.20 thus : Enables the placement of string literals in read-only memory. The default is –features=conststrings which places string literals into the read-only data section. Note that compiling a program that attempts to write to the memory location of a string literal will now cause a segmentation fault when compiled with this option. no% prefix disables this sub-option. -xildoff This is an oldie but a goodie as they say. Not even documented in the Oracle copy of the manual. Older manual revisions simply say : Turns off the incremental linker and forces the use of ld. Which is what I want. Also I can set LD_foo flags if needed and those are fully respected by ld. We may find this option mentioned here : https://docs.oracle.com/cd/E19957-01/806-3567/cc_options.html -xmemalign=8s Somewhat complicated but this flag suggests alignment of data to the compiler. The flag requires two options where the first is a number to suggest "at most X byte alignment" and the second is a flag to suggest behavior in the event of a misaligned access where the "s" means "Raise signal SIGBUS." This makes for an easy way to detect bad behavior as a sigbus is hard to miss on a machine where the operating system will trap the signal and then generate a core dump with all the data you could ask for. I have consistently used this alignment flag for years and ALL libraries are built with it. -xnolibmil Trivial. Do not inline math library routines. This may allow for more easy debugging later. Maybe. If I want to go with optimization then of course we inline. -xcode=pic32 From section B.2.103 we may specify code address space. Here I use "pic32" which results in : Generates position-independent code for use in shared libraries (large model). Equivalent to -KPIC. Permits references to at most 230 unique external symbols on 32-bit architectures, 229 on 64-bit architectures. There is a reasonable discussion about this in the manual of course. It is worth reading some key features here : A routine compiled with either -xcode=pic13 or -xcode=pic32 executes a few extra instructions upon entry to set a register to point at a table (_GLOBAL_OFFSET_TABLE_) used for accessing a shared library’s global or static variables. Each access to a global or static variable involves an extra indirect memory reference through _GLOBAL_OFFSET_TABLE_. If the compilation includes -xcode=pic32, there are two additional instructions per global and static memory reference. When considering these costs, remember that the use of -xcode=pic13 and -xcode=pic32 can significantly reduce system memory requirements due to the effect of library code sharing. Every page of code in a shared library compiled -xcode=pic13 or -xcode=pic32 can be shared by every process that uses the library. If a page of code in a shared library contains even a single non-pic (that is, absolute) memory reference, the page becomes nonsharable, and a copy of the page must be created each time a program using the library is executed. Therefore it seems very reasonable that code compiled on a 64bit system will be done with -xcode=pic32. -xregs=no%appl See section B.2.170 where we may specify the usage of registers for the generated code. I suggest not to use the application registers g2 and g3. The manual suggests "You should compile all system software and libraries using -xregs=no%appl." Well golly gee that is just what I do. Why? System software (including shared libraries) must preserve these registers’ values for the application. Their use is intended to be controlled by the compilation system and must be consistent throughout the application. Clearly I am creating shared libs to be used in many ways long term. -xlibmieee This is trivial. In section B.2.131 we see : Forces IEEE 754 style return values for math routines in exceptional cases. In such cases, no exception message is printed, and you should not rely on errno. Generally when I work with floating point it is best to use the provided methods to detect fp-exceptions. Which certainly do happen ALL the time. -mc Trivial. From section B.2.55 we see : Removes duplicate strings from the .comment section of the object file. When you use the -mc flag, mcs -c is invoked. -ftrap=%none This flag seems to be confused in that one would think it means we do NOT trap any floating point exceptions. However this is entirely the opposite. There are a pile of options here and in section B.2.37 we see we can trap everything and get a SIGFPE along with our own handler. I find it is far better to check for floating point exceptions in my code and then deal with the issues without a SIGFPE : [no%]division Trap on division by zero. [no%]inexact Trap on inexact result. [no%]invalid Trap on invalid operation. [no%]overflow Trap on overflow. [no%]underflow Trap on underflow. %all Trap on all of the above. %none Trap on none of the above. common Trap on invalid, division by zero, and overflow. You can use ieee_handler(3M) or fex_set_handling(3M) to simultaneously enable traps and install a SIGFPE handler. If you do not specify -ftrap, the compiler assumes -ftrap=%none. So clearly I am being a bit verbose but at least there is no confusion about what is happening. -xbuiltin=%none In section B.2.95 we see where we can choose to inline some common library calls, or not. Since I am doing a debug and entirely non-optimized build here there is no reason to inline anything. So I don't. :) -xunroll=1 Section B.2.187 allows us to suggest to the compiler that it can unroll loops. When n is 1, it requires the compiler not to unroll loops. Pretty clear and also useful when debugging. -Qy Section B.2.69 tells us that -Qy is the default in any case and it determines whether to emit identification information to the output file. Generally very helpful to know every header as well as the tools that made a binary. Use mcs -p to print out loads of information from a binary that has not been stripped. I did take a look at https://www.python.org/dev/peps/pep-0007/ where we are clearly told to not use GNU extensions and that the code should be C89 clean with a few C99 features for everything recent. So this did take a day to write and it was a valuable exercise both for myself and a number of people who did review.

This was an excellent opportunity to review these CFLAGS and to
ponder the value of each. This took me a day to write and was
then reviewed by a team. I hope it answers your question.

---------------------------------------------------------------

There is nothing too unusual in the CFLAGS on any system. I have
used this sort of config for many years without too many problems.
There are always some open source software packages that are a bit
"special" and one can not expect strictly portable code everywhere.
However some packages are really critical and Python would be one
of those certainly. We have to agree that the usage of some gnu
extensions breaks "-pedantic" always.

Let us go over these compiler flags for a Solaris 10 SPARC64 system.

beta $ echo $CC
/opt/developerstudio12.6/bin/c99

    Clearly that is the C99 compiler. Similar to running f77 in order
    to handle Fortran77 code.  However the f77 is just a symlink these
    days and it points to f90.  Such is life in the modern world.


CFLAGS ?

I guess we can go over these one by one however they are all clearly
documented in the "Oracle(R) Developer Studio 12.6: C User's Guide" which
we may see here :

    https://docs.oracle.com/cd/E77782_01/html/E77788/index.html

There is a fairly extensive discussion regarding "Features of C 99" :

    https://docs.oracle.com/cd/E77782_01/html/E77788/bjayy.html

OKay lets look at these flags that I have used almost everywhere for
many years :

    -Xc  is seen in section B.2.84 :

        (c = conformance) Issues errors and warnings for programs
        that use non-ISO C constructs. This option is strictly
        conformant ISO C without K&R C compatibility extensions.

        As a side comment here the compiler in use is C99 and this
        option is somewhat similar to saying "-pedantic" and yes I
        really do mean iso9899:1999 without any special flavour
        sauce added :)

        The documentation states :

           See D.1 for a discussion of supported 1999 ISO/IEC features.
           See Appendix H for a discussion of differences between
           ISO/IEC C and K&R C.

           All of that discussion is in the links above.


    -errtags=yes -errwarn=%none -errfmt=error
    -erroff=%none -errshort=full

        Lets look at these as a group of options that ensure we get a
        really verbose error message when needed.

        From section B.2.12 we see -errfmt[=[no%]error] which is used
        if you want to prefix the string "error:" to the beginning of
        error messages so they are more easily distinguishable from
        warning messages. The prefix is also attached to warnings that
        are converted to errors by -errwarn.

        Section B.2.17 we see -errwarn[=t] where I use t=%none such that
        "... any warning message from causing the compiler to exit with
         a fatal status should any warning message be issued."

        Around the same section we see -errtags=a for a being either a
        yes or no. From the manual "Displays the message tag for each
        warning message of the C compiler that can be suppressed with
        the -erroff option or made a fatal error with the -errwarn
        option."

        This brings us to the -erroff flag discussed in section B.2.14
        where it simply says %none enables all warning messages.

        Finally there is -errshort which will determine how much data we
        get from an error message. The option "full" should be pretty clear
        and the section B.2.15 states "Error messages are printed with
        tag names for types which have tag names. If there is no tag
        name, the type is shown in expanded form."

    -m64 -xarch=sparc

        These are trivial and merely specify that we are building for a
        64 bit platform and the target architecture is a SPARC. In this
        specific case we are using a Fujitsu SPARC64 based server where
        the full cpu description would be SPARC64-VII+ clock 2860 MHz.

    -xO0 -g -xs

        The -xO0 option is similar to what we see from GCC and LLVM/Clang
        and other compilers. The compiler optimization level where here
        we use a zero and this is not documented. It is in fact the default
        and the compiler accepts this flag just fine. Any other number from
        1 upwards to 5 indicates a level of optimization that is ever more
        complex. To be blunt the use of a debugging switch -g with any level
        of optimization above 2 will result in limited debug data. Section
        B.2.150 lays out everything one would want to know for the SPARC and
        AMD64 platforms.

        The -xs option is a bit special in that it allows debug information
        to be encoded into the executable binaries. Section B.2.172 shows
        us the default is in fact -xs=yes. This is the same and simply -xs
        by itself. When the compile command forces linking (that is, -c is
        not specified) there will be no object file(s) and the debug info
        must be placed in the executable. I tend to always have -xs with the
        -g option and I do specify the -xO0 just to be really clear at a
        glance that this is a non-optimized debug build.

    -xstrconst

        Strangely this is a deprecated option that seems to be silently
        accepted. I have had this in my CFLAGS for at least twenty years
        and never had a problem. It still works and as section B.2.178
        says it "might be removed in a future release."  The manual suggests
        that we replace this switch with -features=conststrings which is
        documented in section B.2.20 thus :

            Enables the placement of string literals in read-only memory.
            The default is –features=conststrings which places string
            literals into the read-only data section. Note that compiling
            a program that attempts to write to the memory location of a
            string literal will now cause a segmentation fault when compiled
            with this option. no% prefix disables this sub-option.

    -xildoff

        This is an oldie but a goodie as they say. Not even documented in
        the Oracle copy of the manual. Older manual revisions simply say :

            Turns off the incremental linker and forces the use of ld.

        Which is what I want. Also I can set LD_foo flags if needed and
        those are fully respected by ld.  We may find this option mentioned
        here :

            https://docs.oracle.com/cd/E19957-01/806-3567/cc_options.html

    -xmemalign=8s

        Somewhat complicated but this flag suggests alignment of data to
        the compiler. The flag requires two options where the first is a
        number to suggest "at most X byte alignment" and the second is a
        flag to suggest behavior in the event of a misaligned access where
        the "s" means "Raise signal SIGBUS." This makes for an easy way to
        detect bad behavior as a sigbus is hard to miss on a machine where
        the operating system will trap the signal and then generate a core
        dump with all the data you could ask for. I have consistently used
        this alignment flag for years and ALL libraries are built with it.

    -xnolibmil

        Trivial. Do not inline math library routines.  This may allow for
        more easy debugging later. Maybe. If I want to go with optimization
        then of course we inline.

    -xcode=pic32

        From section B.2.103 we may specify code address space. Here I use
        "pic32" which results in :

            Generates position-independent code for use in shared
            libraries (large model). Equivalent to -KPIC. Permits
            references to at most 2**30 unique external symbols on
            32-bit architectures, 2**29 on 64-bit architectures.

        There is a reasonable discussion about this in the manual of course.
        It is worth reading some key features here :

            A routine compiled with either -xcode=pic13 or -xcode=pic32
            executes a few extra instructions upon entry to set a
            register to point at a table (_GLOBAL_OFFSET_TABLE_) used
            for accessing a shared library’s global or static variables.

            Each access to a global or static variable involves an extra
            indirect memory reference through _GLOBAL_OFFSET_TABLE_. If
            the compilation includes -xcode=pic32, there are two
            additional instructions per global and static memory reference.

            When considering these costs, remember that the use of
            -xcode=pic13 and -xcode=pic32 can significantly reduce system
            memory requirements due to the effect of library code sharing.
            Every page of code in a shared library compiled -xcode=pic13
            or -xcode=pic32 can be shared by every process that uses the
            library. If a page of code in a shared library contains even
            a single non-pic (that is, absolute) memory reference, the
            page becomes nonsharable, and a copy of the page must be
            created each time a program using the library is executed.

        Therefore it seems very reasonable that code compiled on a 64bit
        system will be done with -xcode=pic32.

    -xregs=no%appl

        See section B.2.170 where we may specify the usage of registers
        for the generated code. I suggest not to use the application
        registers g2 and g3. The manual suggests "You should compile all
        system software and libraries using -xregs=no%appl."  Well golly
        gee that is just what I do. Why?

            System software (including shared libraries) must preserve
            these registers’ values for the application. Their use is
            intended to be controlled by the compilation system and
            must be consistent throughout the application.

        Clearly I am creating shared libs to be used in many ways long
        term.

    -xlibmieee

        This is trivial. In section B.2.131 we see :

            Forces IEEE 754 style return values for math routines in
            exceptional cases. In such cases, no exception message is
            printed, and you should not rely on errno.

        Generally when I work with floating point it is best to use
        the provided methods to detect fp-exceptions. Which certainly
        do happen ALL the time.

    -mc

        Trivial. From section B.2.55 we see :

            Removes duplicate strings from the .comment section of
            the object file. When you use the -mc flag, mcs -c is
            invoked.

    -ftrap=%none

        This flag seems to be confused in that one would think it means
        we do NOT trap any floating point exceptions. However this is
        entirely the opposite.  There are a pile of options here and in
        section B.2.37 we see we can trap everything and get a SIGFPE
        along with our own handler.  I find it is far better to check
        for floating point exceptions in my code and then deal with the
        issues without a SIGFPE :

            [no%]division          Trap on division by zero.
            [no%]inexact           Trap on inexact result.
            [no%]invalid           Trap on invalid operation.
            [no%]overflow          Trap on overflow.
            [no%]underflow         Trap on underflow.
            %all                   Trap on all of the above.
            %none                  Trap on none of the above.
            common                 Trap on invalid, division by zero,
                                       and overflow.

        You can use ieee_handler(3M) or fex_set_handling(3M) to
        simultaneously enable traps and install a SIGFPE handler.

        If you do not specify -ftrap, the compiler assumes -ftrap=%none.

        So clearly I am being a bit verbose but at least there is no
        confusion about what is happening.

    -xbuiltin=%none

        In section B.2.95 we see where we can choose to inline some
        common library calls, or not. Since I am doing a debug and
        entirely non-optimized build here there is no reason to
        inline anything.  So I don't. :)

    -xunroll=1

        Section B.2.187 allows us to suggest to the compiler that it
        can unroll loops.  When n is 1, it requires the compiler not
        to unroll loops. Pretty clear and also useful when debugging.

    -Qy

        Section B.2.69 tells us that -Qy is the default in any case and
        it determines whether to emit identification information to the
        output file. Generally very helpful to know every header as well
        as the tools that made a binary.  Use mcs -p to print out loads
        of information from a binary that has not been stripped.

I did take a look at https://www.python.org/dev/peps/pep-0007/  where
we are clearly told to not use GNU extensions and that the code should
be C89 clean with a few C99 features for everything recent.

So this did take a day to write and it was a valuable exercise both for
myself and a number of people who did review.

History
Date	User	Action	Args
2021-07-31 20:55:18	blastwave	set	recipients: + blastwave, lys.nikolaou, pablogsal, Dennis Sweeney
2021-07-31 20:55:18	blastwave	set	messageid: <1627764918.5.0.830410841515.issue44789@roundup.psfhosted.org>
2021-07-31 20:55:18	blastwave	link	issue44789 messages
2021-07-31 20:55:16	blastwave	create