classification
Title: Compiler workaround for wide string constants in Modules/getpath.c (patch)
Type: compile error Stage:
Components: Build Versions: Python 3.2
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: BreamoreBoy, Jim.Jewett, jschneid, loewis, vstinner
Priority: normal Keywords: patch

Created on 2011-07-14 17:59 by jschneid, last changed 2014-06-13 16:01 by vstinner. This issue is now closed.

Files
File name Uploaded Description Edit
getpath.patch jschneid, 2011-07-14 17:59 Workaround for compiler string handling limitation
getpath.patch jschneid, 2011-07-15 15:32
Messages (11)
msg140348 - (view) Author: Jim Schneider (jschneid) Date: 2011-07-14 17:59
In Modules/getpath.c, the following line (#138) causes problems with some compilers (HP/UX 11, in particular - there could be others):

static wchar_t *lib_python = L"lib/python" VERSION;

Similarly, line #644:

        module_search_path = L"" PYTHONPATH;

The default HP/UX compiler fails to compile this file with the error "Cannot concatenate character string literal and wide string literal".  The attached patch converts these two string literals to wide string literals that the HP/UX compiler can understand.

Very limited testing indicates that the patch is benign (it does not affect the build on Linux running on x86_64).
msg140355 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2011-07-14 18:29
Why is the __W macro needed?

Please don't call it WCHAR:
- it conflicts with a same-named macro on Windows
- you are applying it to strings, not characters

FWIW, the compiler doesn't conform to standard C if it rejects this code. 6.4.5p4 says

       [#4]   In  translation  phase  6,  the  multibyte  character
       sequences specified by any sequence  of  adjacent  character
       and  wide  string  literal  tokens  are  concatenated into a
       single multibyte character sequence.  If any of  the  tokens
       are  wide  string  literal  tokens,  the resulting multibyte
       character sequence is treated  as  a  wide  string  literal;
       otherwise, it is treated as a character string literal.
msg140359 - (view) Author: Jim Schneider (jschneid) Date: 2011-07-14 18:36
The __W macro is needed because the token-pasting operator binds to the macro's argument immediately;  Having WCHAR(y) expand to __W(y) means that __W is passed WCHAR's argument after it's been macro-expanded.  Without the intermediate step, WCHAR(VERSION) becomes LVERSION.

As for the name - I have no objection to reasonable name changes.  I picked WCHAR because it converts its argument to a wchar_t *.

Finally - I am aware that the HP/UX C compiler is broken.  Unfortunately, I am required to work with it, and can neither replace it nor ignore it.
msg140421 - (view) Author: Jim Schneider (jschneid) Date: 2011-07-15 15:32
I am attaching an updated patch.  This version specifically checks for __hpux, and the macro name has been changed to avoid clashing with other uses.
msg140422 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-07-15 15:36
Use >L"" CONSTANT< to decode a byte string to a character string doesn't work with non-ASCII strings. _Py_char2wchar() should be used instead: see for example this fix, commit 5b6e13b6b473.
msg140424 - (view) Author: Jim Schneider (jschneid) Date: 2011-07-15 15:44
I am collecting HP/UX compiler bug workarounds in issue 12572.

Stinner - is the patch you mentioned in a released version of Python 3.2?  Also, how is it affected by the fact that the (wide char) strings in question are constants?
msg140441 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-07-15 16:14
> Stinner - is the patch you mentioned in a released version
> of Python 3.2?

Yes, Python 3.2.1. (It's not part of Python 3.1.)

> Also, how is it affected by the fact that the (wide char) strings
> in question are constants?

I don't remember exactly. My patch uses the locale encoding at runtime instead of using the locale encoding of the compiler. See issue #6011 for the details.
msg140445 - (view) Author: Jim Schneider (jschneid) Date: 2011-07-15 16:41
Constant initializers are required to be constants, not function calls, so _Py_char2wchar cannot be used in the definition of lib_python (line #138 of Modules/getpath.c).
msg220450 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2014-06-13 14:49
According to msg140433 this should be closed.
msg220460 - (view) Author: Jim Jewett (Jim.Jewett) * (Python triager) Date: 2014-06-13 16:00
Following up on Mark Lawrence's comment:  http://bugs.python.org/issue12572 is collecting the patches required to compile under HP/UX, and the patch there supersedes those on this issue.  Closing.
msg220461 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-06-13 16:01
The changeset a7a8ccf69708 (a few months ago) fixed lib_python to not concatenate bytes string and wide character string. I don't see any occurence of char+wchar in the code, so I'm closing the issue.

changeset:   87136:a7a8ccf69708
user:        Victor Stinner <victor.stinner@gmail.com>
date:        Sat Nov 16 00:45:54 2013 +0100
files:       Modules/getpath.c
description:
Don't mix wide character strings and byte strings (L"lib/python" VERSION): use
_Py_char2wchar() to decode lib_python instead.

Some compilers don't support concatenating literals: L"wide" "bytes". Example: IRIX compiler.
History
Date User Action Args
2014-06-13 16:01:52vstinnersetsuperseder: HP/UX compiler workarounds ->
resolution: duplicate -> fixed
messages: + msg220461
2014-06-13 16:00:59Jim.Jewettsetstatus: open -> closed

nosy: + Jim.Jewett
messages: + msg220460

superseder: HP/UX compiler workarounds
resolution: duplicate
2014-06-13 14:49:43BreamoreBoysetnosy: + BreamoreBoy
messages: + msg220450
2011-07-15 16:41:19jschneidsetmessages: + msg140445
2011-07-15 16:14:34vstinnersetmessages: + msg140441
2011-07-15 15:44:02jschneidsetmessages: + msg140424
2011-07-15 15:36:49vstinnersetnosy: + vstinner
messages: + msg140422
2011-07-15 15:32:13jschneidsetfiles: + getpath.patch

messages: + msg140421
2011-07-14 18:36:47jschneidsetmessages: + msg140359
2011-07-14 18:29:21loewissetnosy: + loewis
messages: + msg140355
2011-07-14 17:59:40jschneidcreate