Title: Support for alternate wcstok syntax for Windows compilers
Type: compile error Stage: patch review
Components: Interpreter Core, Windows Versions: Python 3.8
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: Nosy List: erikjanss, r.david.murray, serhiy.storchaka, vstinner
Priority: normal Keywords: patch

Created on 2014-02-11 13:03 by Jeffrey.Armstrong, last changed 2019-05-19 09:11 by erikjanss. This issue is now closed.

File name Uploaded Description Edit
wcstok.default.old.patch Jeffrey.Armstrong, 2014-02-11 13:03 Changes to support alternate wcstok() syntax on Windows builds
wcstok.default.try2.patch Jeffrey.Armstrong, 2014-02-11 17:27 Changes to support alternate wcstok() syntax on Windows builds review
wcstok.try3.patch Jeffrey.Armstrong, 2014-02-12 01:03 More complete wcstok patch review
wcstok.default.patch Jeffrey.Armstrong, 2014-02-13 02:33 A simpler, complete wcstok patch review
Pull Requests
URL Status Linked Edit
PR 10286 closed erikjanss, 2018-11-01 15:43
PR 11742 erikjanss, 2019-05-19 09:11
Messages (17)
msg210935 - (view) Author: Jeffrey Armstrong (Jeffrey.Armstrong) * Date: 2014-02-11 13:03
In two locations, the current interpreter code makes some assumptions concerning the syntax of the wcstok() function based solely on the operating system (Windows, in this case).  Compilers other than MSVC may (and do) provide alternative wcstok() syntaxes.

The first change in the attached patch changes a preprocessor check in Modules/main.c to determine if we're compiling with MSVC rather than just whether we're compiling with Windows.  If so, it uses Windows's basic two-argument wcstok() function as it always has.  If the compiler isn't MSVC, the code will now default to the Unix method of converting to ASCII first before tokenizing.  This change is more sensible because the code should really be checking for the compiler's wcstok() capabilities, not what operating system Python is being compiled for.

The second change in the attached patch adds some new code to PC/getpathp.c to support alternate wcstok() syntax in the find_env_config_value() function.  A preprocessor check will now determine if we're compiling for MSVC and, if so, default to the three-argument wcstok_s() function.  If the almost-compatible Open Watcom compiler is detected, a three-argument, POSIX-like wcstok() function is used.  If another compiler is detected, the original two-argument wcstok() is assumed to be adequate.
msg210963 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-02-11 16:29
IMO your WCSTOK macro should be called Py_WCSTOK and moved to Include/pyport.h, so you can use it in Modules/main.c (please use Unicode for environment variables on Windows).
msg210980 - (view) Author: Jeffrey Armstrong (Jeffrey.Armstrong) * Date: 2014-02-11 17:27
I've replaced the patch with a newer version that defines Py_WCSTOK in Include/pyport.h as Victor suggested.
msg211019 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-02-11 20:53
The code can be a little more clear if use indentation in preprocessor directives:

# ifdef _MSC_VER
#  define Py_WCSTOK(x,y,z)  wcstok_s(x,y,z)
# else
#  ifdef __WATCOMC__
#   define Py_WCSTOK(x,y,z)  wcstok(x,y,z)
#  else
#   define Py_WCSTOK(x,y,z)  wcstok(x,y)
#  endif /* __WATCOMC__ */
# endif /* _MSC_VER */
#endif /* MS_WINDOWS */

Or may be even using #elif:

# if defined(_MSC_VER)
#  define Py_WCSTOK(x,y,z)  wcstok_s(x,y,z)
# elif defined(__WATCOMC__)
#  define Py_WCSTOK(x,y,z)  wcstok(x,y,z)
# else
#  define Py_WCSTOK(x,y,z)  wcstok(x,y)
# endif
#endif /* MS_WINDOWS */
msg211031 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-02-11 22:35
wcstok() is available on other platforms like Linux. You may also
define the constant for these platforms, no?

On Linux, it has 3 parameters:
wchar_t *wcstok(wchar_t *wcs, const wchar_t *delim, wchar_t **ptr);

You should explain that the macro may ignore the third parameter in a
comment and maybe use better names for parameters.
msg211052 - (view) Author: Jeffrey Armstrong (Jeffrey.Armstrong) * Date: 2014-02-12 01:03
Based on the comments thus far, I've gone ahead with another version of this patch.  Py_WCSTOK is now defined regardless of OS.  For Windows, it chooses between MSVC's wcstok_s, Open Watcom's wcstok, and MinGW's/misc's two-argument wcstok.  If the platform isn't Windows, it defaults to the POSIX-like three-argument wcstok (same as Open Watcom's, actually).

The wcstok functionality is really only used in three places: PC/getpathp.c, Modules/getpath.c, and Modules/main.c.  This patch changes the calls to Py_WCSTOK in all cases.

I appreciate the consideration and input this patch is receiving.
msg211062 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-02-12 08:21

But perhaps it would be better to increase indent to 4 spaces in consistency with other indented code in this file.
msg211069 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-02-12 09:46
I don't think that "#ifdef MS_WINDOWS" is very useful, you can drop it. So get something simpler:

/* Portable macro for wcstok(): on Windows, it is not thread-safe, the state
 * argument is ignored, except if Visual Studio is used. */
#ifdef _MSC_VER
#  define Py_WCSTOK(str, tok, state)  wcstok_s(str, tok, state)
#elif defined(MS_WINDOWS) && !defined(__WATCOMC__)
#  define Py_WCSTOK(str, tok, state)  wcstok(str, tok)
#  define Py_WCSTOK(str, tok, state)  wcstok(str, tok, state)

But I don't like "#elif defined(MS_WINDOWS) && !defined(__WATCOMC__)": what are the other compilers on Windows which provide wcstok() with 2 parameters?
msg211086 - (view) Author: Jeffrey Armstrong (Jeffrey.Armstrong) * Date: 2014-02-12 12:30
I know that Borland/Embarcadero also uses the two-argument version like Microsoft.  As I said earlier, MinGW uses the two-argument version, although it is currently marked as deprecated.  Just from searching, it appears that Pelles C/LCC uses the POSIX-like three-argument version.   I'm having trouble thinking of any further Windows C compilers at the moment.

The purpose of wrapping the definition in MS_WINDOWS originally was to preserve any behavior that already existed when building with a compiler that uses the two-argument version on Windows without any explicit case being called out for said compiler.
msg211123 - (view) Author: Jeffrey Armstrong (Jeffrey.Armstrong) * Date: 2014-02-13 02:33
I've included a revised patch that is far simpler than the others I've proposed.  In this example, the preprocess checks for MSVC or Borland/Embarcadero and, if found, uses three-argument wcstok_s().  If neither is detected, it checks for MinGW and, if found, defaults to old two-argument wcstok().  Otherwise, it defaults to three-argument wcstok().

The default in this case effectively encompasses all other OSes where POSIX-like wcstok() is available.  Additionally, it also allows this default for Open Watcom on Windows or any other theoretically possible compilers on Windows that may provide this more common wcstok() syntax.
msg212967 - (view) Author: Jeffrey Armstrong (Jeffrey.Armstrong) * Date: 2014-03-09 17:53
I didn't receive any feedback on the last patch from 2014-02-12.  Is this issue effectively dead?
msg229174 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014-10-12 16:24
Moving this back to 'patch review' stage since there's a new patch that hasn't been reviewed.  Perhaps this 'ping' will result in someone doing the review (so, no, the issue isn't dead, just sleeping :)
msg323101 - (view) Author: Erik Janssens (erikjanss) * Date: 2018-08-04 11:36
Would anyone mind if I create a new issue to reflect the current situation and create a PR to handle it with a Py_WCSTOK ?
msg329076 - (view) Author: Erik Janssens (erikjanss) * Date: 2018-11-01 15:49
I created a github pull request based on the last patch of Jeffrey Armstrong.

Since mingw now also has a wcstok_s function, the discrimination between wcstok/wcstok_s is now based on MS_WINDOWS being defined, as was the case in Modules/main.c.

Is this correct, or should it be based on the detected compiler ?
msg329512 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-11-09 13:02
wcstok_s is an optional part of C11 and can be available in other compilers.
msg329547 - (view) Author: Erik Janssens (erikjanss) * Date: 2018-11-09 18:59
@serhiy.storchaka Are you suggesting the discrimination should be based on compiler specific defines, or rather that it should go through configuration (a HAVE_WCSTOK_S pyconfig.h) ?
msg342839 - (view) Author: Erik Janssens (erikjanss) * Date: 2019-05-19 09:11
The same issue was handled in bpo-35890
Date User Action Args
2019-05-19 09:11:19erikjansssetmessages: + msg342839
pull_requests: + pull_request13329
2018-11-09 19:50:05Jeffrey.Armstrongsetnosy: - Jeffrey.Armstrong
2018-11-09 18:59:12erikjansssetmessages: + msg329547
2018-11-09 13:02:48serhiy.storchakasetmessages: + msg329512
versions: + Python 3.8, - Python 3.5
2018-11-01 15:49:22erikjansssetmessages: + msg329076
2018-11-01 15:43:53erikjansssetpull_requests: + pull_request9597
2018-08-04 11:36:11erikjansssetnosy: + erikjanss
messages: + msg323101
2015-05-17 15:21:30Jeffrey.Armstrongsetstatus: open -> closed
resolution: wont fix
2014-10-12 16:24:56r.david.murraysetversions: + Python 3.5, - Python 3.4
nosy: + r.david.murray

messages: + msg229174

stage: commit review -> patch review
2014-03-09 17:53:03Jeffrey.Armstrongsetmessages: + msg212967
2014-02-13 02:33:30Jeffrey.Armstrongsetfiles: + wcstok.default.patch

messages: + msg211123
2014-02-12 12:30:32Jeffrey.Armstrongsetmessages: + msg211086
2014-02-12 09:46:04vstinnersetmessages: + msg211069
2014-02-12 08:21:00serhiy.storchakasetmessages: + msg211062
stage: commit review
2014-02-12 01:03:33Jeffrey.Armstrongsetfiles: + wcstok.try3.patch

messages: + msg211052
2014-02-11 22:35:58vstinnersetmessages: + msg211031
2014-02-11 20:53:24serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg211019
2014-02-11 17:27:15Jeffrey.Armstrongsetfiles: + wcstok.default.try2.patch

messages: + msg210980
2014-02-11 16:29:30vstinnersetnosy: + vstinner
messages: + msg210963
2014-02-11 13:03:36Jeffrey.Armstrongcreate