classification
Title: #ifdef and mbcs: don't check for defined(HAVE_USABLE_WCHAR_T)
Type: Stage:
Components: Unicode, Windows Versions: Python 3.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: amaury.forgeotdarc, brian.curtin, haypo, loewis, python-dev, tim.golden
Priority: normal Keywords: patch

Created on 2010-08-19 12:02 by haypo, last changed 2011-07-04 12:26 by haypo. This issue is now closed.

Files
File name Uploaded Description Edit
have_mbcs.patch haypo, 2011-06-22 21:14 review
have_mbcs-2.patch haypo, 2011-07-04 10:55 review
have_mbcs-3.patch haypo, 2011-07-04 11:37 review
Messages (11)
msg114350 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2010-08-19 12:02
mbcs codec functions are surrounded by:

#if defined(MS_WINDOWS) && defined(HAVE_USABLE_WCHAR_T)
(especially in unicodeobject.c and _codecsmodule.c)

or

#ifdef MS_WIN32
(in unicodeobject.h)

or

#if defined(MS_WINDOWS) && !defined(__BORLANDC__)
(in timemodule.c)

I think that all of these tests are wrong. We should just check that we are compiling under Windows: mbcs functions don't use the wchar_t type. And it's better to use the same test in all tests (MS_WIN32 vs MS_WINDOWS).

Attached patch replaces all #ifdef (except the one in timemodule.c because I don't know what to do with the BORLAND check, does anyone use this compiler?).

I suppose that my patch doesn't change anything in pratice because mbcs is used in many places and noboby complained that mbcs encoding was missing on Windows.
msg114354 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2010-08-19 12:25
It's true that for now, MS_WINDOWS implies HAVE_USABLE_WCHAR_T and PyUnicodeObject directly used as a WCHAR array.

I'd prefer a new symbol though. Why not something like HAVE_MBCS_CODEC?
msg114411 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2010-08-19 21:02
> mbcs functions don't use the wchar_t type.

That's not true. MultiByteToWideChar use LPWSTR, which is a typedef for
wchar_t*.

These functions assume that Py_UNICODE is the same type as WCHAR.

> We should just check that we are compiling under Windows:

-1, see above. In the long run, it would be really good if Python
supported a four-byte Py_UNICODE on Windows - people keep asking for it.

I have been meaning to provide versions of the mbcs codecs for years
that work for UCS-4, but haven't found the time yet to implement them.
msg138835 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-06-22 21:14
have_mbcs.patch: use HAVE_MBCS define instead of different tests to check if the MBCS codec can be used or not. HAVE_MBCS is defined in unicodeobject.h by:

#if defined(MS_WINDOWS) && defined(HAVE_USABLE_WCHAR_T)
#  define HAVE_MBCS
#endif

> > We should just check that we are compiling under Windows:

> -1, see above. In the long run, it would be really good if Python
> supported a four-byte Py_UNICODE on Windows - people keep asking
> for it.

MBCS functions of the Python API are always available on Windows without my patch. I don't know if it's correct or not. Using my patch, they are not available if HAVE_USABLE_WCHAR_T is not defined.

Support 32 bits Py_UNICODE on Windows requires a lot of work because in *many* places (everywhere?) Py_UNICODE* is used as wchar_t*. But it is not the topic of this issue :-)
msg138986 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2011-06-24 21:16
How do the two patches relate?
msg138990 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-06-24 21:42
> How do the two patches relate?

Oh, I forgot to remove my first patch which was wrong.
msg139750 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-07-04 10:55
Patch version 2, more complete: use HAVE_MBCS everywhere.
msg139753 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-07-04 11:37
Version 3 of the patch:
 - fix initialization of the filesystem encoding if HAVE_MBCS is not set
 - Python fails on the filesystem encoding if it is unable to get the filesystem encoding instead of using UTF-8
 - reorganize the definition of time_clock() function
 - cleanup how TZNAME_ENCODING is defined in timemodule.c

The change on initfsencoding() should be defined in a separated commit.
msg139755 - (view) Author: Roundup Robot (python-dev) Date: 2011-07-04 11:49
New changeset 7ce685cda0ae by Victor Stinner in branch 'default':
Issue #9642: Fix filesystem encoding initialization: use the ANSI code page on
http://hg.python.org/cpython/rev/7ce685cda0ae
msg139756 - (view) Author: Roundup Robot (python-dev) Date: 2011-07-04 12:03
New changeset 75b18b10064f by Victor Stinner in branch 'default':
Issue #9642: Fix the definition of time.clock() on Windows
http://hg.python.org/cpython/rev/75b18b10064f
msg139759 - (view) Author: Roundup Robot (python-dev) Date: 2011-07-04 12:26
New changeset 13e6d3cb2ecd by Victor Stinner in branch 'default':
Issue #9642: Uniformize the tests on the availability of the mbcs codec
http://hg.python.org/cpython/rev/13e6d3cb2ecd
History
Date User Action Args
2011-07-04 12:26:44hayposetstatus: open -> closed
resolution: fixed
versions: + Python 3.3, - Python 3.2
2011-07-04 12:26:02python-devsetmessages: + msg139759
2011-07-04 12:03:52python-devsetmessages: + msg139756
2011-07-04 11:49:05python-devsetnosy: + python-dev
messages: + msg139755
2011-07-04 11:37:13hayposetfiles: + have_mbcs-3.patch

messages: + msg139753
2011-07-04 10:55:05hayposetfiles: + have_mbcs-2.patch

messages: + msg139750
2011-06-24 21:42:14hayposetmessages: + msg138990
2011-06-24 21:39:33hayposetfiles: - ifdef_mbcs.patch
2011-06-24 21:16:37loewissetmessages: + msg138986
2011-06-22 21:14:55hayposetfiles: + have_mbcs.patch

messages: + msg138835
2010-08-19 21:02:20loewissetnosy: + loewis
title: #ifdef and mbcs: don't check for defined(HAVE_USABLE_WCHAR_T) -> #ifdef and mbcs: don't check for defined(HAVE_USABLE_WCHAR_T)
messages: + msg114411
2010-08-19 12:25:39amaury.forgeotdarcsetnosy: + amaury.forgeotdarc
messages: + msg114354
2010-08-19 12:02:25haypocreate