classification
Title: Use WCHAR variant of OutputDebugString
Type: enhancement Stage: commit review
Components: Windows Versions: Python 2.7
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: loewis Nosy List: amaury.forgeotdarc, eckhardt, loewis, rpetrov, theller
Priority: high Keywords: patch

Created on 2008-10-08 13:13 by eckhardt, last changed 2009-01-02 20:33 by loewis. This issue is now closed.

Files
File name Uploaded Description Edit
Python-OutputDebugStringW.0.patch eckhardt, 2008-10-08 13:13 patch
Messages (16)
msg74527 - (view) Author: Ulrich Eckhardt (eckhardt) Date: 2008-10-08 13:13
The attached patch converts the call to OutputDebugString() with a
'TCHAR' parameter (which boils down to a 'char') to one using a 'WCHAR'
parameter, allowing the code to be compiled under MS Windows CE, which
doesn't have the 'char' version.
msg74529 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-10-08 14:39
The alloca() function should be avoided here: the function may be called
in extreme conditions, like stack overflow. 
I suggest to use a small static buffer (50 chars?), and call
OutputDebugStringW in a loop.
msg74542 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008-10-08 18:40
I agree that a static buffer should be used. I think calling it in a
loop is overkill. Instead, if an overrun occurs, adding "(truncated)"
should be good enough. I could find only a single caller that doesn't
pass a static string (_Py_NegativeRefcount), where I think 50 characters
are still plenty.
msg74558 - (view) Author: Roumen Petrov (rpetrov) * Date: 2008-10-09 08:49
Which CE version ? Is the patch required for previous/next CE version ?
If the CE can't work with char why the compiler don't threat strings as
wide characters always ?
msg74559 - (view) Author: Ulrich Eckhardt (eckhardt) Date: 2008-10-09 09:18
Roumen, just and explanation on the TCHAR/WCHAR/CHAR issue under win32...

In the old days, DOS/Windows was built with 8-bit characters using
codepages. So functions like CreateFile() took a char string that used
the current local codepage as encoding. Now, since NT 4 (maybe even 3)
the internally used char type is a 16-bit type. In order to ease
conversion, the function CreateFile() was removed (it still exists in
oldnames.lib) and replaced with CreateFileW() and CreateFileA(), which
explicitly take either a codepage-encoded 8-bit string or a UCS2/UTF-16
16-bit string. Under win9x, CreateFileW() actually tried to convert to
the internally used 8-bit character type, while under NT, CreateFileA()
converted from the codepage to the UTF-16 character type.

Now, under CE, which is an embedded OS, neither the
(legacy/obsolete/deprecated) codepages nor the according CreateFileA()
functions exist. They simply have been removed to save space and because
they are of limited functionality anyway.

Which CE version? All of them, since at least CE3 (CE6 is current). Why
not treat all strings as wide string? Because that would actually change
the existing meaning of them and make it harder to impossible to create
code that is portable.
msg74577 - (view) Author: Ulrich Eckhardt (eckhardt) Date: 2008-10-09 14:20
Actually, even _Py_NegativeRefcount() passes a statically sized buffer
with 300 chars. Other than that, there is get_ref_type() which uses one
with 350 chars, but AFAICT, that's the largest one. Attached accordingly
modified patch.
msg74592 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008-10-09 18:36
> If the CE can't work with char why the compiler don't threat strings as
> wide characters always ?

I think this question is pointless - we don't have the power to change
how CE works. You might question whether Ulrich's analysis of the issue
is accurate (I think it is), or whether Python should support CE at all
(I think it should). FWIW, the compiler *does* "work with char", and it
needs to do so to support a byte type. It's just that the CE APIs don't
support char (at least, some of them apparently don't).
msg74596 - (view) Author: Roumen Petrov (rpetrov) * Date: 2008-10-09 19:25
My experience with windows CE ends with version about 3.1X. I couldn't
remember wide character support on this version. 
PythonCE project use xxxA functions for CE .NET 4.20 platform.

"Pointless" question is for compiler flags and is not related with the OS.
msg74605 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008-10-09 21:24
> "Pointless" question is for compiler flags and is not related with the OS.

I don't think the compiler has any such flag that you might consider
useful. Do you have a specific flag in mind?
msg74618 - (view) Author: Ulrich Eckhardt (eckhardt) Date: 2008-10-10 07:26
"PythonCE project use xxxA functions for CE .NET 4.20 platform."

Look again. The PythonCE project adds a header and sourcefile
(wince_compatibility.h/c) which adds these functions. In other words, it
doesn't use the xxxA functions of the win32 API (which don't exist under
CE) but its own replacements.

I was thinking of going that way too, but in the end decided against it
unless absolutely necessary. The point is that this approach allowed
minimal changes to the Python code which still had to support the xxxA
variants for win9x. However, since IIRC 2.6 support for win9x has been
dropped, so now it makes much more sense to use the WCHAR APIs which is
what all supported MS Windows versions use internally anyway. This
allows code to work under CE unchanged, avoids unnecessary conversions 
and provides better Unicode support.

BTW: in case somebody actually wants to resurrect the win9x support,
there is a library from Microsoft that provides the xxxW functions for
that platform. Of course that's not a cure but just a band-aid with
reduced functionality, but at least it's possible.
msg74654 - (view) Author: Roumen Petrov (rpetrov) * Date: 2008-10-10 23:09
I couldn't find in MSDN flags for Windows CE compilers similar to the
GCC compiler that change representation of strings in C-code. The
Microsoft recommend so called TCHAR technology that depend from UNICODE
define I answer itself.


From MSDN isn't so clear what is status of CE version 6.0 - R2?. The
documentation of CE 5.0 looks complete, include details for compiler,
migration guide.

About wince_compatibility.h/c - it is just a part of the patch
(PythonCE-2.5-20061219). If we look in the complete patch we see changes
like GetLocaleInfo->GetLocaleInfoA(_localemodule.c) but 
windll.kernel32.GetProcAddress->cdll.coredll.GetProcAddressW
(test_random_things.py). Another change (posixmodule.c)
CRYPTACQUIRECONTEXTA->CRYPTACQUIRECONTEXT.

So we see changes from xxx to xxxA, xxx to xxxW, xxxA to xxx (!).

If this patch required for CE 5.0 ? If not why to change now for may
upcoming 6.0 ?
If the python switch to W functions then the issue will be resolved.
Without general switch to W functions and to wide strings we need to
convert every time from single chat to wide char. Is this acceptable in
general ? (for the particular case yes - the method is called rarely).
msg74675 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008-10-12 17:40
> If the python switch to W functions then the issue will be resolved.

Python should use the *W functions wherever possible.

> Without general switch to W functions and to wide strings we need to
> convert every time from single chat to wide char. Is this acceptable in
> general ?

Most certainly. If we call the *A function instead, it will convert to
*W itself, so it's no loss if we do it right away (and we may save a
copy in some cases).
msg74726 - (view) Author: Ulrich Eckhardt (eckhardt) Date: 2008-10-14 10:41
"If this patch required for CE 5.0?"

The patch I created is required for all CEs that I know of. I have
personally worked with 4.20, 5 and now 6, and had some exchange with
others who worked on 3.x variants to get STLport (C++ stdlibrary
implementation) to run. AFAIK, none of them support the *A functions, so
this patch or something similar is required for every CE flavor out there. 

Roumen, you mentioned the way that the PythonCE project did it, which
also works, but I'd say that that code is obsolete, because the emulated
functions actually create a win9x-like environment, while Python has
officially dropped support for that. The problem is that the *W
functions are badly supported on win9x while the *A functions are
unsupported on CE. The NT variants (NT, win2000..) support both APIs,
but the *A functions are wrappers around the *W functions and don't
provide the whole functionality that the OS actually supports.

So, what this path does is to help phase out the *A functions, gaining
more thorough Unicode support while at the same time easing porting to CE.
msg74768 - (view) Author: Roumen Petrov (rpetrov) * Date: 2008-10-14 20:02
May be OutputDebugStringA has to be part of "wince-port" library but if
is fine all win platforms to call *W what about patch to use function
MultiByteToWideChar() ?
msg74769 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008-10-14 20:47
> May be OutputDebugStringA has to be part of "wince-port" library but if
> is fine all win platforms to call *W what about patch to use function
> MultiByteToWideChar() ?

Is there a problem with the proposed patch?
msg78873 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2009-01-02 20:33
Thanks for the patch. Committed as r68172.
History
Date User Action Args
2009-01-02 20:33:18loewissetstatus: open -> closed
messages: + msg78873
2008-12-30 13:31:01loewissetpriority: high
assignee: loewis
2008-12-30 13:28:49loewissetstage: commit review
2008-12-30 13:28:18loewissetresolution: accepted
2008-10-14 20:47:51loewissetmessages: + msg74769
2008-10-14 20:02:09rpetrovsetmessages: + msg74768
2008-10-14 10:41:59eckhardtsetmessages: + msg74726
2008-10-12 17:40:56loewissetmessages: + msg74675
2008-10-11 18:14:41thellersetnosy: + theller
2008-10-10 23:09:19rpetrovsetmessages: + msg74654
2008-10-10 07:26:28eckhardtsetmessages: + msg74618
2008-10-09 21:24:55loewissetmessages: + msg74605
2008-10-09 19:25:39rpetrovsetmessages: + msg74596
2008-10-09 18:37:00loewissetmessages: + msg74592
2008-10-09 14:20:28eckhardtsetmessages: + msg74577
2008-10-09 09:18:26eckhardtsetmessages: + msg74559
2008-10-09 08:49:30rpetrovsetnosy: + rpetrov
messages: + msg74558
2008-10-08 18:40:53loewissetnosy: + loewis
messages: + msg74542
2008-10-08 14:39:46amaury.forgeotdarcsetnosy: + amaury.forgeotdarc
messages: + msg74529
2008-10-08 13:13:26eckhardtcreate