msvcrt bytes cleanup #49660

ocean-city · 2009-03-03T11:56:13Z

BPO	5410
Nosy	@pitrou, @vstinner, @benjaminp
Dependencies	bpo-5499: only accept byte for getarg('c') and unicode for getarg('C')
Files	py3k_fix_msvcrt.patch msvcrt_wchar.patch msvcrt_wchar-2.patch

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2009-05-01.21:42:45.276>
created_at = <Date 2009-03-03.11:56:13.175>
labels = ['extension-modules', 'type-bug', 'OS-windows', 'release-blocker']
title = 'msvcrt bytes cleanup'
updated_at = <Date 2009-05-01.21:42:45.274>
user = 'https://bugs.python.org/ocean-city'

bugs.python.org fields:

activity = <Date 2009-05-01.21:42:45.274>
actor = 'benjamin.peterson'
assignee = 'none'
closed = True
closed_date = <Date 2009-05-01.21:42:45.276>
closer = 'benjamin.peterson'
components = ['Extension Modules', 'Windows']
creation = <Date 2009-03-03.11:56:13.175>
creator = 'ocean-city'
dependencies = ['5499']
files = ['13233', '13351', '13718']
hgrepos = []
issue_num = 5410
keywords = ['patch']
message_count = 11.0
messages = ['83071', '83685', '83686', '85182', '85410', '85426', '85455', '85461', '86124', '86125', '86916']
nosy_count = 4.0
nosy_names = ['pitrou', 'vstinner', 'ocean-city', 'benjamin.peterson']
pr_nums = []
priority = 'release blocker'
resolution = 'fixed'
stage = None
status = 'closed'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue5410'
versions = ['Python 3.1']

ocean-city · 2009-03-03T11:56:10Z

I came from bpo-5391. Here is quote of Victor's message.

msvcrt.putch(char), msvcrt.ungetch(char): msvcrt has also:

msvcrt.getch()->byte string of 1 byte

msvcrt.getwch()->unicode string of 1 character

msvcrt.putwch(unicode string of 1 character)

msvcrt_ungetwch(unicode string of 1 character)
Hum, putch(), ungetch(), getch() use inconsistent types
(unicode/bytes) and should be fixed. Another issue should be open for
that.

Notes: msvcrt.putwch() accepts string of length > 1 and
msvcrt.ungetwch() doesn't check string length (and so may crash with
length=0 or length > 1?).

And msvcrt.ungetwch() calls _ungetch not _ungetwch. Here is the patch
hopefully fixing these issue. (I cannot test wide version of functions
because VC6 don't have them)

vstinner · 2009-03-17T17:12:21Z

msvcrt.ungetwch() calls _ungetch not _ungetwch

... are you sure that someone already used these functions? :-)

If you suppose that bpo-5499 is fixed, you can leave msvcrt_putch()
and msvcrt_ungetch unchanged and use "C" format in msvcrt_ungetwch()
("Py_UNICODE ch;" have to be replaced by "int ch;" for the
format "C").

vstinner · 2009-03-17T17:22:46Z

Patch implementing my proposition (depends on bpo-5499).

vstinner · 2009-04-02T08:32:19Z

bpo-5499 is fixed, so msvcrt_wchar.patch can now be used :-) Anyone
available for a review and/or _a test_? I don't have Windows, so it's
hard for me to test my patch.

pitrou · 2009-04-04T16:56:46Z

There seems to be a problem with ungetwch():

>>> s = msvcrt.getwch()
# Here I type the Euro sign (€)
>>> ascii(s)
"'\\u20ac'"
>>> msvcrt.ungetwch(s)
>>> u = msvcrt.getwch()
>>> ascii(u)
"'\\xac'"

benjaminp · 2009-04-04T18:58:32Z

I think this can wait until the first beta.

vstinner · 2009-04-05T00:39:56Z

There seems to be a problem with ungetwch()

I tested Visual C++ Express 2008 and it looks like _ungetwch() only
keep 8 lower bits (like _ungetwch(x & 255)). But it's a bug in
Microsoft library, not in Python code (I added some printf to be
sure).

My patch (msvcrt_wchar.patch) makes the situation better, but it's not
perfect because of a bug in Microsoft's library.

msvcrt.getwch() works correctly with characters with code > 255 (eg.
euro sign, U+20ac, 8364 in decimal).

ocean-city · 2009-04-05T01:59:39Z

MSDN says _ungetwch returns WEOF instead of EOF when error occurs.
http://msdn.microsoft.com/en-us/library/yezzac74(VS.80).aspx

I cannot see any remarks about masking behavior. :-(

vstinner · 2009-04-18T17:27:42Z

I cannot see any remarks about masking behavior. :-(

I asked on a french Windows developer channel. The answer is that the
Windows terminal uses "ANSI" charset even if it's possible to use
unicode. So it's a bug in Microsoft msvcrt library (directly in the
terminal implementation), not in Python.

Anyway I think that my patch (msvcrt_wchar.patch) makes the situation
better ;-)

vstinner · 2009-04-18T17:29:44Z

MSDN says _ungetwch returns WEOF instead of EOF when error occurs.

Ok, I updated my patch (to use WEOF).

benjaminp · 2009-05-01T21:42:45Z

Applied in r72185.

ocean-city mannequin added extension-modules C modules in the Modules dir OS-windows labels Mar 3, 2009

ocean-city mannequin added the release-blocker label Apr 2, 2009

pitrou added the type-bug An unexpected behavior, bug, or error label Apr 4, 2009

benjaminp added deferred-blocker release-blocker and removed release-blocker deferred-blocker labels Apr 4, 2009

benjaminp closed this as completed May 1, 2009

ezio-melotti transferred this issue from another repository Apr 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

msvcrt bytes cleanup #49660

msvcrt bytes cleanup #49660

ocean-city mannequin commented Mar 3, 2009

ocean-city mannequin commented Mar 3, 2009

vstinner commented Mar 17, 2009

vstinner commented Mar 17, 2009

vstinner commented Apr 2, 2009

pitrou commented Apr 4, 2009

benjaminp commented Apr 4, 2009

vstinner commented Apr 5, 2009

ocean-city mannequin commented Apr 5, 2009

vstinner commented Apr 18, 2009

vstinner commented Apr 18, 2009

benjaminp commented May 1, 2009

msvcrt bytes cleanup #49660

msvcrt bytes cleanup #49660

Comments

ocean-city mannequin commented Mar 3, 2009

ocean-city mannequin commented Mar 3, 2009

vstinner commented Mar 17, 2009

vstinner commented Mar 17, 2009

vstinner commented Apr 2, 2009

pitrou commented Apr 4, 2009

benjaminp commented Apr 4, 2009

vstinner commented Apr 5, 2009

ocean-city mannequin commented Apr 5, 2009

vstinner commented Apr 18, 2009

vstinner commented Apr 18, 2009

benjaminp commented May 1, 2009