Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

msvcrt bytes cleanup #49660

Closed
ocean-city mannequin opened this issue Mar 3, 2009 · 11 comments
Closed

msvcrt bytes cleanup #49660

ocean-city mannequin opened this issue Mar 3, 2009 · 11 comments
Labels
extension-modules C modules in the Modules dir OS-windows release-blocker type-bug An unexpected behavior, bug, or error

Comments

@ocean-city
Copy link
Mannequin

ocean-city mannequin commented Mar 3, 2009

BPO 5410
Nosy @pitrou, @vstinner, @benjaminp
Dependencies
  • bpo-5499: only accept byte for getarg('c') and unicode for getarg('C')
  • Files
  • py3k_fix_msvcrt.patch
  • msvcrt_wchar.patch
  • msvcrt_wchar-2.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2009-05-01.21:42:45.276>
    created_at = <Date 2009-03-03.11:56:13.175>
    labels = ['extension-modules', 'type-bug', 'OS-windows', 'release-blocker']
    title = 'msvcrt bytes cleanup'
    updated_at = <Date 2009-05-01.21:42:45.274>
    user = 'https://bugs.python.org/ocean-city'

    bugs.python.org fields:

    activity = <Date 2009-05-01.21:42:45.274>
    actor = 'benjamin.peterson'
    assignee = 'none'
    closed = True
    closed_date = <Date 2009-05-01.21:42:45.276>
    closer = 'benjamin.peterson'
    components = ['Extension Modules', 'Windows']
    creation = <Date 2009-03-03.11:56:13.175>
    creator = 'ocean-city'
    dependencies = ['5499']
    files = ['13233', '13351', '13718']
    hgrepos = []
    issue_num = 5410
    keywords = ['patch']
    message_count = 11.0
    messages = ['83071', '83685', '83686', '85182', '85410', '85426', '85455', '85461', '86124', '86125', '86916']
    nosy_count = 4.0
    nosy_names = ['pitrou', 'vstinner', 'ocean-city', 'benjamin.peterson']
    pr_nums = []
    priority = 'release blocker'
    resolution = 'fixed'
    stage = None
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue5410'
    versions = ['Python 3.1']

    @ocean-city
    Copy link
    Mannequin Author

    ocean-city mannequin commented Mar 3, 2009

    I came from bpo-5391. Here is quote of Victor's message.

    • msvcrt.putch(char), msvcrt.ungetch(char): msvcrt has also:
    • msvcrt.getch()->byte string of 1 byte
    • msvcrt.getwch()->unicode string of 1 character
    • msvcrt.putwch(unicode string of 1 character)
    • msvcrt_ungetwch(unicode string of 1 character)
      Hum, putch(), ungetch(), getch() use inconsistent types
      (unicode/bytes) and should be fixed. Another issue should be open for
      that.

    Notes: msvcrt.putwch() accepts string of length > 1 and
    msvcrt.ungetwch() doesn't check string length (and so may crash with
    length=0 or length > 1?).

    And msvcrt.ungetwch() calls _ungetch not _ungetwch. Here is the patch
    hopefully fixing these issue. (I cannot test wide version of functions
    because VC6 don't have them)

    @ocean-city ocean-city mannequin added extension-modules C modules in the Modules dir OS-windows labels Mar 3, 2009
    @vstinner
    Copy link
    Member

    msvcrt.ungetwch() calls _ungetch not _ungetwch

    ... are you sure that someone already used these functions? :-)

    If you suppose that bpo-5499 is fixed, you can leave msvcrt_putch()
    and msvcrt_ungetch unchanged and use "C" format in msvcrt_ungetwch()
    ("Py_UNICODE ch;" have to be replaced by "int ch;" for the
    format "C").

    @vstinner
    Copy link
    Member

    Patch implementing my proposition (depends on bpo-5499).

    @ocean-city ocean-city mannequin added the release-blocker label Apr 2, 2009
    @vstinner
    Copy link
    Member

    vstinner commented Apr 2, 2009

    bpo-5499 is fixed, so msvcrt_wchar.patch can now be used :-) Anyone
    available for a review and/or _a test_? I don't have Windows, so it's
    hard for me to test my patch.

    @pitrou
    Copy link
    Member

    pitrou commented Apr 4, 2009

    There seems to be a problem with ungetwch():

    >>> s = msvcrt.getwch()
    # Here I type the Euro sign (€)
    >>> ascii(s)
    "'\\u20ac'"
    >>> msvcrt.ungetwch(s)
    >>> u = msvcrt.getwch()
    >>> ascii(u)
    "'\\xac'"

    @pitrou pitrou added the type-bug An unexpected behavior, bug, or error label Apr 4, 2009
    @benjaminp
    Copy link
    Contributor

    I think this can wait until the first beta.

    @vstinner
    Copy link
    Member

    vstinner commented Apr 5, 2009

    There seems to be a problem with ungetwch()

    I tested Visual C++ Express 2008 and it looks like _ungetwch() only
    keep 8 lower bits (like _ungetwch(x & 255)). But it's a bug in
    Microsoft library, not in Python code (I added some printf to be
    sure).

    My patch (msvcrt_wchar.patch) makes the situation better, but it's not
    perfect because of a bug in Microsoft's library.

    msvcrt.getwch() works correctly with characters with code > 255 (eg.
    euro sign, U+20ac, 8364 in decimal).

    @ocean-city
    Copy link
    Mannequin Author

    ocean-city mannequin commented Apr 5, 2009

    MSDN says _ungetwch returns WEOF instead of EOF when error occurs.
    http://msdn.microsoft.com/en-us/library/yezzac74(VS.80).aspx

    I cannot see any remarks about masking behavior. :-(

    @vstinner
    Copy link
    Member

    I cannot see any remarks about masking behavior. :-(

    I asked on a french Windows developer channel. The answer is that the
    Windows terminal uses "ANSI" charset even if it's possible to use
    unicode. So it's a bug in Microsoft msvcrt library (directly in the
    terminal implementation), not in Python.

    Anyway I think that my patch (msvcrt_wchar.patch) makes the situation
    better ;-)

    @vstinner
    Copy link
    Member

    MSDN says _ungetwch returns WEOF instead of EOF when error occurs.

    Ok, I updated my patch (to use WEOF).

    @benjaminp
    Copy link
    Contributor

    Applied in r72185.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    extension-modules C modules in the Modules dir OS-windows release-blocker type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants