This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients David.Sankel, amaury.forgeotdarc, brian.curtin, christian.heimes, christoph, davidsarah, ezio.melotti, hippietrail, lemburg, mark, pitrou, santoso.wijaya, ssbarnea, terry.reedy, tim.golden, tzot, v+python, vstinner
Date 2011-03-21.14:25:19
SpamBayes Score 1.226243e-10
Marked as misclassified No
Message-id <1300717520.32.0.790975839962.issue1602@psf.upfronthosting.co.za>
In-reply-to
Content
I did some tests with WriteConsoleW():
 - with raster fonts, U+00E9 is displayed as é, U+0141 as L and U+042D as ? => good (work as expected)
 - with TrueType font (Lucida), U+00E9 is displayed as é, U+0141 as Ł and U+042D as Э => perfect! (all characters are rendered correctly)

Now I agree that WriteConsoleW() is the best solution to fix this issue.

My test code (added to Python/sysmodule.c):
---------
static PyObject *
sys_write_stdout(PyObject *self, PyObject *args)
{
    PyObject *textobj;
    wchar_t *text;
    DWORD written, total;
    Py_ssize_t len, chunk;
    HANDLE console;
    BOOL ok;

    if (!PyArg_ParseTuple(args, "U:write_stdout", &textobj))
        return NULL;

    console = GetStdHandle(STD_OUTPUT_HANDLE);
    if (console == INVALID_HANDLE_VALUE) {
        PyErr_SetFromWindowsErr(GetLastError());
        return NULL;
    }

    text = PyUnicode_AS_UNICODE(textobj);
    len = PyUnicode_GET_SIZE(textobj);
    total = 0;
    while (len != 0) {
        if (len > 10000)
            /* WriteConsoleW() is limited to 64 KB (32,768 UTF-16 units), but
               this limit depends on the heap usage. Use a safe limit of 10,000
               UTF-16 units.
               http://tahoe-lafs.org/trac/tahoe-lafs/ticket/1232 */
            chunk = 10000;
        else
            chunk = len;
        ok = WriteConsoleW(console, text, chunk, &written, NULL);
        if (!ok) 
            break;
        text += written;
        len -= written;
        total += written;
    }
    return PyLong_FromUnsignedLong(total);
}
---------


The question is now how to integrate WriteConsoleW() into Python without breaking the API, for example:
 - Should sys.stdout be a TextIOWrapper or not?
 - Should sys.stdout.fileno() returns 1 or raise an error?
 - What about sys.stdout.buffer: should sys.stdout.buffer.write() calls WriteConsoleA() or sys.stdout should not have a buffer attribute? I think that many modules and programs now rely on sys.stdout.buffer to write directly bytes into stdout. There is at least python -m base64.
 - Should we use ReadConsoleW() for stdin?
History
Date User Action Args
2011-03-21 14:25:20vstinnersetrecipients: + vstinner, lemburg, terry.reedy, tzot, amaury.forgeotdarc, pitrou, christian.heimes, tim.golden, mark, christoph, ezio.melotti, v+python, hippietrail, ssbarnea, brian.curtin, davidsarah, santoso.wijaya, David.Sankel
2011-03-21 14:25:20vstinnersetmessageid: <1300717520.32.0.790975839962.issue1602@psf.upfronthosting.co.za>
2011-03-21 14:25:19vstinnerlinkissue1602 messages
2011-03-21 14:25:19vstinnercreate