Issue 700921: Wide-character curses

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/38140

classification

Title:	Wide-character curses
Type:	enhancement	Stage:	test needed
Components:	Extension Modules	Versions:

process

Status:	closed	Resolution:
Dependencies:		Superseder:
Assigned To:		Nosy List:	akuchling, cben, gpolo, inigoserna, liori, moculus, mwh
Priority:	normal	Keywords:

Created on 2003-03-10 16:45 by cben, last changed 2022-04-10 16:07 by admin. This issue is now closed.

Messages (11)
msg61103 - (view)	Author: Cherniavsky Beni (cben) *	Date: 2003-03-10 16:45
There exists a standard for wide-character curses (http://www.opengroup.org/onlinepubs/7908799/cursesix.html) and at least ncurses implements it (almost completely; you need to enable this when configuring it). It is essensial for getting the maximum of modern UTF-8 terminals (e.g. xterm). It would make sense for python's curses module to support all the wide functions on systems where the wide curses interface is present, especially since Python already supports unicode.
msg61104 - (view)	Author: Michael Hudson (mwh)	Date: 2003-03-10 17:48
Logged In: YES user_id=6656 What steps need to be taken to acheive this? Would you be interested in working up a patch? I do most of my terminal hacking below the level of curses these days...
msg61105 - (view)	Author: Cherniavsky Beni (cben) *	Date: 2003-03-10 20:55
Logged In: YES user_id=36166 Good question :-). Here are the basic additions of the wide curses interface: * `chtype` (which must be an integral type) donesn't have enough place to hold a character OR-ed with the attributes, nor would that be useful enough since combining characters must be handled. Therefore two types are intoroduced: attr_t - an integral type used to hold an OR-ed set of attributes that begin with the prefix ``WA_``. These attributes are semantically a superset of the ``A_`` ones and can have different values (although in ncurses they are the same). cchar_t - a type representing one character cell: at most one spacing character, an implementation-defined number of combining characters, attributes and a color pair. * A whole lot of new functions are provided using these new types. The distinguishing naming style is the separation of words with underscope. Functions that work on single chars have counterparts (``_wch``) that recieve/return cchar_t (except for get_wch which is a bogus mutation). Functions that work on strings have counterparts (``_wstr``) that recieve/return (wchar_t ); many also are duplicated with a (cchar_t ) interface (``_wchstr``). ** All old functions having to do with characters are semantically just degenerate compatibility interfaces to the new ones. * Semantics are defined for adding combining characters: if only non-spacing characters are given, they are added to the existing complex character; if a spacing character is present, the whole cell is replaced. * Semantics are defined for double-width characters (what happens when you break them in various ways). The simplest thing is just to wrap all the extra functions, exposing two APIs in Python, with the later only availible when the platform supports it. This would be painful to work with and I'd rather avoid it. A better approach is just to overload the old names to work with unicode strings. For single-character methods (e.g. `addch`), it's harder. The (character ordinal \| attributes) interface for should be deprecated and only work for ascii chars, in a backwards-compatible way. The interface where the character and attributes are given as separate arguments can be cleanly extended to accept unicode characters/ordinals. The behaivour w.r.t. combing and double-width characters should be defined. Complex chars should be repsented as multi-char unicode strings (therefore unicode ordinals are a limited representation). I don't think anything special is needed for sensible double-width handling? The (char_t *) interfaces (``_wchstr``) are convenient for storing many characters with inividual attributes; I'm not sure how to expose them (list of char, attr tuples?). There is the question of what to do in the absense of wide curses in the platform, when the unicode interface will be called. I think that some settable "curses default encoding" should be used as a fallback, so that people can keep their sanity. This should be specific to curses, or maybe even settable per-window, so that some basic input/output methods can implemented as a codec (this is suboptimal but I think could be useful as a quick solution). I can write an initial patch but don't expect it quickly. This could use the counsel of somebody with wide-curses expereince (I'm a newbe to this, I want to start experimenting in Python rather than C :-).
msg61106 - (view)	Author: Cherniavsky Beni (cben) *	Date: 2003-03-11 13:38
Logged In: YES user_id=36166 One more issue: constants. I think the ``A_`` attribute constants should not be duplicated as ``WA_``; rather the same ``A_`` values will be accepted in all cases and translated to the corresponding ``WA_`` values when passing them to wide curses functions. ``WA_`` attributes that have no ``A_`` counterpart in curses should get one, possibly a long int: A_{HORIZONTAL,LEFT,LOW,RIGHT,TOP,VERTICAL}. Wait, we already define them ;-). But they might be availiable on some platforms only as ``WA_`` and not ``A_``... The ``ACS_`` constants also have ``WACS_`` counterparts that are (cchar_t ). Probably the later should be exposed as unicode strings instead of the Other constants added in XSI curses are: EOF/WEOF (actually declared in <stdio.h>, <wchar.h>; returned by `.getch()` - should be either exposed in the module or converted to something sensible (None?). getkey() currently segfaults(!) on end of file, should return empty string for consistency with `file.read()`... * KEY_CODE_YES - return value of `get_wch()` to indicate the code is a key. Should not be exposed. Update to previous message: the ``_wchstr`` family need not be supported, at least as long as ``chstr`` family isn't (I didn't notice there is such a thing). In long run, it might be useful (but then a completely new high-level wrapper of windows as directly subscriptable/sliceable and 2D arrays would be even better... Let's leave it for now).
msg61107 - (view)	Author: Michael Hudson (mwh)	Date: 2003-03-11 14:06
Logged In: YES user_id=6656 It sounds like you've given this more thought already than I'm likely to for some time :-) This is unlikely to make 2.3, I'd have thought, so there's no hurry, really. I can answer any questions about details of implementation and review any code you produce, but I think you have a better handle on design issues.
msg61108 - (view)	Author: Cherniavsky Beni (cben) *	Date: 2003-03-11 14:49
Logged In: YES user_id=36166 OK. I don't care much whether it gets official quickly, I do want something to play with :-) (of course I want it to become official when it's good enough). I've looked a bit at the code. The prospect of double-casing all functions to use the wide interface is not pleasant but I don't see a better alternative yet; I'd like to factor things out if I find a way... And yes, any code I produce wll certainly require review :-).
msg61109 - (view)	Author: Michael Hudson (mwh)	Date: 2003-03-11 14:52
Logged In: YES user_id=6656 Cool. You should be able to extend the current preprocessor hackery to relieve some of the drudgery, shouldn't you?
msg61110 - (view)	Author: Erik Osheim (moculus)	Date: 2005-03-30 17:29
Logged In: YES user_id=542811 Is there any work still being done on this? I have been hoping that a feature like this would be included for awhile, and it's more than two years since this request was posted.
msg61111 - (view)	Author: A.M. Kuchling (akuchling) *	Date: 2006-07-26 19:57
Logged In: YES user_id=11375 I don't know of anyone working on this; feel free to try yourself. I'll continue to maintain the curses module, but don't plan to embark on any significant new work (adding Unicode support, adding wide-character features). If I have an hour of Python hacking time, I'll spend that time on processing bugs or patches, not on enhancing a module that few people use. So, if you'd like to see wide-character support Note that in Python 2.5 the curses module links against ncursesw. It may be possible to display Unicode characters by setting the locale to something that supports UTF-8 and then sending UTF-8 strings to methods such as addstr().
msg91869 - (view)	Author: Iñigo Serna (inigoserna)	Date: 2009-08-22 18:26
In issue6755 I provide a patch to support get_wch, which is the only wide chars related feature I miss in current bindings (as python v2.6.2 or v3.1.1).
msg91870 - (view)	Author: Guilherme Polo (gpolo) *	Date: 2009-08-22 18:31
Closing this in favour of issue6755, it has been a long time since any discussion took place here that I believe it is better to continue somewhere else related to this.

History
Date	User	Action	Args
2022-04-10 16:07:32	admin	set	github: 38140
2009-08-22 18:31:52	gpolo	set	status: open -> closed
2009-08-22 18:31:41	gpolo	set	nosy: + gpolo messages: + msg91870
2009-08-22 18:26:05	inigoserna	set	nosy: + inigoserna messages: + msg91869
2009-04-29 18:25:13	liori	set	nosy: + liori
2009-02-12 03:23:12	ajaksu2	set	stage: test needed
2003-03-10 16:45:09	cben	create