Author ocean-city
Recipients amaury.forgeotdarc, ocean-city, vstinner
Date 2008-10-06.06:49:47
SpamBayes Score 2.48054e-09
Marked as misclassified No
Message-id <1223275850.67.0.522957441889.issue2382@psf.upfronthosting.co.za>
In-reply-to
Content
>At least my "one unicode char is one space" suggestion corrects the case 
>of Western languages, and all messages with single-width characters.

I'm not happy with this solution. ;-(

>Doesn't the exact width depend on 
>the terminal capabilities? and fonts, and combining diacritics...

I have to admit you are right. 

Nevertheless, I got coLinux(Debian) which has localed wcswidth(3), so I
created another experimental patch.
(py3k_adjust_cursor_at_syntax_error_v2.patch)

The strategy is ...
1. Try to convert to unicode. If fails, nothing changed to offset.
2. If system has wcswidth(3), try that function
3. If system is windows, try WideCharToMultibyte with CP_ACP
4. If above 2/3 fails or system is others, use unicode length as offset
(Amaury's suggestion)

This patch ignores file encoding. Again, this patch is experimental,
best effort, but maybe better than current state.

P.S.
I tested this patch on coLinux with ja_JP.UTF-8 locale and manual
#define HAVE_WCSWIDTH 1
because I don't know how to change configure script.
History
Date User Action Args
2008-10-06 06:50:50ocean-citysetrecipients: + ocean-city, amaury.forgeotdarc, vstinner
2008-10-06 06:50:50ocean-citysetmessageid: <1223275850.67.0.522957441889.issue2382@psf.upfronthosting.co.za>
2008-10-06 06:49:49ocean-citylinkissue2382 messages
2008-10-06 06:49:48ocean-citycreate