Message 295513 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	steven.daprano
Recipients	Vanessa McHale, steven.daprano
Date	2017-06-09.11:03:00
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1497006181.86.0.940527934905.issue30608@psf.upfronthosting.co.za>
In-reply-to

Content
I don't really understand your example code. What result did you expect? The output shown in Github seems correct to me: optional arguments: -h, --help show this help message and exit --language1 XXXXXXXXXX Lanugage for output --language2 LANGUAGE Lanugage for output I've substituted "X" for the "missing characters" that show up, as I don't have a Tibetan font installed. This is a more complicated "bug" (feature?) than it might seem, and I don't think it is really an argparse issue so much as a string issue. The length of a string is the number of code points in it, without trying to distinguish zero-width code points and combining characters from the rest. I don't believe that argparse has any way of knowing how the string will be displayed. It could be displayed as: - a series of 10 "missing character" square glyphs; - the correct glyphs, but still 10 columns wide (if the font has glyphs for Tibetan, but does not render the vowel markers correctly); - or it might render the text properly, according to the rules for Tibetan, requiring less than 10 (I guess) columns. I believe that, unfortunately, the only way that those three scenarios can be distinguished would be to print the text to a GUI framework with a rich text widget capable of measuring the width of text in pixels. Working in a console app, as argparse does, it is limited to the typefaces the console supports, and cannot get the pixel width. I think the only safe way to proceed is to count code points (i.e. the length as reported by Python strings) and assume each code point requires one column. That way you can be reasonably confident that the string won't be any more than that number of columns wide. (Even that might be wrong, if the string includes full width Asian code points, which may take two columns each.) I don't think there is any good solution here, but I think the status quo might be the least worst. If argparse assumes that the vowel markers are zero-width, it will format the output correctly optional arguments: -h, --help show this help message and exit --language1 XXXXXXX Lanugage for output --language2 LANGUAGE Lanugage for output but only for those who have the correct Tibetan typeface installed. Everyone else will see: optional arguments: -h, --help show this help message and exit --language1 XXXXXXXXXX Lanugage for output --language2 LANGUAGE Lanugage for output (By the way, I'm guessing what the output might be -- I don't know Tibetan and don't know how many columns the correctly displayed string will take.) Vanessa, if my analysis is wrong in any way, or if you can think of a patch to argparse that will solve this issue, please tell us. Otherwise, I think this has to be treated as "won't fix".

I don't really understand your example code. What result did you expect? The output shown in Github seems correct to me:

optional arguments:
-h, --help show this help message and exit
--language1 XXXXXXXXXX
Lanugage for output
--language2 LANGUAGE Lanugage for output

I've substituted "X" for the "missing characters" that show up, as I don't have a Tibetan font installed.

This is a more complicated "bug" (feature?) than it might seem, and I don't think it is really an argparse issue so much as a string issue. The length of a string is the number of code points in it, without trying to distinguish zero-width code points and combining characters from the rest.

I don't believe that argparse has any way of knowing how the string will be displayed. It could be displayed as:

- a series of 10 "missing character" square glyphs;
- the correct glyphs, but still 10 columns wide (if the font has glyphs for Tibetan, but does not render the vowel markers correctly);
- or it might render the text properly, according to the rules for Tibetan, requiring less than 10 (I guess) columns.

I believe that, unfortunately, the only way that those three scenarios can be distinguished would be to print the text to a GUI framework with a rich text widget capable of measuring the *width* of text in pixels.

Working in a console app, as argparse does, it is limited to the typefaces the console supports, and cannot get the pixel width. I think the only safe way to proceed is to count code points (i.e. the length as reported by Python strings) and assume each code point requires one column. That way you can be reasonably confident that the string won't be any more than that number of columns wide.

(Even that might be wrong, if the string includes full width Asian code points, which may take two columns each.)

I don't think there is any good solution here, but I think the status quo might be the least worst. If argparse assumes that the vowel markers are zero-width, it will format the output correctly

optional arguments:
-h, --help show this help message and exit
--language1 XXXXXXX Lanugage for output
--language2 LANGUAGE Lanugage for output

but only for those who have the correct Tibetan typeface installed. Everyone else will see:

optional arguments:
-h, --help show this help message and exit
--language1 XXXXXXXXXX Lanugage for output
--language2 LANGUAGE Lanugage for output

(By the way, I'm guessing what the output might be -- I don't know Tibetan and don't know how many columns the correctly displayed string will take.)

Vanessa, if my analysis is wrong in any way, or if you can think of a patch to argparse that will solve this issue, please tell us.

Otherwise, I think this has to be treated as "won't fix".

History
Date	User	Action	Args
2017-06-09 11:03:01	steven.daprano	set	recipients: + steven.daprano, Vanessa McHale
2017-06-09 11:03:01	steven.daprano	set	messageid: <1497006181.86.0.940527934905.issue30608@psf.upfronthosting.co.za>
2017-06-09 11:03:01	steven.daprano	link	issue30608 messages
2017-06-09 11:03:00	steven.daprano	create