This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author steven.daprano
Recipients Vanessa McHale, steven.daprano
Date 2017-06-09.11:03:00
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1497006181.86.0.940527934905.issue30608@psf.upfronthosting.co.za>
In-reply-to
Content
I don't really understand your example code. What result did you expect? The output shown in Github seems correct to me:

optional arguments:
  -h, --help            show this help message and exit
  --language1 XXXXXXXXXX
                        Lanugage for output
  --language2 LANGUAGE  Lanugage for output


I've substituted "X" for the "missing characters" that show up, as I don't have a Tibetan font installed.

This is a more complicated "bug" (feature?) than it might seem, and I don't think it is really an argparse issue so much as a string issue. The length of a string is the number of code points in it, without trying to distinguish zero-width code points and combining characters from the rest.

I don't believe that argparse has any way of knowing how the string will be displayed. It could be displayed as:

- a series of 10 "missing character" square glyphs; 
- the correct glyphs, but still 10 columns wide (if the font has glyphs for Tibetan, but does not render the vowel markers correctly);
- or it might render the text properly, according to the rules for Tibetan, requiring less than 10 (I guess) columns.

I believe that, unfortunately, the only way that those three scenarios can be distinguished would be to print the text to a GUI framework with a rich text widget capable of measuring the *width* of text in pixels.

Working in a console app, as argparse does, it is limited to the typefaces the console supports, and cannot get the pixel width. I think the only safe way to proceed is to count code points (i.e. the length as reported by Python strings) and assume each code point requires one column. That way you can be reasonably confident that the string won't be any more than that number of columns wide.

(Even that might be wrong, if the string includes full width Asian code points, which may take two columns each.)

I don't think there is any good solution here, but I think the status quo might be the least worst. If argparse assumes that the vowel markers are zero-width, it will format the output correctly

optional arguments:
  -h, --help            show this help message and exit
  --language1 XXXXXXX   Lanugage for output
  --language2 LANGUAGE  Lanugage for output


but only for those who have the correct Tibetan typeface installed. Everyone else will see:

optional arguments:
  -h, --help            show this help message and exit
  --language1 XXXXXXXXXX   Lanugage for output
  --language2 LANGUAGE  Lanugage for output


(By the way, I'm guessing what the output might be -- I don't know Tibetan and don't know how many columns the correctly displayed string will take.)

Vanessa, if my analysis is wrong in any way, or if you can think of a patch to argparse that will solve this issue, please tell us.

Otherwise, I think this has to be treated as "won't fix".
History
Date User Action Args
2017-06-09 11:03:01steven.dapranosetrecipients: + steven.daprano, Vanessa McHale
2017-06-09 11:03:01steven.dapranosetmessageid: <1497006181.86.0.940527934905.issue30608@psf.upfronthosting.co.za>
2017-06-09 11:03:01steven.dapranolinkissue30608 messages
2017-06-09 11:03:00steven.dapranocreate