Message 216817 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	terry.reedy
Recipients	bgailer, docs@python, josh.r, lilbludot, martin.panter, terry.reedy
Date	2014-04-19.00:08:45
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1397866128.12.0.739102602152.issue21279@psf.upfronthosting.co.za>
In-reply-to

Content
The docstring is more accurate. ">>> str.translate.__doc__ 'S.translate(table) -> str\n\nReturn a copy of the string S, where all characters have been mapped\nthrough the given translation table, which must be a mapping of\nUnicode ordinals to Unicode ordinals, strings, or None.\nUnmapped characters are left untouched. Characters mapped to None\nare deleted.'"" To me, even this is a bit unclear on exceptions and 'unmapped'. Based on experiments and then reading the C source, I determined that LookupErrors mean 'unmapped' while other exceptions are passed on and terminate the translation. "Return a copy of the string S, where all characters have been mapped through the given translation table. When subscripted by a Unicode ordinal (integer in range(1048576)), the table must return a Unicode ordinal, string, or None, or else raise a LookupError. A LookupError, which includes instances of subclasses IndexError and KeyError, indicates that the character is unmapped and should be left untouched. Characters mapped to None are deleted." class Table: def __getitem__(self, key): if key == 99: raise LookupError() #'c' elif key == 100: return None # 'd' elif key == 101: return 'xyz' # 'e' else: return key+1 print('abcdef'.translate(Table())) # bccxyzg The current doc ends with "Note An even more flexible approach is to create a custom character mapping codec using the codecs module (see encodings.cp1251 for an example)." I don't see how this is supposed to help. Encodings.cp1251 uses a string of 256 chars as a lookup table.

The docstring is more accurate.
">>> str.translate.__doc__
'S.translate(table) -> str\n\nReturn a copy of the string S, where all characters have been mapped\nthrough the given translation table, which must be a mapping of\nUnicode ordinals to Unicode ordinals, strings, or None.\nUnmapped characters are left untouched. Characters mapped to None\nare deleted.'""

To me, even this is a bit unclear on exceptions and 'unmapped'. Based on experiments and then reading the C source, I determined that LookupErrors mean 'unmapped' while other exceptions are passed on and terminate the translation.

"Return a copy of the string S, where all characters have been mapped through the given translation table. When subscripted by a Unicode ordinal (integer in range(1048576)), the table must return a Unicode ordinal, string, or None, or else raise a LookupError. A LookupError, which includes instances of subclasses IndexError and KeyError, indicates that the character is unmapped and should be left untouched. Characters mapped to None are deleted."

class Table:
    def __getitem__(self, key):
        if key == 99:   raise LookupError() #'c'
        elif key == 100: return None  # 'd'
        elif key == 101: return 'xyz'  # 'e'
        else: return key+1
                
print('abcdef'.translate(Table()))
# bccxyzg

The current doc ends with "Note
An even more flexible approach is to create a custom character mapping codec using the codecs module (see encodings.cp1251 for an example)."

I don't see how this is supposed to help. Encodings.cp1251 uses a string of 256 chars as a lookup table.

History
Date	User	Action	Args
2014-04-19 00:08:48	terry.reedy	set	recipients: + terry.reedy, bgailer, docs@python, martin.panter, josh.r, lilbludot
2014-04-19 00:08:48	terry.reedy	set	messageid: <1397866128.12.0.739102602152.issue21279@psf.upfronthosting.co.za>
2014-04-19 00:08:48	terry.reedy	link	issue21279 messages
2014-04-19 00:08:45	terry.reedy	create