Issue403100
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2001-01-04 17:50 by doerwalter, last changed 2022-04-10 16:03 by admin. This issue is now closed.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
None | doerwalter, 2001-01-04 17:50 | None |
Messages (9) | |||
---|---|---|---|
msg53079 - (view) | Author: Walter Dörwald (doerwalter) * | Date: 2001-01-04 17:50 | |
This patch modifies Objects/unicodeobject.c/PyUnicode_TranslateCharmap, so that the error PyErr_SetString(PyExc_NotImplementedError, "1-n mappings are currently not implemented"); no longer occurs. I.e. u"ab".translate({ord(u"a"): u"bbb", ord(u"b"): u"aaa"}) now works. It does this by exponentially reallocating the string, when there is no more available space. |
|||
msg53080 - (view) | Author: Nobody/Anonymous (nobody) | Date: 2001-01-04 18:33 | |
I like the idea, but the implementation needs some reworking: the common case is 1-1 mapping so this should be as fast as possible; extra size checks slow things down too much. You can take a different approach, though: leave things as they are and only add a special case for the 1-n which does resizing depending on how many extra chars are inserted. Then as final step, if resizing occurred, call _PyUnicode_Resize() to cut down the allocate buffer to its true size. -- Marc-Andre |
|||
msg53081 - (view) | Author: Nobody/Anonymous (nobody) | Date: 2001-01-05 18:45 | |
I'll checkin a patch for this tomorrow which implements what I had in mind. The patch doesn't change the performance of the charmap codec. Thanks, -- Marc-Andre |
|||
msg53082 - (view) | Author: Marc-Andre Lemburg (lemburg) * | Date: 2001-01-06 15:03 | |
Checked in a different patch providing the same functionality. Please see the CVS checking message for details. |
|||
msg53083 - (view) | Author: Walter Dörwald (doerwalter) * | Date: 2001-01-05 17:07 | |
The problem, that you can't know beforehand how long the result string will be, i.e. if there really will be any 1-n replacements happening. It would be possible to do a loop through the replacement strings and see if there are any that are longer than one character, but even if there are, you don't know if they will really be used. So you have three choices: (1) You either guess how much space you need and reallocate when the space is not enough or (2) you do a dry run of the algorithm once and count how much space you need and do the algorithm a second time and this time use the strings. (3) you can keep the strings in a list and join the list into one string in the end. For the case of 1-1 mapping the following will happen: (1) The first allocation has exactly the right amount of space, there won't be any reallocations, but a size check for every character will be don (which should be only a few assembler instructions). The mapping will have to be accessed for every character in the source string once. (2) There will only be one allocation, but for every character in the source string, the mapping has to be accessed twice, which are calls to Python function, exception handling etc. (3) You have to make as many memory allocations are are parts of the final string that you create, including error handling etc. I think (1) is clearly the fastest method. |
|||
msg53084 - (view) | Author: Walter Dörwald (doerwalter) * | Date: 2001-06-07 10:09 | |
Logged In: YES user_id=89016 The patch that was checked in changes PyUnicode_DecodeCharmap and PyUnicode_EncodeCharmap, but not PyUnicode_TranslateCharmap, where this functionality is also useful. . (e.g. for u"<foo>".translate({ord("<"): u"<", ord(">"): u">"}) ) |
|||
msg53085 - (view) | Author: Marc-Andre Lemburg (lemburg) * | Date: 2001-06-07 12:32 | |
Logged In: YES user_id=38388 Reopened. This should really be marked as feature request but for some reason SF won't let me change the Data Type. |
|||
msg53086 - (view) | Author: Tim Peters (tim.peters) * | Date: 2001-08-09 21:02 | |
Logged In: YES user_id=31435 Changed to Feature Requests, at MvL's request. |
|||
msg53087 - (view) | Author: Walter Dörwald (doerwalter) * | Date: 2002-09-04 20:37 | |
Logged In: YES user_id=89016 This is implemented by the PEP 293 patch. Closing the request. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-10 16:03:35 | admin | set | github: 33662 |
2001-01-04 17:50:43 | doerwalter | create |