classification
Title: Standard string encodings should include GSM0.38
Type: enhancement Stage: resolved
Components: Library (Lib), Unicode Versions: Python 3.5
process
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: Nosy List: BreamoreBoy, eric.araujo, haypo, jwishnie, lemburg, pitrou
Priority: normal Keywords:

Created on 2009-06-21 19:28 by jwishnie, last changed 2014-10-01 01:55 by berker.peksag. This issue is now closed.

Messages (5)
msg89576 - (view) Author: (jwishnie) Date: 2009-06-21 19:28
The standard string codecs for converting from unicode to strs does not 
include the GSM 0.38 char mapping used by GSM services (like SMS).

I've written a codec for my use based on 'char_mapper' and the skeleton 
from gencodec.py, though it's a little complicated by the fact that the 
GSM encoding is semi-multibyte and not just a straight table look-up.

Gory details here in the comments:
http://www.unicode.org/Public/MAPPINGS/ETSI/GSM0338.TXT

The codec is available here:
http://github.com/jwishnie/pygsm/tree/f574f6db99c585f785f0c73a080814c043
2c6087/pygsm/gsmcodecs

Please consider it, or an optimized/improved version of it, for 
inclusion with the standard codecs distributed with python
msg89590 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-06-21 22:19
You should provide your code as a patch against the Python trunk. Also,
unit tests should probably be part of Lib/test/test_codecs.py.
msg105397 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2010-05-09 14:32
Are there many GSM libraries or applications out there? If not, maybe the codec is best left in your lib, since it wouldn’t be useful for a wide range of uses.

Note also that 2.7 is frozen, so substitute “py3k branch” for “trunk” in Antoine’s previous comment.
msg228008 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2014-09-30 21:44
@jwishnie can you provide a patch for this, as without it the issue goes nowhere?
msg228020 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2014-09-30 22:44
Since the codec has only been asked once, 5 years ago, I consider that there is not enough interested to put yet another encoding. Python already supports a lot of encodings.

It's easy to use your custom codec without having to modify Python, just register it:
https://docs.python.org/dev/library/codecs.html#codecs.register
History
Date User Action Args
2014-10-01 01:55:40berker.peksagsetstage: needs patch -> resolved
2014-09-30 22:44:01hayposetstatus: open -> closed
versions: + Python 3.5, - Python 3.2
nosy: + haypo

messages: + msg228020

resolution: out of date
2014-09-30 21:44:00BreamoreBoysetnosy: + BreamoreBoy
messages: + msg228008
2010-11-12 01:27:15eric.araujosetnosy: + lemburg
2010-05-09 14:32:45eric.araujosetnosy: + eric.araujo

messages: + msg105397
versions: - Python 2.7
2009-06-21 22:20:19pitrousetpriority: normal
nosy: pitrou, jwishnie
versions: + Python 2.7, Python 3.2, - Python 2.6
components: + Library (Lib)
stage: needs patch
2009-06-21 22:19:47pitrousetnosy: + pitrou
messages: + msg89590
2009-06-21 19:28:00jwishniecreate