classification
Title: Additional code pages for EBCDIC
Type: enhancement Stage:
Components: Unicode Versions: Python 3.5
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: Nosy List: BreamoreBoy, ezio.melotti, lemburg, roskakori, serhiy.storchaka, vstinner
Priority: normal Keywords:

Created on 2013-07-02 17:47 by roskakori, last changed 2014-11-15 13:14 by vstinner. This issue is now closed.

Files
File name Uploaded Description Edit
cp_ebcdic.zip roskakori, 2013-07-02 17:47 EBCDIC code pages as listed in http://de.wikipedia.org/wiki/EBCDIC
Messages (7)
msg192214 - (view) Author: (roskakori) Date: 2013-07-02 17:47
Currently Python includes a codec for EBCDIC international (cp500) but seems to be missing any further EBCDIC codecs. These encodings are widly used on mainframe platforms, popular in finance and insurance.

Descriptions of these codepages are available from IBM: <http://www-01.ibm.com/software/globalization/cp/cp_cpgid.html>. These descriptions also include mapping files although not in a format that can readily be processed by gencodec.py.

So instead I used the codecs included with Java 1.7 to generate mappings for gencodec.py. You can find them in the attached ZIP archive. As Java also runs on mainframe platforms, IBM should be interested in the Java codecs to be correct and complete.

The converter is available from <https://github.com/roskakori/CodecMapper>. To build the cp*.txt for EBCDIC, simply run:

$ git clone https://github.com/roskakori/CodecMapper.git
$ cd CodecMapper
$ ant ebcdic

IBM lists a large number of EBCDIC codepages, I only attached the ones listed in the German Wikipedia: <http://de.wikipedia.org/wiki/EBCDIC>. This also includes cp500 for comparison with your current cp500. And it lacks EDF03DRV because even Java does not support it.

Currently Java 1.7 supports 43 variants. To get a list of them, use:

$ ant list | grep -i ' ibm'

This would also fix issue 1097797: Encoding for Code Page 273 used by EBCDIC Germany Austria.
msg228602 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2014-10-05 17:56
What do our unicode experts think about this enhancement request?
msg228607 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-10-05 18:21
Currently Python includes codecs for cp037, cp273, and cp500.
msg228656 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2014-10-06 10:48
I don't think we should add more EBCDIC codecs to Python's stdlib. It would be better to put a Python package on PyPI which people using these encodings can use.
msg228671 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-10-06 12:22
If more users request this codec, we may integrate it later. Right now, maintain a package on PyPI is enough. It's easy to register a "custom" codec from a third-party codec: use
https://docs.python.org/dev/library/codecs.html#codecs.register
msg231208 - (view) Author: (roskakori) Date: 2014-11-15 12:36
I just released a package on PyPI that adds various EBCDIC codecs for Python 2.6+ and Python 3.1+, see <https://pypi.python.org/pypi/ebcdic>.

I agree with Marc-Andre, maintaining this is easier as separate package.
msg231209 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-11-15 13:14
There are more and more codecs on PyPI. I would be nice to have a list somewhere.

@roskakori: Could you please create a page at https://wiki.python.org/ ? Example:
https://wiki.python.org/moin/Codecs
History
Date User Action Args
2014-11-15 13:14:16vstinnersetmessages: + msg231209
2014-11-15 12:36:50roskakorisetmessages: + msg231208
2014-10-06 12:22:47vstinnersetstatus: open -> closed
resolution: wont fix
messages: + msg228671
2014-10-06 10:48:46lemburgsetmessages: + msg228656
2014-10-05 18:21:04serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg228607
2014-10-05 17:56:30BreamoreBoysetnosy: + BreamoreBoy

messages: + msg228602
versions: + Python 3.5
2013-07-02 20:54:08pitrousetnosy: + vstinner
2013-07-02 17:47:11roskakoricreate