Message 71887 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	loewis
Recipients	amaury.forgeotdarc, loewis, pascal.bach
Date	2008-08-24.21:52:16
SpamBayes Score	1.1293169e-07
Marked as misclassified	No
Message-id	<1219614737.8.0.465342749563.issue3649@psf.upfronthosting.co.za>
In-reply-to

Content
I don't think this codec should be named IA-5. IA-5 is specified in ITU-T Rec. T.50 (International Alphabet No. 5), recently renamed to "International Reference Alphabet", and it does not specify that the characters 0..31 are printable. Instead, IA5 is identical to ISO 646 (i.e. allowing for national variants), with the International Reference Version of IA5 (e.g. as used in ASN.1 IA5String) is identical to US-ASCII. If GSM uses a modified version of this, it should receive a separate name. If you were looking at section 2 (Structure of EMI messages), what makes you think that this specification calls the encoding "IA5"? In my copy, it says: # Alphanumeric characters are encoded as two numeric IA5 characters, # the higher 3 bits (0..7) first, the lower 4 bits (0..F) thereafter, # according to the following table. So it uses IA5 to hex-encode the encoding. To achieve that, one would have to write text.encode("emi-section-2").encode("hex") [Notice that the "hex" codec already uses IA-5] In any case, I don't think this is general enough to deserve inclusion into the standard library. The codec system is designed to be so flexible to support additional codecs outside the core.

I don't think this codec should be named IA-5. IA-5 is specified in
ITU-T Rec. T.50 (International Alphabet No. 5), recently renamed to
"International Reference Alphabet", and it does *not* specify that the
characters 0..31 are printable. Instead, IA5 is identical to ISO 646
(i.e. allowing for national variants), with the International Reference
Version of IA5 (e.g. as used in ASN.1 IA5String) is identical to US-ASCII.

If GSM uses a modified version of this, it should receive a separate
name. If you were looking at section 2 (Structure of EMI messages), what
makes you think that this specification calls the encoding "IA5"? In my
copy, it says:

# Alphanumeric characters are encoded as two numeric IA5 characters,
# the higher 3 bits (0..7) first, the lower 4 bits (0..F) thereafter,
# according to the following table.

So it *uses* IA5 to hex-encode the encoding. To achieve that, one would
have to write

  text.encode("emi-section-2").encode("hex")

[Notice that the "hex" codec already uses IA-5]

In any case, I don't think this is general enough to deserve inclusion
into the standard library. The codec system is designed to be so
flexible to support additional codecs outside the core.

History
Date	User	Action	Args
2008-08-24 21:52:17	loewis	set	recipients: + loewis, amaury.forgeotdarc, pascal.bach
2008-08-24 21:52:17	loewis	set	messageid: <1219614737.8.0.465342749563.issue3649@psf.upfronthosting.co.za>
2008-08-24 21:52:17	loewis	link	issue3649 messages
2008-08-24 21:52:16	loewis	create