classification
Title: add thai encoding aliases to encodings.aliases
Type: enhancement Stage: needs patch
Components: Unicode Versions: Python 3.4
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: ezio.melotti, fomcl@yahoo.com, lemburg
Priority: normal Keywords:

Created on 2013-02-20 11:48 by fomcl@yahoo.com, last changed 2013-02-22 23:03 by ezio.melotti.

Messages (6)
msg182489 - (view) Author: albertjan (fomcl@yahoo.com) Date: 2013-02-20 11:48
This is almost identical to: http://bugs.python.org/issue854511
However, tis602, which is mentioned in the orginal bug report, is not an alias to cp874. Therefore, I propose the following:

import encodings

aliases = encodings.aliases.aliases
more_aliases = {'ibm874'     : 'cp874',
                'iso_8859_11': 'cp874',
                'iso8859_11' : 'cp874',
                'windows_874': 'cp874',
               }
aliases.update(more_aliases)
msg182493 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2013-02-20 12:22
On 20.02.2013 12:48, albertjan wrote:
> 
> New submission from albertjan:
> 
> This is almost identical to: http://bugs.python.org/issue854511
> However, tis602, which is mentioned in the orginal bug report, is not an alias to cp874. Therefore, I propose the following:
> 
> import encodings
> 
> aliases = encodings.aliases.aliases
> more_aliases = {'ibm874'     : 'cp874',
>                 'iso_8859_11': 'cp874',
>                 'iso8859_11' : 'cp874',
>                 'windows_874': 'cp874',
>                }
> aliases.update(more_aliases)

Please provide evidence that those encodings are indeed the same.

Thanks,
-- 
Marc-Andre Lemburg
eGenix.com
msg182513 - (view) Author: albertjan (fomcl@yahoo.com) Date: 2013-02-20 14:40
Hi,
 
I found this report that includes your name:
http://mail.python.org/pipermail/python-bugs-list/2004-August/024564.html
 
Other relevant websites:
http://en.wikipedia.org/wiki/ISO/IEC_8859-11  # is wikipedia 'proof'?
http://code.ohloh.net/file?fid=dhX2dJrRWGISzQAijawMU6qzWJQ&cid=YD58Y-grdtE&s=&browser=Default
http://msdn.microsoft.com/en-us/goglobal/cc305142.aspx
http://www.iso.org/iso/catalogue_detail?csnumber=28263  # non-free

Regards,
Albert-Jan

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a 
fresh water system, and public health, what have the Romans ever done for us?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  

----- Original Message -----
> From: Marc-Andre Lemburg <report@bugs.python.org>
> To: fomcl@yahoo.com
> Cc: 
> Sent: Wednesday, February 20, 2013 1:22 PM
> Subject: [issue17254] add thai encoding aliases to encodings.aliases
> 
> 
> Marc-Andre Lemburg added the comment:
> 
> On 20.02.2013 12:48, albertjan wrote:
>> 
>> New submission from albertjan:
>> 
>> This is almost identical to: http://bugs.python.org/issue854511
>> However, tis602, which is mentioned in the orginal bug report, is not an 
> alias to cp874. Therefore, I propose the following:
>> 
>> import encodings
>> 
>> aliases = encodings.aliases.aliases
>> more_aliases = {'ibm874'    : 'cp874',
>>                 'iso_8859_11': 'cp874',
>>                 'iso8859_11' : 'cp874',
>>                 'windows_874': 'cp874',
>>                 }
>> aliases.update(more_aliases)
> 
> Please provide evidence that those encodings are indeed the same.
> 
> Thanks,
> -- 
> Marc-Andre Lemburg
> eGenix.com
> 
> ----------
> nosy: +lemburg
> 
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue17254>
> _______________________________________
>
msg182522 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2013-02-20 15:25
On 20.02.2013 15:40, albertjan wrote:
> 
> albertjan added the comment:
> 
> Hi,
>  
> I found this report that includes your name:
> http://mail.python.org/pipermail/python-bugs-list/2004-August/024564.html
>  
> Other relevant websites:
> http://en.wikipedia.org/wiki/ISO/IEC_8859-11  # is wikipedia 'proof'?
> http://code.ohloh.net/file?fid=dhX2dJrRWGISzQAijawMU6qzWJQ&cid=YD58Y-grdtE&s=&browser=Default
> http://msdn.microsoft.com/en-us/goglobal/cc305142.aspx
> http://www.iso.org/iso/catalogue_detail?csnumber=28263  # non-free

Thanks.

Something is wrong with your request, though:

* we already have an iso8859_11 code, so aliasing it to some
  other name is not possible

* we already have an cp874 code, so aliasing it to some
  other name is not possible

* cp874 differs from iso8859_11 in a few places, so aliasing
  cp874 is not possible (see http://en.wikipedia.org/wiki/ISO/IEC_8859-11#Code_page_874)

What we could do is add aliases 'x-ibm874' and 'windows_874' to
'cp874'. I'm not sure whether 'ibm874' and 'x-ibm874' are the same
thing. The references only mention 'x-ibm874'.
msg182528 - (view) Author: albertjan (fomcl@yahoo.com) Date: 2013-02-20 16:11
> Sent: Wednesday, February 20, 2013 4:25 PM
> Subject: [issue17254] add thai encoding aliases to encodings.aliases
> 
> Thanks.
> 
> Something is wrong with your request, though:
> 
> * we already have an iso8859_11 code, so aliasing it to some
>   other name is not possible
> 
> * we already have an cp874 code, so aliasing it to some
>   other name is not possible
> 
> * cp874 differs from iso8859_11 in a few places, so aliasing
>   cp874 is not possible (see 
> http://en.wikipedia.org/wiki/ISO/IEC_8859-11#Code_page_874)

Sorry about that.
 
> What we could do is add aliases 'x-ibm874' and 'windows_874' to
> 'cp874'. I'm not sure whether 'ibm874' and 
> 'x-ibm874' are the same
> thing. The references only mention 'x-ibm874'.

The following document says the following are aliases: x-IBM874, cp874, ibm874, ibm-874, 874
http://www.java2s.com/Tutorial/Java/0180__File/DisplaysAvailableCharsetsandaliases.htm
http://www.fileformat.info/info/charset/x-IBM874/index.htm
In addition it seems that 'windows_874' is used (that's the one that raised this issue for me), but I've also seen references of windows-874, windows874 , WIN874:
http://doxygen.postgresql.org/encnames_8c_source.html
msg182529 - (view) Author: albertjan (fomcl@yahoo.com) Date: 2013-02-20 16:11
> Sent: Wednesday, February 20, 2013 4:25 PM
> Subject: [issue17254] add thai encoding aliases to encodings.aliases
> 
> Thanks.
> 
> Something is wrong with your request, though:
> 
> * we already have an iso8859_11 code, so aliasing it to some
>   other name is not possible
> 
> * we already have an cp874 code, so aliasing it to some
>   other name is not possible
> 
> * cp874 differs from iso8859_11 in a few places, so aliasing
>   cp874 is not possible (see 
> http://en.wikipedia.org/wiki/ISO/IEC_8859-11#Code_page_874)

Sorry about that.
 
> What we could do is add aliases 'x-ibm874' and 'windows_874' to
> 'cp874'. I'm not sure whether 'ibm874' and 
> 'x-ibm874' are the same
> thing. The references only mention 'x-ibm874'.

The following document says the following are aliases: x-IBM874, cp874, ibm874, ibm-874, 874
http://www.java2s.com/Tutorial/Java/0180__File/DisplaysAvailableCharsetsandaliases.htm
http://www.fileformat.info/info/charset/x-IBM874/index.htm
In addition it seems that 'windows_874' is used (that's the one that raised this issue for me), but I've also seen references of windows-874, windows874 , WIN874:
http://doxygen.postgresql.org/encnames_8c_source.html
History
Date User Action Args
2013-02-22 23:03:29ezio.melottisetversions: + Python 3.4
nosy: + ezio.melotti

components: + Unicode
type: enhancement
stage: needs patch
2013-02-20 16:11:59fomcl@yahoo.comsetmessages: + msg182529
2013-02-20 16:11:59fomcl@yahoo.comsetmessages: + msg182528
2013-02-20 15:25:43lemburgsetmessages: + msg182522
2013-02-20 14:40:37fomcl@yahoo.comsetmessages: + msg182513
2013-02-20 12:22:21lemburgsetnosy: + lemburg
messages: + msg182493
2013-02-20 11:48:44fomcl@yahoo.comcreate