This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: add thai encoding aliases to encodings.aliases
Type: enhancement Stage: patch review
Components: Unicode Versions: Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Benjamin Wood, era, ezio.melotti, fomcl@yahoo.com, lemburg
Priority: normal Keywords: patch

Created on 2013-02-20 11:48 by fomcl@yahoo.com, last changed 2022-04-11 14:57 by admin.

Pull Requests
URL Status Linked Edit
PR 15079 open python-dev, 2019-08-02 14:43
Messages (10)
msg182489 - (view) Author: albertjan (fomcl@yahoo.com) Date: 2013-02-20 11:48
This is almost identical to: http://bugs.python.org/issue854511
However, tis602, which is mentioned in the orginal bug report, is not an alias to cp874. Therefore, I propose the following:

import encodings

aliases = encodings.aliases.aliases
more_aliases = {'ibm874'     : 'cp874',
                'iso_8859_11': 'cp874',
                'iso8859_11' : 'cp874',
                'windows_874': 'cp874',
               }
aliases.update(more_aliases)
msg182493 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2013-02-20 12:22
On 20.02.2013 12:48, albertjan wrote:
> 
> New submission from albertjan:
> 
> This is almost identical to: http://bugs.python.org/issue854511
> However, tis602, which is mentioned in the orginal bug report, is not an alias to cp874. Therefore, I propose the following:
> 
> import encodings
> 
> aliases = encodings.aliases.aliases
> more_aliases = {'ibm874'     : 'cp874',
>                 'iso_8859_11': 'cp874',
>                 'iso8859_11' : 'cp874',
>                 'windows_874': 'cp874',
>                }
> aliases.update(more_aliases)

Please provide evidence that those encodings are indeed the same.

Thanks,
-- 
Marc-Andre Lemburg
eGenix.com
msg182513 - (view) Author: albertjan (fomcl@yahoo.com) Date: 2013-02-20 14:40
Hi,
 
I found this report that includes your name:
http://mail.python.org/pipermail/python-bugs-list/2004-August/024564.html
 
Other relevant websites:
http://en.wikipedia.org/wiki/ISO/IEC_8859-11  # is wikipedia 'proof'?
http://code.ohloh.net/file?fid=dhX2dJrRWGISzQAijawMU6qzWJQ&cid=YD58Y-grdtE&s=&browser=Default
http://msdn.microsoft.com/en-us/goglobal/cc305142.aspx
http://www.iso.org/iso/catalogue_detail?csnumber=28263  # non-free

Regards,
Albert-Jan

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a 
fresh water system, and public health, what have the Romans ever done for us?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  

----- Original Message -----
> From: Marc-Andre Lemburg <report@bugs.python.org>
> To: fomcl@yahoo.com
> Cc: 
> Sent: Wednesday, February 20, 2013 1:22 PM
> Subject: [issue17254] add thai encoding aliases to encodings.aliases
> 
> 
> Marc-Andre Lemburg added the comment:
> 
> On 20.02.2013 12:48, albertjan wrote:
>> 
>> New submission from albertjan:
>> 
>> This is almost identical to: http://bugs.python.org/issue854511
>> However, tis602, which is mentioned in the orginal bug report, is not an 
> alias to cp874. Therefore, I propose the following:
>> 
>> import encodings
>> 
>> aliases = encodings.aliases.aliases
>> more_aliases = {'ibm874'    : 'cp874',
>>                 'iso_8859_11': 'cp874',
>>                 'iso8859_11' : 'cp874',
>>                 'windows_874': 'cp874',
>>                 }
>> aliases.update(more_aliases)
> 
> Please provide evidence that those encodings are indeed the same.
> 
> Thanks,
> -- 
> Marc-Andre Lemburg
> eGenix.com
> 
> ----------
> nosy: +lemburg
> 
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue17254>
> _______________________________________
>
msg182522 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2013-02-20 15:25
On 20.02.2013 15:40, albertjan wrote:
> 
> albertjan added the comment:
> 
> Hi,
>  
> I found this report that includes your name:
> http://mail.python.org/pipermail/python-bugs-list/2004-August/024564.html
>  
> Other relevant websites:
> http://en.wikipedia.org/wiki/ISO/IEC_8859-11  # is wikipedia 'proof'?
> http://code.ohloh.net/file?fid=dhX2dJrRWGISzQAijawMU6qzWJQ&cid=YD58Y-grdtE&s=&browser=Default
> http://msdn.microsoft.com/en-us/goglobal/cc305142.aspx
> http://www.iso.org/iso/catalogue_detail?csnumber=28263  # non-free

Thanks.

Something is wrong with your request, though:

* we already have an iso8859_11 code, so aliasing it to some
  other name is not possible

* we already have an cp874 code, so aliasing it to some
  other name is not possible

* cp874 differs from iso8859_11 in a few places, so aliasing
  cp874 is not possible (see http://en.wikipedia.org/wiki/ISO/IEC_8859-11#Code_page_874)

What we could do is add aliases 'x-ibm874' and 'windows_874' to
'cp874'. I'm not sure whether 'ibm874' and 'x-ibm874' are the same
thing. The references only mention 'x-ibm874'.
msg182528 - (view) Author: albertjan (fomcl@yahoo.com) Date: 2013-02-20 16:11
> Sent: Wednesday, February 20, 2013 4:25 PM
> Subject: [issue17254] add thai encoding aliases to encodings.aliases
> 
> Thanks.
> 
> Something is wrong with your request, though:
> 
> * we already have an iso8859_11 code, so aliasing it to some
>   other name is not possible
> 
> * we already have an cp874 code, so aliasing it to some
>   other name is not possible
> 
> * cp874 differs from iso8859_11 in a few places, so aliasing
>   cp874 is not possible (see 
> http://en.wikipedia.org/wiki/ISO/IEC_8859-11#Code_page_874)

Sorry about that.
 
> What we could do is add aliases 'x-ibm874' and 'windows_874' to
> 'cp874'. I'm not sure whether 'ibm874' and 
> 'x-ibm874' are the same
> thing. The references only mention 'x-ibm874'.

The following document says the following are aliases: x-IBM874, cp874, ibm874, ibm-874, 874
http://www.java2s.com/Tutorial/Java/0180__File/DisplaysAvailableCharsetsandaliases.htm
http://www.fileformat.info/info/charset/x-IBM874/index.htm
In addition it seems that 'windows_874' is used (that's the one that raised this issue for me), but I've also seen references of windows-874, windows874 , WIN874:
http://doxygen.postgresql.org/encnames_8c_source.html
msg182529 - (view) Author: albertjan (fomcl@yahoo.com) Date: 2013-02-20 16:11
> Sent: Wednesday, February 20, 2013 4:25 PM
> Subject: [issue17254] add thai encoding aliases to encodings.aliases
> 
> Thanks.
> 
> Something is wrong with your request, though:
> 
> * we already have an iso8859_11 code, so aliasing it to some
>   other name is not possible
> 
> * we already have an cp874 code, so aliasing it to some
>   other name is not possible
> 
> * cp874 differs from iso8859_11 in a few places, so aliasing
>   cp874 is not possible (see 
> http://en.wikipedia.org/wiki/ISO/IEC_8859-11#Code_page_874)

Sorry about that.
 
> What we could do is add aliases 'x-ibm874' and 'windows_874' to
> 'cp874'. I'm not sure whether 'ibm874' and 
> 'x-ibm874' are the same
> thing. The references only mention 'x-ibm874'.

The following document says the following are aliases: x-IBM874, cp874, ibm874, ibm-874, 874
http://www.java2s.com/Tutorial/Java/0180__File/DisplaysAvailableCharsetsandaliases.htm
http://www.fileformat.info/info/charset/x-IBM874/index.htm
In addition it seems that 'windows_874' is used (that's the one that raised this issue for me), but I've also seen references of windows-874, windows874 , WIN874:
http://doxygen.postgresql.org/encnames_8c_source.html
msg348902 - (view) Author: Benjamin Wood (Benjamin Wood) * Date: 2019-08-02 14:02
From what I can tell

cp874 != ibm_874 != iso_8859_11

What I can say is that the current cp874 is the implementation of the windows_874 code page. The page itself references the microsoft code page, and also contains the appropriate characters (like EURO SIGN).

https://github.com/python/cpython/blob/master/Lib/encodings/cp874.py
""" Python Character Mapping Codec cp874 generated from 'MAPPINGS/VENDORS/MICSFT/WINDOWS/CP874.TXT' with gencodec.py.

https://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP874.TXT

It seems appropriate to at least alias windows_874 with cp874. They are provably the same.

If someone needs the IBM standard, they may have to write a different code page.
msg354287 - (view) Author: Benjamin Wood (Benjamin Wood) * Date: 2019-10-09 16:51
I've created the codepage alias 874.

This is only pending is a merge into the mainline.

Thanks.
msg368192 - (view) Author: Benjamin Wood (Benjamin Wood) * Date: 2020-05-05 18:23
This is an easy alias to a valid codepage. I supplied proof that they are the same.

I don't understand why this has been allowed to languish for 9 months.

Did I miss something? Is there more work I need to do?

Thanks
msg376689 - (view) Author: Benjamin Wood (Benjamin Wood) * Date: 2020-09-10 17:33
Bumping this again.

I'd like to try and understand why this change can not or has not been approved.

I added the technical info here to the github PR.

Is there a shortage of reviewers? What can I do to help speed up the process?
History
Date User Action Args
2022-04-11 14:57:42adminsetgithub: 61456
2020-09-10 17:33:40Benjamin Woodsetmessages: + msg376689
2020-05-05 18:23:45Benjamin Woodsetmessages: + msg368192
2020-01-10 22:47:21cheryl.sabellasetversions: + Python 3.9, - Python 3.4
2019-10-09 16:51:42Benjamin Woodsetmessages: + msg354287
2019-08-02 14:43:18python-devsetkeywords: + patch
stage: needs patch -> patch review
pull_requests: + pull_request14825
2019-08-02 14:02:04Benjamin Woodsetnosy: + Benjamin Wood
messages: + msg348902
2015-01-23 05:14:38erasetnosy: + era
2013-02-22 23:03:29ezio.melottisetversions: + Python 3.4
nosy: + ezio.melotti

components: + Unicode
type: enhancement
stage: needs patch
2013-02-20 16:11:59fomcl@yahoo.comsetmessages: + msg182529
2013-02-20 16:11:59fomcl@yahoo.comsetmessages: + msg182528
2013-02-20 15:25:43lemburgsetmessages: + msg182522
2013-02-20 14:40:37fomcl@yahoo.comsetmessages: + msg182513
2013-02-20 12:22:21lemburgsetnosy: + lemburg
messages: + msg182493
2013-02-20 11:48:44fomcl@yahoo.comcreate