Issue17254
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2013-02-20 11:48 by fomcl@yahoo.com, last changed 2022-04-11 14:57 by admin.
Pull Requests | |||
---|---|---|---|
URL | Status | Linked | Edit |
PR 15079 | open | python-dev, 2019-08-02 14:43 |
Messages (10) | |||
---|---|---|---|
msg182489 - (view) | Author: albertjan (fomcl@yahoo.com) | Date: 2013-02-20 11:48 | |
This is almost identical to: http://bugs.python.org/issue854511 However, tis602, which is mentioned in the orginal bug report, is not an alias to cp874. Therefore, I propose the following: import encodings aliases = encodings.aliases.aliases more_aliases = {'ibm874' : 'cp874', 'iso_8859_11': 'cp874', 'iso8859_11' : 'cp874', 'windows_874': 'cp874', } aliases.update(more_aliases) |
|||
msg182493 - (view) | Author: Marc-Andre Lemburg (lemburg) * | Date: 2013-02-20 12:22 | |
On 20.02.2013 12:48, albertjan wrote: > > New submission from albertjan: > > This is almost identical to: http://bugs.python.org/issue854511 > However, tis602, which is mentioned in the orginal bug report, is not an alias to cp874. Therefore, I propose the following: > > import encodings > > aliases = encodings.aliases.aliases > more_aliases = {'ibm874' : 'cp874', > 'iso_8859_11': 'cp874', > 'iso8859_11' : 'cp874', > 'windows_874': 'cp874', > } > aliases.update(more_aliases) Please provide evidence that those encodings are indeed the same. Thanks, -- Marc-Andre Lemburg eGenix.com |
|||
msg182513 - (view) | Author: albertjan (fomcl@yahoo.com) | Date: 2013-02-20 14:40 | |
Hi, I found this report that includes your name: http://mail.python.org/pipermail/python-bugs-list/2004-August/024564.html Other relevant websites: http://en.wikipedia.org/wiki/ISO/IEC_8859-11 # is wikipedia 'proof'? http://code.ohloh.net/file?fid=dhX2dJrRWGISzQAijawMU6qzWJQ&cid=YD58Y-grdtE&s=&browser=Default http://msdn.microsoft.com/en-us/goglobal/cc305142.aspx http://www.iso.org/iso/catalogue_detail?csnumber=28263 # non-free Regards, Albert-Jan ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ----- Original Message ----- > From: Marc-Andre Lemburg <report@bugs.python.org> > To: fomcl@yahoo.com > Cc: > Sent: Wednesday, February 20, 2013 1:22 PM > Subject: [issue17254] add thai encoding aliases to encodings.aliases > > > Marc-Andre Lemburg added the comment: > > On 20.02.2013 12:48, albertjan wrote: >> >> New submission from albertjan: >> >> This is almost identical to: http://bugs.python.org/issue854511 >> However, tis602, which is mentioned in the orginal bug report, is not an > alias to cp874. Therefore, I propose the following: >> >> import encodings >> >> aliases = encodings.aliases.aliases >> more_aliases = {'ibm874' : 'cp874', >> 'iso_8859_11': 'cp874', >> 'iso8859_11' : 'cp874', >> 'windows_874': 'cp874', >> } >> aliases.update(more_aliases) > > Please provide evidence that those encodings are indeed the same. > > Thanks, > -- > Marc-Andre Lemburg > eGenix.com > > ---------- > nosy: +lemburg > > _______________________________________ > Python tracker <report@bugs.python.org> > <http://bugs.python.org/issue17254> > _______________________________________ > |
|||
msg182522 - (view) | Author: Marc-Andre Lemburg (lemburg) * | Date: 2013-02-20 15:25 | |
On 20.02.2013 15:40, albertjan wrote: > > albertjan added the comment: > > Hi, > > I found this report that includes your name: > http://mail.python.org/pipermail/python-bugs-list/2004-August/024564.html > > Other relevant websites: > http://en.wikipedia.org/wiki/ISO/IEC_8859-11 # is wikipedia 'proof'? > http://code.ohloh.net/file?fid=dhX2dJrRWGISzQAijawMU6qzWJQ&cid=YD58Y-grdtE&s=&browser=Default > http://msdn.microsoft.com/en-us/goglobal/cc305142.aspx > http://www.iso.org/iso/catalogue_detail?csnumber=28263 # non-free Thanks. Something is wrong with your request, though: * we already have an iso8859_11 code, so aliasing it to some other name is not possible * we already have an cp874 code, so aliasing it to some other name is not possible * cp874 differs from iso8859_11 in a few places, so aliasing cp874 is not possible (see http://en.wikipedia.org/wiki/ISO/IEC_8859-11#Code_page_874) What we could do is add aliases 'x-ibm874' and 'windows_874' to 'cp874'. I'm not sure whether 'ibm874' and 'x-ibm874' are the same thing. The references only mention 'x-ibm874'. |
|||
msg182528 - (view) | Author: albertjan (fomcl@yahoo.com) | Date: 2013-02-20 16:11 | |
> Sent: Wednesday, February 20, 2013 4:25 PM > Subject: [issue17254] add thai encoding aliases to encodings.aliases > > Thanks. > > Something is wrong with your request, though: > > * we already have an iso8859_11 code, so aliasing it to some > other name is not possible > > * we already have an cp874 code, so aliasing it to some > other name is not possible > > * cp874 differs from iso8859_11 in a few places, so aliasing > cp874 is not possible (see > http://en.wikipedia.org/wiki/ISO/IEC_8859-11#Code_page_874) Sorry about that. > What we could do is add aliases 'x-ibm874' and 'windows_874' to > 'cp874'. I'm not sure whether 'ibm874' and > 'x-ibm874' are the same > thing. The references only mention 'x-ibm874'. The following document says the following are aliases: x-IBM874, cp874, ibm874, ibm-874, 874 http://www.java2s.com/Tutorial/Java/0180__File/DisplaysAvailableCharsetsandaliases.htm http://www.fileformat.info/info/charset/x-IBM874/index.htm In addition it seems that 'windows_874' is used (that's the one that raised this issue for me), but I've also seen references of windows-874, windows874 , WIN874: http://doxygen.postgresql.org/encnames_8c_source.html |
|||
msg182529 - (view) | Author: albertjan (fomcl@yahoo.com) | Date: 2013-02-20 16:11 | |
> Sent: Wednesday, February 20, 2013 4:25 PM > Subject: [issue17254] add thai encoding aliases to encodings.aliases > > Thanks. > > Something is wrong with your request, though: > > * we already have an iso8859_11 code, so aliasing it to some > other name is not possible > > * we already have an cp874 code, so aliasing it to some > other name is not possible > > * cp874 differs from iso8859_11 in a few places, so aliasing > cp874 is not possible (see > http://en.wikipedia.org/wiki/ISO/IEC_8859-11#Code_page_874) Sorry about that. > What we could do is add aliases 'x-ibm874' and 'windows_874' to > 'cp874'. I'm not sure whether 'ibm874' and > 'x-ibm874' are the same > thing. The references only mention 'x-ibm874'. The following document says the following are aliases: x-IBM874, cp874, ibm874, ibm-874, 874 http://www.java2s.com/Tutorial/Java/0180__File/DisplaysAvailableCharsetsandaliases.htm http://www.fileformat.info/info/charset/x-IBM874/index.htm In addition it seems that 'windows_874' is used (that's the one that raised this issue for me), but I've also seen references of windows-874, windows874 , WIN874: http://doxygen.postgresql.org/encnames_8c_source.html |
|||
msg348902 - (view) | Author: Benjamin Wood (Benjamin Wood) * | Date: 2019-08-02 14:02 | |
From what I can tell cp874 != ibm_874 != iso_8859_11 What I can say is that the current cp874 is the implementation of the windows_874 code page. The page itself references the microsoft code page, and also contains the appropriate characters (like EURO SIGN). https://github.com/python/cpython/blob/master/Lib/encodings/cp874.py """ Python Character Mapping Codec cp874 generated from 'MAPPINGS/VENDORS/MICSFT/WINDOWS/CP874.TXT' with gencodec.py. https://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP874.TXT It seems appropriate to at least alias windows_874 with cp874. They are provably the same. If someone needs the IBM standard, they may have to write a different code page. |
|||
msg354287 - (view) | Author: Benjamin Wood (Benjamin Wood) * | Date: 2019-10-09 16:51 | |
I've created the codepage alias 874. This is only pending is a merge into the mainline. Thanks. |
|||
msg368192 - (view) | Author: Benjamin Wood (Benjamin Wood) * | Date: 2020-05-05 18:23 | |
This is an easy alias to a valid codepage. I supplied proof that they are the same. I don't understand why this has been allowed to languish for 9 months. Did I miss something? Is there more work I need to do? Thanks |
|||
msg376689 - (view) | Author: Benjamin Wood (Benjamin Wood) * | Date: 2020-09-10 17:33 | |
Bumping this again. I'd like to try and understand why this change can not or has not been approved. I added the technical info here to the github PR. Is there a shortage of reviewers? What can I do to help speed up the process? |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:57:42 | admin | set | github: 61456 |
2020-09-10 17:33:40 | Benjamin Wood | set | messages: + msg376689 |
2020-05-05 18:23:45 | Benjamin Wood | set | messages: + msg368192 |
2020-01-10 22:47:21 | cheryl.sabella | set | versions: + Python 3.9, - Python 3.4 |
2019-10-09 16:51:42 | Benjamin Wood | set | messages: + msg354287 |
2019-08-02 14:43:18 | python-dev | set | keywords:
+ patch stage: needs patch -> patch review pull_requests: + pull_request14825 |
2019-08-02 14:02:04 | Benjamin Wood | set | nosy:
+ Benjamin Wood messages: + msg348902 |
2015-01-23 05:14:38 | era | set | nosy:
+ era |
2013-02-22 23:03:29 | ezio.melotti | set | versions:
+ Python 3.4 nosy: + ezio.melotti components: + Unicode type: enhancement stage: needs patch |
2013-02-20 16:11:59 | fomcl@yahoo.com | set | messages: + msg182529 |
2013-02-20 16:11:59 | fomcl@yahoo.com | set | messages: + msg182528 |
2013-02-20 15:25:43 | lemburg | set | messages: + msg182522 |
2013-02-20 14:40:37 | fomcl@yahoo.com | set | messages: + msg182513 |
2013-02-20 12:22:21 | lemburg | set | nosy:
+ lemburg messages: + msg182493 |
2013-02-20 11:48:44 | fomcl@yahoo.com | create |