Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cp874 encoding almost empty #67118

Closed
era mannequin opened this issue Nov 24, 2014 · 4 comments
Closed

cp874 encoding almost empty #67118

era mannequin opened this issue Nov 24, 2014 · 4 comments
Labels
topic-unicode type-bug An unexpected behavior, bug, or error

Comments

@era
Copy link
Mannequin

era mannequin commented Nov 24, 2014

BPO 22929
Nosy @malemburg, @vstinner, @ezio-melotti

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2014-11-24.11:47:09.758>
created_at = <Date 2014-11-24.10:39:37.843>
labels = ['type-bug', 'invalid', 'expert-unicode']
title = 'cp874 encoding almost empty'
updated_at = <Date 2014-11-24.14:39:36.858>
user = 'https://bugs.python.org/era'

bugs.python.org fields:

activity = <Date 2014-11-24.14:39:36.858>
actor = 'r.david.murray'
assignee = 'none'
closed = True
closed_date = <Date 2014-11-24.11:47:09.758>
closer = 'era'
components = ['Unicode']
creation = <Date 2014-11-24.10:39:37.843>
creator = 'era'
dependencies = []
files = []
hgrepos = []
issue_num = 22929
keywords = []
message_count = 4.0
messages = ['231596', '231598', '231599', '231600']
nosy_count = 4.0
nosy_names = ['lemburg', 'vstinner', 'ezio.melotti', 'era']
pr_nums = []
priority = 'normal'
resolution = 'not a bug'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue22929'
versions = ['Python 2.7']

@era
Copy link
Mannequin Author

era mannequin commented Nov 24, 2014

I created a simple script to map character codes in the 8bit range to Unicode for simple lookup:

https://github.com/tripleee/8bit

In the generated output, on Python 2.6.6 (but corroborated on Python 2.7.6), almost all character codes come up as "undefined" in CP874.

According to http://en.wikipedia.org/wiki/ISO/IEC_8859-11, CP874 should be a superset of ISO-8859-11, with a few character codes *added* in the ISO control range.

@era era mannequin added topic-unicode type-bug An unexpected behavior, bug, or error labels Nov 24, 2014
@malemburg
Copy link
Member

I'm not sure I understand the bug report. What's the problem ? :-)

The codec is a charmap codec generated from the file MAPPINGS/VENDORS/MICSFT/WINDOWS/CP874.TXT (http://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP874.TXT)

This mapping does have quite a few undefined characters.

@malemburg
Copy link
Member

BTW: The table on the wiki page shows the same undefined chars.

@era
Copy link
Mannequin Author

era mannequin commented Nov 24, 2014

My apologies -- I already attemptd to close this as a mistake on my part, but apparently, that failed too. )-: Sorry.

@era era mannequin closed this as completed Nov 24, 2014
@era era mannequin added the invalid label Nov 24, 2014
@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-unicode type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

1 participant