Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support of KOI8-T encoding #66871

Closed
serhiy-storchaka opened this issue Oct 20, 2014 · 10 comments
Closed

Add support of KOI8-T encoding #66871

serhiy-storchaka opened this issue Oct 20, 2014 · 10 comments
Assignees
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@serhiy-storchaka
Copy link
Member

BPO 22681
Nosy @amauryfa, @jwilk, @ned-deily, @serhiy-storchaka
Files
  • encoding_koi8_t.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/serhiy-storchaka'
    closed_at = <Date 2015-05-12.20:28:32.820>
    created_at = <Date 2014-10-20.17:49:30.945>
    labels = ['type-feature', 'library']
    title = 'Add support of KOI8-T encoding'
    updated_at = <Date 2015-05-12.22:13:32.308>
    user = 'https://github.com/serhiy-storchaka'

    bugs.python.org fields:

    activity = <Date 2015-05-12.22:13:32.308>
    actor = 'ned.deily'
    assignee = 'serhiy.storchaka'
    closed = True
    closed_date = <Date 2015-05-12.20:28:32.820>
    closer = 'serhiy.storchaka'
    components = ['Library (Lib)']
    creation = <Date 2014-10-20.17:49:30.945>
    creator = 'serhiy.storchaka'
    dependencies = []
    files = ['36983']
    hgrepos = []
    issue_num = 22681
    keywords = ['patch']
    message_count = 10.0
    messages = ['229739', '229740', '242964', '242978', '243006', '243016', '243017', '243020', '243021', '243026']
    nosy_count = 5.0
    nosy_names = ['amaury.forgeotdarc', 'jwilk', 'ned.deily', 'python-dev', 'serhiy.storchaka']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue22681'
    versions = ['Python 3.5']

    @serhiy-storchaka
    Copy link
    Member Author

    KOI8-T is Tajik encoding partially compatible with KOI8-R. This is default encoding of Tajik locale tg_TJ in glibc (but in X11 locale.alias file it is KOI8-C, bpo-20087).

    Proposed patch adds support for this encoding. I have not found official mapping of KOI8-T and have used a table from Apple's implementation of libiconv. It matches a table in Wikipedia [2] and GNU iconv.

    [1] http://www.opensource.apple.com/source/libiconv/libiconv-4/libiconv/tests/KOI8-T.TXT
    [2] https://ru.wikipedia.org/wiki/КОИ-8 (Russian)

    @serhiy-storchaka serhiy-storchaka added stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels Oct 20, 2014
    @serhiy-storchaka
    Copy link
    Member Author

    Ah, actually Apple uses (a fork of) GNU libiconv. So I should correct links.

    @serhiy-storchaka
    Copy link
    Member Author

    Ping.

    @amauryfa
    Copy link
    Member

    Looks good to me.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented May 12, 2015

    New changeset 78de5d040492 by Serhiy Storchaka in branch 'default':
    Issue bpo-22681: Added support for the koi8_t encoding.
    https://hg.python.org/cpython/rev/78de5d040492

    @ned-deily
    Copy link
    Member

    Lots of "LookupError: unknown encoding: koi8_t" test failures (on OS X 10.10) after this commit, for example, in test_codecs:

    ======================================================================
    ERROR: test_basics (test.test_codecs.BasicUnicodeTest)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/py/dev/3x/source/Lib/test/test_codecs.py", line 1869, in test_basics
        name = codecs.lookup(encoding).name
    LookupError: unknown encoding: koi8_t

    ======================================================================
    ERROR: test_decoder_state (test.test_codecs.BasicUnicodeTest)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/py/dev/3x/source/Lib/test/test_codecs.py", line 2024, in test_decoder_state
        self.check_state_handling_decode(encoding, u, u.encode(encoding))
    LookupError: unknown encoding: koi8_t

    ======================================================================
    ERROR: test_seek (test.test_codecs.BasicUnicodeTest)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/py/dev/3x/source/Lib/test/test_codecs.py", line 1992, in test_seek
        reader = codecs.getreader(encoding)(io.BytesIO(s.encode(encoding)))
      File "/py/dev/3x/blds/uxd/../../source/Lib/codecs.py", line 998, in getreader
        return lookup(encoding).streamreader
    LookupError: unknown encoding: koi8_t

    Ran 211 tests in 5.970s

    FAILED (errors=5, skipped=17)

    @ned-deily
    Copy link
    Member

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented May 12, 2015

    New changeset def3bab79c8f by Serhiy Storchaka in branch 'default':
    Added forgotten new files for issues bpo-22681 and bpo-22682.
    https://hg.python.org/cpython/rev/def3bab79c8f

    @serhiy-storchaka
    Copy link
    Member Author

    Thanks Ned. I just forgive to add new encoding files.

    @ned-deily
    Copy link
    Member

    All better, thanks!

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants