Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distutils generates the wrong export symbol for unicode module names #83613

Closed
da-woods mannequin opened this issue Jan 23, 2020 · 12 comments
Closed

Distutils generates the wrong export symbol for unicode module names #83613

da-woods mannequin opened this issue Jan 23, 2020 · 12 comments
Labels
3.8 only security fixes 3.9 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@da-woods
Copy link
Mannequin

da-woods mannequin commented Jan 23, 2020

BPO 39432
Nosy @scoder, @ned-deily, @merwok, @encukou, @ambv, @zooba, @dstufft, @miss-islington, @da-woods
PRs
  • bpo-39432: Implement PEP-489 algorithm for non-ascii "PyInit_*" symbol names in distutils #18150
  • [3.8] bpo-39432: Implement PEP-489 algorithm for non-ascii "PyInit_*" symbol names in distutils (GH-18150) #18546
  • [3.8] bpo-39555: Fix distutils test to handle _d suffix on Windows debug build (GH-18357) #18548
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2020-02-18.13:17:31.345>
    created_at = <Date 2020-01-23.13:48:15.580>
    labels = ['3.8', 'type-bug', 'library', '3.9']
    title = 'Distutils generates the wrong export symbol for unicode module names'
    updated_at = <Date 2021-04-20.09:24:11.553>
    user = 'https://github.com/da-woods'

    bugs.python.org fields:

    activity = <Date 2021-04-20.09:24:11.553>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2020-02-18.13:17:31.345>
    closer = 'scoder'
    components = ['Distutils']
    creation = <Date 2020-01-23.13:48:15.580>
    creator = 'da-woods'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 39432
    keywords = ['patch']
    message_count = 12.0
    messages = ['360555', '361352', '361416', '361434', '361435', '361436', '361437', '361460', '361608', '361800', '362206', '391314']
    nosy_count = 9.0
    nosy_names = ['scoder', 'ned.deily', 'eric.araujo', 'petr.viktorin', 'lukasz.langa', 'steve.dower', 'dstufft', 'miss-islington', 'da-woods']
    pr_nums = ['18150', '18546', '18548']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue39432'
    versions = ['Python 3.8', 'Python 3.9']

    @da-woods
    Copy link
    Mannequin Author

    da-woods mannequin commented Jan 23, 2020

    Distuitls generates "export symbols" for extension modules to help ensure that they have have the correct linkage on Windows.

    initfunc_name = "PyInit_" + ext.name.split('.')[-1]

    It generates the correct symbol in most causes, but if the filename contains unicode characters then it creates the wrong symbol, causing linkage errors.

    The behaviour should be updated to reflect PEP-489: https://www.python.org/dev/peps/pep-0489/#export-hook-name

    @da-woods da-woods mannequin added 3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Jan 23, 2020
    @miss-islington
    Copy link
    Contributor

    New changeset 9538bc9 by Stefan Behnel in branch 'master':
    bpo-39432: Implement PEP-489 algorithm for non-ascii "PyInit_*" symbol names in distutils (GH-18150)
    9538bc9

    @scoder
    Copy link
    Contributor

    scoder commented Feb 5, 2020

    Ok, this is merged into 3.9. To which versions should we backport it?
    Definitely 3.8, definitely not 3.5, probably not 3.6 (since it's not a security issue). Ned, what about 3.7?

    @vstinner
    Copy link
    Member

    vstinner commented Feb 5, 2020

    The test fails on Windows. Example on AMD64 Windows8.1 Refleaks 3.x:
    https://buildbot.python.org/all/#/builders/157/builds/76

    ======================================================================
    FAIL: test_unicode_module_names (distutils.tests.test_build_ext.BuildExtTestCase)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "D:\buildarea\3.x.ware-win81-release.refleak\build\lib\distutils\tests\test_build_ext.py", line 315, in test_unicode_module_names
        self.assertRegex(cmd.get_ext_filename(modules[0].name), r'foo\..*')
    AssertionError: Regex didn't match: 'foo\\..*' not found in 'foo_d.cp39-win_amd64.pyd'

    ======================================================================
    FAIL: test_unicode_module_names (distutils.tests.test_build_ext.ParallelBuildExtTestCase)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "D:\buildarea\3.x.ware-win81-release.refleak\build\lib\distutils\tests\test_build_ext.py", line 315, in test_unicode_module_names
        self.assertRegex(cmd.get_ext_filename(modules[0].name), r'foo\..*')
    AssertionError: Regex didn't match: 'foo\\..*' not found in 'foo_d.cp39-win_amd64.pyd'

    @vstinner
    Copy link
    Member

    vstinner commented Feb 5, 2020

    Python 3.6 doesn't accept bugfixes anymore:
    https://devguide.python.org/#status-of-python-branches

    The bugfix can go into 3.7 and 3.8.

    @vstinner
    Copy link
    Member

    vstinner commented Feb 5, 2020

    I put a breakpoint before the error:

    test_unicode_module_names (distutils.tests.test_build_ext.BuildExtTestCase) ... > c:\vstinner\python\master\lib\distutils\tests
    \test_build_ext.py(316)test_unicode_module_names()
    -> self.assertRegex(cmd.get_ext_filename(modules[0].name), r'foo\..*')
    (Pdb) p modules
    [<distutils.extension.Extension('foo') at 0x274003035f0>, <distutils.extension.Extension('föö') at 0x27400303730>]
    (Pdb) p modules[0].name
    'foo'
    (Pdb) p modules[1].name
    'föö'

    @vstinner
    Copy link
    Member

    vstinner commented Feb 5, 2020

    On Windows, names get a "_d" suffix for debug. Extract of build_ext.py:

        def get_libraries(self, ext):
            """Return the list of libraries to link against when building a
            shared extension.  On most platforms, this is just 'ext.libraries';
            on Windows, we add the Python library (eg. python20.dll).
            """
            # The python library is always needed on Windows.  For MSVC, this
            # is redundant, since the library is mentioned in a pragma in
            # pyconfig.h that MSVC groks.  The other Windows compilers all seem
            # to need it mentioned explicitly, though, so that's what we do.
            # Append '_d' to the python import library on debug builds.
            if sys.platform == "win32":
                from distutils._msvccompiler import MSVCCompiler
                if not isinstance(self.compiler, MSVCCompiler):
                    template = "python%d%d"
                    if self.debug:
                        template = template + '_d'
    (...)

    @zooba
    Copy link
    Member

    zooba commented Feb 5, 2020

    bpo-39555 and PR 18357 have the fix for the buildbot.

    @ned-deily
    Copy link
    Member

    We should not be changing Distutils 3.7 behavior at this late point in its life cycle, particularly since AFAIK this issue has not come up before. Let's see what Łukasz thinks for 3.8.

    @ned-deily ned-deily removed 3.7 (EOL) end of life labels Feb 7, 2020
    @encukou
    Copy link
    Member

    encukou commented Feb 11, 2020

    I'm with Stefan on "Definitely 3.8". It's a bug fix (for a rarely used feature).

    @scoder
    Copy link
    Contributor

    scoder commented Feb 18, 2020

    New changeset 5bf58ce by Miss Islington (bot) in branch '3.8':
    bpo-39432: Implement PEP-489 algorithm for non-ascii "PyInit_*" symbol names in distutils (GH-18150) (GH-18546)
    5bf58ce

    @scoder scoder closed this as completed Feb 18, 2020
    @scoder scoder closed this as completed Feb 18, 2020
    @da-woods
    Copy link
    Mannequin Author

    da-woods mannequin commented Apr 17, 2021

    It looks like this wasn't quite fixed by the patch: the patch encoded _<module_name> when it should have encoded <module_name>.

    I've submitted a revised version to setuptools pypa/setuptools#2646. My impression is that distutils is no longer updated and so there's no value in submitting this patch to Python too. However, I can do so if it would be used.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 only security fixes 3.9 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    6 participants