Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ctypes: Speed up find_library() on Linux by 500% #55467

Closed
jonashaag mannequin opened this issue Feb 20, 2011 · 17 comments
Closed

ctypes: Speed up find_library() on Linux by 500% #55467

jonashaag mannequin opened this issue Feb 20, 2011 · 17 comments
Labels
performance Performance or resource usage topic-ctypes

Comments

@jonashaag
Copy link
Mannequin

jonashaag mannequin commented Feb 20, 2011

BPO 11258
Nosy @theller, @pitrou, @vapier, @jonashaag
Files
  • faster-find-library1.diff: does all filtering in Python (more I/O)
  • faster-find-library2.diff: Pre-filters ldconfig's output using grep (less I/O but requires grep)
  • faster-find-library1-py3k.diff
  • faster-find-library1-py3k-with-escaped-name.diff: with re.escape(name) as suggested by Antoine Pitrou
  • faster-find-library1-py3k-with-escaped-name-try2.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2011-04-24.14:20:47.375>
    created_at = <Date 2011-02-20.16:03:58.470>
    labels = ['ctypes', 'performance']
    title = 'ctypes: Speed up find_library() on Linux by 500%'
    updated_at = <Date 2011-04-24.22:18:57.382>
    user = 'https://github.com/jonashaag'

    bugs.python.org fields:

    activity = <Date 2011-04-24.22:18:57.382>
    actor = 'Arfrever'
    assignee = 'none'
    closed = True
    closed_date = <Date 2011-04-24.14:20:47.375>
    closer = 'pitrou'
    components = ['ctypes']
    creation = <Date 2011-02-20.16:03:58.470>
    creator = 'jonash'
    dependencies = []
    files = ['20808', '20809', '20810', '20874', '20935']
    hgrepos = []
    issue_num = 11258
    keywords = ['patch']
    message_count = 17.0
    messages = ['128910', '128911', '128912', '128913', '129313', '129315', '129512', '129514', '129627', '129630', '129638', '134025', '134309', '134310', '134352', '134354', '134355']
    nosy_count = 6.0
    nosy_names = ['theller', 'pitrou', 'vapier', 'Arfrever', 'jonash', 'python-dev']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue11258'
    versions = ['Python 3.3']

    @jonashaag
    Copy link
    Mannequin Author

    jonashaag mannequin commented Feb 20, 2011

    (This applies to all versions of Python I investigated, although the attached patch is for Python 2.7)

    I wondered why import uuid took so long, so I did some profiling.

    It turns out that find_library wastes at lot of time because of this crazy regular expression in _findSoname_ldconfig.

    A quick look at the ldconfig source (namely, the print_cache routine which is invoked when you call ldconfig -p, http://sourceware.org/git/?p=glibc.git;a=blob;f=elf/cache.c#l127) confirmed my suspicion that the ldconfig's output could easily be parsed without such a regex monster.

    I attached two patches that fix this problem. Choose one! ;-)

    The ctypes tests pass with my fixes, and here comes some benchmarking:

    $ cat benchmark_ctypes.py 
    from ctypes.util import find_library
    for i in xrange(10):
      for lib in ['mm', 'c', 'bz2', 'uuid']:
        find_library(lib)

    # Current implementation
    $ time python benchmark_ctypes.py
    real 0m11.813s
    ...
    $ time python -c 'import uuid'
    real 0m0.625s
    ...

    # With my patch applied
    $ cp /tmp/ctypesutil.py ctypes/util.py
    $ time python benchmark_ctypes.py
    real 0m1.785s
    ...
    $ time python -c 'import uuid'
    real 0m0.182s
    ...

    @jonashaag jonashaag mannequin assigned theller Feb 20, 2011
    @jonashaag jonashaag mannequin added topic-ctypes performance Performance or resource usage labels Feb 20, 2011
    @jonashaag
    Copy link
    Mannequin Author

    jonashaag mannequin commented Feb 20, 2011

    (might also be related to http://bugs.python.org/issue11063)

    @pitrou
    Copy link
    Member

    pitrou commented Feb 20, 2011

    Here is the first patch adapted for py3k.

    @pitrou
    Copy link
    Member

    pitrou commented Feb 20, 2011

    Actually, re.escape is probably still needed around name.

    @pitrou
    Copy link
    Member

    pitrou commented Feb 24, 2011

    Thanks for the new patch. Looking again, I wonder if there's a reason the original regexp was so complicated. ldconfig output here has lines such as:

    libBrokenLocale.so.1 (libc6,x86-64, OS ABI: Linux 2.6.9) =\> /lib64/libBrokenLocale.so.1
    libBrokenLocale.so.1 (libc6, OS ABI: Linux 2.6.9) =\> /lib/libBrokenLocale.so.1
    libBrokenLocale.so (libc6,x86-64, OS ABI: Linux 2.6.9) =\> /usr/lib64/libBrokenLocale.so
    libBrokenLocale.so (libc6, OS ABI: Linux 2.6.9) =\> /usr/lib/libBrokenLocale.so
    

    Ideally we would factor out the parsing to a separate private function, and have tests for it.

    @jonashaag
    Copy link
    Mannequin Author

    jonashaag mannequin commented Feb 24, 2011

    As far as I can tell, it doesn't matter.

    We're looking for the part after the => in any case - ignoring the ABI/architecture information - so the regex would chose the first of those entries.

    @pitrou
    Copy link
    Member

    pitrou commented Feb 26, 2011

    Ok, I think you're right.
    I've committed the patch to 3.3 in r88639 after having added a minimal test. Is there a full name I should credit?
    Thank you for contributing!

    @pitrou
    Copy link
    Member

    pitrou commented Feb 26, 2011

    Reopening and reverted the commit in r88640. The patch changes behaviour by turning the previous unrooted filename ('libc.so.6') into a full path ('/lib64/libc.so.6'). This breaks builds where multiple versions of a library are available and only one is loadable, e.g. 32-bit builds on 64-bit machines:

    ======================================================================
    ERROR: test_load (ctypes.test.test_loading.LoaderTest)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/home2/buildbot2/slave/hg-3.x.loewis-parallel/build/Lib/ctypes/test/test_loading.py", line 26, in test_load
        CDLL(libc_name)
      File "/home2/buildbot2/slave/hg-3.x.loewis-parallel/build/Lib/ctypes/__init__.py", line 340, in __init__
        self._handle = _dlopen(self._name, mode)
    OSError: /lib64/libc.so.6: wrong ELF class: ELFCLASS64

    @pitrou pitrou reopened this Feb 26, 2011
    @jonashaag
    Copy link
    Mannequin Author

    jonashaag mannequin commented Feb 27, 2011

    Humm. Would be great to have the ldconfig -p output of such a machine... I can't get ldconfig to recognize 64-bit libraries on my 32-bit machines, so I have no output to test against...

    @pitrou
    Copy link
    Member

    pitrou commented Feb 27, 2011

    Here is an excerpt:

    libc.so.6 (libc6,x86-64, OS ABI: Linux 2.6.9) =\> /lib64/libc.so.6
    libc.so.6 (libc6, OS ABI: Linux 2.6.9) =\> /lib/libc.so.6
    

    The "OS ABI" thing is not always there:

    libdrm.so.2 (libc6,x86-64) =\> /usr/lib64/libdrm.so.2
    libdrm.so.2 (libc6) =\> /usr/lib/libdrm.so.2
    

    As you see, there are two of them with the same name but in a different path. If you return the absolute path, there is a 50% possibility that you are returning the wrong one ;)

    There seem to be two key differences between the original implementation and yours:

    • the orig impl matches the abi_type at the beginning of the parentheses, yours simply ignores the abi_type (that should have caught my eye, but that regex looked so much like magic that I didn't try to make sense of it :-))
    • the orig impl returns the file name from the beginning of the matched line, yours returns the full path from the end of the line

    I guess it should be doable to retain the speed benefit while implementing a matching algorithm closer to the original one.

    @jonashaag
    Copy link
    Mannequin Author

    jonashaag mannequin commented Feb 27, 2011

    the orig impl matches the abi_type at the beginning of the parentheses,
    yours simply ignores the abi_type (that should have caught my eye, but that
    regex looked so much like magic that I didn't try to make sense of it :-))

    Same here. :)

    The version I attached seems to work for me. It's some kind of compromise -- basically it's the original regex but with the unneccessary, performance-decreasing cruft stripped away.

    btw, "Jonas H." is perfectly fine - I don't care about being honored, I just want to import uuid without waiting forever. :-)

    @jonashaag
    Copy link
    Mannequin Author

    jonashaag mannequin commented Apr 19, 2011

    *push* Any way to get this into the codebase?

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Apr 23, 2011

    New changeset 19d9f0a177de by Antoine Pitrou in branch 'default':
    Issue bpo-11258: Speed up ctypes.util.find_library() under Linux by a factor
    http://hg.python.org/cpython/rev/19d9f0a177de

    @pitrou
    Copy link
    Member

    pitrou commented Apr 23, 2011

    I committed a modified patch. Hopefully the buildbots won't break this time :)

    @pitrou pitrou closed this as completed Apr 24, 2011
    @Arfrever
    Copy link
    Mannequin

    Arfrever mannequin commented Apr 24, 2011

    19d9f0a177de causes that test_ctypes hangs when test suite is run in Gentoo sandbox. Please reopen this issue.

    $ sandbox python3.3 -B -m test.regrtest --timeout=10 -v test_ctypes
    == CPython 3.3a0 (default:020ebe0be33e+, Apr 24 2011, 17:52:58) [GCC 4.5.2]
    ==   Linux-${version}
    ==   /tmp/test_python_23902
    Testing with flags: sys.flags(debug=0, inspect=0, interactive=0, optimize=0, dont_write_bytecode=1, no_user_site=0, no_site=0, ignore_environment=0, verbose=0, bytes_warning=0, quiet=0)
    [1/1] test_ctypes
    Timeout (0:00:10)!
    Thread 0x00007fc205f54700:
      File "/usr/lib64/python3.3/subprocess.py", line 466 in _eintr_retry_call
      File "/usr/lib64/python3.3/subprocess.py", line 1412 in _execute_child
      File "/usr/lib64/python3.3/subprocess.py", line 766 in __init__
      File "/usr/lib64/python3.3/ctypes/util.py", line 198 in _findSoname_ldconfig
      File "/usr/lib64/python3.3/ctypes/util.py", line 206 in find_library
      File "/usr/lib64/python3.3/ctypes/test/test_find.py", line 15 in <module>
      File "/usr/lib64/python3.3/ctypes/test/__init__.py", line 64 in get_tests
      File "/usr/lib64/python3.3/test/test_ctypes.py", line 11 in test_main
      File "/usr/lib64/python3.3/test/regrtest.py", line 1094 in runtest_inner
      File "/usr/lib64/python3.3/test/regrtest.py", line 887 in runtest
      File "/usr/lib64/python3.3/test/regrtest.py", line 587 in _runtest
      File "/usr/lib64/python3.3/test/regrtest.py", line 711 in main
      File "/usr/lib64/python3.3/test/regrtest.py", line 1672 in <module>
      File "/usr/lib64/python3.3/runpy.py", line 73 in _run_code
      File "/usr/lib64/python3.3/runpy.py", line 160 in _run_module_as_main

    @pitrou
    Copy link
    Member

    pitrou commented Apr 24, 2011

    19d9f0a177de causes that test_ctypes hangs when test suite is run in
    Gentoo sandbox. Please reopen this issue.

    I'd prefer having a separate issue (which you already opened :-)).
    The fact that all buildbots work fine after the change suggests to me
    that the issue is not really the patch I committed.

    @Arfrever
    Copy link
    Mannequin

    Arfrever mannequin commented Apr 24, 2011

    OK. We will use issue bpo-11915.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    performance Performance or resource usage topic-ctypes
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants