Issue33351
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2018-04-25 04:36 by ethan smith, last changed 2022-04-11 14:58 by admin.
Pull Requests | |||
---|---|---|---|
URL | Status | Linked | Edit |
PR 6761 | closed | ethan smith, 2018-05-11 00:29 | |
PR 7680 | closed | ethan smith, 2018-06-13 10:29 | |
PR 18371 | open | isuruf, 2020-02-06 07:21 |
Messages (12) | |||
---|---|---|---|
msg315721 - (view) | Author: Ethan Smith (ethan smith) * | Date: 2018-04-25 04:36 | |
The clang folks have been hard at work making an ABI compatible backend to clang for Windows. Additionally they have created a cl compatible driver for clang, which can be used in lieu of cl itself. Clang-cl has been adopted to build Chrome on Windows http://blog.llvm.org/2018/03/clang-is-now-used-to-build-chrome-for.html, so I think it is stable enough to be considered for use. Clang-cl has several advantages, such as computed goto support and many other optimizations which would make Python faster on Windows. I would be happy to start contributing patches to further this goal, I already have a couple of small patches. |
|||
msg315723 - (view) | Author: Alex Walters (tritium) * | Date: 2018-04-25 11:06 | |
Is this the same as the clang/llvm C1 that you can enable from inside Visual Studio? |
|||
msg315725 - (view) | Author: Ethan Smith (ethan smith) * | Date: 2018-04-25 11:15 | |
No, this is provided from llvm.org. You can find it as e.g. "Clang for Windows (64-bit)" here: http://releases.llvm.org/download.html#6.0.0 The Clang/C2 in Visual Studio is very different, and deprecated anyway. |
|||
msg315726 - (view) | Author: Alex Walters (tritium) * | Date: 2018-04-25 12:57 | |
When supporting platforms comes up, there's a usual list of questions, especially for windows. I can remember two of them off the top of my head: * Are you suggesting that CPython's build system move away from MSVC as the platform compiler for Windows? * Are you able to provide a machine to run buildbots on? |
|||
msg315749 - (view) | Author: Ethan Smith (ethan smith) * | Date: 2018-04-25 17:29 | |
>* Are you suggesting that CPython's build system move away from MSVC as the platform compiler for Windows? Not immediately, I don't think we should give up on the stability that currently exists with the cl based compilation. However, I think once CPython on clang-cl becomes stable, it will be compelling to switch. Clang-cl has the benefit of backwards compatibility with existing MSVC compiled c extensions, while generating a faster interpreter (perhaps 30% faster or more!). * Are you able to provide a machine to run buildbots on? I'm afraid not, I am just a college student :) |
|||
msg315920 - (view) | Author: Steve Dower (steve.dower) * | Date: 2018-04-29 21:58 | |
Feel free to start creating patches so we can get an idea of what the changes would look like. Hopefully it's not that dramatic. Be very careful making performance claims without benchmarks to back it up, and ideally against multiple sets of hardware (MSVC is designed and tested to perform well across a range of processors, often by engineers who work for the manufacturer - intuition would suggest that an open source compiler is probably not 30% better all the time). Don't focus on the number right now, but do try to collect the justification before you expect or encourage others to do the work. Since the ABI is compatible, there should be no problem enabling extensions to be built using this compiler (assuming someone is willing to become a distutils maintainer, as there are currently none). You don't need to ask here to create a third-party library that enables this. I haven't heard any complaints about access to the compilers being an issue recently, so the only reasons to switch the interpreter itself would be source compatibility (essentially, the clang clib is better than our custom Win32 code) or performance. But we need a positive reason to switch support, not just the ability. |
|||
msg315924 - (view) | Author: Ethan Smith (ethan smith) * | Date: 2018-04-30 02:16 | |
> Feel free to start creating patches so we can get an idea of what the changes would look like. Hopefully it's not that dramatic. Okay, will do. I have a few smaller patches to start with. Clang-cl tries to be as compatible as possible with cl, so I don't expect drastic changes. I'm currently trying to figure out an include issue with timeval, but so far the patches have been few and small. > Be very careful making performance claims without benchmarks to back it up, and ideally against multiple sets of hardware (MSVC is designed and tested to perform well across a range of processors, often by engineers who work for the manufacturer - intuition would suggest that an open source compiler is probably not 30% better all the time). Don't focus on the number right now, but do try to collect the justification before you expect or encourage others to do the work. I did not mean to say it would make Python 30% faster in all cases, I meant "up to 30% faster". This number is based on benchmarks of CPython with and without computed goto, and my own experiments of benchmarks comparing CPython in the WSL, and native Windows CPython releases on x86_64. But your point is well taken, and I will of course benchmark Python compiled with clang-cl once I have a complete working version. > Since the ABI is compatible, there should be no problem enabling extensions to be built using this compiler (assuming someone is willing to become a distutils maintainer, as there are currently none). You don't need to ask here to create a third-party library that enables this. When you say "someone to become a distutils maintainer" you mean for clang-cl specifically? If that is the case, I'm happy to add support and commit to continuing to work on clang-cl support in distutils, as I expect to use it a fair amount. > I haven't heard any complaints about access to the compilers being an issue recently, so the only reasons to switch the interpreter itself would be source compatibility (essentially, the clang clib is better than our custom Win32 code) or performance. But we need a positive reason to switch support, not just the ability. I agree there should be a good reason to move away from the MSVC compiler. The decision to move can be re-evaluated when there is a good argument to warrant it. |
|||
msg316250 - (view) | Author: Gregory P. Smith (gregory.p.smith) * | Date: 2018-05-07 04:34 | |
FWIW, I would _love_ to see this. But I don't wrangle Windows myself so I can't usefully offer anything other than being happy to volunteer to run a Clang on Windows buildbot VM once there is something to actually be run. |
|||
msg317656 - (view) | Author: Ethan Smith (ethan smith) * | Date: 2018-05-25 04:23 | |
After wrangling with some missing compiler intrinsics, I've been able to get CPython to build with an almost vanilla clang-cl! I plan on upstreaming the patches to the LLVM project once I clean them up a bit. After that I will clean up the CPython patches and send a PR. I also ran performance with master built on MSVC compared to my branch on clang-cl with computed-goto enabled (I wasn't sure if there are other things that may be possible to turn on, computed goto seemed an obvious win). The results are decent, but some things, like json loads, are much slower (not sure why that is). The full report: msvc.json ========= Performance version: 0.6.1 Report on Windows-10-10.0.17672-SP0 Number of logical CPUs: 12 Start date: 2018-05-24 03:40:09.082701 End date: 2018-05-24 04:08:57.993717 clang2goto.json =============== Performance version: 0.6.1 Report on Windows-10-10.0.17672-SP0 Number of logical CPUs: 12 Start date: 2018-05-24 04:29:01.214005 End date: 2018-05-24 04:57:08.774299 ### 2to3 ### Mean +- std dev: 675 ms +- 31 ms -> 655 ms +- 32 ms: 1.03x faster Significant (t=3.55) ### chameleon ### Mean +- std dev: 19.5 ms +- 0.5 ms -> 18.1 ms +- 0.7 ms: 1.08x faster Significant (t=13.19) ### chaos ### Mean +- std dev: 230 ms +- 6 ms -> 209 ms +- 8 ms: 1.10x faster Significant (t=16.39) ### crypto_pyaes ### Mean +- std dev: 212 ms +- 8 ms -> 197 ms +- 8 ms: 1.07x faster Significant (t=9.72) ### deltablue ### Mean +- std dev: 15.2 ms +- 0.6 ms -> 14.2 ms +- 0.5 ms: 1.07x faster Significant (t=10.23) ### django_template ### Mean +- std dev: 222 ms +- 9 ms -> 210 ms +- 8 ms: 1.06x faster Significant (t=8.10) ### dulwich_log ### Mean +- std dev: 235 ms +- 13 ms -> 230 ms +- 12 ms: 1.02x faster Significant (t=2.18) ### fannkuch ### Mean +- std dev: 905 ms +- 11 ms -> 802 ms +- 15 ms: 1.13x faster Significant (t=42.95) ### float ### Mean +- std dev: 226 ms +- 9 ms -> 197 ms +- 8 ms: 1.15x faster Significant (t=18.71) ### go ### Mean +- std dev: 485 ms +- 10 ms -> 445 ms +- 8 ms: 1.09x faster Significant (t=24.60) ### hexiom ### Mean +- std dev: 19.9 ms +- 0.9 ms -> 18.3 ms +- 0.8 ms: 1.08x faster Significant (t=9.51) ### html5lib ### Mean +- std dev: 156 ms +- 9 ms -> 149 ms +- 9 ms: 1.05x faster Significant (t=4.31) ### json_dumps ### Mean +- std dev: 23.4 ms +- 1.2 ms -> 23.0 ms +- 1.1 ms: 1.02x faster Not significant ### json_loads ### Mean +- std dev: 49.3 us +- 2.2 us -> 93.2 us +- 8.7 us: 1.89x slower Significant (t=-37.79) ### logging_format ### Mean +- std dev: 25.3 us +- 1.3 us -> 23.4 us +- 1.2 us: 1.08x faster Significant (t=8.48) ### logging_silent ### Mean +- std dev: 368 ns +- 14 ns -> 340 ns +- 21 ns: 1.08x faster Significant (t=8.69) ### logging_simple ### Mean +- std dev: 23.1 us +- 1.4 us -> 20.6 us +- 0.9 us: 1.12x faster Significant (t=11.66) ### mako ### Mean +- std dev: 36.7 ms +- 1.8 ms -> 36.0 ms +- 1.7 ms: 1.02x faster Not significant ### meteor_contest ### Mean +- std dev: 189 ms +- 9 ms -> 175 ms +- 9 ms: 1.08x faster Significant (t=9.09) ### nbody ### Mean +- std dev: 274 ms +- 12 ms -> 222 ms +- 8 ms: 1.24x faster Significant (t=28.22) ### nqueens ### Mean +- std dev: 198 ms +- 8 ms -> 174 ms +- 8 ms: 1.14x faster Significant (t=16.67) ### pathlib ### Mean +- std dev: 343 ms +- 19 ms -> 338 ms +- 18 ms: 1.02x faster Not significant ### pickle ### Mean +- std dev: 20.9 us +- 0.8 us -> 19.9 us +- 0.5 us: 1.05x faster Significant (t=8.91) ### pickle_dict ### Mean +- std dev: 50.0 us +- 1.9 us -> 51.2 us +- 3.0 us: 1.02x slower Significant (t=-2.62) ### pickle_list ### Mean +- std dev: 7.61 us +- 0.32 us -> 7.06 us +- 0.36 us: 1.08x faster Significant (t=8.92) ### pickle_pure_python ### Mean +- std dev: 964 us +- 52 us -> 879 us +- 43 us: 1.10x faster Significant (t=9.72) ### pidigits ### Mean +- std dev: 257 ms +- 5 ms -> 254 ms +- 9 ms: 1.01x faster Not significant ### python_startup ### Mean +- std dev: 69.6 ms +- 8.3 ms -> 69.5 ms +- 6.3 ms: 1.00x faster Not significant ### python_startup_no_site ### Mean +- std dev: 57.7 ms +- 6.6 ms -> 58.2 ms +- 6.0 ms: 1.01x slower Not significant ### raytrace ### Mean +- std dev: 1.00 sec +- 0.02 sec -> 0.94 sec +- 0.02 sec: 1.07x faster Significant (t=21.49) ### regex_compile ### Mean +- std dev: 335 ms +- 5 ms -> 306 ms +- 10 ms: 1.10x faster Significant (t=20.75) ### regex_dna ### Mean +- std dev: 237 ms +- 7 ms -> 266 ms +- 7 ms: 1.13x slower Significant (t=-23.71) ### regex_effbot ### Mean +- std dev: 4.42 ms +- 0.17 ms -> 4.82 ms +- 0.20 ms: 1.09x slower Significant (t=-12.07) ### regex_v8 ### Mean +- std dev: 45.2 ms +- 15.5 ms -> 39.7 ms +- 2.8 ms: 1.14x faster Significant (t=2.74) ### richards ### Mean +- std dev: 152 ms +- 8 ms -> 142 ms +- 9 ms: 1.07x faster Significant (t=6.19) ### scimark_fft ### Mean +- std dev: 665 ms +- 12 ms -> 593 ms +- 12 ms: 1.12x faster Significant (t=32.36) ### scimark_lu ### Mean +- std dev: 327 ms +- 11 ms -> 324 ms +- 11 ms: 1.01x faster Not significant ### scimark_monte_carlo ### Mean +- std dev: 205 ms +- 7 ms -> 192 ms +- 8 ms: 1.07x faster Significant (t=9.22) ### scimark_sor ### Mean +- std dev: 386 ms +- 11 ms -> 351 ms +- 11 ms: 1.10x faster Significant (t=18.36) ### scimark_sparse_mat_mult ### Mean +- std dev: 8.39 ms +- 0.31 ms -> 7.19 ms +- 0.40 ms: 1.17x faster Significant (t=18.44) ### spectral_norm ### Mean +- std dev: 279 ms +- 8 ms -> 238 ms +- 7 ms: 1.17x faster Significant (t=29.08) ### sqlalchemy_declarative ### Mean +- std dev: 250 ms +- 12 ms -> 245 ms +- 10 ms: 1.02x faster Significant (t=2.73) ### sqlalchemy_imperative ### Mean +- std dev: 47.2 ms +- 2.4 ms -> 47.4 ms +- 2.5 ms: 1.00x slower Not significant ### sqlite_synth ### Mean +- std dev: 5.37 us +- 0.25 us -> 5.22 us +- 0.21 us: 1.03x faster Significant (t=3.60) ### sympy_expand ### Mean +- std dev: 710 ms +- 16 ms -> 671 ms +- 20 ms: 1.06x faster Significant (t=11.92) ### sympy_integrate ### Mean +- std dev: 31.9 ms +- 1.3 ms -> 30.5 ms +- 1.4 ms: 1.05x faster Significant (t=5.51) ### sympy_str ### Mean +- std dev: 313 ms +- 9 ms -> 297 ms +- 8 ms: 1.05x faster Significant (t=10.39) ### sympy_sum ### Mean +- std dev: 151 ms +- 6 ms -> 143 ms +- 6 ms: 1.05x faster Significant (t=6.70) ### telco ### Mean +- std dev: 11.2 ms +- 0.4 ms -> 10.8 ms +- 0.4 ms: 1.04x faster Significant (t=5.45) ### unpack_sequence ### Mean +- std dev: 87.1 ns +- 3.8 ns -> 67.3 ns +- 4.3 ns: 1.29x faster Significant (t=26.90) ### unpickle ### Mean +- std dev: 27.4 us +- 1.9 us -> 25.3 us +- 1.5 us: 1.08x faster Significant (t=6.46) ### unpickle_list ### Mean +- std dev: 6.81 us +- 0.29 us -> 6.15 us +- 0.30 us: 1.11x faster Significant (t=12.38) ### unpickle_pure_python ### Mean +- std dev: 740 us +- 31 us -> 696 us +- 38 us: 1.06x faster Significant (t=6.82) ### xml_etree_generate ### Mean +- std dev: 197 ms +- 13 ms -> 190 ms +- 11 ms: 1.04x faster Significant (t=3.34) ### xml_etree_iterparse ### Mean +- std dev: 177 ms +- 12 ms -> 174 ms +- 11 ms: 1.02x faster Not significant ### xml_etree_parse ### Mean +- std dev: 229 ms +- 13 ms -> 229 ms +- 13 ms: 1.00x faster Not significant ### xml_etree_process ### Mean +- std dev: 161 ms +- 11 ms -> 154 ms +- 10 ms: 1.04x faster Significant (t=3.45) |
|||
msg318890 - (view) | Author: Ethan Smith (ethan smith) * | Date: 2018-06-07 04:55 | |
I sent my patches to clang-cl upstream [1]. It seems they want to implement Hardware Lock Elision (which is used by some MSVC compiler intrinsics in pyatomic.h) before implementing the needed intrinsics. I have found temporary replacements that do not elide locks, but have effectively the same functional purpose as those intrinsics, so I should have a full PR for CPython ready soon. [1] https://reviews.llvm.org/D47672 |
|||
msg321353 - (view) | Author: Ethan Smith (ethan smith) * | Date: 2018-07-09 23:52 | |
I just updated the PR with some more information after trying this on every VS project. It seems that clang-cl still fails on some projects/tests, but I don't think that is a big problem. I was mostly interested in getting Python core to build with clang-cl, which it does (and passes all test with it). I will keep iterating on this as time allows. I also think it would be helpful to have an idea of the expectation for review/merge. |
|||
msg360048 - (view) | Author: Gisle Vanem (gvanem) | Date: 2020-01-15 12:39 | |
I will add to this issue my *only* compile problem I had using clang-cl ver 9. It has an issue with parsing the `frame_t` structure in Modules/_tracemalloc.c: Modules/_tracemalloc.c(64,11): error: declaration of anonymous struct must be a definition typedef struct ^ Modules/_tracemalloc.c(64,3): warning: typedef requires a name [-Wmissing-declarations] typedef struct ^~~~~~~ Modules/_tracemalloc.c(77,11): warning: #pragma pack(pop, ...) failed: stack empty [-Wignored-pragmas] #pragma pack(pop) ^ Modules/_tracemalloc.c(110,5): error: unknown type name 'frame_t' frame_t frames[1]; ^ ------------------ I commented on and suggested a fix for _tracemalloc.c here: https://github.com/python/cpython/commit/8d59eb1b66c51b2b918da9881c57d07d08df43b7#r36794938 |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:58:59 | admin | set | github: 77532 |
2020-02-06 07:21:40 | isuruf | set | pull_requests: + pull_request17747 |
2020-01-15 12:39:22 | gvanem | set | versions:
+ Python 3.9, - Python 3.8 nosy: + gvanem messages: + msg360048 type: enhancement -> compile error |
2018-07-09 23:52:23 | ethan smith | set | messages: + msg321353 |
2018-06-13 10:29:58 | ethan smith | set | pull_requests: + pull_request7292 |
2018-06-07 04:55:45 | ethan smith | set | messages: + msg318890 |
2018-05-25 04:23:37 | ethan smith | set | messages: + msg317656 |
2018-05-11 00:29:12 | ethan smith | set | keywords:
+ patch stage: test needed -> patch review pull_requests: + pull_request6448 |
2018-05-07 04:34:33 | gregory.p.smith | set | nosy:
+ gregory.p.smith messages: + msg316250 |
2018-04-30 02:16:02 | ethan smith | set | messages: + msg315924 |
2018-04-29 21:58:15 | steve.dower | set | messages: + msg315920 |
2018-04-27 19:01:50 | terry.reedy | set | nosy:
+ paul.moore, tim.golden, zach.ware, steve.dower components: + Windows stage: test needed |
2018-04-25 17:29:59 | ethan smith | set | messages: + msg315749 |
2018-04-25 12:57:26 | tritium | set | messages: + msg315726 |
2018-04-25 11:15:13 | ethan smith | set | messages: + msg315725 |
2018-04-25 11:06:04 | tritium | set | nosy:
+ tritium messages: + msg315723 |
2018-04-25 04:57:10 | pmpp | set | nosy:
+ pmpp |
2018-04-25 04:36:44 | ethan smith | create |