Message145205
Marc-Andre: gcc will normally not unroll loops, unless -funroll-loops is given on the command line. Then, it will unroll many loops, and do so with 8 iterations per outer loop. This typically causes significant code bloat, which is why unrolling is normally disabled and left to the programmer.
For those who want to experiment with this, I attach a C file with just the code in question. Compile this with your favorite compiler settings, and see what the compile generates. clang, on an x64 system, compiles the original loop into
LBB0_2: ## =>This Inner Loop Header: Depth=1
movzbl (%rdi), %eax
movw %ax, (%rdx)
incq %rdi
addq $2, %rdx
decq %rsi
jne LBB0_2
and the unrolled loop into
LBB1_2: ## %.lr.ph6
## =>This Inner Loop Header: Depth=1
movzbl (%rdi,%rcx), %r8d
movw %r8w, (%rdx)
movzbl 1(%rdi,%rcx), %r8d
movw %r8w, 2(%rdx)
movzbl 2(%rdi,%rcx), %r8d
movw %r8w, 4(%rdx)
movzbl 3(%rdi,%rcx), %r8d
movw %r8w, 6(%rdx)
addq $8, %rdx
addq $4, %rcx
cmpq %rax, %rcx
jl LBB1_2 |
|
Date |
User |
Action |
Args |
2011-10-09 03:06:10 | loewis | set | recipients:
+ loewis, lemburg, pitrou, meador.inge |
2011-10-09 03:06:10 | loewis | set | messageid: <1318129570.86.0.821593112149.issue13136@psf.upfronthosting.co.za> |
2011-10-09 03:06:10 | loewis | link | issue13136 messages |
2011-10-09 03:06:09 | loewis | create | |
|