Factbites
 Where results make sense
About us   |   Why use us?   |   Reviews   |   PR   |   Contact us  

Topic: Loop unrolling


Related Topics

In the News (Thu 16 Feb 12)

  
  Loop nest optimization - Wikipedia, the free encyclopedia
Loop nest optimization (LNO) is a special case of loop transformation which deals with nested loops that makes possible large reductions in the cache bandwidth necessary for some pervasive algorithms.
The loop requires registers to hold both the accumulators and the loaded and reused A and B values.
That trick is reducing the size of the stripe of the B matrix by blocking the k loop, so that the stripe is of size ib x kb.
en.wikipedia.org /wiki/Loop_nest_optimization   (1619 words)

  
 Real World Technologies - Compilers - Code Optimization
Unrolling means copying out the body of the loop several times and changing the loop index correspondingly.
More often it is variable, if so another loop must follow the unrolled loop to take account of the possibility that the range ends on an odd number (or more generally a number not divisible by the number of unrolls).
Loop fission is a partial solution if it is possible the loop is split into two loops, one containing the dependence and the other containing the rest of the work of the loop, this second loop can be optimized well.
www.realworldtech.com /page.cfm?ArticleID=RWT110302111309&p=5   (850 words)

  
 Loop Unrolling
Loop unrolling is a standard manual optimization that creates larger loops by replication of the original loop body.
Loop unrolling is done automatically by KAP to speed up some loops by reducing the number of times the loop control overhead is encountered.
Unrolling a loop involves duplicating the loop body one or more times within the loop, adding an increment, or changing the increment that was already in the loop, and possibly inserting cleanup code before the loop to execute any left-over iterations of the loop.
www.mun.ca /hpc/kapc/kapc_osf_ug0125.html   (317 words)

  
 GNU MP 4.1   (Site not responding. Last check: 2007-10-24)
Loop unrolling consists of replicating code so that several limbs are processed in each loop.
Unrolling is commonly done to a power of 2 multiple so the number of unrolled loops and the number of remaining limbs can be calculated with a shift and mask.
One way to write the setups and finishups for a pipelined unrolled loop is simply to duplicate the loop at the start and the end, then delete instructions at the start which have no valid antecedents, and delete instructions at the end whose results are unwanted.
www.gnu.org /software/gmp/manual/html_node/Assembler-Loop-Unrolling.html   (395 words)

  
 Loop Unrolling   (Site not responding. Last check: 2007-10-24)
DO loop unrolling is a standard "manual" optimization that creates more statements in a small loop by replication of the original statement.
Unrolling a loop involves duplicating the loop body one or more times within the loop, adding an increment (or changing the increment that was already on the DO statement), and possibly inserting code before or after the loop to execute the excess iterations of the loop (the cleanup code).
If the loop bounds are constant and the iteration count of the loop is small, the loop may be entirely deleted and replaced by copies of the loop body.
www.mun.ca /hpc/kapf90/kapf90_ug0210.html   (420 words)

  
 PGI User's Guide - 3 Optimization Features
Loops that are candidates for the vectorizer are countable, that is the number of loop iterations is determined prior to the loop's execution and the loop counter is incremented or decremented by a fixed amount at each iteration.
An expandable scalar is a scalar appearing in an innermost loop for which every use is reached by a single assignment to the scalar, or where all paths from the beginning of a loop to the scalar use contain a definition of that scalar.
When the address of a user-defined loop index is passed as an argument to a subprogram, the vectorizer must create yet another temporary to hold the address of the expression that yields the same value as the original loop index.
www.unc.edu /depts/case/pgi/pgiws_ug/pgi31u04.htm   (5603 words)

  
 347805.faq   (Site not responding. Last check: 2007-10-24)
For unroll factor other than 1, this directive serves as a suggestion to the compiler that the specified loop should be unrolled by the given factor.
When the unroll factor value is 1, this directive serves as a command which specifies to the compiler that the loop is not to be unrolled.
The optimizer in the Forte 6.0 C compiler does determine, if the unrolling of a loop is beneficial from a performance standpoint.
access1.sun.com /cgi-bin/rinfo2html?347805.faq   (566 words)

  
 flipcode - Unrolling a Loop with Duff's Device
Loop unrolling may be ok for vector algebra or the like.
However, loop unrolling is a very common and viable optimization technique; this macro just makes it easier to do, and easier to change if you do do it.
In this case, the compiler can't unroll the loop at compile time, because it the number of iterations is a variable.
www.flipcode.com /cgi-bin/fcarticles.cgi?show=63889   (1105 words)

  
 QC#7299: Optimizer should perform loop unrolling
I covered the case where unrolled loops were huge (code size), and that the compiler shouldn't unroll loops that would severly bloat the code (and possibly cause the main loop to spill out of the cache)-- what other loop situations would be hurt by unrolling?
Unrolled loop is faster if there are no calls nor jumps.
The compiler would need to handle the odd loops (before or after plowing through the unrolled loops) as a special case (unless jumping into the middle of the unrolled loops was workable).
qc.borland.com /wc/qcmain.aspx?d=7299   (845 words)

  
 Chapter 7. Using Loop Nest Optimization
Outer loop unrolling is one optimization that the LNO performs; it chooses the proper amount of unrolling for loop nests such as this matrix multiply kernel.
Loop interchange and outer loop unrolling can be combined to solve some performance problems that neither technique can solve on its own.
Loop peeling is the technique of removing iterations from the beginning and/or ending of a loop so that the index range will match that of another loop.
www.cs.wfu.edu /~torgerse/Kokua/More_SGI/007-3430-003/sgi_html/ch07.html   (6141 words)

  
 Program Optimisation Catalogue - Loop Unrolling   (Site not responding. Last check: 2007-10-24)
Loop unrolling is the process of expanding a loop so each new iteration contains several of what used to be an iteration.
This might not be possible in general, but it is still possible to do loop unrolling using prefix code to handle the final cases.
In the example, it is shown how if the loop variable is referenced, it must be respected that it should be different on each iteration.
users.chariot.net.au /~matty/optcat/loop_unrolling.html   (202 words)

  
 5.1 Loop Transformations
Unroll requests a loop label and an unrolling factor from the user.
Unrolling can be interactively applied and the user is requested a loop label.
The first one (at the beginning of the loop body) assigns it to its value function of the new index and the second one (after the end of the loop) assigns it to its final value.
www.cri.ensmp.fr /people/pips/pipsmake-rc/node31.html   (578 words)

  
 LabVIEW insights - Loops optimization technique: unrolling
Unrolling a loop reduces the cost of loop overhead by decreasing the number of times you check the loop condition.
To unroll a loop, you perform two or more of the same statements for each iteration, and increment the counter accordingly.
The syntax for this directive is: #pragma unroll #pragma unroll (n) #pragma nounroll where n is an integer constant from 0 through 255.
vi-lib.com /vi.lab/Loopsoptimizationtechniqu.html   (699 words)

  
 +O2 level optimizations
The register is initialized outside the loop to the loop-invariant portion of the virtual memory address expression, and the register is incremented or decremented within the loop by the loop-variant portion of the virtual memory address expression.
This register pointer variable is initialized outside the innermost loop and auto-incremented within the innermost loop as a side-effect of the pointer dereference.
Total unrolling involves eliminating the loop structure completely by replicating the loop body a number of times equal to the iteration count and replacing the iteration variable with constants.
docs.hp.com /en/B6056-96002/ch03s05.html   (1427 words)

  
 Branch Prediction Schemes
3) Show the loop unrolled (scheduled and unscheduled) so that there are 4 copies of the loop body, assuming R1 is initially a multiple of 32, which means that the number of loop iterations is a multiple of 4.
Determine that unrolling the loop would be useful by finding that the loop iterations were independent, except for loop maintenance code.
Determine that the loads and stores in the unrolled loop can be interchanged by observing that the loads and stores from different iterations are independent.
www.cs.iastate.edu /~prabhu/Tutorial/PIPELINE/loopUnrolling.html   (318 words)

  
 User's Guide - Optimizing Loops and Array Language
Loop unrolling involves expanding the loop body to do the work of two, three, or more iterations, and reducing the iteration count proportionately.
Loop unrolling also increases code sizes in the new loop body, which can increase register allocation and possibly cause register spilling.
The loop transformations performed by the -qhot option are controlled by a set of assumptions about the characteristics of typical loops and the costs (in terms of registers used and potential delays introduced) of performing particular transformations.
www.ncsa.uiuc.edu /UserInfo/Resources/Hardware/IBMp690/IBM/usr/share/man/info/en_US/xlf/html/UG53.HTM   (1056 words)

  
 Code Maestro - Loop Unrolling with Duff's Device
Loop unrolling, or unwinding, is simply reducing the number of overhead instructions that the CPU has to execute in a loop, thus improving the cache hit rate and the loop's run time.
If loop unrolling is to be used as an optimization, it should be used very carefully.
Duff's device implements loop unrolling by interlacing the structures of a switch and a loop.
www.codemaestro.com /reviews/review00000102.html   (549 words)

  
 Loop Unrolling with -unroll[n]
To disable loop unrolling, specify n as 0.
Unrolling enables you to aggressively schedule (or pipeline) the loop to hide latencies if you have enough free registers to keep variables live.
The Intel® Pentium® 4 or Intel® Xeon(TM) processors can correctly predict the exit branch for an inner loop that has 16 or fewer iterations, if that number of iterations is predictable and there are no conditional branches in the loop.
www.intel.com /software/products/compilers/flin/docs/for_ug2/high_unr.htm   (247 words)

  
 Code for loop unrolling(medium level) - Dev Shed
Code for loop unrolling(medium level) C programming forum discussing all C derivatives, including C#, C++, Object-C, and even plain old vanilla C. These languages are low level languages, and used on projects such as device drivers, compilers, and even whole computer operating systems.
.loop unrolling is process of code optimization and for that the loop is unrolled and is used to adjust the gaps between two instructions.for ex if there are two instructions to be executed but one operand is source of another instruction operand.
Compilers unroll loops automatically when they are told to optimize.
forums.devshed.com /c-programming-42/code-for-loop-unrolling-medium-level-332444.html   (525 words)

  
 Loop Unrolling   (Site not responding. Last check: 2007-10-24)
Loop overhead can be reduced by reducing the number of iterations and replicating the body of the loop.
In the code fragment below, the body of the loop can be replicated once and the number of iterations can be reduced from 100 to 50.
Below is the code fragment after loop unrolling.
www.nullstone.com /htmls/category/unroll.htm   (78 words)

  
 CS 6/75101 - Chapter 4
Loop Unrolling and Software Pipelining are two techniques that a compiler can use to exploit the ILP in loops.
The unrolled loop has 8 instructions of which 2 are overhead - unrolling the loop has reduced the overhead from 40 % to 25 %.
This example unrolled the loop by doubling the instructions required for the job-at-hand in each iteration and cutting the number of iterations in half - it used a loop-unrolling factor of 2.
www.cs.kent.edu /~batcher/CS675101/c4.html   (4797 words)

  
 Optimizing code   (Site not responding. Last check: 2007-10-24)
If the code inside the loop is not much more compl ex than the logic for looping itself, then loop unrolling will help.
If it is much more complex, then loop unrolling won't do much, and you simply have to optimize the code within the loop somehow.
Look to see if there's logic repeated within the loop that need not be repeated, and check to see if there are ways to reduce the number of loop iterations required to get the job done.
www.codecomments.com /message198146.html   (1337 words)

  
 Bentley's Rules from Writing Efficient Programs   (Site not responding. Last check: 2007-10-24)
However, loop unrolling is a touchy, error-prone operation to carry out by hand, and the resulting code is hard to read and to maintain.
A benefit of compiler-controlled loop unrollling is that the compiler applies the optimizations of common subexpression elimination, constant propogation, code motion out of loops, and function call inlining both before and after unrolling the loop.
Loop fission -- when two inner loops are controlled by a single outer loop, duplicating the outer loop and creating two simple loop nests that can then be interchanged, blocked, or unrolled.
www.hipecc.twsu.edu /bentley.htm   (4215 words)

  
 From the stacks: TCP/IP checksum vectorization using AltiVec, Part 2   (Site not responding. Last check: 2007-10-24)
Vectorizing the particular reduction pattern that appears in the checksum loop is further complicated by the presence of multiple data types.
Unrolling the loop to use independent accumulators does nothing to alleviate the problem.
Further discussion of loop unrolling and vector accumulators may be found in Advanced Compiler Design and Implementation, by Seven S. Muchnick.
www-106.ibm.com /developerworks/library/pa-altivec2   (1935 words)

  
 Unrolling AltiVec, Part 3: Down and dirty loop optimization   (Site not responding. Last check: 2007-10-24)
The Unrolling AltiVec series has looked at the history and usage of the SIMD components of the PowerPC architecture; these are called by various names, including Velocity Engine and VMX, but Freescale's name for it is AltiVec, which is how the previous installments refer to it.
It could be unrolled further, but when you're performing the same operation on a 32-bit value four times in a row, that suggests that maybe it'd be more efficient to use AltiVec.
The time to run this version of the loop is only about a million microseconds -- not exactly the huge speedup AltiVec should offer when well-tuned, but it's nice to note that, even without a serious effort at further unrolling, performance has improved.
www.ibm.com /developerworks/power/library/pa-unrollav3/index.html?ca=drs-tp1405   (2921 words)

  
 Branch and Loop Reorganization to Prevent Mispredicts
Now that the history for this loop is in the BTB, whenever this code is executed again, it will not cause any branch mispredicts and the accompanying penalty.
Another benefit of loop unrolling is that dependence chains are stretched, allowing for deep pipelines to get more utilization.
Branches within a loop can place a heavy demand on the BTB, since they are now multiplied the same number of times the loop is unrolled.
www.intel.com /cd/ids/developer/asmo-na/eng/dc/digitalmedia/66779.htm?page=4   (604 words)

  
 Rico Mariani's Performance Tidbits : Loop unrolling for speed
Remember the purpose of loop unrolling is to reduce the cost of testing the loop variables on each iteration and the associated control flow.
Generally when you unroll a loop you take several iterations and put it directly into the body.
I find it unlikely that more than 10 unrolls would ever really be worth it, but of course you'd have to measure to be sure for your case.
blogs.msdn.com /ricom/archive/2003/12/02/40777.aspx   (391 words)

  
 Loop unroll optimization - MSDN Forums
VC++ compiler does unroll loop in cases when this change benifit for overall performance of the application.
Yes I see, VC does unroll loops in some cases, I was able to see it on real-life app.
Then my question is what a threshold is for this, how compiler decides whether loop has to be unrolled or not, despite obvious reason that a loop body has to fit into CPU instruction cache.
forums.microsoft.com /msdn/showpost.aspx?postid=13848&siteid=1   (713 words)

  
 [No title]   (Site not responding. Last check: 2007-10-24)
Loop unrolling The compiler apparently didn’t do a good job unrolling for loops even with the flag -funroll-all-loops.
Optimizations attempted but unsuccessful 2.1 Software pipelining with k-way loop unrolling Inside each iteration of the innermost loop that computes a dot product, there are three data dependent operations that take place -- l, m and a (load, multiply and accumulate).
Sec 1.3s loop unrolling was the best we could find even though we got comparable results for 2(lm, a) and 3(lm, a).
www.cs.berkeley.edu /~boonloo/classes/cs267/a1/writeup_new.doc   (1922 words)

Try your search on: Qwika (all wikis)

Factbites
  About us   |   Why use us?   |   Reviews   |   Press   |   Contact us  
Copyright © 2005-2007 www.factbites.com Usage implies agreement with terms.