1 This is cloog.info, produced by makeinfo version 4.11 from cloog.texi.
3 INFO-DIR-SECTION Software libraries
5 * cloog: (cloog). A loop generator for scanning polyhedra
8 This manual is for CLooG version , a software which generates loops
9 for scanning Z-polyhedra. That is, CLooG produces a code visiting each
10 integral point of a union of parametrized polyhedra. CLooG is designed
11 to avoid control overhead and to produce a very efficient code.
13 It would be quite kind to refer the following paper in any
14 publication that results from the use of the CLooG software or its
18 author = {C. Bastoul},
19 title = {Code Generation in the Polyhedral Model
20 Is Easier Than You Think},
21 booktitle = {PACT'13 IEEE International Conference on
22 Parallel Architecture and Compilation Techniques},
26 address = {Juan-les-Pins}
29 Copyright (C) 2002-2005 Ce'dric Bastoul.
31 Permission is granted to copy, distribute and/or modify this
32 document under the terms of the GNU Free Documentation License, Version
33 1.2 published by the Free Software Foundation. To receive a copy of the
34 GNU Free Documentation License, write to the Free Software Foundation,
35 Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.
38 File: cloog.info, Node: Top, Next: Introduction, Up: (dir)
43 This manual is for CLooG version , a software which generates loops for
44 scanning Z-polyhedra. That is, CLooG produces a code visiting each
45 integral point of a union of parametrized polyhedra. CLooG is designed
46 to avoid control overhead and to produce a very efficient code.
48 It would be quite kind to refer the following paper in any
49 publication that results from the use of the CLooG software or its
53 author = {C. Bastoul},
54 title = {Code Generation in the Polyhedral Model
55 Is Easier Than You Think},
56 booktitle = {PACT'13 IEEE International Conference on
57 Parallel Architecture and Compilation Techniques},
61 address = {Juan-les-Pins}
64 Copyright (C) 2002-2005 Ce'dric Bastoul.
66 Permission is granted to copy, distribute and/or modify this
67 document under the terms of the GNU Free Documentation License, Version
68 1.2 published by the Free Software Foundation. To receive a copy of the
69 GNU Free Documentation License, write to the Free Software Foundation,
70 Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.
82 File: cloog.info, Node: Introduction, Next: CLooG Software, Prev: Top, Up: Top
87 CLooG is a free software and library generating loops for scanning
88 Z-polyhedra. That is, it finds a code (e.g. in C, FORTRAN...) that
89 reaches each integral point of one or more parameterized polyhedra.
90 CLooG has been originally written to solve the code generation problem
91 for optimizing compilers based on the polytope model. Nevertheless it
92 is used now in various area, e.g., to build control automata for
93 high-level synthesis or to find the best polynomial approximation of a
94 function. CLooG may help in any situation where scanning polyhedra
95 matters. It uses the best state-of-the-art code generation algorithm
96 known as the Quillere' et al. algorithm (*note Qui00::) with our own
97 improvements and extensions (*note Bas04::). The user has full control
98 on generated code quality. On one hand, generated code size has to be
99 tuned for sake of readability or instruction cache use. On the other
100 hand, we must ensure that a bad control management does not hamper
101 performance of the generated code, for instance by producing redundant
102 guards or complex loop bounds. CLooG is specially designed to avoid
103 control overhead and to produce a very efficient code.
105 CLooG stands for _Chunky Loop Generator_: it is a part of the Chunky
106 project, a research tool for data locality improvement (*note Bas03a::).
107 It is designed also to be the back-end of automatic parallelizers like
108 LooPo (*note Gri04::). Thus it is very compilable code oriented and
109 provides powerful program transformation facilities. Mainly, it allows
110 the user to specify very general schedules where, e.g., unimodularity
111 or invertibility of the transformation doesn't matter.
113 The current version is still under evaluation, and there is no
114 guarantee that the upward compatibility will be respected (but the
115 previous API has been stable for two years, we hope this one will be as
116 successful -and we believe it-). A lot of reports are necessary to
117 freeze the library API and the input file shape. Most API changes from
118 0.12.x to 0.14.x have been requested by the users themselves. Thus you
119 are very welcome and encouraged to post reports on bugs, wishes,
120 critics, comments, suggestions or successful experiences in the forum
121 of `http://www.CLooG.org' or to send them to cedric.bastoul@inria.fr
130 File: cloog.info, Node: Basics, Next: Scattering, Up: Introduction
132 1.1 Basically, what's the point ?
133 =================================
135 If you want to use CLooG, this is because you want to scan or to find
136 something inside the integral points of a set of polyhedra. There are
137 many reasons for that. Maybe you need the generated code itself because
138 it actually implements a very smart program transformation you found.
139 Maybe you want to use the generated code because you know that the
140 solution of your problem belongs to the integral points of those damned
141 polyhedra and you don't know which one. Maybe you just want to know if
142 a polyhedron has integral points depending on some parameters, which is
143 the lexicographic minimum, maximum, the third on the basis of the left
144 etc. Probably you have your own reasons to use CLooG.
146 Let us illustrate a basic use of CLooG. Suppose we have a set of
147 affine constraints that describes a part of a whatever-dimensional
148 space, called a *domain*, and we want to scan it. Let us consider for
149 instance the following set of constraints where `i' and `j' are the
150 unknown (the two dimensions of the space) and `m' and `n' are the
151 parameters (some symbolic constants):
155 Let us also consider that we have a partial knowledge of the parameter
156 values, called the *context*, expressed as affine constraints as well,
160 Note that using parameters is optional, if you are not comfortable with
161 parameter manipulation, just replace them with any scalar value that
162 fits `m>=2' and `n>=2'. A graphical representation of this part of the
163 2-dimensional space, where the integral points are represented using
164 heavy dots would be for instance:
166 \0\b[image src="images/basic.jpg" text=" j^ i>=2
181 The affine constraints of both the domain and the context are what
182 we will provide to CLooG as input (in a particular shape that will be
183 described later). The output of CLooG is a pseudo-code to scan the
184 integral points of the input domain according to the context:
186 for (j=2;j<=min(m,-i+n+2);j++) {
190 If you felt such a basic example is yet interesting, there is a good
191 chance that CLooG is appropriate for you. CLooG can do much more:
192 scanning several polyhedra or unions of polyhedra at the same time,
193 applying general affine transformations to the polyhedra, generate
194 compilable code etc. Welcome to the CLooG's user's guide !
197 File: cloog.info, Node: Scattering, Prev: Basics, Up: Introduction
199 1.2 Defining a Scanning Order: Scattering Functions
200 ===================================================
202 In CLooG, domains only define the set of integral points to scan and
203 their coordinates. In particular, CLooG is free to choose the scanning
204 order for generating the most efficient code. This means, for
205 optimizing/parallelizing compiler people, that CLooG doesn't make any
206 speculation on dependences on and between statements (by the way, it's
207 not its job !). For instance, if an user give to CLooG only two
208 domains `S1:1<=i<=n', `S2:1<=i<=n' and the context `n>=1', the
209 following pseudo-codes are considered to be equivalent:
211 /* A convenient target pseudo-code. */
219 /* Another convenient target pseudo-code. */
225 The default behaviour of CLooG is to generate the second one, since
226 it is optimized in control. It is right if there are no data
227 dependences between `S1' and `S2', but wrong otherwise.
229 Thus it is often useful to force scanning to respect a given order.
230 This can be done in CLooG by using *scattering functions*. Scattering
231 is a shortcut for scheduling, allocation, chunking functions and the
232 like we can find in the restructuring compilation litterature. There
233 are a lot of reasons to scatter the integral points of the domains
234 (i.e. the statement instances of a program, for compilation people),
235 parallelization or optimization are good examples. For instance, if the
236 user wants for any reason to set some precedence constraints between
237 the statements of our example above in order to force the generation of
238 the first code, he can do it easily by setting (for example) the
239 following scheduling functions:
244 This scattering means that each integral point of the domain `S1' is
245 scanned at logical date `1' while each integral point of the domain
246 `S2' is scanned at logical date `2'. As a result, the whole domain `S1'
247 is scanned before domain `S2' and the first code in our example is
250 The user can set every kind of affine scanning order thanks to the
251 scattering functions. Each domain has its own scattering function and
252 each scattering function may be multi-dimensional. A multi-dimentional
253 logical date may be seen as classical date
254 (year,month,day,hour,minute,etc.) where the first dimensions are the
255 most significant. Each scattering dimension may depend linearly on the
256 original dimensions (e.g., `i'), the parameters (e.g., `n') ans scalars
259 A very useful example of multi-dimensional scattering functions is,
260 for compilation people, the scheduling of the original program. The
261 basic data to use for code generation are statement iteration domains.
262 As we saw, these data are not sufficient to rebuild the original
263 program (what is the ordering between instances of different statements
264 ?). The missing data can be put in the scattering functions as the
265 original scheduling. The method to compute it is quite simple (*note
266 Fea92::). The idea is to build an abstract syntax tree of the program
267 and to read the scheduling for each statement. For instance, let us
268 consider the following implementation of a Cholesky factorization:
270 /* A Cholesky factorization kernel. */
272 for (j=1;j<=i-1;j++) {
273 a[i][i] -= a[i][j] ; /* S1 */
275 a[i][i] = sqrt(a[i][i]) ; /* S2 */
276 for (j=i+1;j<=N;j++) {
277 for (k=1;k<=i-1;k++) {
278 a[j][i] -= a[j][k]*a[i][k] ; /* S3 */
280 a[j][i] /= a[i][i] ; /* S4 */
285 The corresponding abstract syntax tree is given in the following
286 figure. It directly gives the scattering functions (schedules) for all
287 the statements of the program.
289 \0\b[image src="images/tree.jpg" text=" *
316 T_S1(i,j)^T = (0,i,0,j,0)^T
318 T_S3(i,j,k)^T = (0,i,2,j,0,k,0)^T
319 T_S4(i,j)^T = (0,i,2,j,1)^T
321 These schedules depend on the iterators and give for each instance
322 of each statement a unique execution date. Using such scattering
323 functions allow CLooG to re-generate the input code.
326 File: cloog.info, Node: CLooG Software, Next: CLooG Library, Prev: Introduction, Up: Top
328 2 Using the CLooG Software
329 **************************
334 * Writing The Input File::
340 File: cloog.info, Node: A First Example, Next: Writing The Input File, Up: CLooG Software
345 CLooG takes as input a file that must be written accordingly to a
346 grammar described in depth in a further section (*note Writing The
347 Input File::). Moreover it supports many options to tune the target
348 code presentation or quality as discussed in a dedicated section (*note
349 Calling CLooG::). However, a basic use of CLooG is not very complex
350 and we present in this section how to generate the code corresponding
351 to a basic example discussed earlier (*note Basics::).
353 The problem is to find the code that scans a 2-dimensional polyhedron
354 where `i' and `j' are the unknown (the two dimensions of the space) and
355 `m' and `n' are the parameters (the symbolic constants), defined by the
356 following set of constraints:
360 We also consider a partial knowledge of the parameter values, expressed
361 thanks to the following affine constraints:
365 An input file that corresponds to this problem, and asks for a
366 generated code in C, may be the following. Note that we do not describe
367 here precisely the structure and the components of this file (*note
368 Writing The Input File:: for such information, if you feel it
371 # ---------------------- CONTEXT ----------------------
374 # Context (constraints on two parameters)
375 2 4 # 2 lines and 4 columns
376 # eq/in m n 1 eq/in: 1 for inequality >=0, 0 for equality =0
377 1 1 0 -2 # 1*m + 0*n -2*1 >= 0, i.e. m>=2
378 1 0 1 -2 # 0*m + 1*n -2*1 >= 0, i.e. n>=2
380 1 # We want to set manually the parameter names
381 m n # parameter names
383 # --------------------- STATEMENTS --------------------
384 1 # Number of statements
386 1 # First statement: one domain
388 5 6 # 5 lines and 6 columns
390 1 1 0 0 0 -2 # i >= 2
391 1 -1 0 0 1 0 # i <= n
392 1 0 1 0 0 -2 # j >= 2
393 1 0 -1 1 0 0 # j <= m
394 1 -1 -1 0 1 2 # n+2-i>=j
395 0 0 0 # for future options
397 1 # We want to set manually the iterator names
400 # --------------------- SCATTERING --------------------
401 0 # No scattering functions
403 This file may be called `basic.cloog' (this example is provided in
404 the CLooG distribution as `test/manual_basic.cloog') and we can ask
405 CLooG to process it and to generate the code by a simple calling to
406 CLooG with this file as input: `cloog basic.cloog'. By default, CLooG
407 will print the generated code in the standard output:
409 /* Generated by CLooG v in 0.00s. */
411 for (j=2;j<=min(m,-i+n+2);j++) {
417 File: cloog.info, Node: Writing The Input File, Next: Calling CLooG, Prev: A First Example, Up: CLooG Software
419 2.2 Writing The Input File
420 ==========================
422 The input text file contains a problem description, i.e. the context,
423 the domains and the scattering functions. Because CLooG is very
424 'compilable code generation oriented', we can associate some additional
425 informations to each domain. We call this association a _statement_.
426 The set of all informations is called a _program_. The input file
427 respects the grammar below (terminals are preceeded by "_"):
430 Program ::= Context Statements Scattering
431 Context ::= Language Domain Naming
432 Statements ::= Nb_statements Statement_list Naming
433 Scattering ::= Nb_functions Domain_list Naming
434 Naming ::= Option Name_list
435 Name_list ::= _String Name_list | (void)
436 Statement_list ::= Statement Statement_list | (void)
437 Domain_list ::= _Domain Domain_list | (void)
438 Statement ::= Iteration_domain 0 0 0
439 Iteration_domain ::= Domain_union
440 Domain_union ::= Nb_domains Domain_list
443 Nb_statements ::= _Integer
444 Nb_domains ::= _Integer
445 Nb_functions ::= _Integer
447 * `Context' represents the informations that are shared by
448 all the statements. It consists on the language used (which
449 can be `c' for C or `f' for FORTRAN 90) and the global
450 constraints on parameters. These constraints are essential
451 since they give to CLooG the number of parameters. If there
452 is no parameter or no constraints on parameters, just give
453 a constraint always satisfied like 1 \geq 0. `Naming' sets
454 the parameter names. If the naming option `Option'
455 is 1, parameter names will be read on the next line. There
456 must be exactly as many names as parameters. If the naming
457 option `Option' is 0, parameter names are automatically
458 generated. The name of the first parameter will be `M', and
459 the name of the (n+1)^th parameter directly follows the
460 name of the n^th parameter in ASCII code. It is the user
461 responsibility to ensure that parameter names, iterators
462 and scattering dimension names are different.
464 * `Statements' represents the informations on the statements.
465 `Nb_statements' is the number of statements in the program,
466 i.e. the number of `Statement' items in the `Statement_list'.
467 `Statement' represents the informations on a given statement.
468 To each statement is associated a domain (the statement
469 iteration domain: `Iteration_domain') and three zeroes that
470 represents future options. `Naming' sets the iterator
471 names. If the naming option `Option' is 1, the iterator
472 names will be read on the next line. There must be exactly
473 as many names as nesting level in the deepest iteration
474 domain. If the naming option `Option' is 0, iterator names
475 are automatically generated. The iterator name of the
476 outermost loop will be `i', and the iterator name of the
477 loop at level n+1 directly follows the iterator name of the
478 loop at level n in ASCII code.
480 * `Scattering' represents the informations on scattering functions.
481 `Nb_functions' is the number of functions (it must be
482 equal to the number of statements or 0 if there is no scattering
483 function). The function themselves are represented through
484 `Domain_list'. `Naming' sets the scattering dimension
485 names. If the naming option `Option' is 1, the scattering
486 dimension names will be read on the next line.
487 There must be exactly as many names as scattering dimensions. If
488 the naming option `Option' is 0, scattering dimension names
489 are automatically generated. The name of the n^th
490 scattering dimention will be `cn'.
494 * Domain Representation::
495 * Scattering Representation::
498 File: cloog.info, Node: Domain Representation, Next: Scattering Representation, Up: Writing The Input File
500 2.2.1 Domain Representation
501 ---------------------------
503 As shown by the grammar, the input file describes the various
504 informations thanks to characters, integers and domains. Each domain is
505 defined by a set of constraints in the PolyLib format (*note Wil93::).
506 They have the following syntax:
507 1. some optional comment lines beginning with `#',
509 2. the row and column numbers, possibly followed by comments,
511 3. the constraint rows, each row corresponds to a constraint the
512 domain have to satisfy. Each row must be on a single line and is
513 possibly followed by comments. The constraint is an equality
514 p(x) = 0 if the first element is 0, an inequality p(x) \geq
515 0 if the first element is 1. The next elements are the
516 unknown coefficients, followed by the parameter
517 coefficients. The last element is the constant factor.
518 For instance, assuming that `i', `j' and `k' are iterators and
519 `m' and `n' are parameters, the domain defined by the following
526 can be written in the input file as follows :
529 3 7 # 3 lines and 7 columns
531 1 -1 0 0 1 0 0 # -i + m >= 0
532 1 0 -1 0 0 1 0 # -j + n >= 0
533 1 1 1 -1 0 0 0 # i + j - k >= 0
535 Each iteration domain `Iteration_domain' of a given statement is a
536 union of polyhedra `Domain_union'. A union is defined by its number of
537 elements `Nb_domains' and the elements themselves `Domain_list'. For
538 instance, let us consider the following pseudo-code:
541 if ((i >= m) || (i <= 2*m))
547 The iteration domain of `S1' can be divided into two polyhedra and
548 written in the input file as follows:
550 2 # Number of polyhedra in the union
552 3 5 # 3 lines and 5 columns
558 3 5 # 3 lines and 5 columns
562 1 -1 2 0 0 # i <= 2*m
565 File: cloog.info, Node: Scattering Representation, Prev: Domain Representation, Up: Writing The Input File
567 2.2.2 Scattering Function Representation
568 ----------------------------------------
570 Scattering functions are depicted in the input file thanks a
571 representation very close to the domain one. An integer gives the
572 number of functions `Nb_functions' and each function is represented by
573 a domain. Each line of the domain corresponds to an equality defining a
574 dimension of the function. Note that at present (CLooG ) *all functions
575 must have the same scattering dimension number*. If a user wants to set
576 scattering functions with different dimensionality, he has to complete
577 the smaller one with zeroes to reach the maximum dimensionality. For
578 instance, let us consider the following code and scheduling functions:
581 if ((i >= m) || (i <= 2*m))
588 T_S2(i,j)^T = (n,i+j)^T
590 This scheduling can be written in the input file as follows:
592 2 # Number of scattering functions
594 2 7 # 2 lines and 7 columns
595 # eq/in c1 c2 i m n 1
596 0 1 0 -1 0 0 0 # c1 = i
597 0 0 1 0 0 0 0 # c2 = 0
599 2 8 # 2 lines and 8 columns
600 # eq/in c1 c2 i j m n 1
601 0 1 0 0 0 0 -1 0 # c1 = n
602 0 0 1 -1 -1 0 0 0 # c2 = i+j
603 The complete input file for the user who wants to generate the code for
604 this example with the preceding scheduling would be (this file is
605 provided in the CLooG distribution as `test/manual_scattering.cloog':
607 # ---------------------- CONTEXT ----------------------
610 # Context (no constraints on two parameters)
611 1 4 # 1 lines and 4 columns
613 1 0 0 0 # 0 >= 0, always true
615 1 # We want to set manually the parameter names
616 m n # parameter names
618 # --------------------- STATEMENTS --------------------
619 2 # Number of statements
621 2 # First statement: two domains
623 3 5 # 3 lines and 5 columns
629 3 5 # 3 lines and 5 columns
633 1 -1 2 0 0 # i <= 2*m
634 0 0 0 # for future options
636 1 # Second statement: one domain
637 4 6 # 4 lines and 6 columns
639 1 1 0 0 0 -1 # i >= 1
640 1 -1 0 0 1 0 # i <= n
641 1 -1 1 0 0 -1 # j >= i+1
642 1 0 -1 1 0 0 # j <= m
643 0 0 0 # for future options
645 1 # We want to set manually the iterator names
648 # --------------------- SCATTERING --------------------
649 2 # Scattering functions
651 2 7 # 2 lines and 7 columns
652 # eq/in p1 p2 i m n 1
653 0 1 0 -1 0 0 0 # p1 = i
654 0 0 1 0 0 0 0 # p2 = 0
656 2 8 # 2 lines and 8 columns
657 # eq/in p1 p2 i j m n 1
658 0 1 0 0 0 0 -1 0 # p1 = n
659 0 0 1 -1 -1 0 0 0 # p2 = i+j
661 1 # We want to set manually the scattering dimension names
662 p1 p2 # scattering dimension names
665 File: cloog.info, Node: Calling CLooG, Next: CLooG Options, Prev: Writing The Input File, Up: CLooG Software
670 CLooG is called by the following command:
671 cloog [ options | file ]
672 The default behavior of CLooG is to read the input informations from
673 a file and to print the generated code or pseudo-code on the standard
674 output. CLooG's behavior and the output code shape is under the user
675 control thanks to many options which are detailed a further section
676 (*note CLooG Options::). `file' is the input file. `stdin' is a
677 special value: when used, input is standard input. For instance, we can
678 call CLooG to treat the input file `basic.cloog' with default options
679 by typing: `cloog basic.cloog' or `more basic.cloog | cloog stdin'.
682 File: cloog.info, Node: CLooG Options, Next: Full Example, Prev: Calling CLooG, Up: CLooG Software
689 * Last Depth to Optimize Control::
690 * First Depth to Optimize Control::
691 * Simplify Convex Hull::
692 * Once Time Loop Elimination::
693 * Equality Spreading::
694 * Constant Spreading::
695 * First Level for Spreading::
696 * C PreProcessor Friendly::
705 File: cloog.info, Node: Last Depth to Optimize Control, Next: First Depth to Optimize Control, Up: CLooG Options
707 2.4.1 Last Depth to Optimize Control `-l <depth>'
708 -------------------------------------------------
710 `-l <depth>': this option sets the last loop depth to be optimized in
711 control. The higher this depth, the less control overhead. For
712 instance, with some input file, a user can generate different
713 pseudo-codes with different `depth' values as shown below.
714 /* Generated using a given input file and *option -l 1* */
726 /* Generated using the same input file but *option -l 2* */
735 In this example we can see that this option can change the
736 operation execution order between statements. Let us remind that
737 CLooG does not make any speculation on dependences between
738 statements (*note Scattering::). Thus if nothing (i.e. scattering
739 functions) forbids this, CLooG considers the above codes to be
740 equivalent. If there is no scattering functions, the minimum
741 value for `depth' is 1 (in the case of 0, the user doesn't really
742 need a loop generator !), and the number of scattering dimensions
743 otherwise (CLooG will warn the user if he doesn't respect such
744 constraint). The maximum value for depth is -1 (infinity).
745 Default value is infinity.
748 File: cloog.info, Node: First Depth to Optimize Control, Next: Simplify Convex Hull, Prev: Last Depth to Optimize Control, Up: CLooG Options
750 2.4.2 First Depth to Optimize Control `-f <depth>'
751 --------------------------------------------------
753 `-f <depth>': this option sets the first loop depth to be optimized
754 in control. The lower this depth, the less control overhead (and the
755 longer the generated code). For instance, with some input file, a
756 user can generate different pseudo-codes with different `depth'
757 values as shown below. The minimum value for `depth' is 1,
758 and the maximum value is -1 (infinity). Default value is 1.
759 /* Generated using a given input file and *option -f 3* */
769 /* Generated using the same input file but *option -f 2* */
774 for (j=10;j<=M;j++) {
781 File: cloog.info, Node: Simplify Convex Hull, Next: Once Time Loop Elimination, Prev: First Depth to Optimize Control, Up: CLooG Options
783 2.4.3 Simplify Convex Hull `-sh <boolean>'
784 ------------------------------------------
786 `-sh <boolean>': this option enables (`boolean=1') or forbids
787 (`boolean=0') a simplification step that may simplify some
788 constraints. This option works only for generated code without
789 code duplication (it means, you have to tune `-f' and `-l'
790 options first to generate only a loop nest with internal guards).
791 For instance, with the input file `test/union.cloog', a user can
792 generate different pseudo-codes as shown below. Default value is
794 /* Generated using test/union.cloog and *option -f -1 -l 2 -override* */
795 for (i=0;i<=11;i++) {
796 for (j=max(0,5*i-50);j<=min(15,5*i+10);j++) {
797 if ((i <= 10) && (j <= 10)) {
800 if ((i >= 1) && (j >= 5)) {
806 /* Generated using the same input file but *option -sh 1 -f -1 -l 2 -override* */
807 for (i=0;i<=11;i++) {
808 for (j=0;j<=15;j++) {
809 if ((i <= 10) && (j <= 10)) {
812 if ((i >= 1) && (j >= 5)) {
819 File: cloog.info, Node: Once Time Loop Elimination, Next: Equality Spreading, Prev: Simplify Convex Hull, Up: CLooG Options
821 2.4.4 Once Time Loop Elimination `-otl <boolean>'
822 -------------------------------------------------
824 `-otl <boolean>': this option allows (`boolean=1') or forbids
825 (`boolean=0') the simplification of loops running once. Default
827 /* Generated using a given input file and *option -otl 0* */
828 for (j=i+1;j<=i+1;j++) {
832 /* Generated using the same input file but *option -otl 1* */
837 File: cloog.info, Node: Equality Spreading, Next: Constant Spreading, Prev: Once Time Loop Elimination, Up: CLooG Options
839 2.4.5 Equality Spreading `-esp <boolean>'
840 -----------------------------------------
842 `-esp <boolean>': this option allows (`boolean=1') or forbids
843 (`boolean=0') values spreading when there are equalities. Default
845 /* Generated using a given input file and *option -esp 0* */
848 for (k=i;k<=j+M;k++) {
852 /* Generated using the same input file but *option -esp 1* */
853 for (k=M+2;k<=N+M;k++) {
858 File: cloog.info, Node: Constant Spreading, Next: First Level for Spreading, Prev: Equality Spreading, Up: CLooG Options
860 2.4.6 Constant Spreading `-csp <boolean>'
861 -----------------------------------------
863 `-csp <boolean>': this option allows (`boolean=1') or forbids
864 (`boolean=0') values spreading when there are _constant_
865 equalities. That is, when the right member of the equality is a
866 constant term. Default value is 1.
867 /* Generated using a given input file and *option -csp 0* */
870 for (k=i;j<=j+M;j++) {
874 /* Generated using the same input file but *option -csp 1* */
876 for (k=i;k<=N+M;k++) {
881 File: cloog.info, Node: First Level for Spreading, Next: C PreProcessor Friendly, Prev: Constant Spreading, Up: CLooG Options
883 2.4.7 First Level for Spreading `-fsp <level>'
884 ----------------------------------------------
886 `-fsp <level>': it can be useful to set a first level to begin
887 equality spreading. Particularly when using scattering functions,
888 the user may want to see the scattering dimension values instead
889 of spreading or hiding them. If user has set a spreading, `level'
890 is the first level to start it. Default value is 1.
891 /* Generated using a given input file and *option -fsp 1* */
892 for (j=0;j<=N+M;j++) {
895 for (j=0;j<=N+M;j++) {
899 /* Generated using the same input file but *option -fsp 2* */
901 for (j=0;j<=c1+M;j++) {
905 for (j=0;j<=N+c1;j++) {
910 File: cloog.info, Node: C PreProcessor Friendly, Next: Statement Block, Prev: First Level for Spreading, Up: CLooG Options
912 2.4.8 C PreProcessor Friendly `-cpp <boolean>'
913 ----------------------------------------------
915 `-cpp <boolean>': this option ask CLooG for printing a less
916 human-readable but compilable code by using the C preprocessor
917 (`boolean=1'). In this case each statement is written as a
918 function of the iterators corresponding to its domain dimensions:
919 `Si(value_of_iterator_1,...,value_of_iterator_n)'. It follows that
920 the user can easily add preprocessor macros to define each
921 statement and use the generated textual code directly for compilation.
922 When `boolean' is set to 0, the pretty printer has the default
923 behaviour. Default value is 0.
924 /* Generated using a given input file and *option -cpp 0* */
925 for (j=0;j<=N+M;j++) {
929 /* Generated using the same input file but *option -cpp 1* */
930 /* and a preprocessor macro set by the user */
932 #define S1(i,j) A[(j)]=3*(i)
934 for (j=0;j<=N+M;j++) {
939 File: cloog.info, Node: Statement Block, Next: Loop Strides, Prev: C PreProcessor Friendly, Up: CLooG Options
941 2.4.9 Statement Block `-block <boolean>'
942 ----------------------------------------
944 `-block <boolean>': this option allows (`boolean=1') to create a
945 statement block for each new iterator, even if there is only an
946 equality. This can be useful in order to parse the generated
947 pseudo-code. When `boolean' is set to 0 or when the generation
948 language is FORTRAN, this feature is disabled. Default value is 0.
949 /* Generated using a given input file and *option -block 0* */
954 /* Generated using the same input file but *option -block 1* */
962 File: cloog.info, Node: Loop Strides, Next: Compilable Code, Prev: Statement Block, Up: CLooG Options
964 2.4.10 Loop Strides `-strides <boolean>'
965 ----------------------------------------
967 `-strides <boolean>': this options allows (`boolean=1') to handle
968 non-unit strides for loop increments. This can remove a lot of
969 guards and make the generated code more efficient. Default value is 0.
970 /* Generated using a given input file and *option -strides 0* */
980 /* Generated using the same input file but *option -strides 1* */
981 for (i=2;i<=n;i+=2) {
989 File: cloog.info, Node: Compilable Code, Next: Output, Prev: Loop Strides, Up: CLooG Options
991 2.4.11 Compilable Code `-compilable <value>'
992 --------------------------------------------
994 `-compilable <value>': this options allows (`value' is not 0) to
995 generate a compilable code where all parameters have the integral value
996 `value'. This option creates a macro for each statement. Since
997 CLooG do not know anything about the statement sources, it fills the
998 macros with a basic increment that computes the total number of
999 scanned integral points. The user may change easily the macros according
1000 to his own needs. This option is possible only if the generated
1001 code is in C. Default value is 0.
1002 /* Generated using a given input file and *option -compilable 0* */
1003 for (i=0;i<=n;i++) {
1004 for (j=0;j<=n;j++) {
1011 /* Generated using the same input file but *option -compilable 10* */
1012 /* DON'T FORGET TO USE -lm OPTION TO COMPILE. */
1014 /* Useful headers. */
1019 /* Parameter value. */
1022 /* Statement macros (please set). */
1023 #define S1(i,j) {total++;}
1024 #define S2(i,j) {total++;}
1025 #define S3(i) {total++;}
1028 /* Original iterators. */
1031 int n=PARVAL, total=0 ;
1033 for (i=0;i<=n;i++) {
1034 for (j=0;j<=n;j++) {
1041 printf("Number of integral points: %d.\n",total) ;
1046 File: cloog.info, Node: Output, Next: Help, Prev: Compilable Code, Up: CLooG Options
1048 2.4.12 Output `-o <output>'
1049 ---------------------------
1051 `-o <output>': this option sets the output file. `stdout' is a
1052 special value: when used, output is standard output. Default
1056 File: cloog.info, Node: Help, Next: Version, Prev: Output, Up: CLooG Options
1058 2.4.13 Help `--help' or `-h'
1059 ----------------------------
1061 `--help' or `-h': this option ask CLooG to print a short help.
1064 File: cloog.info, Node: Version, Prev: Help, Up: CLooG Options
1066 2.4.14 Version `--version' or `-v'
1067 ----------------------------------
1069 `--version' or `-v': this option ask CLooG to print some version
1073 File: cloog.info, Node: Full Example, Prev: CLooG Options, Up: CLooG Software
1078 Let us consider the allocation problem of a Gaussian elimination, i.e.
1079 we want to distribute the various statement instances of the compute
1080 kernel onto different processors. The original code is the following:
1081 for (i=1;j<=N-1;i++) {
1082 for (j=i+1;j<=N;j++) {
1083 c[i][j] = a[j][i]/a[i][i] ; /* S1 */
1084 for (k=i+1;k<=N;k++) {
1085 a[j][k] -= c[i][j]*a[i][k] ; /* S2 */
1090 The best affine allocation functions can be found by any good automatic
1091 parallelizer like LooPo (*note Gri04::):
1096 To ensure that on each processor, the set of statement instances is
1097 executed according to the original ordering, we add as minor scattering
1098 dimensions the original scheduling (*note Scattering::):
1100 T_S1(i,j)^T = (i,0,i,0,j,0)^T
1101 T_S2(i,j,k)^T = (k,0,i,0,j,1,k,0)^T
1103 To ensure that the scattering functions have the same dimensionality, we
1104 complete the first function with zeroes (this is a CLooG and previous
1105 versions requirement, it should be removed in a future version, don't
1106 worry it's absolutly legal !):
1108 T_S1(i,j)^T = (i,0,i,0,j,0,0,0)^T
1109 T_S2(i,j,k)^T = (k,0,i,0,j,1,k,0)^T
1111 The input file corresponding to this code generation problem could be
1112 (this file is provided in the CLooG distribution as
1113 `test/manual_gauss.cloog':
1115 # ---------------------- CONTEXT ----------------------
1118 # Context (no constraints on one parameter)
1119 1 3 # 1 line and 3 columns
1121 1 0 0 # 0 >= 0, always true
1123 1 # We want to set manually the parameter name
1126 # --------------------- STATEMENTS --------------------
1127 2 # Number of statements
1129 1 # First statement: one domain
1130 4 5 # 4 lines and 3 columns
1133 1 -1 0 1 -1 # i <= n-1
1134 1 -1 1 0 -1 # j >= i+1
1136 0 0 0 # for future options
1139 # Second statement: one domain
1140 6 6 # 6 lines and 3 columns
1142 1 1 0 0 0 -1 # i >= 1
1143 1 -1 0 0 1 -1 # i <= n-1
1144 1 -1 1 0 0 -1 # j >= i+1
1145 1 0 -1 0 1 0 # j <= n
1146 1 -1 0 1 0 -1 # k >= i+1
1147 1 0 0 -1 1 0 # k <= n
1148 0 0 0 # for future options
1150 0 # We let CLooG set the iterator names
1152 # --------------------- SCATTERING --------------------
1153 2 # Scattering functions
1155 8 13 # 3 lines and 3 columns
1156 # eq/in p1 p2 p3 p4 p5 p6 p7 p8 i j n 1
1157 0 1 0 0 0 0 0 0 0 -1 0 0 0 # p1 = i
1158 0 0 1 0 0 0 0 0 0 0 0 0 0 # p2 = 0
1159 0 0 0 1 0 0 0 0 0 -1 0 0 0 # p3 = i
1160 0 0 0 0 1 0 0 0 0 0 0 0 0 # p4 = 0
1161 0 0 0 0 0 1 0 0 0 0 -1 0 0 # p5 = j
1162 0 0 0 0 0 0 1 0 0 0 0 0 0 # p6 = 0
1163 0 0 0 0 0 0 0 1 0 0 0 0 0 # p7 = 0
1164 0 0 0 0 0 0 0 0 1 0 0 0 0 # p8 = 0
1166 8 14 # 3 lines and 3 columns
1167 # eq/in p1 p2 p3 p4 p5 p6 p7 p8 i j k n 1
1168 0 1 0 0 0 0 0 0 0 0 0 -1 0 0 # p1 = k
1169 0 0 1 0 0 0 0 0 0 0 0 0 0 0 # p2 = 0
1170 0 0 0 1 0 0 0 0 0 -1 0 0 0 0 # p3 = i
1171 0 0 0 0 1 0 0 0 0 0 0 0 0 0 # p4 = 0
1172 0 0 0 0 0 1 0 0 0 0 -1 0 0 0 # p5 = j
1173 0 0 0 0 0 0 1 0 0 0 0 0 0 -1 # p6 = 1
1174 0 0 0 0 0 0 0 1 0 0 0 -1 0 0 # p7 = k
1175 0 0 0 0 0 0 0 0 1 0 0 0 0 0 # p8 = 0
1177 1 # We want to set manually the scattering dimension names
1178 p1 p2 p3 p4 p5 p6 p7 p8 # scattering dimension names
1180 Calling CLooG, with for instance the command line `cloog -fsp 2
1181 gauss.cloog' for a better view of the allocation (the processor number
1182 is given by `p1'), will result on the following target code that
1183 actually implements the transformation. A minor processing on the
1184 dimension `p1' to implement, e.g., MPI calls, which is not shown here
1185 may result in dramatic speedups !
1189 for (p5=2;p5<=n;p5++) {
1193 for (p1=2;p1<=n-1;p1++) {
1194 for (p3=1;p3<=p1-1;p3++) {
1195 for (p5=p3+1;p5<=n;p5++) {
1196 S2(i = p3,j = p5,k = p1) ;
1199 for (p5=p1+1;p5<=n;p5++) {
1205 for (p3=1;p3<=n-1;p3++) {
1206 for (p5=p3+1;p5<=n;p5++) {
1207 S2(i = p3,j = p5,k = n) ;
1213 File: cloog.info, Node: CLooG Library, Next: Installing, Prev: CLooG Software, Up: Top
1215 3 Using the CLooG Library
1216 *************************
1218 The CLooG Library was implemented to allow the user to call CLooG
1219 directly from his programs, without file accesses or system calls. The
1220 user only needs to link his programs with C libraries. The CLooG
1221 library mainly provides one function (`cloog_program_generate') which
1222 takes as input the problem description with some options, and returns
1223 the data structure corresponding to the generated code (a
1224 `CloogProgram' structure) which is more or less an abstract syntax tree.
1225 The user can work with this data structure and/or use our pretty
1226 printing function to write the final code in either C or FORTRAN. Some
1227 other functions are provided for convenience reasons. These functions
1228 as well as the data structures are described in this section.
1232 * CLooG Data Structures::
1234 * Example of Library Utilization::
1237 File: cloog.info, Node: CLooG Data Structures, Next: CLooG Functions, Up: CLooG Library
1239 3.1 CLooG Data Structures Description
1240 =====================================
1242 In this section, we describe the data structures used by the loop
1243 generator to represent and to process a code generation problem.
1259 File: cloog.info, Node: CloogMatrix, Next: CloogDomain, Up: CLooG Data Structures
1264 #define CloogMatrix Matrix
1266 The `CloogMatrix' structure is directly the PolyLib `Matrix' data
1267 structure (*note Wil93::). This structure is devoted to represent a set
1268 of constraints. It is defined in `polylib/types.h' as the following:
1271 { unsigned NbRows ; /* Number of rows. */
1272 unsigned NbColumns ; /* Number of columns. */
1273 Value ** p ; /* Array of pointers to the matrix rows. */
1274 Value * p_Init ; /* Matrix rows contiguously in memory. */
1275 int p_Init_size ; /* For internal use. */
1277 typedef struct matrix Matrix;
1279 The whole matrix is stored in memory row after row at the `p_Init'
1280 address. `p' is an array of pointers where `p[i]' points to the first
1281 element of the i^th row. `NbRows' and `NbColumns' are respectively the
1282 number of rows and columns of the matrix. Each row corresponds to a
1283 constraint. The first element of each row is an equality/inequality
1284 tag. The constraint is an equality p(x) = 0 if the first element is 0,
1285 but it is an inequality p(x) \geq 0 if the first element is 1. The
1286 next elements are the unknown coefficients, followed by the parameter
1287 coefficients, then the scalar coefficient. For instance, the following
1294 would be represented by the following rows:
1296 # eq/in i j k m n cst
1301 To be able to provide different precision version (CLooG supports 32
1302 bits, 64 bits and arbitrary precision through the GMP library), the
1303 `Value' type depends on the configuration options (it may be `long int'
1304 for 32 bits version, `long long int' for 64 bits version, and `mpz_t'
1305 for multiple precision version). The `p_Init_size' field is needed by
1306 the PolyLib to free the memory allocated by `mpz_init' in the multiple
1307 precision release. Set this field to 0 if you are _not_ using multiple
1308 precision. Set this field to the size of the `p_Init' array if you
1309 initialized it yourself and if you are using the multiple precision
1313 File: cloog.info, Node: CloogDomain, Next: CloogDomainList, Prev: CloogMatrix, Up: CLooG Data Structures
1319 { Polyhedron * polyhedron ; /* The polyhedral domain. */
1321 typedef struct cloogdomain CloogDomain ;
1323 The `CloogDomain' structure contains a PolyLib `Polyhedron' data
1324 structure which represents a polyhedral domain (a union of polyhedra)
1325 in both constraint representation and its dual ray representation
1326 (*note Wil93::). It is defined in `polylib/types.h' as the following:
1329 { unsigned Dimension, /* Number of dimensions. */
1330 NbConstraints, /* Number of constraints. */
1331 NbRays, /* Number of rays. */
1332 NbEq, /* Number of vertices (?). */
1333 NbBid ; /* Number of extremal rays (?). */
1334 Value ** Constraint ; /* Pointers to constraints. */
1335 Value ** Ray ; /* Pointers to rays. */
1336 Value * p_Init ; /* Constraints and rays contiguously. */
1337 int p_Init_size ; /* For internal use. */
1338 struct polyhedron * next ; /* Next component of the union. */
1340 typedef struct polyhedron Polyhedron;
1342 The constraint representation is quite the same as in the `Matrix' data
1343 structure (*note CloogMatrix::). The number of rows is `NbConstraints'
1344 and the number of columns in the `Polyhedron' structure is
1345 `Dimension+2' (the +2 comes from the equality/inequality tag and the
1346 scalar coefficient). As in the `Matrix' structure, The data are stored
1347 in memory contiguously at the `p_Init' address and the `p_Init_size'
1348 field is used for memory deallocation in the multiple precision case
1349 (*note CloogMatrix::). For a better understanding of the dual ray
1350 representation, the user may refer to the PolyLib documentation.
1353 File: cloog.info, Node: CloogDomainList, Next: CloogStatement, Prev: CloogDomain, Up: CLooG Data Structures
1355 3.1.3 CloogDomainList
1356 ---------------------
1358 struct cloogdomainlist
1359 { CloogDomain * domain ;
1360 struct cloogdomainlist * next ;
1362 typedef struct cloogdomainlist CloogDomainList ;
1364 The CloogDomainList structure represents a `NULL' terminated linked list
1368 File: cloog.info, Node: CloogStatement, Next: CloogBlock, Prev: CloogDomainList, Up: CLooG Data Structures
1370 3.1.4 CloogStatement
1371 --------------------
1373 struct cloogstatement
1374 { int number ; /* The statement unique number. */
1375 void * usr ; /* Pointer for user's convenience. */
1376 struct cloogstatement * next ;/* Next element of the linked list. */
1378 typedef struct cloogstatement CloogStatement ;
1380 The `CloogStatement' structure represents a `NULL' terminated linked
1381 list of statements. In CLooG, a statement is only defined by its unique
1382 number (`number'). The user can use the pointer `usr' for his own
1383 convenience to link his own statement representation to the
1384 corresponding `CloogStatement' structure. The whole management of the
1385 `usr' pointer is under the responsibility of the user, in particular,
1386 CLooG never tries to print, to allocate or to free a memory block
1390 File: cloog.info, Node: CloogBlock, Next: CloogBlockList, Prev: CloogStatement, Up: CLooG Data Structures
1396 { CloogStatement * statement ; /* Statement list of the block. */
1397 CloogMatrix * scattering ; /* Scattering function of the block. */
1398 int depth ; /* Original block depth.*/
1399 void * usr; /* Pointer for user's convenience. */
1401 typedef struct cloogblock CloogBlock ;
1403 The `CloogBlock' structure represents a statement block. In a
1404 statement block, every statements have the same iteration domain and
1405 the same scattering function (actually, the scattering functions may
1406 differ only by a scalar coefficient if it just precises the ordering of
1407 the statements within the block). `statement' is the statement list
1408 where the statement order matters, `scattering' is one of the statement
1409 scattering functions and `depth' is the number of dimensions of the
1410 iteration domain (only the unknown, not the tag/parameters/scalar).
1411 `usr' is a pointer for library user's convenience. Note this pointer is
1412 never allocated, freed or printed by CLooG.
1415 File: cloog.info, Node: CloogBlockList, Next: CloogLoop, Prev: CloogBlock, Up: CLooG Data Structures
1417 3.1.6 CloogBlockList
1418 --------------------
1420 struct cloogdblocklist
1421 { CloogBlock * block ;
1422 struct cloogblocklist * next ;
1424 typedef struct cloogblocklist CloogBlockList ;
1426 The CloogBlockList structure represents a `NULL' terminated linked list
1430 File: cloog.info, Node: CloogLoop, Next: CloogNames, Prev: CloogBlockList, Up: CLooG Data Structures
1436 { CloogDomain * domain; /* Iteration domain. */
1437 Value stride ; /* Loop stride. */
1438 CloogBlock * block ; /* Included statement block.*/
1439 void * usr; /* Pointer for user's convenience. */
1440 struct cloogloop * inner ; /* Loop at the next level. */
1441 struct cloogloop * next ; /* Next loop at the same level. */
1443 typedef struct cloogloop CloogLoop ;
1445 The `CloogLoop' structure represents a loop. First of all, a loop has
1446 an iteration domain (`domain'). The iterator's stride for loop
1447 increment is `stride'. The loop can include a statement block in the
1448 field `block'. If there is no included statement block, `block' is set
1449 to `NULL'. `usr' is a pointer for library user's convenience. Note that
1450 this pointer is never allocated, freed or printed by CLooG. `inner' is
1451 a pointer to the inner loop, and `next' a pointer to the next loop in
1452 the textual order. If there are no inner loop or no next loop, the
1453 corresponding pointer is set to `NULL'.
1456 File: cloog.info, Node: CloogNames, Next: CloogProgram, Prev: CloogLoop, Up: CLooG Data Structures
1462 { int nb_scattering ; /* Scattering dimension number. */
1463 int nb_iterators ; /* Iterator number. */
1464 int nb_parameters ; /* Parameter number. */
1465 char ** scattering ; /* The scattering dimension names. */
1466 char ** iterators ; /* The iterator names. */
1467 char ** parameters ; /* The parameter names. */
1469 typedef struct cloognames CloogNames ;
1471 The `CloogNames' structure represents the scattering dimension, the
1472 iterator and the parameter names in the final program. `nb_scattering'
1473 (respectively `nb_iterators' and `nb_parameters') is the number of
1474 scattering dimensions number (respectively the iterator and parameter
1475 number) and of elements in the corresponding array of strings
1476 `scattering' (respectively `iterators' and `parameters'). The i^th
1477 scattering dimension name will be associated with the to the dimension
1478 i of the scattering function. The i^th iterator name will be
1479 associated with the dimension i of the iteration domain. The i^th
1480 parameter name will be associated with the dimension i of the context
1481 polyhedron. The user has to ensure that there are enough scattering
1482 dimension, iterator and parameter names.
1485 File: cloog.info, Node: CloogProgram, Next: CloogOptions, Prev: CloogNames, Up: CLooG Data Structures
1491 { char language ; /* The language of the program. */
1492 int nb_scattdims ; /* Scattering dimension number. */
1493 CloogNames * names ; /* Iterators and parameters names. */
1494 CloogDomain * context ; /* The context of the program. */
1495 CloogLoop * loop ; /* The loops of the program. */
1496 CloogBlockList * blocklist ; /* The statement block list. */
1497 void * usr; /* For library user's convenience. */
1499 typedef struct cloogprogram CloogProgram ;
1501 The `CloogProgram' structure represents a static control program kernel.
1502 `language' precises the language (`c' for C or `f' for FORTRAN).
1503 `nb_scattdims' gives the number of scattering dimensions. `context' is
1504 a pointer to the constraints on the program parameters, it can't be the
1505 `NULL' pointer even if there are no constraints on parameters. In such a
1506 case, set a polyhedron with as many dimensions as there are parameters,
1507 with an _always true_ constraint like 1 \geq 0 (this is necessary since
1508 the number of parameters is deduced from the dimension number of the
1509 context constraints). `loop' is a pointer to the first loop of the
1510 program. `names' is a pointer to the various element names (scattering
1511 dimension, iterators, parameters) of the final program. `names' can be
1512 the `NULL' pointer if the user do not want to use our pretty printing
1513 function. `blocklist' is the linked list of all the statement block
1514 structures. `usr' is a pointer for library user's convenience. Note
1515 that this pointer is never allocated, freed or printed by CLooG. As an
1516 example, let us consider the following loop nest:
1517 for (i=0; i<=n; i++) {
1518 for (j=0; j<=n; j++) {
1522 for (j=n+1; j<=2*n; j++) {
1526 The next figure gives a possible representation in memory for this
1527 program thanks to the CLooG data structures (it has been actually
1528 printed by the `cloog_program_print' function). In this figure, `+--
1529 CloogLoop' denotes an `inner' loop, while a `CloogLoop' on the same
1530 column pointed by an arrow denotes a `next' loop:
1536 | Scattering dimension number: 0
1540 | | Scattering dimension number: 0
1542 | | +-- No scattering string
1544 | | Iterator number -----------: 2
1546 | | +-- Iterator strings ------: i j
1548 | | Parameter number ----------: 1
1550 | | +-- Parameter strings -----: n
1565 | | +-- Null CloogBlock
1569 | | | +-- CloogDomain
1570 | | | | [ 1 0 1 0 0 ]
1571 | | | | [ 1 0 -1 1 0 ]
1572 | | | | [ 1 0 0 0 1 ]
1576 | | | +-- Null CloogBlock
1580 | | | | +-- CloogDomain
1581 | | | | | [ 1 0 0 0 1 ]
1585 | | | | +-- CloogBlock
1587 | | | | | +-- CloogStatement 1
1590 | | | | | | CloogStatement 2
1592 | | | | | +-- Null scattering function
1600 | | | +-- CloogDomain
1601 | | | | [ 1 0 -1 2 0 ]
1602 | | | | [ 1 0 1 -1 -1 ]
1603 | | | | [ 1 0 0 0 1 ]
1607 | | | +-- Null CloogBlock
1611 | | | | +-- CloogDomain
1612 | | | | | [ 1 0 0 0 1 ]
1616 | | | | +-- CloogBlock
1618 | | | | | +-- CloogStatement 3
1620 | | | | | +-- Null scattering function
1630 File: cloog.info, Node: CloogOptions, Prev: CloogProgram, Up: CLooG Data Structures
1636 { int l ; /* -l option. */
1637 int f ; /* -f option. */
1638 int strides ; /* -strides option. */
1639 int sh ; /* -sh option. */
1640 int esp ; /* -esp option. */
1641 int csp ; /* -csp option. */
1642 int fsp ; /* -fsp option. */
1643 int otl ; /* -otl option. */
1644 int block ; /* -block option. */
1645 int cpp ; /* -cpp option. */
1646 int compilable ; /* -compilable option. */
1648 typedef struct cloogoptions CloogOptions ;
1650 The `CloogOptions' structure contains all the possible options to rule
1651 CLooG's behaviour (*note Calling CLooG::). As a reminder, the default
1653 * l = -1 (optimize control until the innermost loops),
1655 * f = 1 (optimize control from the outermost loops),
1657 * strides = 0 (use only unit strides),
1659 * sh = 0 (do not simplify convex hulls),
1661 * esp = 0 (do not spread complex equalities),
1663 * csp = 1 (spread constant values),
1665 * fsp = 1 (start to spread from the first iterators),
1667 * otl = 1 (simplify loops running only once).
1669 * block = 0 (do not make statement blocks when not necessary).
1671 * cpp = 0 (do not generate a compilable part of code using
1674 * compilable = 0 (do not generate a compilable code).
1677 File: cloog.info, Node: CLooG Functions, Next: Example of Library Utilization, Prev: CLooG Data Structures, Up: CLooG Library
1679 3.2 CLooG Functions Description
1680 ===============================
1684 * cloog_program_generate::
1685 * cloog_program_scatter::
1686 * cloog_program_pprint::
1687 * cloog_program_read::
1688 * From Matrices to Domains and Conversely::
1689 * Allocation and Initialization Functions::
1690 * Memory Deallocation Functions::
1691 * Printing Functions::
1694 File: cloog.info, Node: cloog_program_generate, Next: cloog_program_scatter, Up: CLooG Functions
1696 3.2.1 cloog_program_generate
1697 ----------------------------
1699 CloogProgram * cloog_program_generate
1700 ( CloogProgram * program, /* Input program. */
1701 CloogOptions * options /* Options. */
1704 The `cloog_program_generate' function generates the data structure of
1705 the source code that scans the input polyhedra pointed by `program'
1706 according to the options pointed by `options'. The process is made
1707 directly on the input structure pointed by `program', thus the original
1708 structure is no longer available after a call to this function. It
1709 returns a pointer to a `CloogProgram' structure containing the solution
1710 in CLooG structures.
1712 The input `CloogProgram' structure must have only one loop level (no
1713 inner loops): there is one loop per statement block. For a given block,
1714 the corresponding loop carries the iteration domain, the statement
1715 block, and a loop stride initialized to 1. For instance, the input
1716 `CloogProgram' structure that have been sent to
1717 `cloog_program_generate' to achieve the final structure and code shown
1718 as example in the `CloogProgram' structure description (*note
1719 CloogProgram::) was the following one:
1725 | Scattering dimension number: 0
1729 | | Scattering dimension number: 0
1731 | | +-- No scattering string
1733 | | Iterator number -----------: 2
1735 | | +-- Iterator strings ------: i j
1737 | | Parameter number ----------: 1
1739 | | +-- Parameter strings -----: n
1748 | | | [ 1 -1 0 1 0 ]
1750 | | | [ 1 0 -1 1 0 ]
1756 | | | +-- CloogStatement 1
1759 | | | | CloogStatement 2
1761 | | | +-- Null scattering function
1770 | | | [ 1 -1 0 1 0 ]
1771 | | | [ 1 0 1 -1 -1 ]
1772 | | | [ 1 0 -1 2 0 ]
1778 | | | +-- CloogStatement 3
1780 | | | +-- Null scattering function
1788 File: cloog.info, Node: cloog_program_pprint, Next: cloog_program_read, Prev: cloog_program_scatter, Up: CLooG Functions
1790 3.2.2 cloog_program_pprint
1791 --------------------------
1793 void cloog_program_pprint
1794 ( FILE * file, /* Output file. */
1795 CloogProgram * program, /* Program to print. */
1796 CloogOptions * options /* Options. */
1799 The function `cloog_program_pprint' is a pretty printer for
1800 `CloogProgram' structures when it is a solution provided by the
1801 `cloog_program_generate' function. It prints the code or pseudo-code in
1802 the file pointed by `file' (possibly `stdout') with regards to the
1803 options pointed by `options'.
1806 File: cloog.info, Node: cloog_program_scatter, Next: cloog_program_pprint, Prev: cloog_program_generate, Up: CLooG Functions
1808 3.2.3 cloog_program_scatter
1809 ---------------------------
1811 void cloog_program_scatter
1812 ( CloogProgram * program, /* Input program. */
1813 CloogDomainList * scattering, /* Additional scattering functions. */
1814 char ** names ; /* Additional dimension names. */
1817 The function `cloog_program_scatter' applies scattering functions to
1818 the `CloogProgram' structure pointed by `program'. Original domains of
1819 `program' are freed. Scattering functions are inside the
1820 `CloogDomainList' structure pointed by `scattering'. There must be as
1821 many scattering functions in the `CloogDomainList' structure as loops
1822 (i.e. iteration domains) in the `CloogProgram' structure. The first
1823 scattering function of the list will be applied to the iteration domain
1824 of the first loop in the program, and so on. `names' gives the
1825 scattering dimension names as an array of strings. If `names' is
1826 `NULL', names are automatically generated: the name of the n^th
1827 scattering dimension will be `cn'.
1830 File: cloog.info, Node: cloog_program_read, Next: From Matrices to Domains and Conversely, Prev: cloog_program_pprint, Up: CLooG Functions
1832 3.2.4 cloog_program_read
1833 ------------------------
1835 CloogProgram * cloog_program_read(FILE *) ;
1836 The `cloog_program_read' function reads the program data from a CLooG
1837 input file (*note Writing The Input File::). It takes as input a
1838 pointer to the file it has to read (possibly `stdin'), and return a
1839 pointer to the read `CloogProgram' structure.
1842 File: cloog.info, Node: From Matrices to Domains and Conversely, Next: Allocation and Initialization Functions, Prev: cloog_program_read, Up: CLooG Functions
1844 3.2.5 From Matrices to Domains and Conversely
1845 ---------------------------------------------
1847 CloogMatrix * cloog_domain_domain2matrix(CloogDomain *) ;
1848 CloogDomain * cloog_domain_matrix2domain(CloogMatrix *) ;
1849 Two functions are provided to translate a `CloogDomain' data structure
1850 to a `CloogMatrix' data structure and conversely. Each function takes
1851 as input a pointer to the data structure to translate and returns as
1852 output a pointer to the translated data structure. The input data
1853 structure if neither modified nor freed. They may be quite useful for
1854 e.g. pretty printing since it is more convenient in constraint (matrix)
1858 File: cloog.info, Node: Allocation and Initialization Functions, Next: Memory Deallocation Functions, Prev: From Matrices to Domains and Conversely, Up: CLooG Functions
1860 3.2.6 Allocation and Initialization Functions
1861 ---------------------------------------------
1863 CloogStructure * cloog_structure_malloc() ;
1864 Each CLooG data structure has an allocation and initialization function
1865 as shown above, where `Structure' and `structure' have to be replaced
1866 by the name of the convenient structure (without `Cloog' prefix) for
1867 instance `CloogLoop * cloog_loop_malloc() ;'. These functions return
1868 pointers to an allocated structure with fields set to convenient default
1869 values. *Using those functions is mandatory* to support internal
1870 management fields and to avoid upward compatibility problems if new
1871 fields appear. An exception is `cloog_matrix_malloc' since the
1872 `CloogMatrix' comes directly from the PolyLib. It takes two parameters:
1873 the number of rows and columns of the matrix we want to allocate:
1874 CloogMatrix * cloog_matrix_malloc(unsigned nbrows, unsigned nbcolumns);
1877 File: cloog.info, Node: Memory Deallocation Functions, Next: Printing Functions, Prev: Allocation and Initialization Functions, Up: CLooG Functions
1879 3.2.7 Memory Deallocation Functions
1880 -----------------------------------
1882 void cloog_structure_free(CloogStructure *) ;
1883 Each CLooG data structure has a deallocation function as shown above,
1884 where `Structure' and `structure' have to be replaced by the name of
1885 the convenient structure (without `Cloog' prefix) for instance `void
1886 cloog_loop_free(CloogLoop *) ;'. These functions free the allocated
1887 memory for the structure provided as input. They free memory
1888 recursively, i.e. they also free the allocated memory for the internal
1889 structures. *Using those functions is mandatory* to avoid memory leaks
1890 on internal management fields and to avoid upward compatibility
1891 problems if new fields appear.
1894 File: cloog.info, Node: Printing Functions, Prev: Memory Deallocation Functions, Up: CLooG Functions
1896 3.2.8 Printing Functions
1897 ------------------------
1899 void cloog_structure_print(FILE *, CloogStructure *) ;
1900 Each CLooG data structure has a printing function as shown above,
1901 where `Structure' and `structure' have to be replaced by the name of
1902 the convenient structure (without `Cloog' prefix) for instance `void
1903 cloog_loop_print(FILE *, CloogLoop *) ;'. These functions print the
1904 pointed structure (and its fields recursively) to the file provided as
1905 input (possibly `stdout').
1908 File: cloog.info, Node: Example of Library Utilization, Prev: CLooG Functions, Up: CLooG Library
1910 3.3 Example of Library Utilization
1911 ==================================
1913 Here is a basic example showing how it is possible to use the CLooG
1914 library, assuming that a standard installation has been done. The
1915 following C program reads a CLooG input file on the standard input,
1916 then prints the solution on the standard output. Options are
1917 preselected to the default values of the CLooG software. This example
1918 is provided in the `example' directory of the CLooG distribution.
1921 # include <cloog/cloog.h>
1924 { CloogProgram * program ;
1925 CloogOptions * options ;
1927 /* Setting options and reading program informations. */
1928 options = cloog_options_malloc() ;
1929 program = cloog_program_read(stdin,options) ;
1931 /* Generating and printing the code. */
1932 program = cloog_program_generate(program,options) ;
1933 cloog_program_pprint(stdout,program,options) ;
1935 cloog_options_free(options) ;
1936 cloog_program_free(program) ;
1940 The compilation command could be:
1941 gcc example.c -lcloog -o example
1942 A calling command with the input file test.cloog could be:
1943 more test.cloog | ./example
1946 File: cloog.info, Node: Installing, Next: Documentation, Prev: CLooG Library, Up: Top
1955 * Basic Installation::
1956 * Optional Features::
1960 File: cloog.info, Node: License, Next: Requirements, Up: Installing
1965 First of all, it would be very kind to refer the following paper in any
1966 publication that result from the use of the CLooG software or its
1967 library, *note Bas04:: (a bibtex entry is provided behind the title
1968 page of this manual, along with copyright notice, and in the CLooG home
1969 `http://www.CLooG.org'.
1971 This program is free software; you can redistribute it and/or modify
1972 it under the terms of the GNU General Public License version 2 as
1973 published by the Free Software Foundation. This program is distributed
1974 in the hope that it will be useful, but WITHOUT ANY WARRANTY; without
1975 even the implied warranty of MERCHANTABILITY or FITNESS FOR A
1976 PARTICULAR PURPOSE. See the GNU General Public License for more
1977 details. `http://www.gnu.org/copyleft/gpl.html'
1980 File: cloog.info, Node: Requirements, Next: Basic Installation, Prev: License, Up: Installing
1991 File: cloog.info, Node: PolyLib, Next: GMP Library, Up: Requirements
1993 4.2.1 PolyLib (mandatory)
1994 -------------------------
1996 To successfully install CLooG, the user need firstly to install PolyLib
1997 version 5.22.1 or above (default 64 bits version is satisfying as well
1998 as 32 bits or GMP multiple precision version). Polylib can be
1999 downloaded freely at `http://icps.u-strasbg.fr/PolyLib/' or
2000 `http://www.irisa.fr/polylib/'. Once downloaded and unpacked (e.g.
2001 using the `tar -zxvf polylib-5.22.1.tar.gz' command), the user can
2002 compile it by typing the following commands on the PolyLib's root
2009 * And as root: `make install'
2011 The PolyLib default installation is `/usr/local'. This directory may
2012 not be inside your library path. To fix the problem, the user should set
2013 export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib
2014 if your shell is, e.g., bash or
2015 setenv LD_LIBRARY_PATH $LD_LIBRARY_PATH:/usr/local/lib
2016 if your shell is, e.g., tcsh. Add the line to your .bashrc or
2017 .tcshrc (or whatever convenient file) to make this change permanent.
2018 Another solution is to ask PolyLib to install in the standard path by
2019 using the prefix option of the configure script: `./configure
2022 CLooG makes intensive calls to polyhedral operations, and PolyLib
2023 functions do the job. Polylib is a free library written in C for the
2024 manipulation of polyhedra. The library is operating on objects like
2025 vectors, matrices, lattices, polyhedra, Z-polyhedra, unions of
2026 polyhedra and a lot of other intermediary structures. It provides
2027 functions for all the important operations on these structures.
2030 File: cloog.info, Node: GMP Library, Prev: PolyLib, Up: Requirements
2032 4.2.2 GMP Library (optional)
2033 ----------------------------
2035 To be able to deal with insanely large coefficient, the user will need
2036 to install the GNU Multiple Precision Library (GMP for short) version
2037 4.1.4 or above. It can be freely downloaded from
2038 `http://www.swox.com/gmp'. The user can compile it by typing the
2039 following commands on the GMP root directory:
2045 * And as root: `make install'
2047 The GMP default installation is `/usr/local', the same method to fix
2048 a library path problem applies as with PolyLib (*note PolyLib::).
2050 The PolyLib has to be built using the GMP library by specifying the
2051 option `--with-libgmp=PATH_TO_GMP' to the PolyLib configure script
2052 (where `PATH_TO_GMP' is `/usr/local' if you did not change the GMP
2053 installation directory). Then you have to set the convenient CLooG
2054 configure script options to buid the GMP version (*note Optional
2058 File: cloog.info, Node: Basic Installation, Next: Optional Features, Prev: Requirements, Up: Installing
2060 4.3 CLooG Basic Installation
2061 ============================
2063 Once downloaded and unpacked (e.g. using the `tar -zxvf cloog-.tar.gz'
2064 command), you can compile CLooG by typing the following commands on the
2065 CLooG's root directory:
2071 * And as root: `make install'
2073 Depending on the location of the PolyLib, you may need to set the
2074 option `--with-polylib' of the configure script (e.g. `./configure
2075 --with-polylib=/usr/local' with a default PolyLib installation).
2077 The program binaries and object files can be removed from the source
2078 code directory by typing `make clean'. To also remove the files that
2079 the `configure' script created (so you can compile the package for a
2080 different kind of computer) type `make distclean'.
2082 Both the CLooG software and library have been successfully compiled
2083 on the following systems:
2084 * PC's under Linux, with the `gcc' compiler,
2086 * PC's under Windows (Cygwin), with the `gcc' compiler,
2088 * Sparc and UltraSparc Stations, with the `gcc' compiler.
2091 File: cloog.info, Node: Optional Features, Next: Uninstallation, Prev: Basic Installation, Up: Installing
2093 4.4 Optional Features
2094 =====================
2096 The `configure' shell script attempts to guess correct values for
2097 various system-dependent variables and user options used during
2098 compilation. It uses those values to create the `Makefile'. Various
2099 user options are provided by the CLooG's configure script. They are
2100 summarized in the following list and may be printed by typing
2101 `./configure --help' in the CLooG top-level directory.
2103 * By default, the installation directory is `/usr/local': `make
2104 install' will install the package's files in `/usr/local/bin',
2105 `/usr/local/lib' and `/usr/local/include'. The user can specify
2106 an installation prefix other than `/usr/local' by giving
2107 `configure' the option `--prefix=PATH'.
2109 * By default, `configure' will look for the PolyLib in standard
2110 locations. If necessary, the user can specify the PolyLib path by
2111 giving `configure' the option `--with-polylib=PATH'.
2113 * By default, both CLooG software and library are compiled and
2114 installed. By giving `configure' the option `--without-cloog'
2115 the user disable the compilation and installation of the CLooG
2116 software. By giving `configure' the option `--without-lib' the
2117 user disable the compilation and installation of the CLooG library.
2119 * By default, CLooG is built in 64bits version if such version of the
2120 PolyLib is found by `configure'. If the only existing version of
2121 the PolyLib is the 32bits or if the user give to `configure' the
2122 option `--with-bits=32', the 32bits version of CLooG will be
2123 compiled. In the same way, the option `--with-bits=gmp' have to be
2124 used to build the multiple precision version.
2126 * By default, `configure' will look for the GMP library (necessary
2127 to build the multiple precision version) in standard locations. If
2128 necessary, the user can specify the GMP path by giving `configure'
2129 the option `--with-gmp=PATH'.
2132 File: cloog.info, Node: Uninstallation, Prev: Optional Features, Up: Installing
2137 The user can easily remove the CLooG software and library from his
2138 system by typing (as root if necessary) from the CLooG top-level
2139 directory `make uninstall'.
2142 File: cloog.info, Node: Documentation, Next: References, Prev: Installing, Up: Top
2147 The CLooG distribution provides several documentation sources. First,
2148 the source code itself is as documented as possible. The code comments
2149 use a Doxygen-compatible presentation (something similar to what
2150 JavaDoc does for JAVA). The user may install Doxygen (see
2151 `http://www.stack.nl/~dimitri/doxygen') to automatically generate a
2152 technical documentation by typing `make doc' or `doxygen
2153 ./autoconf/Doxyfile' at the CLooG top-level directory after running the
2154 configure script (*note Installing::). Doxygen will generate
2155 documentation sources (in HTML, LaTeX and man) in the `doc/source'
2156 directory of the CLooG distribution.
2158 The Texinfo sources of the present document are also provided in the
2159 `doc' directory. You can build it in either DVI format (by typing
2160 `texi2dvi cloog.texi') or PDF format (by typing `texi2pdf cloog.texi')
2161 or HTML format (by typing `makeinfo --html cloog.texi', using
2162 `--no-split' option to generate a single HTML file) or info format (by
2163 typing `makeinfo cloog.texi').
2166 File: cloog.info, Node: References, Prev: Documentation, Up: Top
2171 * [Bas03a] C. Bastoul, P. Feautrier. Improving data locality by
2172 chunking. CC'12 International Conference on Compiler Construction,
2173 LNCS 2622, pages 320-335, Warsaw, april 2003.
2175 * [Bas03b] C. Bastoul. Efficient code generation for automatic
2176 parallelization and optimization. ISPDC'03 IEEE International
2177 Symposium on Parallel and Distributed Computing, pages 23-30,
2178 Ljubljana, october 2003.
2180 * [Bas04] C. Bastoul. Code Generation in the Polyhedral Model Is
2181 Easier Than You Think. PACT'13 IEEE International Conference on
2182 Parallel Architecture and Compilation Techniques, pages 7-16,
2183 Juan-les-Pins, september 2004.
2185 * [Fea92] P. Feautrier Some efficient solutions to the affine
2186 scheduling problem, part II: multidimensional time. International
2187 Journal of Parallel Programming, 21(6):389-420, December 1992.
2189 * [Gri04] M. Griebl. Automatic parallelization of loop programs for
2190 distributed memory architectures. Habilitation Thesis. Faculta"t
2191 fu"r Mathematik und Informatik, Universita"t Passau, 2004.
2192 _http://www.infosun.fmi.uni-passau.de/cl/loopo/_
2194 * [Qui00] F. Quillere', S. Rajopadhye, and D. Wilde. Generation of
2195 efficient nested loops from polyhedra. International Journal of
2196 Parallel Programming, 28(5):469-498, october 2000.
2198 * [Wil93] Doran K. Wilde. A library for doing polyhedral operations.
2199 Technical Report 785, IRISA, Rennes, France, 1993.
2206 Node: Introduction
\x7f2857
2207 Node: Basics
\x7f5283
2208 Node: Scattering
\x7f7961
2209 Node: CLooG Software
\x7f12628
2210 Node: A First Example
\x7f12889
2211 Node: Writing The Input File
\x7f15805
2212 Node: Domain Representation
\x7f20084
2213 Node: Scattering Representation
\x7f22516
2214 Node: Calling CLooG
\x7f26188
2215 Node: CLooG Options
\x7f26976
2216 Node: Last Depth to Optimize Control
\x7f27450
2217 Node: First Depth to Optimize Control
\x7f29043
2218 Node: Simplify Convex Hull
\x7f30128
2219 Node: Once Time Loop Elimination
\x7f31435
2220 Node: Equality Spreading
\x7f32024
2221 Node: Constant Spreading
\x7f32663
2222 Node: First Level for Spreading
\x7f33388
2223 Node: C PreProcessor Friendly
\x7f34336
2224 Node: Statement Block
\x7f35438
2225 Node: Loop Strides
\x7f36209
2226 Node: Compilable Code
\x7f37002
2227 Node: Output
\x7f38669
2229 Node: Version
\x7f39175
2230 Node: Full Example
\x7f39395
2231 Node: CLooG Library
\x7f44437
2232 Node: CLooG Data Structures
\x7f45419
2233 Node: CloogMatrix
\x7f45900
2234 Node: CloogDomain
\x7f48170
2235 Node: CloogDomainList
\x7f50061
2236 Node: CloogStatement
\x7f50468
2237 Node: CloogBlock
\x7f51429
2238 Node: CloogBlockList
\x7f52604
2239 Node: CloogLoop
\x7f52995
2240 Node: CloogNames
\x7f54197
2241 Node: CloogProgram
\x7f55586
2242 Node: CloogOptions
\x7f61853
2243 Node: CLooG Functions
\x7f63502
2244 Node: cloog_program_generate
\x7f63956
2245 Node: cloog_program_pprint
\x7f67537
2246 Node: cloog_program_scatter
\x7f68224
2247 Node: cloog_program_read
\x7f69368
2248 Node: From Matrices to Domains and Conversely
\x7f69864
2249 Node: Allocation and Initialization Functions
\x7f70684
2250 Node: Memory Deallocation Functions
\x7f71767
2251 Node: Printing Functions
\x7f72622
2252 Node: Example of Library Utilization
\x7f73211
2253 Node: Installing
\x7f74533
2254 Node: License
\x7f74768
2255 Node: Requirements
\x7f75627
2256 Node: PolyLib
\x7f75801
2257 Node: GMP Library
\x7f77457
2258 Node: Basic Installation
\x7f78451
2259 Node: Optional Features
\x7f79588
2260 Node: Uninstallation
\x7f81640
2261 Node: Documentation
\x7f81926
2262 Node: References
\x7f83050
2263 Ref: Bas03a
\x7f83153
2264 Ref: Bas03b
\x7f83344