3 @c % /**-----------------------------------------------------------------**
5 @c % **-----------------------------------------------------------------**
7 @c % **-----------------------------------------------------------------**
8 @c % ** First version: july 6th 2002 **
9 @c % **-----------------------------------------------------------------**/
11 @c % release 1.0: september 17th 2002
12 @c % release 1.1: december 5th 2002
13 @c % release 1.2: april 22th 2003
14 @c % release 2.0: november 21th 2005 (and now in texinfo instead of LaTeX)
16 @c %/**************************************************************************
17 @c % * CLooG : the Chunky Loop Generator (experimental) *
18 @c % **************************************************************************/
19 @c %/* CAUTION: the english used is probably the worst you ever read, please
20 @c % * feel free to correct and improve it !
23 @c %\textit{"I found the ultimate transformation functions, optimization for
24 @c %static control programs is now a closed problem, I have \textnormal{just}
25 @c %to generate the target code !"}
29 @c % /*************************************************************************
30 @c % * PART I: HEADER *
31 @c % *************************************************************************/
33 @setfilename cloog.info
34 @settitle CLooG - a loop generator for scanning polyhedra
38 @set UPDATED November 17th 2005
39 @setchapternewpage odd
43 @c % /*************************************************************************
44 @c % * PART II: SUMMARY DESCRIPTION AND COPYRIGHT *
45 @c % *************************************************************************/
48 This manual is for CLooG version @value{VERSION}, a software
49 which generates loops for scanning Z-polyhedra. That is, CLooG produces a
50 code visiting each integral point of a union of parametrized
51 polyhedra. CLooG is designed to avoid control overhead and to produce a very
54 It would be quite kind to refer the following paper in any publication that
55 results from the use of the CLooG software or its library:
58 @@InProceedings@{Bas04,
59 @ @ author =@ @ @ @ @{C. Bastoul@},
60 @ @ title =@ @ @ @ @ @{Code Generation in the Polyhedral Model
61 @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ Is Easier Than You Think@},
62 @ @ booktitle = @{PACT'13 IEEE International Conference on
63 @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ Parallel Architecture and Compilation Techniques@},
64 @ @ year =@ @ @ @ @ @ 2004,
65 @ @ pages =@ @ @ @ @ @{7--16@},
66 @ @ month =@ @ @ @ @ @{september@},
67 @ @ address =@ @ @ @{Juan-les-Pins@}
71 Copyright @copyright{} 2002-2005 C@'edric Bastoul.
74 Permission is granted to copy, distribute and/or modify this document under
75 the terms of the GNU Free Documentation License, Version 1.2
76 published by the Free Software Foundation. To receive a copy of the
77 GNU Free Documentation License, write to the Free Software Foundation, Inc.,
78 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.
82 @c % /*************************************************************************
83 @c % * PART III: TITLEPAGE, CONTENTS, COPYRIGHT *
84 @c % *************************************************************************/
87 @subtitle A Loop Generator For Scanning Polyhedra
88 @subtitle Edition @value{EDITION}, for CLooG @value{VERSION}
89 @subtitle @value{UPDATED}
90 @author C@'edric Bastoul
92 @c The following two commands start the copyright page.
94 @noindent (September 2001)
96 @item C@'edric Bastoul
97 SCHEDULES GENERATE !!! I just need to apply them now, where can I find
98 a good code generator ?!
101 Hmmm. I fear that if you want something powerful enough, you'll have to
105 @vskip 0pt plus 1filll
109 @c Output the table of contents at the beginning.
112 @c % /*************************************************************************
113 @c % * PART IV: TOP NODE AND MASTER MENU *
114 @c % *************************************************************************/
134 @c % /*************************************************************************
135 @c % * PART V: BODY OF THE DOCUMENT *
136 @c % *************************************************************************/
138 @c % ****************************** INTRODUCTION ******************************
140 @chapter Introduction
141 CLooG is a free software and library generating loops for scanning Z-polyhedra.
142 That is, it finds a code (e.g. in C, FORTRAN...) that reaches each integral
143 point of one or more parameterized polyhedra. CLooG has been originally
144 written to solve the code generation problem for optimizing compilers based on
145 the polytope model. Nevertheless it is used now in various area, e.g., to build
146 control automata for high-level synthesis or to find the best polynomial
147 approximation of a function. CLooG may help in any situation where scanning
148 polyhedra matters. It uses the best state-of-the-art code generation
149 algorithm known as the Quiller@'e et al. algorithm (@pxref{Qui00})
150 with our own improvements and extensions (@pxref{Bas04}).
151 The user has full control on generated code quality.
152 On one hand, generated code size has to be tuned for sake of
153 readability or instruction cache use. On the other hand, we must ensure that
154 a bad control management does not hamper performance of the generated code,
155 for instance by producing redundant guards or complex loop bounds.
156 CLooG is specially designed to avoid control overhead and to produce a very
159 CLooG stands for @emph{Chunky Loop Generator}: it is a part of the Chunky
160 project, a research tool for data locality improvement (@pxref{Bas03a}).
162 also to be the back-end of automatic parallelizers like LooPo (@pxref{Gri04}).
164 compilable code oriented and provides powerful program transformation
165 facilities. Mainly, it allows the user to specify very general schedules where,
166 e.g., unimodularity or invertibility of the transformation doesn't matter.
168 The current version is still under
169 evaluation, and there is no guarantee that the upward compatibility
170 will be respected (but the previous API has been stable for two years,
171 we hope this one will be as successful -and we believe it-).
172 A lot of reports are necessary to freeze the library
173 API and the input file shape. Most API changes from 0.12.x to 0.14.x
174 have been requested by the users themselves.
175 Thus you are very welcome and encouraged
176 to post reports on bugs, wishes, critics, comments, suggestions or
177 successful experiences in the forum of @code{http://www.CLooG.org}
178 (preferably) or to send them to cedric.bastoul@@inria.fr directly.
186 @section Basically, what's the point ?
187 If you want to use CLooG, this is because you want to scan or to find
188 something inside the integral points of a set of polyhedra. There are many
189 reasons for that. Maybe you need the generated code itself because it
190 actually implements a very smart program transformation you found.
191 Maybe you want to use the generated code
192 because you know that the solution of your problem belongs to the integral
193 points of those damned polyhedra and you don't know which one. Maybe you just
194 want to know if a polyhedron has integral points depending on some parameters,
195 which is the lexicographic minimum, maximum, the third on the basis of the
196 left etc. Probably you have your own reasons to use CLooG.
198 Let us illustrate a basic use of CLooG. Suppose we have a set of affine
199 constraints that describes a part of a whatever-dimensional space,
200 called a @strong{domain}, and we
201 want to scan it. Let us consider for instance the following set of constraints
203 and @samp{j} are the unknown (the two dimensions of the space) and
204 @samp{m} and @samp{n} are the parameters (some symbolic constants):
212 Let us also consider that we have a partial knowledge of the parameter values,
213 called the @strong{context}, expressed as affine constraints as well,
221 Note that using parameters is optional, if you are not comfortable with
222 parameter manipulation, just replace them with any scalar value that fits
223 @code{m>=2} and @code{n>=2}.
224 A graphical representation of this part of the 2-dimensional space, where
225 the integral points are represented using heavy dots would be for instance:
227 @image{./images/basic,6cm}
229 The affine constraints of both the domain and the context are what we will
230 provide to CLooG as input (in a particular shape that will be described later).
231 The output of CLooG is a pseudo-code to scan the integral points of the
232 input domain according to the context:
235 for (i=2;i<=n;i++) @{
236 for (j=2;j<=min(m,-i+n+2);j++) @{
242 If you felt such a basic example is yet interesting, there is a good chance
243 that CLooG is appropriate for you. CLooG can do much more: scanning several
244 polyhedra or unions of polyhedra at the same time, applying general affine
245 transformations to the polyhedra, generate compilable code etc. Welcome
246 to the CLooG's user's guide !
249 @section Defining a Scanning Order: Scattering Functions
250 In CLooG, domains only define the set of integral points to scan and their
251 coordinates. In particular, CLooG is free to choose the scanning order for
252 generating the most efficient code. This means, for optimizing/parallelizing
253 compiler people, that CLooG doesn't make any speculation on dependences on and
254 between statements (by the way, it's not its job !).
255 For instance, if an user give to
256 CLooG only two domains @code{S1:1<=i<=n}, @code{S2:1<=i<=n} and the context
257 @code{n>=1}, the following pseudo-codes are considered to be equivalent:
261 /* A convenient target pseudo-code. */
262 for (i=1;i<=N;i++) @{
265 for (i=1;i<=N;i++) @{
273 /* Another convenient target pseudo-code. */
274 for (i=1;i<=N;i++) @{
281 The default behaviour
282 of CLooG is to generate the second one, since it is optimized in control.
283 It is right if there are no data dependences
284 between @code{S1} and @code{S2}, but wrong otherwise.
286 Thus it is often useful to force scanning to respect a given order. This can be
287 done in CLooG by using @strong{scattering functions}. Scattering is a
288 shortcut for scheduling, allocation, chunking functions and the like we can
289 find in the restructuring compilation litterature. There are a lot of reasons
290 to scatter the integral points of the domains (i.e. the statement instances
291 of a program, for compilation people), parallelization or optimization are good
292 examples. For instance, if the user wants for any reason to set some
293 precedence constraints between the statements of our example above
294 in order to force the generation of the
295 first code, he can do it easily by setting (for example) the following
296 scheduling functions:
299 $$\theta _{S1}(i) = (1)$$
300 $$\theta _{S2}(j) = (2)$$
312 This scattering means that each integral point of the domain @code{S1}
313 is scanned at logical date @code{1} while each integral point of the domain
314 @code{S2} is scanned at logical date @code{2}. As a result, the whole
315 domain @code{S1} is scanned before domain @code{S2} and the first code in our
316 example is generated.
318 The user can set every kind of affine scanning order thanks to the
319 scattering functions. Each domain has its own scattering function and
320 each scattering function may be multi-dimensional. A multi-dimentional logical
321 date may be seen as classical date (year,month,day,hour,minute,etc.) where
322 the first dimensions are the most significant. Each scattering dimension
323 may depend linearly on the original dimensions (e.g., @code{i}), the
324 parameters (e.g., @code{n}) ans scalars (e.g., @code{2}).
326 A very useful example of multi-dimensional scattering functions is, for
327 compilation people, the scheduling of the original program.
328 The basic data to use for code generation are statement iteration domains.
329 As we saw, these data are not sufficient to rebuild the original
330 program (what is the ordering between instances of different statements ?).
331 The missing data can be put in the scattering functions as the original
332 scheduling. The method to compute it is quite simple (@pxref{Fea92}). The idea is to
333 build an abstract syntax tree of the program and to read the scheduling for
334 each statement. For instance, let us consider the following implementation of
335 a Cholesky factorization:
339 /* A Cholesky factorization kernel. */
340 for (i=1;i<=N;i++) @{
341 for (j=1;j<=i-1;j++) @{
342 a[i][i] -= a[i][j] ; /* S1 */
344 a[i][i] = sqrt(a[i][i]) ; /* S2 */
345 for (j=i+1;j<=N;j++) @{
346 for (k=1;k<=i-1;k++) @{
347 a[j][i] -= a[j][k]*a[i][k] ; /* S3 */
349 a[j][i] /= a[i][i] ; /* S4 */
356 The corresponding abstract syntax tree is given in the following figure.
357 It directly gives the scattering functions (schedules) for all the
358 statements of the program.
360 @image{./images/tree,6cm}
364 \hbox{$ \cases{ \theta _{S1}(i,j)^T &$= (0,i,0,j,0)^T$\cr
365 \theta _{S2}(i) &$= (0,i,1)^T$\cr
366 \theta _{S3}(i,j,k)^T &$= (0,i,2,j,0,k,0)^T$\cr
367 \theta _{S4}(i,j)^T &$= (0,i,2,j,1)^T$}$}
374 T_S1(i,j)^T = (0,i,0,j,0)^T
376 T_S3(i,j,k)^T = (0,i,2,j,0,k,0)^T
377 T_S4(i,j)^T = (0,i,2,j,1)^T
382 These schedules depend on the iterators and give for each instance of each
383 statement a unique execution date. Using such scattering functions allow
384 CLooG to re-generate the input code.
390 @c % ***********************Using the CLooG Software **************************
392 @chapter Using the CLooG Software
397 * Writing The Input File::
403 @c %/*************************************************************************
404 @c % * A FIRST EXAMPLE *
405 @c % *************************************************************************/
406 @node A First Example
407 @section A First Example
408 CLooG takes as input a file that must be written accordingly to a grammar
409 described in depth in a further section (@pxref{Writing The Input File}).
410 Moreover it supports many options to tune the target code presentation or
411 quality as discussed in a dedicated section (@pxref{Calling CLooG}).
413 of CLooG is not very complex and we present in this section how to generate the
414 code corresponding to a basic example discussed earlier (@pxref{Basics}).
416 The problem is to find the code that scans a 2-dimensional polyhedron
417 where @samp{i} and @samp{j} are the unknown (the two dimensions of the space)
418 and @samp{m} and @samp{n} are the parameters (the symbolic constants),
419 defined by the following set of constraints:
427 @noindent We also consider a partial knowledge of the parameter values,
428 expressed thanks to the following affine constraints:
436 An input file that corresponds to this problem, and asks for a generated
437 code in C, may be the following. Note that we do not describe here precisely
438 the structure and the components of this file (@pxref{Writing The Input File}
439 for such information, if you feel it necessary):
442 # ---------------------- CONTEXT ----------------------
445 # Context (constraints on two parameters)
446 2 4 # 2 lines and 4 columns
447 # eq/in m n 1 eq/in: 1 for inequality >=0, 0 for equality =0
448 1 1 0 -2 # 1*m + 0*n -2*1 >= 0, i.e. m>=2
449 1 0 1 -2 # 0*m + 1*n -2*1 >= 0, i.e. n>=2
451 1 # We want to set manually the parameter names
452 m n # parameter names
454 # --------------------- STATEMENTS --------------------
455 1 # Number of statements
457 1 # First statement: one domain
459 5 6 # 5 lines and 6 columns
461 1 1 0 0 0 -2 # i >= 2
462 1 -1 0 0 1 0 # i <= n
463 1 0 1 0 0 -2 # j >= 2
464 1 0 -1 1 0 0 # j <= m
465 1 -1 -1 0 1 2 # n+2-i>=j
466 0 0 0 # for future options
468 1 # We want to set manually the iterator names
471 # --------------------- SCATTERING --------------------
472 0 # No scattering functions
475 This file may be called @samp{basic.cloog}
476 (this example is provided in the CLooG distribution as
477 @code{test/manual_basic.cloog}) and we can ask CLooG to process it
478 and to generate the code by a simple calling to CLooG with this file as input:
479 @samp{cloog basic.cloog}. By default, CLooG will print the generated code in
484 /* Generated by CLooG v@value{VERSION} in 0.00s. */
485 for (i=2;i<=n;i++) @{
486 for (j=2;j<=min(m,-i+n+2);j++) @{
493 @c %/*************************************************************************
495 @c % *************************************************************************/
496 @node Writing The Input File
497 @section Writing The Input File
498 The input text file contains a problem description, i.e. the context,
499 the domains and the scattering functions.
500 Because CLooG is very 'compilable code generation oriented', we can associate
501 some additional informations to each domain. We call this association a
502 @emph{statement}. The set of all informations is
503 called a @emph{program}. The input file respects the grammar below
504 (terminals are preceeded by "_"):
508 Program ::= Context Statements Scattering
509 Context ::= Language Domain Naming
510 Statements ::= Nb_statements Statement_list Naming
511 Scattering ::= Nb_functions Domain_list Naming
512 Naming ::= Option Name_list
513 Name_list ::= _String Name_list | (void)
514 Statement_list ::= Statement Statement_list | (void)
515 Domain_list ::= _Domain Domain_list | (void)
516 Statement ::= Iteration_domain 0 0 0
517 Iteration_domain ::= Domain_union
518 Domain_union ::= Nb_domains Domain_list
521 Nb_statements ::= _Integer
522 Nb_domains ::= _Integer
523 Nb_functions ::= _Integer
527 @item @samp{Context} represents the informations that are
528 shared by all the statements. It consists on
529 the language used (which can be @samp{c} for C or @samp{f} for FORTRAN 90)
530 and the global constraints on parameters.
531 These constraints are essential
532 since they give to CLooG the number of parameters. If there is no
533 parameter or no constraints on parameters, just give a constraint
534 always satisfied like @math{1 \geq 0}. @samp{Naming} sets the parameter
536 If the naming option @samp{Option} is 1, parameter names will be read
537 on the next line. There must be exactly as many names as parameters.
538 If the naming option @samp{Option} is 0, parameter names are
539 automatically generated. The name of the first parameter will
540 be @samp{M}, and the name of the @math{(n+1)^{th}} parameter directly
541 follows the name of the @math{n^{th}} parameter in ASCII code.
542 It is the user responsibility to ensure that parameter names,
543 iterators and scattering dimension names are different.
544 @item @samp{Statements} represents the informations on the statements.
545 @samp{Nb_statements} is the number of statements in the program,
546 i.e. the number of @samp{Statement} items in the @samp{Statement_list}.
547 @samp{Statement} represents the informations on a given statement.
548 To each statement is associated a domain
549 (the statement iteration domain: @samp{Iteration_domain}) and three
550 zeroes that represents future options.
551 @samp{Naming} sets the iterator names. If the naming option
552 @samp{Option} is 1, the iterator names
553 will be read on the next line. There must be exactly as many names as
554 nesting level in the deepest iteration domain. If the naming option
555 @samp{Option} is 0, iterator names are automatically generated.
556 The iterator name of the outermost loop will be @samp{i}, and the
557 iterator name of the loop at level @math{n+1} directly follows the
558 iterator name of the loop at level @math{n} in ASCII code.
559 @item @samp{Scattering} represents the informations on scattering functions.
560 @samp{Nb_functions} is the number of functions (it must be
561 equal to the number of statements or 0 if there is no scattering
562 function). The function themselves are represented through
564 @samp{Naming} sets the scattering dimension names. If the naming option
565 @samp{Option} is 1, the scattering dimension names will be read on the
567 There must be exactly as many names as scattering dimensions. If the
568 naming option @samp{Option} is 0, scattering dimension names are automatically
569 generated. The name of the @math{n^{th}} scattering dimention
574 * Domain Representation::
575 * Scattering Representation::
578 @node Domain Representation
579 @subsection Domain Representation
580 As shown by the grammar, the input file describes the various informations
581 thanks to characters, integers and domains. Each domain is defined by a set of
582 constraints in the PolyLib format (@pxref{Wil93}). They have the
585 @item some optional comment lines beginning with @samp{#},
586 @item the row and column numbers, possibly followed by comments,
587 @item the constraint rows, each row corresponds to a constraint the
588 domain have to satisfy. Each row must be on a single line and is possibly
589 followed by comments. The constraint is an equality @math{p(x) = 0} if the
590 first element is 0, an inequality @math{p(x) \geq 0} if the first element
591 is 1. The next elements are the unknown coefficients, followed by
592 the parameter coefficients. The last element is the constant factor.
594 For instance, assuming that @samp{i}, @samp{j} and @samp{k} are iterators and
595 @samp{m} and @samp{n} are parameters, the domain defined by the following
600 \hbox{$ \cases{ -i + m &$\geq 0$\cr
602 i + j - k &$\geq 0$}$}
616 @noindent can be written in the input file as follows :
621 3 7 # 3 lines and 7 columns
623 1 -1 0 0 1 0 0 # -i + m >= 0
624 1 0 -1 0 0 1 0 # -j + n >= 0
625 1 1 1 -1 0 0 0 # i + j - k >= 0
629 Each iteration domain @samp{Iteration_domain} of a given statement
630 is a union of polyhedra
631 @samp{Domain_union}. A union is defined by its number of elements
632 @samp{Nb_domains} and the elements themselves @samp{Domain_list}.
633 For instance, let us consider the following pseudo-code:
637 for (i=1;i<=n;i++) @{
638 if ((i >= m) || (i <= 2*m))
646 @noindent The iteration domain of @samp{S1} can be divided into two
647 polyhedra and written in the input file as follows:
651 2 # Number of polyhedra in the union
653 3 5 # 3 lines and 5 columns
659 3 5 # 3 lines and 5 columns
663 1 -1 2 0 0 # i <= 2*m
667 @node Scattering Representation
668 @subsection Scattering Function Representation
669 Scattering functions are depicted in the input file thanks a representation
670 very close to the domain one.
671 An integer gives the number of functions @samp{Nb_functions} and each function
672 is represented by a domain. Each line of the domain corresponds to an equality
673 defining a dimension of the function. Note that at present
674 (CLooG @value{VERSION})
675 @strong{all functions must have the same scattering dimension number}. If a
676 user wants to set scattering functions with different dimensionality, he has
677 to complete the smaller one with zeroes to reach the maximum dimensionality.
678 For instance, let us consider the following code and
679 scheduling functions:
683 for (i=1;i<=n;i++) @{
684 if ((i >= m) || (i <= 2*m))
694 \hbox{$ \cases{ \theta _{S1}(i) &$= (i,0)^T$\cr
695 \theta _{S2}(i,j)^T &$= (n,i+j)^T$}$}
703 T_S2(i,j)^T = (n,i+j)^T
709 @noindent This scheduling can be written in the input file as follows:
713 2 # Number of scattering functions
715 2 7 # 2 lines and 7 columns
716 # eq/in c1 c2 i m n 1
717 0 1 0 -1 0 0 0 # c1 = i
718 0 0 1 0 0 0 0 # c2 = 0
720 2 8 # 2 lines and 8 columns
721 # eq/in c1 c2 i j m n 1
722 0 1 0 0 0 0 -1 0 # c1 = n
723 0 0 1 -1 -1 0 0 0 # c2 = i+j
726 The complete input file for the user who wants to generate the code for this
727 example with the preceding scheduling would be
728 (this file is provided in the CLooG distribution
729 as @code{test/manual_scattering.cloog}:
732 # ---------------------- CONTEXT ----------------------
735 # Context (no constraints on two parameters)
736 1 4 # 1 lines and 4 columns
738 1 0 0 0 # 0 >= 0, always true
740 1 # We want to set manually the parameter names
741 m n # parameter names
743 # --------------------- STATEMENTS --------------------
744 2 # Number of statements
746 2 # First statement: two domains
748 3 5 # 3 lines and 5 columns
754 3 5 # 3 lines and 5 columns
758 1 -1 2 0 0 # i <= 2*m
759 0 0 0 # for future options
761 1 # Second statement: one domain
762 4 6 # 4 lines and 6 columns
764 1 1 0 0 0 -1 # i >= 1
765 1 -1 0 0 1 0 # i <= n
766 1 -1 1 0 0 -1 # j >= i+1
767 1 0 -1 1 0 0 # j <= m
768 0 0 0 # for future options
770 1 # We want to set manually the iterator names
773 # --------------------- SCATTERING --------------------
774 2 # Scattering functions
776 2 7 # 2 lines and 7 columns
777 # eq/in p1 p2 i m n 1
778 0 1 0 -1 0 0 0 # p1 = i
779 0 0 1 0 0 0 0 # p2 = 0
781 2 8 # 2 lines and 8 columns
782 # eq/in p1 p2 i j m n 1
783 0 1 0 0 0 0 -1 0 # p1 = n
784 0 0 1 -1 -1 0 0 0 # p2 = i+j
786 1 # We want to set manually the scattering dimension names
787 p1 p2 # scattering dimension names
791 @c %/*************************************************************************
792 @c % * Calling CLooG *
793 @c % *************************************************************************/
795 @section Calling CLooG
796 CLooG is called by the following command:
798 cloog [ options | file ]
800 The default behavior of CLooG is to read the input informations from a file and
801 to print the generated code or pseudo-code on the standard output.
802 CLooG's behavior and the output code shape is under the user control thanks
803 to many options which are detailed a further section (@pxref{CLooG Options}).
804 @code{file} is the input file. @code{stdin} is a special value: when used,
805 input is standard input. For instance, we can call CLooG to treat the
806 input file @code{basic.cloog} with default options by typing:
807 @code{cloog basic.cloog} or @code{more basic.cloog | cloog stdin}.
809 @c %/*************************************************************************
810 @c % * CLooG Options *
811 @c % *************************************************************************/
813 @section CLooG Options
816 * Last Depth to Optimize Control::
817 * First Depth to Optimize Control::
818 * Once Time Loop Elimination::
819 * Equality Spreading::
820 * Constant Spreading::
821 * First Level for Spreading::
822 * C PreProcessor Friendly::
831 @node Last Depth to Optimize Control
832 @subsection Last Depth to Optimize Control @code{-l <depth>}
834 @code{-l <depth>}: this option sets the last loop depth to be optimized in
835 control. The higher this depth, the less control overhead.
836 For instance, with some input file, a user can generate
837 different pseudo-codes with different @code{depth} values as shown below.
840 /* Generated using a given input file and @strong{option -l 1} */
841 for (i=0;i<=M;i++) @{
843 for (j=0;j<=N;j++) @{
846 for (j=0;j<=N;j++) @{
855 /* Generated using the same input file but @strong{option -l 2} */
856 for (i=0;i<=M;i++) @{
858 for (j=0;j<=N;j++) @{
866 In this example we can see that this option can change the operation
867 execution order between statements. Let us remind that CLooG does not
868 make any speculation on dependences between statements
869 (@pxref{Scattering}). Thus if nothing (i.e. scattering functions)
870 forbids this, CLooG considers the above codes to be equivalent.
871 If there is no scattering functions, the minimum value for @code{depth}
872 is 1 (in the case of 0, the user doesn't really need a loop generator !),
873 and the number of scattering dimensions otherwise (CLooG will warn the
874 user if he doesn't respect such constraint).
875 The maximum value for depth is -1 (infinity).
876 Default value is infinity.
878 @node First Depth to Optimize Control
879 @subsection First Depth to Optimize Control @code{-f <depth>}
881 @code{-f <depth>}: this option sets the first loop depth to be optimized
882 in control. The lower this depth, the less control overhead (and the longer
883 the generated code). For instance, with some input file, a user
884 can generate different pseudo-codes with different @code{depth} values
886 The minimum value for @code{depth} is 1, and the
887 maximum value is -1 (infinity).
891 /* Generated using a given input file and @strong{option -f 3} */
892 for (i=1;i<=N;i++) @{
893 for (j=1;j<=M;j++) @{
904 /* Generated using the same input file but @strong{option -f 2} */
905 for (i=1;i<=N;i++) @{
906 for (j=1;j<=9;j++) @{
909 for (j=10;j<=M;j++) @{
918 @node Once Time Loop Elimination
919 @subsection Once Time Loop Elimination @code{-otl <boolean>}
921 @code{-otl <boolean>}: this option allows (@code{boolean=1}) or
922 forbids (@code{boolean=0}) the simplification of loops running
923 once. Default value is 1.
926 /* Generated using a given input file and @strong{option -otl 0} */
927 for (j=i+1;j<=i+1;j++) @{
934 /* Generated using the same input file but @strong{option -otl 1} */
941 @node Equality Spreading
942 @subsection Equality Spreading @code{-esp <boolean>}
944 @code{-esp <boolean>}: this option allows (@code{boolean=1}) or
945 forbids (@code{boolean=0}) values spreading when there
946 are equalities. Default value is 0.
949 /* Generated using a given input file and @strong{option -esp 0} */
952 for (k=i;k<=j+M;k++) @{
959 /* Generated using the same input file but @strong{option -esp 1} */
960 for (k=M+2;k<=N+M;k++) @{
967 @node Constant Spreading
968 @subsection Constant Spreading @code{-csp <boolean>}
970 @code{-csp <boolean>}: this option allows (@code{boolean=1}) or
971 forbids (@code{boolean=0}) values spreading when
972 there are @emph{constant} equalities. That is, when the right member
973 of the equality is a constant term. Default value is 1.
976 /* Generated using a given input file and @strong{option -csp 0} */
979 for (k=i;j<=j+M;j++) @{
986 /* Generated using the same input file but @strong{option -csp 1} */
988 for (k=i;k<=N+M;k++) @{
995 @node First Level for Spreading
996 @subsection First Level for Spreading @code{-fsp <level>}
998 @code{-fsp <level>}: it can be useful to set a
999 first level to begin equality spreading. Particularly when using
1000 scattering functions, the user may want to see the scattering dimension
1001 values instead of spreading or hiding them. If user has set a
1002 spreading, @code{level} is
1003 the first level to start it. Default value is 1.
1006 /* Generated using a given input file and @strong{option -fsp 1} */
1007 for (j=0;j<=N+M;j++) @{
1010 for (j=0;j<=N+M;j++) @{
1017 /* Generated using the same input file but @strong{option -fsp 2} */
1019 for (j=0;j<=c1+M;j++) @{
1023 for (j=0;j<=N+c1;j++) @{
1030 @node C PreProcessor Friendly
1031 @subsection C PreProcessor Friendly @code{-cpp <boolean>}
1033 @code{-cpp <boolean>}: this option ask CLooG for printing a less
1034 human-readable but compilable code by using the C preprocessor
1035 (@code{boolean=1}). In this case each statement is written as a
1036 function of the iterators corresponding to its domain dimensions:
1037 @code{Si(value_of_iterator_1,...,value_of_iterator_n)}. It follows
1038 that the user can easily add preprocessor macros to define each
1039 statement and use the generated textual code directly for compilation.
1040 When @code{boolean} is set to 0, the pretty printer has the default
1041 behaviour. Default value is 0.
1044 /* Generated using a given input file and @strong{option -cpp 0} */
1045 for (j=0;j<=N+M;j++) @{
1052 /* Generated using the same input file but @strong{option -cpp 1} */
1053 /* and a preprocessor macro set by the user */
1055 #define S1(i,j) A[(j)]=3*(i)
1057 for (j=0;j<=N+M;j++) @{
1063 @node Statement Block
1064 @subsection Statement Block @code{-block <boolean>}
1066 @code{-block <boolean>}: this option allows (@code{boolean=1}) to
1067 create a statement block for each new iterator, even if there is only
1068 an equality. This can be useful in order to parse the generated
1069 pseudo-code. When @code{boolean} is set to 0 or when the generation
1070 language is FORTRAN, this feature is disabled. Default value is 0.
1073 /* Generated using a given input file and @strong{option -block 0} */
1081 /* Generated using the same input file but @strong{option -block 1} */
1092 @subsection Loop Strides @code{-strides <boolean>}
1094 @code{-strides <boolean>}: this options allows (@code{boolean=1}) to
1095 handle non-unit strides for loop increments. This can remove a lot of
1096 guards and make the generated code more efficient. Default value is 0.
1099 /* Generated using a given input file and @strong{option -strides 0} */
1100 for (i=1;i<=n;i++) @{
1112 /* Generated using the same input file but @strong{option -strides 1} */
1113 for (i=2;i<=n;i+=2) @{
1122 @node Compilable Code
1123 @subsection Compilable Code @code{-compilable <value>}
1125 @code{-compilable <value>}: this options allows (@code{value} is not 0)
1126 to generate a compilable code where all parameters have the integral value
1127 @code{value}. This option creates a macro for each statement. Since
1128 CLooG do not know anything about the statement sources, it fills the
1129 macros with a basic increment that computes the total number of
1130 scanned integral points. The user may change easily the macros according
1131 to his own needs. This option is possible only if the generated code is
1132 in C. Default value is 0.
1135 /* Generated using a given input file and @strong{option -compilable 0} */
1136 for (i=0;i<=n;i++) @{
1137 for (j=0;j<=n;j++) @{
1146 /* Generated using the same input file but @strong{option -compilable 10} */
1147 /* DON'T FORGET TO USE -lm OPTION TO COMPILE. */
1149 /* Useful headers. */
1154 /* Parameter value. */
1157 /* Statement macros (please set). */
1158 #define S1(i,j) @{total++;@}
1159 #define S2(i,j) @{total++;@}
1160 #define S3(i) @{total++;@}
1163 /* Original iterators. */
1166 int n=PARVAL, total=0 ;
1168 for (i=0;i<=n;i++) @{
1169 for (j=0;j<=n;j++) @{
1176 printf("Number of integral points: %d.\n",total) ;
1182 @subsection Output @code{-o <output>}
1184 @code{-o <output>}: this option sets the output file. @code{stdout} is a
1185 special value: when used, output is standard output.
1186 Default value is @code{stdout}.
1189 @subsection Help @code{--help} or @code{-h}
1191 @code{--help} or @code{-h}: this option ask CLooG to print a short help.
1194 @subsection Version @code{--version} or @code{-v}
1196 @code{--version} or @code{-v}: this option ask CLooG to print some version
1200 @c %/*************************************************************************
1201 @c % * A Full Example *
1202 @c % *************************************************************************/
1204 @section A Full Example
1206 Let us consider the allocation problem of a Gaussian elimination, i.e. we want
1207 to distribute the various statement instances of the compute kernel onto
1208 different processors. The original code is the following:
1211 for (i=1;j<=N-1;i++) @{
1212 for (j=i+1;j<=N;j++) @{
1213 c[i][j] = a[j][i]/a[i][i] ; /* S1 */
1214 for (k=i+1;k<=N;k++) @{
1215 a[j][k] -= c[i][j]*a[i][k] ; /* S2 */
1222 @noindent The best affine allocation functions can be found by any good automatic
1223 parallelizer like LooPo (@pxref{Gri04}):
1227 \hbox{$ \cases{ \theta _{S1}(i,j)^T &$= (i)$\cr
1228 \theta _{S2}(i,j,k)^T &$= (k)$}$}
1241 @noindent To ensure that on each processor, the set of statement instances is
1242 executed according to the original ordering, we add as minor scattering
1243 dimensions the original scheduling (@pxref{Scattering}):
1247 \hbox{$ \cases{ \theta _{S1}(i,j)^T &$= (i,0,i,0,j,0)^T$\cr
1248 \theta _{S2}(i,j,k)^T &$= (k,0,i,0,j,1,k,0)^T$}$}
1255 T_S1(i,j)^T = (i,0,i,0,j,0)^T
1256 T_S2(i,j,k)^T = (k,0,i,0,j,1,k,0)^T
1261 @noindent To ensure that the scattering functions have the same dimensionality, we
1262 complete the first function with zeroes
1263 (this is a CLooG @value{VERSION} and previous versions requirement,
1264 it should be removed in a future version, don't worry it's absolutly legal !):
1268 \hbox{$ \cases{ \theta _{S1}(i,j)^T &$= (i,0,i,0,j,0,0,0)^T$\cr
1269 \theta _{S2}(i,j,k)^T &$= (k,0,i,0,j,1,k,0)^T$}$}
1276 T_S1(i,j)^T = (i,0,i,0,j,0,0,0)^T
1277 T_S2(i,j,k)^T = (k,0,i,0,j,1,k,0)^T
1282 @noindent The input file corresponding to this code generation problem
1283 could be (this file is provided in the CLooG distribution
1284 as @code{test/manual_gauss.cloog}:
1287 # ---------------------- CONTEXT ----------------------
1290 # Context (no constraints on one parameter)
1291 1 3 # 1 line and 3 columns
1293 1 0 0 # 0 >= 0, always true
1295 1 # We want to set manually the parameter name
1298 # --------------------- STATEMENTS --------------------
1299 2 # Number of statements
1301 1 # First statement: one domain
1302 4 5 # 4 lines and 3 columns
1305 1 -1 0 1 -1 # i <= n-1
1306 1 -1 1 0 -1 # j >= i+1
1308 0 0 0 # for future options
1311 # Second statement: one domain
1312 6 6 # 6 lines and 3 columns
1314 1 1 0 0 0 -1 # i >= 1
1315 1 -1 0 0 1 -1 # i <= n-1
1316 1 -1 1 0 0 -1 # j >= i+1
1317 1 0 -1 0 1 0 # j <= n
1318 1 -1 0 1 0 -1 # k >= i+1
1319 1 0 0 -1 1 0 # k <= n
1320 0 0 0 # for future options
1322 0 # We let CLooG set the iterator names
1324 # --------------------- SCATTERING --------------------
1325 2 # Scattering functions
1327 8 13 # 3 lines and 3 columns
1328 # eq/in p1 p2 p3 p4 p5 p6 p7 p8 i j n 1
1329 0 1 0 0 0 0 0 0 0 -1 0 0 0 # p1 = i
1330 0 0 1 0 0 0 0 0 0 0 0 0 0 # p2 = 0
1331 0 0 0 1 0 0 0 0 0 -1 0 0 0 # p3 = i
1332 0 0 0 0 1 0 0 0 0 0 0 0 0 # p4 = 0
1333 0 0 0 0 0 1 0 0 0 0 -1 0 0 # p5 = j
1334 0 0 0 0 0 0 1 0 0 0 0 0 0 # p6 = 0
1335 0 0 0 0 0 0 0 1 0 0 0 0 0 # p7 = 0
1336 0 0 0 0 0 0 0 0 1 0 0 0 0 # p8 = 0
1338 8 14 # 3 lines and 3 columns
1339 # eq/in p1 p2 p3 p4 p5 p6 p7 p8 i j k n 1
1340 0 1 0 0 0 0 0 0 0 0 0 -1 0 0 # p1 = k
1341 0 0 1 0 0 0 0 0 0 0 0 0 0 0 # p2 = 0
1342 0 0 0 1 0 0 0 0 0 -1 0 0 0 0 # p3 = i
1343 0 0 0 0 1 0 0 0 0 0 0 0 0 0 # p4 = 0
1344 0 0 0 0 0 1 0 0 0 0 -1 0 0 0 # p5 = j
1345 0 0 0 0 0 0 1 0 0 0 0 0 0 -1 # p6 = 1
1346 0 0 0 0 0 0 0 1 0 0 0 -1 0 0 # p7 = k
1347 0 0 0 0 0 0 0 0 1 0 0 0 0 0 # p8 = 0
1349 1 # We want to set manually the scattering dimension names
1350 p1 p2 p3 p4 p5 p6 p7 p8 # scattering dimension names
1353 Calling CLooG, with for instance the command line
1354 @code{cloog -fsp 2 gauss.cloog} for a better view
1355 of the allocation (the processor number is given by @code{p1}),
1356 will result on the following target code that actually implements
1357 the transformation. A minor processing on the dimension @code{p1}
1358 to implement, e.g., MPI calls, which is not shown here may
1359 result in dramatic speedups !
1364 for (p5=2;p5<=n;p5++) @{
1368 for (p1=2;p1<=n-1;p1++) @{
1369 for (p3=1;p3<=p1-1;p3++) @{
1370 for (p5=p3+1;p5<=n;p5++) @{
1371 S2(i = p3,j = p5,k = p1) ;
1374 for (p5=p1+1;p5<=n;p5++) @{
1380 for (p3=1;p3<=n-1;p3++) @{
1381 for (p5=p3+1;p5<=n;p5++) @{
1382 S2(i = p3,j = p5,k = n) ;
1389 @c %/*************************************************************************
1390 @c % * A Full Example *
1391 @c % *************************************************************************/
1393 @chapter Using the CLooG Library
1394 The CLooG Library was implemented to allow the user to call CLooG
1395 directly from his programs, without file accesses or system calls. The
1396 user only needs to link his programs with C libraries. The CLooG
1397 library mainly provides one function (@code{cloog_program_generate})
1398 which takes as input the problem
1399 description with some options, and returns the data structure corresponding
1400 to the generated code (a @code{CloogProgram} structure) which is more or less
1401 an abstract syntax tree.
1402 The user can work with this data structure and/or use
1403 our pretty printing function to write the final code in either C or FORTRAN.
1404 Some other functions are provided for convenience reasons.
1405 These functions as well as the data structures are described in this section.
1408 * CLooG Data Structures::
1410 * Example of Library Utilization::
1414 @node CLooG Data Structures
1415 @section CLooG Data Structures Description
1416 In this section, we describe the data structures used by the loop
1417 generator to represent and to process a code generation problem.
1434 @subsection CloogMatrix
1437 #define CloogMatrix Matrix
1441 @noindent The @code{CloogMatrix} structure is directly the PolyLib
1442 @code{Matrix} data structure (@pxref{Wil93}). This structure is devoted to
1443 represent a set of constraints. It is
1444 defined in @code{polylib/types.h} as the following:
1449 @{ unsigned NbRows ; /* Number of rows. */
1450 unsigned NbColumns ; /* Number of columns. */
1451 Value ** p ; /* Array of pointers to the matrix rows. */
1452 Value * p_Init ; /* Matrix rows contiguously in memory. */
1453 int p_Init_size ; /* For internal use. */
1455 typedef struct matrix Matrix;
1459 @noindent The whole matrix is stored in memory row after row at the
1460 @code{p_Init} address. @code{p} is an array of pointers where
1461 @code{p[i]} points to the first element of the @math{i^{th}} row.
1462 @code{NbRows} and @code{NbColumns} are respectively the number of
1463 rows and columns of the matrix.
1464 Each row corresponds to a constraint. The first element of each row is an
1465 equality/inequality tag. The
1466 constraint is an equality @math{p(x) = 0} if the first element is 0, but it is
1467 an inequality @math{p(x) \geq 0} if the first element is 1.
1468 The next elements are the unknown coefficients, followed by the parameter
1469 coefficients, then the scalar coefficient.
1470 For instance, the following three constraints:
1474 \hbox{$ \cases{ -i + m &$= 0$\cr
1476 j + i - k &$\geq 0$}$}
1490 @noindent would be represented by the following rows:
1494 # eq/in i j k m n cst
1501 @noindent To be able to provide different precision version (CLooG
1502 supports 32 bits, 64 bits and arbitrary precision through the GMP library),
1503 the @code{Value} type depends on the configuration options (it may be
1504 @code{long int} for 32 bits version, @code{long long int} for 64 bits version,
1505 and @code{mpz_t} for multiple precision version).
1506 The @code{p_Init_size} field is needed by the PolyLib to free
1507 the memory allocated by @code{mpz_init} in the multiple precision release.
1508 Set this field to 0 if you are @emph{not} using multiple precision.
1509 Set this field to the size of the @code{p_Init} array if you
1510 initialized it yourself and if you are using the multiple precision version.
1514 @subsection CloogDomain
1518 @{ Polyhedron * polyhedron ; /* The polyhedral domain. */
1520 typedef struct cloogdomain CloogDomain ;
1524 @noindent The @code{CloogDomain} structure contains a PolyLib
1525 @code{Polyhedron} data structure which represents a polyhedral domain
1526 (a union of polyhedra) in both constraint representation and its dual
1527 ray representation (@pxref{Wil93}).
1528 It is defined in @code{polylib/types.h} as the following:
1533 @{ unsigned Dimension, /* Number of dimensions. */
1534 NbConstraints, /* Number of constraints. */
1535 NbRays, /* Number of rays. */
1536 NbEq, /* Number of vertices (?). */
1537 NbBid ; /* Number of extremal rays (?). */
1538 Value ** Constraint ; /* Pointers to constraints. */
1539 Value ** Ray ; /* Pointers to rays. */
1540 Value * p_Init ; /* Constraints and rays contiguously. */
1541 int p_Init_size ; /* For internal use. */
1542 struct polyhedron * next ; /* Next component of the union. */
1544 typedef struct polyhedron Polyhedron;
1548 @noindent The constraint representation is quite the same as in
1549 the @code{Matrix} data structure (@pxref{CloogMatrix}). The number of
1550 rows is @code{NbConstraints} and the
1551 number of columns in the @code{Polyhedron} structure is
1552 @code{Dimension+2} (the @math{+2} comes from the equality/inequality
1553 tag and the scalar coefficient). As in the @code{Matrix} structure,
1554 The data are stored in memory contiguously at the
1555 @code{p_Init} address and the @code{p_Init_size} field is used for
1556 memory deallocation in the multiple precision case (@pxref{CloogMatrix}).
1557 For a better understanding of the
1558 dual ray representation, the user may refer to the PolyLib documentation.
1561 @node CloogDomainList
1562 @subsection CloogDomainList
1565 struct cloogdomainlist
1566 @{ CloogDomain * domain ;
1567 struct cloogdomainlist * next ;
1569 typedef struct cloogdomainlist CloogDomainList ;
1573 @noindent The CloogDomainList structure represents a @code{NULL} terminated linked list
1577 @node CloogStatement
1578 @subsection CloogStatement
1581 struct cloogstatement
1582 @{ int number ; /* The statement unique number. */
1583 void * usr ; /* Pointer for user's convenience. */
1584 struct cloogstatement * next ;/* Next element of the linked list. */
1586 typedef struct cloogstatement CloogStatement ;
1590 @noindent The @code{CloogStatement} structure represents a @code{NULL}
1592 list of statements. In CLooG, a statement is only defined by its unique
1593 number (@code{number}). The user can use the pointer @code{usr} for his
1594 own convenience to link his own statement representation to the
1595 corresponding @code{CloogStatement} structure. The whole management of the
1596 @code{usr} pointer is under the responsibility of the user, in particular,
1597 CLooG never tries to print, to allocate or to free a memory block pointed
1602 @subsection CloogBlock
1606 @{ CloogStatement * statement ; /* Statement list of the block. */
1607 CloogMatrix * scattering ; /* Scattering function of the block. */
1608 int depth ; /* Original block depth.*/
1610 typedef struct cloogblock CloogBlock ;
1614 @noindent The @code{CloogBlock} structure represents a statement block.
1615 In a statement block, every statements have the same iteration
1616 domain and the same scattering function (actually, the scattering
1617 functions may differ only by a scalar
1618 coefficient if it just precises the ordering of the statements within
1619 the block). @code{statement} is the statement list where the
1620 statement order matters, @code{scattering} is one of
1621 the statement scattering functions and
1622 @code{depth} is the number of dimensions of the
1623 iteration domain (only the unknown, not the tag/parameters/scalar).
1626 @node CloogBlockList
1627 @subsection CloogBlockList
1630 struct cloogdblocklist
1631 @{ CloogBlock * block ;
1632 struct cloogblocklist * next ;
1634 typedef struct cloogblocklist CloogBlockList ;
1638 @noindent The CloogBlockList structure represents a @code{NULL} terminated linked list
1643 @subsection CloogLoop
1647 @{ CloogDomain * domain ; /* Iteration domain. */
1648 int stride ; /* Loop stride. */
1649 CloogBlock * block ; /* Included statement block.*/
1650 struct cloogloop * inner ; /* Loop at the next level. */
1651 struct cloogloop * next ; /* Next loop at the same level. */
1653 typedef struct cloogloop CloogLoop ;
1657 @noindent The @code{CloogLoop} structure represents a loop.
1659 loop has an iteration domain (@code{domain}). The iterator's stride for loop
1660 increment is @code{stride}. The loop can include a statement block
1661 in the field @code{block}. If there is no included statement block,
1662 @code{block} is set to @code{NULL}. @code{inner} is a pointer to the inner
1663 loop, and @code{next} a pointer to the next loop in the textual order. If
1664 there are no inner loop or no next loop, the corresponding pointer is set
1669 @subsection CloogNames
1673 @{ int nb_scattering ; /* Scattering dimension number. */
1674 int nb_iterators ; /* Iterator number. */
1675 int nb_parameters ; /* Parameter number. */
1676 char ** scattering ; /* The scattering dimension names. */
1677 char ** iterators ; /* The iterator names. */
1678 char ** parameters ; /* The parameter names. */
1680 typedef struct cloognames CloogNames ;
1684 @noindent The @code{CloogNames} structure represents the scattering dimension,
1685 the iterator and the parameter names in the final program.
1686 @code{nb_scattering}
1687 (respectively @code{nb_iterators} and @code{nb_parameters})
1688 is the number of scattering dimensions number
1689 (respectively the iterator and parameter number)
1690 and of elements in the corresponding array of strings
1692 (respectively @code{iterators} and @code{parameters}).
1693 The @math{i^{th}} scattering dimension name will be associated with the
1694 to the dimension @math{i} of the scattering function.
1695 The @math{i^{th}} iterator name will be associated with the
1696 dimension @math{i} of the iteration domain.
1697 The @math{i^{th}} parameter name will be associated with the
1698 dimension @math{i} of the context polyhedron.
1699 The user has to ensure that there are
1700 enough scattering dimension, iterator and parameter names.
1704 @subsection CloogProgram
1708 @{ char language ; /* The language of the program. */
1709 int scattdims ; /* Scattering dimension number. */
1710 CloogNames * names ; /* Iterators and parameters names. */
1711 CloogDomain * context ; /* The context of the program. */
1712 CloogLoop * loop ; /* The loops of the program. */
1713 CloogBlockList * blocklist ; /* The statement block list. */
1715 typedef struct cloogprogram CloogProgram ;
1719 @noindent The @code{CloogProgram} structure represents a static control program kernel.
1720 @code{language} precises the language (@code{c} for C or @code{f} for FORTRAN).
1721 @code{scattdims} gives the number of scattering dimensions.
1722 @code{context} is a pointer to the constraints on the program parameters,
1724 @code{NULL} pointer even if there are no constraints on parameters. In such a
1725 case, set a polyhedron with as many dimensions as there are parameters, with
1726 an @emph{always true} constraint like @math{1 \geq 0} (this is necessary
1727 since the number of parameters is deduced from the dimension number of
1728 the context constraints). @code{loop} is a pointer
1729 to the first loop of the program. @code{names} is a pointer to the various
1730 element names (scattering dimension, iterators, parameters)
1731 of the final program. @code{names} can be the @code{NULL}
1732 pointer if the user do not want to use our pretty printing function.
1733 @code{blocklist} is the linked list of all the statement block structures.
1734 As an example, let us consider the following loop nest:
1737 for (i=0; i<=n; i++) @{
1738 for (j=0; j<=n; j++) @{
1742 for (j=n+1; j<=2*n; j++) @{
1748 @noindent The next figure gives a possible representation in memory for this
1749 program thanks to the CLooG data structures (it has been actually printed
1750 by the @code{cloog_program_print} function). In this figure,
1751 @samp{+-- CloogLoop} denotes an @samp{inner} loop, while a @samp{CloogLoop}
1752 on the same column pointed by an arrow denotes a @samp{next} loop:
1759 | Scattering dimension number: 0
1763 | | Scattering dimension number: 0
1765 | | +-- No scattering string
1767 | | Iterator number -----------: 2
1769 | | +-- Iterator strings ------: i j
1771 | | Parameter number ----------: 1
1773 | | +-- Parameter strings -----: n
1788 | | +-- Null CloogBlock
1792 | | | +-- CloogDomain
1793 | | | | [ 1 0 1 0 0 ]
1794 | | | | [ 1 0 -1 1 0 ]
1795 | | | | [ 1 0 0 0 1 ]
1799 | | | +-- Null CloogBlock
1803 | | | | +-- CloogDomain
1804 | | | | | [ 1 0 0 0 1 ]
1808 | | | | +-- CloogBlock
1810 | | | | | +-- CloogStatement 1
1813 | | | | | | CloogStatement 2
1815 | | | | | +-- Null scattering function
1823 | | | +-- CloogDomain
1824 | | | | [ 1 0 -1 2 0 ]
1825 | | | | [ 1 0 1 -1 -1 ]
1826 | | | | [ 1 0 0 0 1 ]
1830 | | | +-- Null CloogBlock
1834 | | | | +-- CloogDomain
1835 | | | | | [ 1 0 0 0 1 ]
1839 | | | | +-- CloogBlock
1841 | | | | | +-- CloogStatement 3
1843 | | | | | +-- Null scattering function
1855 @subsection CloogOptions
1859 @{ int l ; /* -l option. */
1860 int f ; /* -f option. */
1861 int strides ; /* -strides option. */
1862 int esp ; /* -esp option. */
1863 int csp ; /* -csp option. */
1864 int fsp ; /* -fsp option. */
1865 int otl ; /* -otl option. */
1866 int block ; /* -block option. */
1867 int cpp ; /* -cpp option. */
1868 int compilable ; /* -compilable option. */
1870 typedef struct cloogoptions CloogOptions ;
1874 @noindent The @code{CloogOptions} structure contains all the possible options to
1875 rule CLooG's behaviour (@pxref{Calling CLooG}).
1876 As a reminder, the default values are:
1878 @item @math{l = -1} (optimize control until the innermost loops),
1879 @item @math{f = 1} (optimize control from the outermost loops),
1880 @item @math{strides = 0} (use only unit strides),
1881 @item @math{esp = 0} (do not spread complex equalities),
1882 @item @math{csp = 1} (spread constant values),
1883 @item @math{fsp = 1} (start to spread from the first iterators),
1884 @item @math{otl = 1} (simplify loops running only once).
1885 @item @math{block = 0} (do not make statement blocks when not necessary).
1886 @item @math{cpp = 0} (do not generate a compilable part of code using preprocessor).
1887 @item @math{compilable = 0} (do not generate a compilable code).
1891 @node CLooG Functions
1892 @section CLooG Functions Description
1895 * cloog_program_generate::
1896 * cloog_program_scatter::
1897 * cloog_program_pprint::
1898 * cloog_program_read::
1899 * From Matrices to Domains and Conversely::
1900 * Allocation and Initialization Functions::
1901 * Memory Deallocation Functions::
1902 * Printing Functions::
1906 @node cloog_program_generate
1907 @subsection cloog_program_generate
1910 CloogProgram * cloog_program_generate
1911 ( CloogProgram * program, /* Input program. */
1912 CloogOptions * options /* Options. */
1917 @noindent The @code{cloog_program_generate} function generates
1918 the data structure of the source code that scans the input
1919 polyhedra pointed by @code{program}
1920 according to the options pointed by @code{options}.
1921 The process is made directly on the input structure pointed by
1922 @code{program}, thus the original structure is no longer available
1923 after a call to this function. It returns a pointer to a
1924 @code{CloogProgram} structure containing the
1925 solution in CLooG structures.
1927 The input @code{CloogProgram} structure must have only one loop level
1928 (no inner loops): there is one loop per statement block. For a given block,
1929 the corresponding loop carries the iteration domain, the statement block,
1930 and a loop stride initialized to 1. For instance, the input @code{CloogProgram} structure
1931 that have been sent to @code{cloog_program_generate} to achieve the final
1932 structure and code shown as example in the @code{CloogProgram} structure
1933 description (@pxref{CloogProgram}) was the following one:
1940 | Scattering dimension number: 0
1944 | | Scattering dimension number: 0
1946 | | +-- No scattering string
1948 | | Iterator number -----------: 2
1950 | | +-- Iterator strings ------: i j
1952 | | Parameter number ----------: 1
1954 | | +-- Parameter strings -----: n
1963 | | | [ 1 -1 0 1 0 ]
1965 | | | [ 1 0 -1 1 0 ]
1971 | | | +-- CloogStatement 1
1974 | | | | CloogStatement 2
1976 | | | +-- Null scattering function
1985 | | | [ 1 -1 0 1 0 ]
1986 | | | [ 1 0 1 -1 -1 ]
1987 | | | [ 1 0 -1 2 0 ]
1993 | | | +-- CloogStatement 3
1995 | | | +-- Null scattering function
2004 @node cloog_program_pprint
2005 @subsection cloog_program_pprint
2008 void cloog_program_pprint
2009 ( FILE * file, /* Output file. */
2010 CloogProgram * program, /* Program to print. */
2011 CloogOptions * options /* Options. */
2016 @noindent The function @code{cloog_program_pprint} is a pretty printer for
2017 @code{CloogProgram} structures when it is a solution provided by
2018 the @code{cloog_program_generate} function. It prints the code or pseudo-code in the
2019 file pointed by @code{file} (possibly @code{stdout}) with regards to the
2020 options pointed by @code{options}.
2023 @node cloog_program_scatter
2024 @subsection cloog_program_scatter
2027 void cloog_program_scatter
2028 ( CloogProgram * program, /* Input program. */
2029 CloogDomainList * scattering, /* Additional scattering functions. */
2030 char ** names ; /* Additional dimension names. */
2035 @noindent The function @code{cloog_program_scatter} applies scattering
2036 functions to the @code{CloogProgram} structure pointed by @code{program}.
2037 Original domains of @code{program} are freed. Scattering functions
2038 are inside the @code{CloogDomainList} structure pointed by @code{scattering}.
2039 There must be as many scattering functions in the @code{CloogDomainList}
2040 structure as loops (i.e. iteration domains) in the @code{CloogProgram}
2041 structure. The first scattering function of the list will be applied to the
2042 iteration domain of the first loop in the program, and so on.
2043 @code{names} gives the scattering dimension names as an array of strings. If
2044 @code{names} is @code{NULL}, names are automatically generated: the name of
2045 the @math{n^{th}} scattering dimension will be @code{cn}.
2048 @node cloog_program_read
2049 @subsection cloog_program_read
2051 CloogProgram * cloog_program_read(FILE *) ;
2053 @noindent The @code{cloog_program_read} function
2054 reads the program data from a CLooG input file
2055 (@pxref{Writing The Input File}). It takes
2056 as input a pointer to the file it has to read (possibly @code{stdin}), and
2057 return a pointer to the read @code{CloogProgram} structure.
2060 @node From Matrices to Domains and Conversely
2061 @subsection From Matrices to Domains and Conversely
2063 CloogMatrix * cloog_domain_domain2matrix(CloogDomain *) ;
2064 CloogDomain * cloog_domain_matrix2domain(CloogMatrix *) ;
2066 @noindent Two functions are provided to translate a @code{CloogDomain}
2067 data structure to a @code{CloogMatrix} data structure and conversely.
2068 Each function takes as input a pointer to the data structure to translate
2069 and returns as output a pointer to the translated data structure. The
2070 input data structure if neither modified nor freed. They
2071 may be quite useful for e.g. pretty printing since it is more convenient
2072 in constraint (matrix) representation.
2075 @node Allocation and Initialization Functions
2076 @subsection Allocation and Initialization Functions
2078 CloogStructure * cloog_structure_malloc() ;
2080 @noindent Each CLooG data structure has an allocation and initialization
2081 function as shown above, where @code{Structure} and @code{structure} have to
2082 be replaced by the name of the convenient structure (without @samp{Cloog} prefix) for
2083 instance @code{CloogLoop * cloog_loop_malloc() ;}. These functions return
2084 pointers to an allocated structure with fields set to convenient default
2085 values. @strong{Using those functions is mandatory} to support internal
2086 management fields and to avoid upward compatibility problems if
2087 new fields appear. An exception is @code{cloog_matrix_malloc} since the
2088 @code{CloogMatrix} comes directly from the PolyLib. It takes two parameters:
2089 the number of rows and columns of the matrix we want to allocate:
2091 CloogMatrix * cloog_matrix_malloc(unsigned nbrows, unsigned nbcolumns);
2095 @node Memory Deallocation Functions
2096 @subsection Memory Deallocation Functions
2098 void cloog_structure_free(CloogStructure *) ;
2100 @noindent Each CLooG data structure has a deallocation function as shown above,
2101 where @code{Structure} and @code{structure} have to
2102 be replaced by the name of the convenient structure (without @samp{Cloog} prefix) for
2103 instance @code{void cloog_loop_free(CloogLoop *) ;}. These functions
2104 free the allocated memory for the structure provided as input. They free
2105 memory recursively, i.e. they also free the allocated memory for the internal
2107 @strong{Using those functions is mandatory} to avoid memory leaks on internal
2108 management fields and to avoid upward compatibility problems if
2112 @node Printing Functions
2113 @subsection Printing Functions
2115 void cloog_structure_print(FILE *, CloogStructure *) ;
2117 @noindent Each CLooG data structure has a printing function as shown above,
2118 where @code{Structure} and @code{structure} have to
2119 be replaced by the name of the convenient structure (without @samp{Cloog} prefix) for
2120 instance @code{void cloog_loop_print(FILE *, CloogLoop *) ;}. These functions
2121 print the pointed structure (and its fields recursively) to the file provided
2122 as input (possibly @code{stdout}).
2125 @node Example of Library Utilization
2126 @section Example of Library Utilization
2127 Here is a basic example showing how it is possible to use the CLooG library,
2128 assuming that a standard installation has been done.
2129 The following C program reads a CLooG input file on the standard input,
2130 then prints the solution on the standard output.
2131 Options are preselected to the default values of the CLooG software.
2132 This example is provided in the @code{example} directory of the
2137 # include <cloog/cloog.h>
2140 @{ CloogProgram * program ;
2141 CloogOptions * options ;
2143 /* Setting options and reading program informations. */
2144 options = cloog_options_malloc() ;
2145 program = cloog_program_read(stdin,options) ;
2147 /* Generating and printing the code. */
2148 program = cloog_program_generate(program,options) ;
2149 cloog_program_pprint(stdout,program,options) ;
2151 cloog_options_free(options) ;
2152 cloog_program_free(program) ;
2157 @noindent The compilation command could be:
2159 gcc example.c -lcloog -o example
2161 @noindent A calling command with the input file test.cloog could be:
2163 more test.cloog | ./example
2167 @c % ******************************** HACKING *********************************
2169 @c @chapter Hacking CLooG
2172 @c * Program organization::
2173 @c * Special Options::
2174 @c * CLooG Coding Standards::
2177 @c @node Program organization
2178 @c @section Program organization
2180 @c @node Special Options
2181 @c @section Special Options
2183 @c @node CLooG Coding Standards
2184 @c @section CLooG Coding Standards
2187 @c % ****************************** INSTALLING ********************************
2189 @chapter Installing CLooG
2194 * Basic Installation::
2195 * Optional Features::
2201 First of all, it would be very kind to refer the following paper in any
2202 publication that result from the use of the CLooG software or its library,
2203 @pxref{Bas04} (a bibtex entry is provided behind the title page of this
2204 manual, along with copyright notice, and in the CLooG home
2205 @code{http://www.CLooG.org}.
2207 This program is free software; you can redistribute it and/or
2208 modify it under the terms of the GNU General Public License version 2
2209 as published by the Free Software Foundation.
2210 This program is distributed in the hope that it will be useful,
2211 but WITHOUT ANY WARRANTY; without even the implied warranty of
2212 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
2213 GNU General Public License for more details.
2214 @code{http://www.gnu.org/copyleft/gpl.html}
2218 @section Requirements
2227 @subsection PolyLib (mandatory)
2228 To successfully install CLooG, the user need firstly to install PolyLib
2229 version 5.22.1 or above (default 64 bits version is satisfying
2230 as well as 32 bits or GMP multiple precision version).
2231 Polylib can be downloaded freely
2232 at @code{http://icps.u-strasbg.fr/PolyLib/} or
2233 @code{http://www.irisa.fr/polylib/}. Once downloaded and unpacked
2234 (e.g. using the @samp{tar -zxvf polylib-5.22.1.tar.gz} command),
2235 the user can compile
2236 it by typing the following commands on the PolyLib's root directory:
2239 @item @code{./configure}
2241 @item And as root: @code{make install}
2244 The PolyLib default installation is @code{/usr/local}. This directory may
2245 not be inside your library path. To fix the problem, the user should set
2247 export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib
2249 @noindent if your shell is, e.g., bash or
2251 setenv LD_LIBRARY_PATH $LD_LIBRARY_PATH:/usr/local/lib
2253 @noindent if your shell is, e.g., tcsh. Add the line to your .bashrc or .tcshrc (or
2254 whatever convenient file) to make this change permanent. Another solution
2255 is to ask PolyLib to install in the standard path by using the prefix
2256 option of the configure script:
2257 @samp{./configure --prefix=/usr}.
2259 CLooG makes intensive calls to polyhedral operations, and PolyLib
2260 functions do the job. Polylib is a free library written in C for the
2261 manipulation of polyhedra. The library is operating on objects like
2262 vectors, matrices, lattices, polyhedra, Z-polyhedra, unions of
2263 polyhedra and a lot of other intermediary structures. It provides
2264 functions for all the important operations on these structures.
2267 @subsection GMP Library (optional)
2269 To be able to deal with insanely large coefficient, the user will need to
2270 install the GNU Multiple Precision Library (GMP for short) version 4.1.4
2271 or above. It can be freely downloaded from @code{http://www.swox.com/gmp}.
2272 The user can compile it by typing the following commands on the GMP root
2276 @item @code{./configure}
2278 @item And as root: @code{make install}
2281 The GMP default installation is @code{/usr/local}, the same method to
2282 fix a library path problem applies as with PolyLib (@pxref{PolyLib}).
2284 The PolyLib has to be built using the GMP library by specifying the option
2285 @samp{--with-libgmp=PATH_TO_GMP} to the PolyLib configure script
2286 (where @code{PATH_TO_GMP} is @code{/usr/local} if you did not change the GMP
2287 installation directory). Then you have to set the convenient CLooG configure
2288 script options to buid the GMP version (@pxref{Optional Features}).
2291 @node Basic Installation
2292 @section CLooG Basic Installation
2294 Once downloaded and unpacked
2295 (e.g. using the @samp{tar -zxvf cloog-@value{VERSION}.tar.gz} command),
2296 you can compile CLooG by typing the following commands on the CLooG's root
2300 @item @code{./configure}
2302 @item And as root: @code{make install}
2305 Depending on the location of the PolyLib, you may need to set the
2306 option @code{--with-polylib} of the configure script
2307 (e.g. @samp{./configure --with-polylib=/usr/local} with a default PolyLib
2310 The program binaries and object files can be removed from the
2311 source code directory by typing @code{make clean}. To also remove the
2312 files that the @code{configure} script created (so you can compile the
2313 package for a different kind of computer) type @code{make distclean}.
2315 Both the CLooG software and library have been successfully compiled
2316 on the following systems:
2318 @item PC's under Linux, with the @code{gcc} compiler,
2319 @item PC's under Windows (Cygwin), with the @code{gcc} compiler,
2320 @item Sparc and UltraSparc Stations, with the @code{gcc} compiler.
2323 @node Optional Features
2324 @section Optional Features
2325 The @code{configure} shell script attempts to guess correct values for
2326 various system-dependent variables and user options used during compilation.
2327 It uses those values to create the @code{Makefile}. Various user options
2328 are provided by the CLooG's configure script. They are summarized in the
2329 following list and may be printed by typing @code{./configure --help} in the
2330 CLooG top-level directory.
2333 @item By default, the installation directory is @code{/usr/local}:
2334 @code{make install} will install the package's files in
2335 @code{/usr/local/bin}, @code{/usr/local/lib} and @code{/usr/local/include}.
2336 The user can specify an installation prefix other than @code{/usr/local} by
2337 giving @code{configure} the option @code{--prefix=PATH}.
2339 @item By default, @code{configure} will look for the PolyLib in standard
2340 locations. If necessary, the user can specify the PolyLib path by giving
2341 @code{configure} the option @code{--with-polylib=PATH}.
2343 @item By default, both CLooG software and library are compiled and installed.
2344 By giving @code{configure} the option @code{--without-cloog} the user
2345 disable the compilation and installation of the CLooG software.
2346 By giving @code{configure} the option
2347 @code{--without-lib} the user disable the compilation and installation of the
2350 @item By default, CLooG is built in 64bits version if such version of the
2351 PolyLib is found by @code{configure}. If the only existing version of the
2352 PolyLib is the 32bits or if the user give to @code{configure} the option
2353 @code{--with-bits=32}, the 32bits version of CLooG will be compiled. In the
2354 same way, the option @code{--with-bits=gmp} have to be used to build
2355 the multiple precision version.
2357 @item By default, @code{configure} will look for the GMP library
2358 (necessary to build the multiple precision version) in standard
2359 locations. If necessary, the user can specify the GMP path by giving
2360 @code{configure} the option @code{--with-gmp=PATH}.
2363 @node Uninstallation
2364 @section Uninstallation
2365 The user can easily remove the CLooG software and library from his system
2366 by typing (as root if necessary) from the CLooG top-level directory
2367 @code{make uninstall}.
2369 @c % **************************** DOCUMENTATION ******************************
2371 @chapter Documentation
2372 The CLooG distribution provides several documentation sources. First, the
2373 source code itself is as documented as possible. The code comments use a
2374 Doxygen-compatible presentation (something similar to what JavaDoc does for
2375 JAVA). The user may install Doxygen
2376 (see @code{http://www.stack.nl/~dimitri/doxygen}) to automatically
2377 generate a technical documentation by typing @code{make doc} or
2378 @code{doxygen ./autoconf/Doxyfile} at the CLooG top-level directory after
2379 running the configure script (@pxref{Installing}). Doxygen will generate
2380 documentation sources (in HTML, LaTeX and man) in the @code{doc/source}
2381 directory of the CLooG distribution.
2383 The Texinfo sources of the present document are also provided in the @code{doc}
2384 directory. You can build it in either DVI format (by typing
2385 @code{texi2dvi cloog.texi}) or PDF format
2386 (by typing @code{texi2pdf cloog.texi}) or HTML format
2387 (by typing @code{makeinfo --html cloog.texi}, using @code{--no-split}
2388 option to generate a single HTML file) or info format
2389 (by typing @code{makeinfo cloog.texi}).
2391 @c % ****************************** REFERENCES ********************************
2397 @anchor{Bas03a}[Bas03a] C. Bastoul, P. Feautrier. Improving data locality
2398 by chunking. CC'12 International Conference on Compiler Construction,
2399 LNCS 2622, pages 320-335, Warsaw, april 2003.
2402 @anchor{Bas03b}[Bas03b] C. Bastoul. Efficient code generation for automatic
2403 parallelization and optimization. ISPDC'03 IEEE International Symposium on
2404 Parallel and Distributed Computing, pages 23-30, Ljubljana, october 2003.
2407 @anchor{Bas04}[Bas04] C. Bastoul. Code Generation in the Polyhedral Model
2408 Is Easier Than You Think. PACT'13 IEEE International Conference on Parallel
2409 Architecture and Compilation Techniques, pages 7-16, Juan-les-Pins,
2413 @anchor{Fea92}[Fea92] P. Feautrier Some efficient solutions to the affine
2414 scheduling problem, part II: multidimensional time.
2415 International Journal of Parallel Programming, 21(6):389--420, December 1992.
2418 @anchor{Gri04}[Gri04] M. Griebl. Automatic parallelization of loop programs
2419 for distributed memory architectures. Habilitation Thesis. Facult@"at f@"ur
2420 Mathematik und Informatik, Universit@"at Passau, 2004.
2421 @emph{http://www.infosun.fmi.uni-passau.de/cl/loopo/}
2424 @anchor{Qui00}[Qui00] F. Quiller@'e, S. Rajopadhye, and D. Wilde.
2425 Generation of efficient nested loops from polyhedra.
2426 International Journal of Parallel Programming, 28(5):469-498,
2430 @anchor{Wil93}[Wil93] Doran K. Wilde.
2431 A library for doing polyhedral operations.
2432 Technical Report 785, IRISA, Rennes, France, 1993.
2439 @c % /*************************************************************************
2440 @c % * PART VI: END OF THE DOCUMENT *
2441 @c % *************************************************************************/
2442 @c @unnumbered Index