doc/cloog.texi

   1 \input texinfo
   2 @c %
   3 @c %  /**-----------------------------------------------------------------**
   4 @c %   **                              CLooG                              **
   5 @c %   **-----------------------------------------------------------------**
   6 @c %   **                            cloog.texi                           **
   7 @c %   **-----------------------------------------------------------------**
   8 @c %   **                   First version: july 6th 2002                  **
   9 @c %   **-----------------------------------------------------------------**/
  10 @c %
  11 @c % release 1.0: september 17th 2002
  12 @c % release 1.1: december   5th 2002
  13 @c % release 1.2: april     22th 2003
  14 @c % release 2.0: november  21th 2005 (and now in texinfo instead of LaTeX)
  15 @c %
  16 @c %/**************************************************************************
  17 @c % *               CLooG : the Chunky Loop Generator (experimental)         *
  18 @c % **************************************************************************/
  19 @c %/* CAUTION: the english used is probably the worst you ever read, please
  20 @c % *          feel free to correct and improve it !
  21 @c % */
  22
  23 @c %\textit{"I found the ultimate transformation functions, optimization for
  24 @c %static control programs is now a closed problem, I have \textnormal{just}
  25 @c %to generate the target code !"}
  26
  27
  28
  29 @c % /*************************************************************************
  30 @c %  *                              PART I: HEADER                           *
  31 @c %  *************************************************************************/
  32 @c %**start of header
  33 @setfilename cloog.info
  34 @settitle CLooG - a loop generator for scanning polyhedra
  35
  36 @set EDITION 2.0
  37 @set VERSION 0.14.0
  38 @set UPDATED November 17th 2005
  39 @setchapternewpage odd
  40
  41 @c %**end of header
  42
  43 @c % /*************************************************************************
  44 @c %  *                 PART II: SUMMARY DESCRIPTION AND COPYRIGHT            *
  45 @c %  *************************************************************************/
  46
  47 @copying
  48 This manual is for CLooG version @value{VERSION}, a software
  49 which generates loops for scanning Z-polyhedra. That is, CLooG produces a
  50 code visiting each integral point of a union of parametrized
  51 polyhedra. CLooG is designed to avoid control overhead and to produce a very
  52 effective code.
  53
  54 It would be quite kind to refer the following paper in any publication that
  55 results from the use of the CLooG software or its library:
  56
  57 @example
  58 @@InProceedings@{Bas04,
  59 @ @ author =@ @ @ @ @{C. Bastoul@},
  60 @ @ title =@ @ @ @ @ @{Code Generation in the Polyhedral Model
  61 @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ Is Easier Than You Think@},
  62 @ @ booktitle = @{PACT'13 IEEE International Conference on
  63 @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ Parallel Architecture and Compilation Techniques@},
  64 @ @ year =@ @ @ @ @ @ 2004,
  65 @ @ pages =@ @ @ @ @ @{7--16@},
  66 @ @ month =@ @ @ @ @ @{september@},
  67 @ @ address =@ @ @ @{Juan-les-Pins@}
  68 @}
  69 @end example
  70
  71 Copyright @copyright{} 2002-2005 C@'edric Bastoul.
  72
  73 @c quotation
  74 Permission is granted to copy, distribute and/or modify this document under
  75 the terms of the GNU Free Documentation License, Version 1.2
  76 published by the Free Software Foundation. To receive a copy of the
  77 GNU Free Documentation License, write to the Free Software Foundation, Inc.,
  78 59 Temple Place, Suite 330, Boston, MA  02111-1307 USA.
  79 @c end quotation
  80 @end copying
  81
  82 @c % /*************************************************************************
  83 @c %  *                 PART III: TITLEPAGE, CONTENTS, COPYRIGHT              *
  84 @c %  *************************************************************************/
  85 @titlepage
  86 @title CLooG
  87 @subtitle A Loop Generator For Scanning Polyhedra
  88 @subtitle Edition @value{EDITION}, for CLooG @value{VERSION}
  89 @subtitle @value{UPDATED}
  90 @author C@'edric Bastoul
  91
  92 @c The following two commands start the copyright page.
  93 @page
  94 @noindent (September 2001)
  95 @table @code
  96 @item C@'edric Bastoul
  97 SCHEDULES GENERATE !!! I just need to apply them now, where can I find
  98 a good code generator ?!
  99
 100 @item Paul Feautrier
 101 Hmmm. I fear that if you want something powerful enough, you'll have to
 102 write it yourself !
 103 @end table
 104
 105 @vskip 0pt plus 1filll
 106 @insertcopying
 107 @end titlepage
 108
 109 @c Output the table of contents at the beginning.
 110 @contents
 111
 112 @c % /*************************************************************************
 113 @c %  *                     PART IV: TOP NODE AND MASTER MENU                 *
 114 @c %  *************************************************************************/
 115 @ifnottex
 116 @node Top
 117 @top CLooG
 118
 119 @insertcopying
 120 @end ifnottex
 121
 122 @menu
 123 * Introduction::
 124 * CLooG Software::
 125 * CLooG Library::
 126 @c * Hacking::
 127 * Installing::
 128 * Documentation::
 129 * References::
 130 @end menu
 131
 132
 133
 134 @c % /*************************************************************************
 135 @c %  *                       PART V: BODY OF THE DOCUMENT                    *
 136 @c %  *************************************************************************/
 137
 138 @c %  ****************************** INTRODUCTION ******************************
 139 @node Introduction
 140 @chapter Introduction
 141 CLooG is a free software and library generating loops for scanning Z-polyhedra.
 142 That is, it finds a code (e.g. in C, FORTRAN...) that reaches each integral
 143 point of one or more parameterized polyhedra. CLooG has been originally
 144 written to solve the code generation problem for optimizing compilers based on
 145 the polytope model. Nevertheless it is used now in various area, e.g., to build
 146 control automata for high-level synthesis or to find the best polynomial
 147 approximation of a function. CLooG may help in any situation where scanning
 148 polyhedra matters. It uses the best state-of-the-art code generation
 149 algorithm known as the Quiller@'e et al. algorithm (@pxref{Qui00})
 150 with our own improvements and extensions (@pxref{Bas04}).
 151 The user has full control on generated code quality.
 152 On one hand, generated code size has to be tuned for sake of
 153 readability or instruction cache use. On the other hand, we must ensure that
 154 a bad control management does not hamper performance of the generated code,
 155 for instance by producing redundant guards or complex loop bounds.
 156 CLooG is specially designed to avoid control overhead and to produce a very
 157 efficient code.
 158
 159 CLooG stands for @emph{Chunky Loop Generator}: it is a part of the Chunky
 160 project, a research tool for data locality improvement (@pxref{Bas03a}).
 161 It is designed
 162 also to be the back-end of automatic parallelizers like LooPo (@pxref{Gri04}).
 163 Thus it is very
 164 compilable code oriented and provides powerful program transformation
 165 facilities. Mainly, it allows the user to specify very general schedules where,
 166 e.g., unimodularity or invertibility of the transformation doesn't matter.
 167
 168 The current version is still under
 169 evaluation, and there is no guarantee that the upward compatibility
 170 will be respected (but the previous API has been stable for two years,
 171 we hope this one will be as successful -and we believe it-).
 172 A lot of reports are necessary to freeze the library
 173 API and the input file shape. Most API changes from 0.12.x to 0.14.x
 174 have been requested by the users themselves.
 175 Thus you are very welcome and encouraged
 176 to post reports on bugs, wishes, critics, comments, suggestions or
 177 successful experiences in the forum of @code{http://www.CLooG.org}
 178 (preferably) or to send them to cedric.bastoul@@inria.fr directly.
 179
 180 @menu
 181 * Basics::
 182 * Scattering::
 183 @end menu
 184
 185 @node Basics
 186 @section Basically, what's the point ?
 187 If you want to use CLooG, this is because you want to scan or to find
 188 something inside the integral points of a set of polyhedra. There are many
 189 reasons for that. Maybe you need the generated code itself because it
 190 actually implements a very smart program transformation you found.
 191 Maybe you want to use the generated code
 192 because you know that the solution of your problem belongs to the integral
 193 points of those damned polyhedra and you don't know which one. Maybe you just
 194 want to know if a polyhedron has integral points depending on some parameters,
 195 which is the lexicographic minimum, maximum, the third on the basis of the
 196 left etc. Probably you have your own reasons to use CLooG.
 197
 198 Let us illustrate a basic use of CLooG. Suppose we have a set of affine
 199 constraints that describes a part of a whatever-dimensional space,
 200 called a @strong{domain}, and we
 201 want to scan it. Let us consider for instance the following set of constraints
 202 where @samp{i}
 203 and @samp{j} are the unknown (the two dimensions of the space) and
 204 @samp{m} and @samp{n} are the parameters (some symbolic constants):
 205 @example
 206 @group
 207 2<=i<=n
 208 2<=j<=m
 209 j<=n+2-i
 210 @end group
 211 @end example
 212 Let us also consider that we have a partial knowledge of the parameter values,
 213 called the @strong{context}, expressed as affine constraints as well,
 214 for instance:
 215 @example
 216 @group
 217 m>=2
 218 n>=2
 219 @end group
 220 @end example
 221 Note that using parameters is optional, if you are not comfortable with
 222 parameter manipulation, just replace them with any scalar value that fits
 223 @code{m>=2} and @code{n>=2}.
 224 A graphical representation of this part of the 2-dimensional space, where
 225 the integral points are represented using heavy dots would be for instance:
 226
 227 @image{./images/basic,6cm}
 228
 229 The affine constraints of both the domain and the context are what we will
 230 provide to CLooG as input (in a particular shape that will be described later).
 231 The output of CLooG is a pseudo-code to scan the integral points of the
 232 input domain according to the context:
 233 @example
 234 @group
 235 for (i=2;i<=n;i++) @{
 236   for (j=2;j<=min(m,-i+n+2);j++) @{
 237     S1(i,j) ;
 238   @}
 239 @}
 240 @end group
 241 @end example
 242 If you felt such a basic example is yet interesting, there is a good chance
 243 that CLooG is appropriate for you. CLooG can do much more: scanning several
 244 polyhedra or unions of polyhedra at the same time, applying general affine
 245 transformations to the polyhedra, generate compilable code etc. Welcome
 246 to the CLooG's user's guide !
 247
 248 @node Scattering
 249 @section Defining a Scanning Order: Scattering Functions
 250 In CLooG, domains only define the set of integral points to scan and their
 251 coordinates. In particular, CLooG is free to choose the scanning order for
 252 generating the most efficient code. This means, for optimizing/parallelizing
 253 compiler people, that CLooG doesn't make any speculation on dependences on and
 254 between statements (by the way, it's not its job !).
 255 For instance, if an user give to
 256 CLooG only two domains @code{S1:1<=i<=n}, @code{S2:1<=i<=n} and the context
 257 @code{n>=1}, the following pseudo-codes are considered to be equivalent:
 258
 259 @example
 260 @group
 261 /* A convenient target pseudo-code. */
 262 for (i=1;i<=N;i++) @{
 263  S1(i) ;
 264 @}
 265 for (i=1;i<=N;i++) @{
 266  S2(i) ;
 267 @}
 268 @end group
 269 @end example
 270
 271 @example
 272 @group
 273 /* Another convenient target pseudo-code. */
 274 for (i=1;i<=N;i++) @{
 275  S1(i) ;
 276  S2(i) ;
 277 @}
 278 @end group
 279 @end example
 280
 281 The default behaviour
 282 of CLooG is to generate the second one, since it is optimized in control.
 283 It is right if there are no data dependences
 284 between @code{S1} and @code{S2}, but wrong otherwise.
 285
 286 Thus it is often useful to force scanning to respect a given order. This can be
 287 done in CLooG by using @strong{scattering functions}. Scattering is a
 288 shortcut for scheduling, allocation, chunking functions and the like we can
 289 find in the restructuring compilation litterature. There are a lot of reasons
 290 to scatter the integral points of the domains (i.e. the statement instances
 291 of a program, for compilation people), parallelization or optimization are good
 292 examples. For instance, if the user wants for any reason to set some
 293 precedence constraints between the statements of our example above
 294 in order to force the generation of the
 295 first code, he can do it easily by setting (for example) the following
 296 scheduling functions:
 297
 298 @tex
 299 $$\theta _{S1}(i) =  (1)$$
 300 $$\theta _{S2}(j) =  (2)$$
 301 @end tex
 302
 303 @ifnottex
 304 @example
 305 @group
 306 T_S1(i) = (1)
 307 T_S2(j) = (2)
 308 @end group
 309 @end example
 310 @end ifnottex
 311
 312 This scattering means that each integral point of the domain @code{S1}
 313 is scanned at logical date @code{1} while each integral point of the domain
 314 @code{S2} is scanned at logical date @code{2}. As a result, the whole
 315 domain @code{S1} is scanned before domain @code{S2} and the first code in our
 316 example is generated.
 317
 318 The user can set every kind of affine scanning order thanks to the
 319 scattering functions. Each domain has its own scattering function and
 320 each scattering function may be multi-dimensional. A multi-dimentional logical
 321 date may be seen as classical date (year,month,day,hour,minute,etc.) where
 322 the first dimensions are the most significant. Each scattering dimension
 323 may depend linearly on the original dimensions (e.g., @code{i}), the
 324 parameters (e.g., @code{n}) ans scalars (e.g., @code{2}).
 325
 326 A very useful example of multi-dimensional scattering functions is, for
 327 compilation people, the scheduling of the original program.
 328 The basic data to use for code generation are statement iteration domains.
 329 As we saw, these data are not sufficient to rebuild the original
 330 program (what is the ordering between instances of different statements ?).
 331 The missing data can be put in the scattering functions as the original
 332 scheduling. The method to compute it is quite simple (@pxref{Fea92}). The idea is to
 333 build an abstract syntax tree of the program and to read the scheduling for
 334 each statement. For instance, let us consider the following implementation of
 335 a Cholesky factorization:
 336
 337 @example
 338 @group
 339 /* A Cholesky factorization kernel. */
 340 for (i=1;i<=N;i++) @{
 341   for (j=1;j<=i-1;j++) @{
 342     a[i][i] -= a[i][j] ;           /* S1 */
 343   @}
 344   a[i][i] = sqrt(a[i][i]) ;        /* S2 */
 345   for (j=i+1;j<=N;j++) @{
 346     for (k=1;k<=i-1;k++) @{
 347       a[j][i] -= a[j][k]*a[i][k] ; /* S3 */
 348     @}
 349     a[j][i] /= a[i][i] ;           /* S4 */
 350     @}
 351   @}
 352 @}
 353 @end group
 354 @end example
 355
 356 The corresponding abstract syntax tree is given in the following figure.
 357 It directly gives the scattering functions (schedules) for all the
 358 statements of the program.
 359
 360 @image{./images/tree,6cm}
 361
 362 @tex
 363 $$
 364 \hbox{$ \cases{ \theta _{S1}(i,j)^T    &$=  (0,i,0,j,0)^T$\cr
 365                 \theta _{S2}(i)        &$=  (0,i,1)^T$\cr
 366                 \theta _{S3}(i,j,k)^T  &$=  (0,i,2,j,0,k,0)^T$\cr
 367                 \theta _{S4}(i,j)^T    &$=  (0,i,2,j,1)^T$}$}
 368 $$
 369 @end tex
 370
 371 @ifnottex
 372 @example
 373 @group
 374 T_S1(i,j)^T   = (0,i,0,j,0)^T
 375 T_S2(i)       = (0,i,1)^T
 376 T_S3(i,j,k)^T = (0,i,2,j,0,k,0)^T
 377 T_S4(i,j)^T   = (0,i,2,j,1)^T
 378 @end group
 379 @end example
 380 @end ifnottex
 381
 382 These schedules depend on the iterators and give for each instance of each
 383 statement a unique execution date. Using such scattering functions allow
 384 CLooG to re-generate the input code.
 385
 386
 387
 388
 389
 390 @c %  ***********************Using the CLooG Software **************************
 391 @node CLooG Software
 392 @chapter Using the CLooG Software
 393
 394
 395 @menu
 396 * A First Example::
 397 * Writing The Input File::
 398 * Calling CLooG::
 399 * CLooG Options::
 400 * Full Example::
 401 @end menu
 402
 403 @c %/*************************************************************************
 404 @c % *                              A FIRST EXAMPLE                          *
 405 @c % *************************************************************************/
 406 @node A First Example
 407 @section A First Example
 408 CLooG takes as input a file that must be written accordingly to a grammar
 409 described in depth in a further section (@pxref{Writing The Input File}).
 410 Moreover it supports many options to tune the target code presentation or
 411 quality as discussed in a dedicated section (@pxref{Calling CLooG}).
 412 However, a basic use
 413 of CLooG is not very complex and we present in this section how to generate the
 414 code corresponding to a basic example discussed earlier (@pxref{Basics}).
 415
 416 The problem is to find the code that scans a 2-dimensional polyhedron
 417 where @samp{i} and @samp{j} are the unknown (the two dimensions of the space)
 418 and @samp{m} and @samp{n} are the parameters (the symbolic constants),
 419 defined by the following set of constraints:
 420 @example
 421 @group
 422 2<=i<=n
 423 2<=j<=m
 424 j<=n+2-i
 425 @end group
 426 @end example
 427 @noindent We also consider a partial knowledge of the parameter values,
 428 expressed thanks to the following affine constraints:
 429 @example
 430 @group
 431 m>=2
 432 n>=2
 433 @end group
 434 @end example
 435
 436 An input file that corresponds to this problem, and asks for a generated
 437 code in C, may be the following. Note that we do not describe here precisely
 438 the structure and the components of this file (@pxref{Writing The Input File}
 439  for such information, if you feel it necessary):
 440
 441 @example
 442 # ---------------------- CONTEXT ----------------------
 443 c # language is C
 444
 445 # Context (constraints on two parameters)
 446 2 4                   # 2 lines and 4 columns
 447 # eq/in m  n  1         eq/in: 1 for inequality >=0, 0 for equality =0
 448     1   1  0 -2       # 1*m + 0*n -2*1 >= 0, i.e. m>=2
 449     1   0  1 -2       # 0*m + 1*n -2*1 >= 0, i.e. n>=2
 450
 451 1 # We want to set manually the parameter names
 452 m n                   # parameter names
 453
 454 # --------------------- STATEMENTS --------------------
 455 1 # Number of statements
 456
 457 1 # First statement: one domain
 458 # First domain
 459 5 6                   # 5 lines and 6 columns
 460 # eq/in i  j  m  n  1
 461     1   1  0  0  0 -2 # i >= 2
 462     1  -1  0  0  1  0 # i <= n
 463     1   0  1  0  0 -2 # j >= 2
 464     1   0 -1  1  0  0 # j <= m
 465     1  -1 -1  0  1  2 # n+2-i>=j
 466 0  0  0               # for future options
 467
 468 1 # We want to set manually the iterator names
 469 i j                   # iterator names
 470
 471 # --------------------- SCATTERING --------------------
 472 0 # No scattering functions
 473 @end example
 474
 475 This file may be called @samp{basic.cloog}
 476 (this example is provided in the CLooG distribution as
 477 @code{test/manual_basic.cloog}) and we can ask CLooG to process it
 478 and to generate the code by a simple calling to CLooG with this file as input:
 479 @samp{cloog basic.cloog}. By default, CLooG will print the generated code in
 480 the standard output:
 481
 482 @example
 483 @group
 484 /* Generated by CLooG v@value{VERSION} in 0.00s. */
 485 for (i=2;i<=n;i++) @{
 486   for (j=2;j<=min(m,-i+n+2);j++) @{
 487     S1(i,j) ;
 488   @}
 489 @}
 490 @end group
 491 @end example
 492
 493 @c %/*************************************************************************
 494 @c % *                                Input file                             *
 495 @c % *************************************************************************/
 496 @node Writing The Input File
 497 @section Writing The Input File
 498 The input text file contains a problem description, i.e. the context,
 499 the domains and the scattering functions.
 500 Because CLooG is very 'compilable code generation oriented', we can associate
 501 some additional informations to each domain. We call this association a
 502 @emph{statement}. The set of all informations is
 503 called a @emph{program}. The input file respects the grammar below
 504 (terminals are preceeded by "_"):
 505
 506 @example
 507 File             ::= Program
 508 Program          ::= Context Statements Scattering
 509 Context          ::= Language      Domain         Naming
 510 Statements       ::= Nb_statements Statement_list Naming
 511 Scattering       ::= Nb_functions  Domain_list    Naming
 512 Naming           ::= Option Name_list
 513 Name_list        ::= _String   Name_list      | (void)
 514 Statement_list   ::= Statement Statement_list | (void)
 515 Domain_list      ::= _Domain   Domain_list    | (void)
 516 Statement        ::= Iteration_domain 0 0 0
 517 Iteration_domain ::= Domain_union
 518 Domain_union     ::= Nb_domains Domain_list
 519 Option           ::= 0 | 1
 520 Language         ::= c | f
 521 Nb_statements    ::= _Integer
 522 Nb_domains       ::= _Integer
 523 Nb_functions     ::= _Integer
 524 @end example
 525
 526 @itemize @bullet
 527 @item  @samp{Context} represents the informations that are
 528        shared by all the statements. It consists on
 529        the language used (which can be @samp{c} for C or @samp{f} for FORTRAN 90)
 530        and the global constraints on parameters.
 531        These constraints are essential
 532        since they give to CLooG the number of parameters. If there is no
 533        parameter or no constraints on parameters, just give a constraint
 534        always satisfied like @math{1 \geq 0}. @samp{Naming} sets the parameter
 535        names.
 536        If the naming option @samp{Option} is 1, parameter names will be read
 537        on the next line. There must be exactly as many names as parameters.
 538        If the naming option @samp{Option} is 0, parameter names are
 539        automatically generated. The name of the first parameter will
 540        be @samp{M}, and the name of the @math{(n+1)^{th}} parameter directly
 541        follows the name of the @math{n^{th}} parameter in ASCII code.
 542        It is the user responsibility to ensure that parameter names,
 543        iterators and scattering dimension names are different.
 544 @item  @samp{Statements} represents the informations on the statements.
 545        @samp{Nb_statements} is the number of statements in the program,
 546        i.e. the number of @samp{Statement} items in the @samp{Statement_list}.
 547        @samp{Statement} represents the informations on a given statement.
 548        To each statement is associated a domain
 549        (the statement iteration domain: @samp{Iteration_domain}) and three
 550        zeroes that represents future options.
 551        @samp{Naming} sets the iterator names. If the naming option
 552        @samp{Option} is 1, the iterator names
 553        will be read on the next line. There must be exactly as many names as
 554        nesting level in the deepest iteration domain. If the naming option
 555        @samp{Option} is 0, iterator names are automatically generated.
 556        The iterator name of the outermost loop will be @samp{i}, and the
 557        iterator name of the loop at level @math{n+1} directly follows the
 558        iterator name of the loop at level @math{n} in ASCII code.
 559 @item  @samp{Scattering} represents the informations on scattering functions.
 560        @samp{Nb_functions} is the number of functions (it must be
 561        equal to the number of statements or 0 if there is no scattering
 562        function). The function themselves are represented through
 563        @samp{Domain_list}.
 564        @samp{Naming} sets the scattering dimension names. If the naming option
 565        @samp{Option} is 1, the scattering dimension names will be read on the
 566        next line.
 567        There must be exactly as many names as scattering dimensions. If the
 568        naming option @samp{Option} is 0, scattering dimension names are automatically
 569        generated. The name of the @math{n^{th}} scattering dimention
 570        will be @samp{cn}.
 571 @end itemize
 572
 573 @menu
 574 * Domain Representation::
 575 * Scattering Representation::
 576 @end menu
 577
 578 @node Domain Representation
 579 @subsection Domain Representation
 580 As shown by the grammar, the input file describes the various informations
 581 thanks to characters, integers and domains. Each domain is defined by a set of
 582 constraints in the PolyLib format (@pxref{Wil93}). They have the
 583 following syntax:
 584 @enumerate
 585 @item some optional comment lines beginning with @samp{#},
 586 @item the row and column numbers, possibly followed by comments,
 587 @item the constraint rows, each row corresponds to a constraint the
 588       domain have to satisfy. Each row must be on a single line and is possibly
 589       followed by comments. The constraint is an equality @math{p(x) = 0} if the
 590       first element is 0, an inequality  @math{p(x) \geq 0} if the first element
 591       is 1. The next elements are the unknown coefficients, followed by
 592       the parameter coefficients. The last element is the constant factor.
 593 @end enumerate
 594 For instance, assuming that @samp{i}, @samp{j} and @samp{k} are iterators and
 595 @samp{m} and @samp{n} are parameters, the domain defined by the following
 596 constraints :
 597
 598 @tex
 599 $$
 600 \hbox{$ \cases{ -i     + m &$\geq 0$\cr
 601                     -j + n &$\geq 0$\cr
 602                  i + j - k &$\geq 0$}$}
 603 $$
 604 @end tex
 605
 606 @ifnottex
 607 @example
 608 @group
 609    -i + m >= 0
 610    -j + n >= 0
 611 i + j - k >= 0
 612 @end group
 613 @end example
 614 @end ifnottex
 615
 616 @noindent can be written in the input file as follows :
 617
 618 @example
 619 @group
 620 # This is the domain
 621 3 7                      # 3 lines and 7 columns
 622 # eq/in i  j  k  m  n  1
 623     1  -1  0  0  1  0  0 #    -i + m >= 0
 624     1   0 -1  0  0  1  0 #    -j + n >= 0
 625     1   1  1 -1  0  0  0 # i + j - k >= 0
 626 @end group
 627 @end example
 628
 629 Each iteration domain @samp{Iteration_domain} of a given statement
 630 is a union of polyhedra
 631 @samp{Domain_union}. A union is defined by its number of elements
 632 @samp{Nb_domains} and the elements themselves @samp{Domain_list}.
 633 For instance, let us consider the following pseudo-code:
 634
 635 @example
 636 @group
 637 for (i=1;i<=n;i++) @{
 638   if ((i >= m) || (i <= 2*m))
 639     S1 ;
 640   for (j=i+1;j<=m;j++)
 641     S2 ;
 642 @}
 643 @end group
 644 @end example
 645
 646 @noindent The iteration domain of @samp{S1} can be divided into two
 647 polyhedra and written in the input file as follows:
 648
 649 @example
 650 @group
 651 2 # Number of polyhedra in the union
 652 # First domain
 653 3 5                # 3 lines and 5 columns
 654 # eq/in i  m  n  1
 655     1   1  0  0 -1 #  i >= 1
 656     1  -1  0  1  0 #  i <= n
 657     1   1 -1  0  0 #  i >= m
 658 # Second domain
 659 3 5                # 3 lines and 5 columns
 660 # eq/in i  m  n  1
 661     1   1  0  0 -1 #  i >= 1
 662     1  -1  0  1  0 #  i <= n
 663     1  -1  2  0  0 #  i <= 2*m
 664 @end group
 665 @end example
 666
 667 @node Scattering Representation
 668 @subsection Scattering Function Representation
 669 Scattering functions are depicted in the input file thanks a representation
 670 very close to the domain one.
 671 An integer gives the number of functions @samp{Nb_functions} and each function
 672 is represented by a domain. Each line of the domain corresponds to an equality
 673 defining a dimension of the function. Note that at present
 674 (CLooG @value{VERSION})
 675 @strong{all functions must have the same scattering dimension number}. If a
 676 user wants to set scattering functions with different dimensionality, he has
 677 to complete the smaller one with zeroes to reach the maximum dimensionality.
 678 For instance, let us consider the following code and
 679 scheduling functions:
 680
 681 @example
 682 @group
 683 for (i=1;i<=n;i++) @{
 684   if ((i >= m) || (i <= 2*m))
 685     S1 ;
 686   for (j=i+1;j<=m;j++)
 687     S2 ;
 688 @}
 689 @end group
 690 @end example
 691
 692 @tex
 693 $$
 694 \hbox{$ \cases{ \theta _{S1}(i)      &$=  (i,0)^T$\cr
 695                 \theta _{S2}(i,j)^T  &$=  (n,i+j)^T$}$}
 696 $$
 697 @end tex
 698
 699 @ifnottex
 700 @example
 701 @group
 702 T_S1(i)     = (i,0)^T
 703 T_S2(i,j)^T = (n,i+j)^T
 704 @end group
 705 @end example
 706 @end ifnottex
 707
 708
 709 @noindent This scheduling can be written in the input file as follows:
 710
 711 @example
 712 @group
 713 2 # Number of scattering functions
 714 # First function
 715 2 7                          # 2 lines and 7 columns
 716 # eq/in c1 c2  i  m  n  1
 717     0    1  0 -1  0  0  0    #  c1 = i
 718     0    0  1  0  0  0  0    #  c2 = 0
 719 # Second function
 720 2 8                          # 2 lines and 8 columns
 721 # eq/in c1 c2  i  j  m  n  1
 722     0    1  0  0  0  0 -1  0 #  c1 = n
 723     0    0  1 -1 -1  0  0  0 #  c2 = i+j
 724 @end group
 725 @end example
 726 The complete input file for the user who wants to generate the code for this
 727 example with the preceding scheduling would be
 728 (this file is provided in the CLooG distribution
 729 as @code{test/manual_scattering.cloog}:
 730
 731 @example
 732 # ---------------------- CONTEXT ----------------------
 733 c # language is C
 734
 735 # Context (no constraints on two parameters)
 736 1 4                   # 1 lines and 4 columns
 737 # eq/in m  n  1
 738     1   0  0  0       # 0 >= 0, always true
 739
 740 1 # We want to set manually the parameter names
 741 m n                   # parameter names
 742
 743 # --------------------- STATEMENTS --------------------
 744 2 # Number of statements
 745
 746 2 # First statement: two domains
 747 # First domain
 748 3 5                   # 3 lines and 5 columns
 749 # eq/in i  m  n  1
 750     1   1  0  0 -1    # i >= 1
 751     1  -1  0  1  0    # i <= n
 752     1   1 -1  0  0    # i >= m
 753 # Second domain
 754 3 5                   # 3 lines and 5 columns
 755 # eq/in i  m  n  1
 756     1   1  0  0 -1    # i >= 1
 757     1  -1  0  1  0    # i <= n
 758     1  -1  2  0  0    # i <= 2*m
 759 0  0  0               # for future options
 760
 761 1 # Second statement: one domain
 762 4 6                   # 4 lines and 6 columns
 763 # eq/in i  j  m  n  1
 764     1   1  0  0  0 -1 # i >= 1
 765     1  -1  0  0  1  0 # i <= n
 766     1  -1  1  0  0 -1 # j >= i+1
 767     1   0 -1  1  0  0 # j <= m
 768 0  0  0               # for future options
 769
 770 1 # We want to set manually the iterator names
 771 i j                   # iterator names
 772
 773 # --------------------- SCATTERING --------------------
 774 2 # Scattering functions
 775 # First function
 776 2 7                   # 2 lines and 7 columns
 777 # eq/in p1 p2  i  m  n  1
 778     0    1  0 -1  0  0  0    # p1 = i
 779     0    0  1  0  0  0  0    # p2 = 0
 780 # Second function
 781 2 8                   # 2 lines and 8 columns
 782 # eq/in p1 p2  i  j  m  n  1
 783     0    1  0  0  0  0 -1  0 # p1 = n
 784     0    0  1 -1 -1  0  0  0 # p2 = i+j
 785
 786 1 # We want to set manually the scattering dimension names
 787 p1 p2                 # scattering dimension names
 788 @end example
 789
 790
 791 @c %/*************************************************************************
 792 @c % *                             Calling CLooG                             *
 793 @c % *************************************************************************/
 794 @node Calling CLooG
 795 @section Calling CLooG
 796 CLooG is called by the following command:
 797 @example
 798        cloog [ options | file ]
 799 @end example
 800 The default behavior of CLooG is to read the input informations from a file and
 801 to print the generated code or pseudo-code on the standard output.
 802 CLooG's behavior and the output code shape is under the user control thanks
 803 to many options which are detailed a further section (@pxref{CLooG Options}).
 804 @code{file} is the input file. @code{stdin} is a special value: when used,
 805 input is standard input. For instance, we can call CLooG to treat the
 806 input file @code{basic.cloog} with default options by typing:
 807 @code{cloog basic.cloog} or @code{more basic.cloog | cloog stdin}.
 808
 809 @c %/*************************************************************************
 810 @c % *                             CLooG Options                             *
 811 @c % *************************************************************************/
 812 @node CLooG Options
 813 @section CLooG Options
 814
 815 @menu
 816 * Last Depth to Optimize Control::
 817 * First Depth to Optimize Control::
 818 * Once Time Loop Elimination::
 819 * Equality Spreading::
 820 * Constant Spreading::
 821 * First Level for Spreading::
 822 * C PreProcessor Friendly::
 823 * Statement Block::
 824 * Loop Strides::
 825 * Compilable Code::
 826 * Output::
 827 * Help::
 828 * Version ::
 829 @end menu
 830
 831 @node Last Depth to Optimize Control
 832 @subsection Last Depth to Optimize Control @code{-l <depth>}
 833
 834 @code{-l <depth>}: this option sets the last loop depth to be optimized in
 835 control. The higher this depth, the less control overhead.
 836 For instance, with some input file, a user can generate
 837 different pseudo-codes with different @code{depth} values as shown below.
 838 @example
 839 @group
 840 /* Generated using a given input file and @strong{option -l 1} */
 841 for (i=0;i<=M;i++) @{
 842   S1 ;
 843   for (j=0;j<=N;j++) @{
 844     S2 ;
 845   @}
 846   for (j=0;j<=N;j++) @{
 847     S3 ;
 848   @}
 849   S4 ;
 850 @}
 851 @end group
 852 @end example
 853 @example
 854 @group
 855 /* Generated using the same input file but @strong{option -l 2} */
 856 for (i=0;i<=M;i++) @{
 857   S1 ;
 858   for (j=0;j<=N;j++) @{
 859     S2 ;
 860     S3 ;
 861   @}
 862   S4 ;
 863 @}
 864 @end group
 865 @end example
 866      In this example we can see that this option can change the operation
 867      execution order between statements. Let us remind that CLooG does not
 868      make any speculation on dependences between statements
 869      (@pxref{Scattering}). Thus if nothing (i.e. scattering functions)
 870      forbids this, CLooG considers the above codes to be equivalent.
 871      If there is no scattering functions, the minimum value for @code{depth}
 872      is 1 (in the case of 0, the user doesn't really need a loop generator !),
 873      and the number of scattering dimensions otherwise (CLooG will warn the
 874      user if he doesn't respect such constraint).
 875      The maximum value for depth is -1 (infinity).
 876      Default value is infinity.
 877
 878 @node First Depth to Optimize Control
 879 @subsection First Depth to Optimize Control @code{-f <depth>}
 880
 881      @code{-f <depth>}: this option sets the first loop depth to be optimized
 882      in control. The lower this depth, the less control overhead (and the longer
 883      the generated code). For instance, with some input file, a user
 884      can generate different pseudo-codes with different @code{depth} values
 885      as shown below.
 886      The minimum value for @code{depth} is 1, and the
 887      maximum value is -1 (infinity).
 888      Default value is 1.
 889 @example
 890 @group
 891 /* Generated using a given input file and @strong{option -f 3} */
 892 for (i=1;i<=N;i++) @{
 893   for (j=1;j<=M;j++) @{
 894     S1 ;
 895     if (j >= 10) @{
 896       S2 ;
 897     @}
 898   @}
 899 @}
 900 @end group
 901 @end example
 902 @example
 903 @group
 904 /* Generated using the same input file but @strong{option -f 2} */
 905 for (i=1;i<=N;i++) @{
 906   for (j=1;j<=9;j++) @{
 907     S1 ;
 908   @}
 909   for (j=10;j<=M;j++) @{
 910     S1 ;
 911     S2 ;
 912   @}
 913 @}
 914 @end group
 915 @end example
 916
 917
 918 @node Once Time Loop Elimination
 919 @subsection Once Time Loop Elimination @code{-otl <boolean>}
 920
 921      @code{-otl <boolean>}: this option allows (@code{boolean=1}) or
 922      forbids (@code{boolean=0}) the simplification of loops running
 923      once. Default value is 1.
 924 @example
 925 @group
 926 /* Generated using a given input file and @strong{option -otl 0} */
 927 for (j=i+1;j<=i+1;j++) @{
 928   S1 ;
 929 @}
 930 @end group
 931 @end example
 932 @example
 933 @group
 934 /* Generated using the same input file but @strong{option -otl 1} */
 935 j = i+1 ;
 936 S1 ;
 937 @end group
 938 @end example
 939
 940
 941 @node Equality Spreading
 942 @subsection Equality Spreading @code{-esp <boolean>}
 943
 944      @code{-esp <boolean>}: this option allows (@code{boolean=1}) or
 945      forbids (@code{boolean=0}) values spreading when there
 946      are equalities. Default value is 0.
 947 @example
 948 @group
 949 /* Generated using a given input file and @strong{option -esp 0} */
 950 i = M+2 ;
 951 j = N ;
 952 for (k=i;k<=j+M;k++) @{
 953   S1 ;
 954 @}
 955 @end group
 956 @end example
 957 @example
 958 @group
 959 /* Generated using the same input file but @strong{option -esp 1} */
 960 for (k=M+2;k<=N+M;k++) @{
 961   S1(i = M+2, j = N) ;
 962 @}
 963 @end group
 964 @end example
 965
 966
 967 @node Constant Spreading
 968 @subsection Constant Spreading @code{-csp <boolean>}
 969
 970      @code{-csp <boolean>}: this option allows (@code{boolean=1}) or
 971      forbids (@code{boolean=0}) values spreading when
 972      there are @emph{constant} equalities. That is, when the right member
 973      of the equality is a constant term. Default value is 1.
 974 @example
 975 @group
 976 /* Generated using a given input file and @strong{option -csp 0} */
 977 i = M+2 ;
 978 j = N ;
 979 for (k=i;j<=j+M;j++) @{
 980   S1 ;
 981 @}
 982 @end group
 983 @end example
 984 @example
 985 @group
 986 /* Generated using the same input file but @strong{option -csp 1} */
 987 i = M+2 ;
 988 for (k=i;k<=N+M;k++) @{
 989   S1(j = N) ;
 990 @}
 991 @end group
 992 @end example
 993
 994
 995 @node First Level for Spreading
 996 @subsection First Level for Spreading @code{-fsp <level>}
 997
 998      @code{-fsp <level>}: it can be useful to set a
 999      first level to begin equality spreading. Particularly when using
1000      scattering functions, the user may want to see the scattering dimension
1001      values instead of spreading or hiding them. If user has set a
1002      spreading, @code{level} is
1003      the first level to start it. Default value is 1.
1004 @example
1005 @group
1006 /* Generated using a given input file and @strong{option -fsp 1} */
1007 for (j=0;j<=N+M;j++) @{
1008   S1(i = N) ;
1009 @}
1010 for (j=0;j<=N+M;j++) @{
1011   S1(i = M) ;
1012 @}
1013 @end group
1014 @end example
1015 @example
1016 @group
1017 /* Generated using the same input file but @strong{option -fsp 2} */
1018 c1 = N ;
1019 for (j=0;j<=c1+M;j++) @{
1020   S1(i = c1) ;
1021 @}
1022 c1 = M ;
1023 for (j=0;j<=N+c1;j++) @{
1024   S1(i = c1) ;
1025 @}
1026 @end group
1027 @end example
1028
1029
1030 @node C PreProcessor Friendly
1031 @subsection C PreProcessor Friendly @code{-cpp <boolean>}
1032
1033      @code{-cpp <boolean>}: this option ask CLooG for printing a less
1034      human-readable but compilable code by using the C preprocessor
1035      (@code{boolean=1}). In this case each statement is written as a
1036      function of the iterators corresponding to its domain dimensions:
1037      @code{Si(value_of_iterator_1,...,value_of_iterator_n)}. It follows
1038      that the user can easily add preprocessor macros to define each
1039      statement and use the generated textual code directly for compilation.
1040      When @code{boolean} is set to 0, the pretty printer has the default
1041      behaviour. Default value is 0.
1042 @example
1043 @group
1044 /* Generated using a given input file and @strong{option -cpp 0} */
1045 for (j=0;j<=N+M;j++) @{
1046   S1(i = N) ;
1047 @}
1048 @end group
1049 @end example
1050 @example
1051 @group
1052 /* Generated using the same input file but @strong{option -cpp 1} */
1053 /* and a preprocessor macro set by the user */
1054
1055 #define S1(i,j) A[(j)]=3*(i)
1056
1057 for (j=0;j<=N+M;j++) @{
1058   S1(N,j) ;
1059 @}
1060 @end group
1061 @end example
1062
1063 @node Statement Block
1064 @subsection Statement Block @code{-block <boolean>}
1065
1066      @code{-block <boolean>}: this option allows (@code{boolean=1}) to
1067      create a statement block for each new iterator, even if there is only
1068      an equality. This can be useful in order to parse the generated
1069      pseudo-code. When @code{boolean} is set to 0 or when the generation
1070      language is FORTRAN, this feature is disabled. Default value is 0.
1071 @example
1072 @group
1073 /* Generated using a given input file and @strong{option -block 0} */
1074 i = M+2 ;
1075 j = N ;
1076 S1 ;
1077 @end group
1078 @end example
1079 @example
1080 @group
1081 /* Generated using the same input file but @strong{option -block 1} */
1082 @{ i = M+2 ;
1083   @{ j = N ;
1084     S1 ;
1085   @}
1086 @}
1087 @end group
1088 @end example
1089
1090
1091 @node Loop Strides
1092 @subsection Loop Strides @code{-strides <boolean>}
1093
1094      @code{-strides <boolean>}: this options allows (@code{boolean=1}) to
1095      handle non-unit strides for loop increments. This can remove a lot of
1096      guards and make the generated code more efficient. Default value is 0.
1097 @example
1098 @group
1099 /* Generated using a given input file and @strong{option -strides 0} */
1100 for (i=1;i<=n;i++) @{
1101   if (i%2 == 0) @{
1102     S1(j = i/2) ;
1103   @}
1104   if (i%4 == 0) @{
1105     S2(j = i/4) ;
1106   @}
1107 @}
1108 @end group
1109 @end example
1110 @example
1111 @group
1112 /* Generated using the same input file but @strong{option -strides 1} */
1113 for (i=2;i<=n;i+=2) @{
1114   S1(j = i/2) ;
1115   if (i%4 == 0) @{
1116     S2(j = i/4) ;
1117   @}
1118 @}
1119 @end group
1120 @end example
1121
1122 @node Compilable Code
1123 @subsection Compilable Code @code{-compilable <value>}
1124
1125      @code{-compilable <value>}: this options allows (@code{value} is not 0)
1126      to generate a compilable code where all parameters have the integral value
1127      @code{value}. This option creates a macro for each statement. Since
1128      CLooG do not know anything about the statement sources, it fills the
1129      macros with a basic increment that computes the total number of
1130      scanned integral points. The user may change easily the macros according
1131      to his own needs. This option is possible only if the generated code is
1132      in C. Default value is 0.
1133 @example
1134 @group
1135 /* Generated using a given input file and @strong{option -compilable 0} */
1136 for (i=0;i<=n;i++) @{
1137   for (j=0;j<=n;j++) @{
1138     S1 ;
1139     S2 ;
1140   @}
1141   S3 ;
1142 @}
1143 @end group
1144 @end example
1145 @example
1146 /* Generated using the same input file but @strong{option -compilable 10} */
1147 /* DON'T FORGET TO USE -lm OPTION TO COMPILE. */
1148
1149 /* Useful headers. */
1150 #include <stdio.h>
1151 #include <stdlib.h>
1152 #include <math.h>
1153
1154 /* Parameter value. */
1155 #define PARVAL 10
1156
1157 /* Statement macros (please set). */
1158 #define S1(i,j) @{total++;@}
1159 #define S2(i,j) @{total++;@}
1160 #define S3(i)   @{total++;@}
1161
1162 int main() @{
1163   /* Original iterators. */
1164   int i, j ;
1165   /* Parameters. */
1166   int n=PARVAL, total=0 ;
1167
1168   for (i=0;i<=n;i++) @{
1169     for (j=0;j<=n;j++) @{
1170       S1(i,j) ;
1171       S2(i,j) ;
1172     @}
1173     S3(i) ;
1174   @}
1175
1176   printf("Number of integral points: %d.\n",total) ;
1177   return 0 ;
1178 @}
1179 @end example
1180
1181 @node Output
1182 @subsection Output @code{-o <output>}
1183
1184      @code{-o <output>}: this option sets the output file. @code{stdout} is a
1185      special value: when used, output is standard output.
1186      Default value is @code{stdout}.
1187
1188 @node Help
1189 @subsection Help @code{--help} or @code{-h}
1190
1191      @code{--help} or @code{-h}: this option ask CLooG to print a short help.
1192
1193 @node Version
1194 @subsection Version @code{--version} or @code{-v}
1195
1196      @code{--version} or @code{-v}: this option ask CLooG to print some version
1197      informations.
1198
1199
1200 @c %/*************************************************************************
1201 @c % *                           A Full Example                              *
1202 @c % *************************************************************************/
1203 @node Full Example
1204 @section A Full Example
1205
1206 Let us consider the allocation problem of a Gaussian elimination, i.e. we want
1207 to distribute the various statement instances of the compute kernel onto
1208 different processors. The original code is the following:
1209 @example
1210 @group
1211 for (i=1;j<=N-1;i++) @{
1212   for (j=i+1;j<=N;j++) @{
1213     c[i][j] = a[j][i]/a[i][i] ;    /* S1 */
1214     for (k=i+1;k<=N;k++) @{
1215       a[j][k] -= c[i][j]*a[i][k] ; /* S2 */
1216     @}
1217   @}
1218 @}
1219 @end group
1220 @end example
1221
1222 @noindent The best affine allocation functions can be found by any good automatic
1223 parallelizer like LooPo (@pxref{Gri04}):
1224
1225 @tex
1226 $$
1227 \hbox{$ \cases{ \theta _{S1}(i,j)^T    &$=  (i)$\cr
1228                 \theta _{S2}(i,j,k)^T  &$=  (k)$}$}
1229 $$
1230 @end tex
1231
1232 @ifnottex
1233 @example
1234 @group
1235 T_S1(i,j)^T   = (i)
1236 T_S2(i,j,k)^T = (k)
1237 @end group
1238 @end example
1239 @end ifnottex
1240
1241 @noindent To ensure that on each processor, the set of statement instances is
1242 executed according to the original ordering, we add as minor scattering
1243 dimensions the original scheduling (@pxref{Scattering}):
1244
1245 @tex
1246 $$
1247 \hbox{$ \cases{ \theta _{S1}(i,j)^T    &$=  (i,0,i,0,j,0)^T$\cr
1248                 \theta _{S2}(i,j,k)^T  &$=  (k,0,i,0,j,1,k,0)^T$}$}
1249 $$
1250 @end tex
1251
1252 @ifnottex
1253 @example
1254 @group
1255 T_S1(i,j)^T   = (i,0,i,0,j,0)^T
1256 T_S2(i,j,k)^T = (k,0,i,0,j,1,k,0)^T
1257 @end group
1258 @end example
1259 @end ifnottex
1260
1261 @noindent To ensure that the scattering functions have the same dimensionality, we
1262 complete the first function with zeroes
1263 (this is a CLooG @value{VERSION} and previous versions requirement,
1264 it should be removed in a future version, don't worry it's absolutly legal !):
1265
1266 @tex
1267 $$
1268 \hbox{$ \cases{ \theta _{S1}(i,j)^T    &$=  (i,0,i,0,j,0,0,0)^T$\cr
1269                 \theta _{S2}(i,j,k)^T  &$=  (k,0,i,0,j,1,k,0)^T$}$}
1270 $$
1271 @end tex
1272
1273 @ifnottex
1274 @example
1275 @group
1276 T_S1(i,j)^T   = (i,0,i,0,j,0,0,0)^T
1277 T_S2(i,j,k)^T = (k,0,i,0,j,1,k,0)^T
1278 @end group
1279 @end example
1280 @end ifnottex
1281
1282 @noindent The input file corresponding to this code generation problem
1283 could be (this file is provided in the CLooG distribution
1284 as @code{test/manual_gauss.cloog}:
1285
1286 @example
1287 # ---------------------- CONTEXT ----------------------
1288 c # language is C
1289
1290 # Context (no constraints on one parameter)
1291 1 3                     # 1 line and 3 columns
1292 # eq/in n  1
1293     1   0  0            # 0 >= 0, always true
1294
1295 1 # We want to set manually the parameter name
1296 n                       # parameter name
1297
1298 # --------------------- STATEMENTS --------------------
1299 2 # Number of statements
1300
1301 1 # First statement: one domain
1302 4 5                     # 4 lines and 3 columns
1303 # eq/in i  j  n  1
1304     1   1  0  0 -1      # i >= 1
1305     1  -1  0  1 -1      # i <= n-1
1306     1  -1  1  0 -1      # j >= i+1
1307     1   0 -1  1  0      # j <= n
1308 0  0  0                 # for future options
1309
1310 1
1311 # Second statement: one domain
1312 6 6                     # 6 lines and 3 columns
1313 # eq/in i  j  k  n  1
1314     1   1  0  0  0 -1   # i >= 1
1315     1  -1  0  0  1 -1   # i <= n-1
1316     1  -1  1  0  0 -1   # j >= i+1
1317     1   0 -1  0  1  0   # j <= n
1318     1  -1  0  1  0 -1   # k >= i+1
1319     1   0  0 -1  1  0   # k <= n
1320 0  0  0                 # for future options
1321
1322 0 # We let CLooG set the iterator names
1323
1324 # --------------------- SCATTERING --------------------
1325 2 # Scattering functions
1326 # First function
1327 8 13                    # 3 lines and 3 columns
1328 # eq/in p1 p2 p3 p4 p5 p6 p7 p8  i  j  n  1
1329     0    1  0  0  0  0  0  0  0 -1  0  0  0     # p1 = i
1330     0    0  1  0  0  0  0  0  0  0  0  0  0     # p2 = 0
1331     0    0  0  1  0  0  0  0  0 -1  0  0  0     # p3 = i
1332     0    0  0  0  1  0  0  0  0  0  0  0  0     # p4 = 0
1333     0    0  0  0  0  1  0  0  0  0 -1  0  0     # p5 = j
1334     0    0  0  0  0  0  1  0  0  0  0  0  0     # p6 = 0
1335     0    0  0  0  0  0  0  1  0  0  0  0  0     # p7 = 0
1336     0    0  0  0  0  0  0  0  1  0  0  0  0     # p8 = 0
1337 # Second function
1338 8 14                    # 3 lines and 3 columns
1339 # eq/in p1 p2 p3 p4 p5 p6 p7 p8  i  j  k  n  1
1340     0    1  0  0  0  0  0  0  0  0  0 -1  0  0  # p1 = k
1341     0    0  1  0  0  0  0  0  0  0  0  0  0  0  # p2 = 0
1342     0    0  0  1  0  0  0  0  0 -1  0  0  0  0  # p3 = i
1343     0    0  0  0  1  0  0  0  0  0  0  0  0  0  # p4 = 0
1344     0    0  0  0  0  1  0  0  0  0 -1  0  0  0  # p5 = j
1345     0    0  0  0  0  0  1  0  0  0  0  0  0 -1  # p6 = 1
1346     0    0  0  0  0  0  0  1  0  0  0 -1  0  0  # p7 = k
1347     0    0  0  0  0  0  0  0  1  0  0  0  0  0  # p8 = 0
1348
1349 1 # We want to set manually the scattering dimension names
1350 p1 p2 p3 p4 p5 p6 p7 p8 # scattering dimension names
1351 @end example
1352
1353 Calling CLooG, with for instance the command line
1354 @code{cloog -fsp 2 gauss.cloog} for a better view
1355 of the allocation (the processor number is given by @code{p1}),
1356 will result on the following target code that actually implements
1357 the transformation. A minor processing on the dimension @code{p1}
1358 to implement, e.g., MPI calls, which is not shown here may
1359 result in dramatic speedups !
1360
1361 @example
1362 if (n >= 2) @{
1363   p1 = 1 ;
1364   for (p5=2;p5<=n;p5++) @{
1365     S1(i = 1,j = p5) ;
1366   @}
1367 @}
1368 for (p1=2;p1<=n-1;p1++) @{
1369   for (p3=1;p3<=p1-1;p3++) @{
1370     for (p5=p3+1;p5<=n;p5++) @{
1371       S2(i = p3,j = p5,k = p1) ;
1372     @}
1373   @}
1374   for (p5=p1+1;p5<=n;p5++) @{
1375     S1(i = p1,j = p5) ;
1376   @}
1377 @}
1378 if (n >= 2) @{
1379   p1 = n ;
1380   for (p3=1;p3<=n-1;p3++) @{
1381     for (p5=p3+1;p5<=n;p5++) @{
1382       S2(i = p3,j = p5,k = n) ;
1383     @}
1384   @}
1385 @}
1386 @end example
1387
1388
1389 @c %/*************************************************************************
1390 @c % *                           A Full Example                              *
1391 @c % *************************************************************************/
1392 @node CLooG Library
1393 @chapter Using the CLooG Library
1394 The CLooG Library was implemented to allow the user to call CLooG
1395 directly from his programs, without file accesses or system calls. The
1396 user only needs to link his programs with C libraries. The CLooG
1397 library mainly provides one function (@code{cloog_program_generate})
1398 which takes as input the problem
1399 description with some options, and returns the data structure corresponding
1400 to the generated code (a @code{CloogProgram} structure) which is more or less
1401 an abstract syntax tree.
1402 The user can work with this data structure and/or use
1403 our pretty printing function to write the final code in either C or FORTRAN.
1404 Some other functions are provided for convenience reasons.
1405 These functions as well as the data structures are described in this section.
1406
1407 @menu
1408 * CLooG Data Structures::
1409 * CLooG Functions::
1410 * Example of Library Utilization::
1411 @end menu
1412
1413
1414 @node CLooG Data Structures
1415 @section CLooG Data Structures Description
1416 In this section, we describe the data structures used by the loop
1417 generator to represent and to process a code generation problem.
1418
1419 @menu
1420 * CloogMatrix::
1421 * CloogDomain::
1422 * CloogDomainList::
1423 * CloogStatement::
1424 * CloogBlock::
1425 * CloogBlockList::
1426 * CloogLoop::
1427 * CloogNames::
1428 * CloogProgram::
1429 * CloogOptions::
1430 @end menu
1431
1432
1433 @node CloogMatrix
1434 @subsection CloogMatrix
1435 @example
1436 @group
1437 #define CloogMatrix Matrix
1438 @end group
1439 @end example
1440
1441 @noindent The @code{CloogMatrix} structure is directly the PolyLib
1442 @code{Matrix} data structure (@pxref{Wil93}). This structure is devoted to
1443 represent a set of constraints. It is
1444 defined in @code{polylib/types.h} as the following:
1445
1446 @example
1447 @group
1448 struct matrix
1449 @{ unsigned NbRows ;    /* Number of rows. */
1450   unsigned NbColumns ; /* Number of columns. */
1451   Value ** p ;         /* Array of pointers to the matrix rows. */
1452   Value * p_Init ;     /* Matrix rows contiguously in memory. */
1453   int p_Init_size ;    /* For internal use. */
1454 @}
1455 typedef struct matrix Matrix;
1456 @end group
1457 @end example
1458
1459 @noindent The whole matrix is stored in memory row after row at the
1460 @code{p_Init} address. @code{p} is an array of pointers where
1461 @code{p[i]} points to the first element of the @math{i^{th}} row.
1462 @code{NbRows} and @code{NbColumns} are respectively the number of
1463 rows and columns of the matrix.
1464 Each row corresponds to a constraint. The first element of each row is an
1465 equality/inequality tag. The
1466 constraint is an equality @math{p(x) = 0} if the first element is 0, but it is
1467 an inequality @math{p(x) \geq 0} if the first element is 1.
1468 The next elements are the unknown coefficients, followed by the parameter
1469 coefficients, then the scalar coefficient.
1470 For instance, the following three constraints:
1471
1472 @tex
1473 $$
1474 \hbox{$ \cases{ -i + m       &$= 0$\cr
1475                 -j + n       &$\geq 0$\cr
1476                  j + i - k   &$\geq 0$}$}
1477 $$
1478 @end tex
1479
1480 @ifnottex
1481 @example
1482 @group
1483     -i + m  = 0
1484     -j + n >= 0
1485  i + j - k >= 0
1486 @end group
1487 @end example
1488 @end ifnottex
1489
1490 @noindent would be represented by the following rows:
1491
1492 @example
1493 @group
1494 # eq/in  i   j   k   m   n   cst
1495     0    0  -1   0   1   0    0
1496     1   -1   0   0   0   1    0
1497     1    1   1  -1   0   0    0
1498 @end group
1499 @end example
1500
1501 @noindent To be able to provide different precision version (CLooG
1502 supports 32 bits, 64 bits and arbitrary precision through the GMP library),
1503 the @code{Value} type depends on the configuration options (it may be
1504 @code{long int} for 32 bits version, @code{long long int} for 64 bits version,
1505 and @code{mpz_t} for multiple precision version).
1506 The @code{p_Init_size} field is needed by the PolyLib to free
1507 the memory allocated by @code{mpz_init} in the multiple precision release.
1508 Set this field to 0 if you are @emph{not} using multiple precision.
1509 Set this field to the size of the @code{p_Init} array if you
1510 initialized it yourself and if you are using the multiple precision version.
1511
1512
1513 @node CloogDomain
1514 @subsection CloogDomain
1515 @example
1516 @group
1517 struct cloogdomain
1518 @{ Polyhedron * polyhedron ;  /* The polyhedral domain. */
1519 @} ;
1520 typedef struct cloogdomain CloogDomain ;
1521 @end group
1522 @end example
1523
1524 @noindent The @code{CloogDomain} structure contains a PolyLib
1525 @code{Polyhedron} data structure which represents a polyhedral domain
1526 (a union of polyhedra) in both constraint representation and its dual
1527 ray representation (@pxref{Wil93}).
1528 It is defined in @code{polylib/types.h} as the following:
1529
1530 @example
1531 @group
1532 struct polyhedron
1533 @{ unsigned Dimension,        /* Number of dimensions. */
1534            NbConstraints,    /* Number of constraints. */
1535            NbRays,           /* Number of rays. */
1536            NbEq,             /* Number of vertices (?). */
1537            NbBid ;           /* Number of extremal rays (?). */
1538   Value ** Constraint ;      /* Pointers to constraints. */
1539   Value ** Ray ;             /* Pointers to rays. */
1540   Value * p_Init ;           /* Constraints and rays contiguously. */
1541   int p_Init_size ;          /* For internal use. */
1542   struct polyhedron * next ; /* Next component of the union. */
1543 @}
1544 typedef struct polyhedron Polyhedron;
1545 @end group
1546 @end example
1547
1548 @noindent The constraint representation is quite the same as in
1549 the @code{Matrix} data structure (@pxref{CloogMatrix}). The number of
1550 rows is @code{NbConstraints} and the
1551 number of columns in the @code{Polyhedron} structure is
1552 @code{Dimension+2} (the @math{+2} comes from the equality/inequality
1553 tag and the scalar coefficient). As in the @code{Matrix} structure,
1554 The data are stored in memory contiguously at the
1555 @code{p_Init} address and the @code{p_Init_size} field is used for
1556 memory deallocation in the multiple precision case (@pxref{CloogMatrix}).
1557 For a better understanding of the
1558 dual ray representation, the user may refer to the PolyLib documentation.
1559
1560
1561 @node CloogDomainList
1562 @subsection CloogDomainList
1563 @example
1564 @group
1565 struct cloogdomainlist
1566 @{ CloogDomain * domain ;
1567   struct cloogdomainlist * next ;
1568 @} ;
1569 typedef struct cloogdomainlist CloogDomainList ;
1570 @end group
1571 @end example
1572
1573 @noindent The CloogDomainList structure represents a @code{NULL} terminated linked list
1574 of domains.
1575
1576
1577 @node CloogStatement
1578 @subsection CloogStatement
1579 @example
1580 @group
1581 struct cloogstatement
1582 @{ int number ;                  /* The statement unique number. */
1583   void * usr ;                  /* Pointer for user's convenience. */
1584   struct cloogstatement * next ;/* Next element of the linked list. */
1585 @} ;
1586 typedef struct cloogstatement CloogStatement ;
1587 @end group
1588 @end example
1589
1590 @noindent The @code{CloogStatement} structure represents a @code{NULL}
1591 terminated linked
1592 list of statements. In CLooG, a statement is only defined by its unique
1593 number (@code{number}). The user can use the pointer @code{usr} for his
1594 own convenience to link his own statement representation to the
1595 corresponding @code{CloogStatement} structure. The whole management of the
1596 @code{usr} pointer is under the responsibility of the user, in particular,
1597 CLooG never tries to print, to allocate or to free a memory block pointed
1598 by @code{usr}.
1599
1600
1601 @node CloogBlock
1602 @subsection CloogBlock
1603 @example
1604 @group
1605 struct cloogblock
1606 @{ CloogStatement * statement ; /* Statement list of the block. */
1607   CloogMatrix * scattering ;   /* Scattering function of the block. */
1608   int depth ;                  /* Original block depth.*/
1609 @} ;
1610 typedef struct cloogblock CloogBlock ;
1611 @end group
1612 @end example
1613
1614 @noindent The @code{CloogBlock} structure represents a statement block.
1615 In a statement block, every statements have the same iteration
1616 domain and the same scattering function (actually, the scattering
1617 functions may differ only by a scalar
1618 coefficient if it just precises the ordering of the statements within
1619 the block). @code{statement} is the statement list where the
1620 statement order matters, @code{scattering} is one of
1621 the statement scattering functions and
1622 @code{depth} is the number of dimensions of the
1623 iteration domain (only the unknown, not the tag/parameters/scalar).
1624
1625
1626 @node CloogBlockList
1627 @subsection CloogBlockList
1628 @example
1629 @group
1630 struct cloogdblocklist
1631 @{ CloogBlock * block ;
1632   struct cloogblocklist * next ;
1633 @} ;
1634 typedef struct cloogblocklist CloogBlockList ;
1635 @end group
1636 @end example
1637
1638 @noindent The CloogBlockList structure represents a @code{NULL} terminated linked list
1639 of blocks.
1640
1641
1642 @node CloogLoop
1643 @subsection CloogLoop
1644 @example
1645 @group
1646 struct cloogloop
1647 @{ CloogDomain * domain ;       /* Iteration domain. */
1648   int stride ;                 /* Loop stride. */
1649   CloogBlock * block ;         /* Included statement block.*/
1650   struct cloogloop * inner ;   /* Loop at the next level. */
1651   struct cloogloop * next ;    /* Next loop at the same level. */
1652 @} ;
1653 typedef struct cloogloop CloogLoop ;
1654 @end group
1655 @end example
1656
1657 @noindent The @code{CloogLoop} structure represents a loop.
1658 First of all, a
1659 loop has an iteration domain (@code{domain}). The iterator's stride for loop
1660 increment is @code{stride}. The loop can include a statement block
1661 in the field @code{block}. If there is no included statement block,
1662 @code{block} is set to @code{NULL}. @code{inner} is a pointer to the inner
1663 loop, and @code{next} a pointer to the next loop in the textual order. If
1664 there are no inner loop or no next loop, the corresponding pointer is set
1665 to @code{NULL}.
1666
1667
1668 @node CloogNames
1669 @subsection CloogNames
1670 @example
1671 @group
1672 struct cloognames
1673 @{ int nb_scattering ;         /* Scattering dimension number. */
1674   int nb_iterators ;          /* Iterator number. */
1675   int nb_parameters ;         /* Parameter number. */
1676   char ** scattering ;        /* The scattering dimension names. */
1677   char ** iterators ;         /* The iterator names. */
1678   char ** parameters ;        /* The parameter names. */
1679 @} ;
1680 typedef struct cloognames CloogNames ;
1681 @end group
1682 @end example
1683
1684 @noindent The @code{CloogNames} structure represents the scattering dimension,
1685 the iterator and the parameter names in the final program.
1686 @code{nb_scattering}
1687 (respectively @code{nb_iterators} and @code{nb_parameters})
1688 is the number of scattering dimensions number
1689 (respectively the iterator and parameter number)
1690 and of elements in the corresponding array of strings
1691 @code{scattering}
1692 (respectively @code{iterators} and @code{parameters}).
1693 The @math{i^{th}} scattering dimension name will be associated with the
1694 to the dimension @math{i} of the scattering function.
1695 The @math{i^{th}} iterator name will be associated with the
1696 dimension @math{i} of the iteration domain.
1697 The @math{i^{th}} parameter name will be associated with the
1698 dimension @math{i} of the context polyhedron.
1699 The user has to ensure that there are
1700 enough scattering dimension, iterator and parameter names.
1701
1702
1703 @node CloogProgram
1704 @subsection CloogProgram
1705 @example
1706 @group
1707 struct cloogprogram
1708 @{ char language ;              /* The language of the program. */
1709   int  scattdims ;             /* Scattering dimension number. */
1710   CloogNames  * names ;        /* Iterators and parameters names. */
1711   CloogDomain * context ;      /* The context of the program. */
1712   CloogLoop   * loop ;         /* The loops of the program. */
1713   CloogBlockList * blocklist ; /* The statement block list. */
1714 @} ;
1715 typedef struct cloogprogram CloogProgram ;
1716 @end group
1717 @end example
1718
1719 @noindent The @code{CloogProgram} structure represents a static control program kernel.
1720 @code{language} precises the language (@code{c} for C or @code{f} for FORTRAN).
1721 @code{scattdims} gives the number of scattering dimensions.
1722 @code{context} is a pointer to the constraints on the program parameters,
1723 it can't be the
1724 @code{NULL} pointer even if there are no constraints on parameters. In such a
1725 case, set a polyhedron with as many dimensions as there are parameters, with
1726 an @emph{always true} constraint like @math{1 \geq 0} (this is necessary
1727 since the number of parameters is deduced from the dimension number of
1728 the context constraints). @code{loop} is a pointer
1729 to the first loop of the program. @code{names} is a pointer to the various
1730 element names (scattering dimension, iterators, parameters)
1731 of the final program. @code{names} can be the @code{NULL}
1732 pointer if the user do not want to use our pretty printing function.
1733 @code{blocklist} is the linked list of all the statement block structures.
1734 As an example, let us consider the following loop nest:
1735 @example
1736 @group
1737 for (i=0; i<=n; i++) @{
1738   for (j=0; j<=n; j++) @{
1739     S1 ;
1740     S2 ;
1741   @}
1742   for (j=n+1; j<=2*n; j++) @{
1743     S3 ;
1744   @}
1745 @}
1746 @end group
1747 @end example
1748 @noindent The next figure gives a possible representation in memory for this
1749 program thanks to the CLooG data structures (it has been actually printed
1750 by the @code{cloog_program_print} function). In this figure,
1751 @samp{+-- CloogLoop} denotes an @samp{inner} loop, while a @samp{CloogLoop}
1752 on the same column pointed by an arrow denotes a @samp{next} loop:
1753
1754 @smallexample
1755 +-- CloogProgram
1756 |       |
1757 |       Language: c
1758 |       |
1759 |       Scattering dimension number: 0
1760 |       |
1761 |       +-- CloogNames
1762 |       |       |
1763 |       |       Scattering dimension number: 0
1764 |       |       |
1765 |       |       +-- No scattering string
1766 |       |       |
1767 |       |       Iterator number -----------: 2
1768 |       |       |
1769 |       |       +-- Iterator strings ------: i j
1770 |       |       |
1771 |       |       Parameter number ----------: 1
1772 |       |       |
1773 |       |       +-- Parameter strings -----: n
1774 |       |
1775 |       +-- Context
1776 |       |       [   1    1   -2  ]
1777 |       |       [   1    0    1  ]
1778 |       |
1779 |       +-- CloogLoop
1780 |       |       |
1781 |       |       +-- CloogDomain
1782 |       |       |       [   1   -1    1    0  ]
1783 |       |       |       [   1    1    0    0  ]
1784 |       |       |       [   1    0    0    1  ]
1785 |       |       |
1786 |       |       Stride: 1
1787 |       |       |
1788 |       |       +-- Null CloogBlock
1789 |       |       |
1790 |       |       +-- CloogLoop
1791 |       |       |       |
1792 |       |       |       +-- CloogDomain
1793 |       |       |       |       [   1    0    1    0    0  ]
1794 |       |       |       |       [   1    0   -1    1    0  ]
1795 |       |       |       |       [   1    0    0    0    1  ]
1796 |       |       |       |
1797 |       |       |       Stride: 1
1798 |       |       |       |
1799 |       |       |       +-- Null CloogBlock
1800 |       |       |       |
1801 |       |       |       +-- CloogLoop
1802 |       |       |       |       |
1803 |       |       |       |       +-- CloogDomain
1804 |       |       |       |       |       [   1    0    0    0    1  ]
1805 |       |       |       |       |
1806 |       |       |       |       Stride: 1
1807 |       |       |       |       |
1808 |       |       |       |       +-- CloogBlock
1809 |       |       |       |       |       |
1810 |       |       |       |       |       +-- CloogStatement 1
1811 |       |       |       |       |       |          |
1812 |       |       |       |       |       |          V
1813 |       |       |       |       |       |   CloogStatement 2
1814 |       |       |       |       |       |
1815 |       |       |       |       |       +-- Null scattering function
1816 |       |       |       |       |       |
1817 |       |       |       |       |       Depth: 2
1818 |       |       |       |       |
1819 |       |       |       |
1820 |       |       |       V
1821 |       |       |   CloogLoop
1822 |       |       |       |
1823 |       |       |       +-- CloogDomain
1824 |       |       |       |       [   1    0   -1    2    0  ]
1825 |       |       |       |       [   1    0    1   -1   -1  ]
1826 |       |       |       |       [   1    0    0    0    1  ]
1827 |       |       |       |
1828 |       |       |       Stride: 1
1829 |       |       |       |
1830 |       |       |       +-- Null CloogBlock
1831 |       |       |       |
1832 |       |       |       +-- CloogLoop
1833 |       |       |       |       |
1834 |       |       |       |       +-- CloogDomain
1835 |       |       |       |       |       [   1    0    0    0    1  ]
1836 |       |       |       |       |
1837 |       |       |       |       Stride: 1
1838 |       |       |       |       |
1839 |       |       |       |       +-- CloogBlock
1840 |       |       |       |       |       |
1841 |       |       |       |       |       +-- CloogStatement 3
1842 |       |       |       |       |       |
1843 |       |       |       |       |       +-- Null scattering function
1844 |       |       |       |       |       |
1845 |       |       |       |       |       Depth: 2
1846 |       |       |       |       |
1847 |       |       |       |
1848 |       |       |
1849 |       |
1850 |
1851 @end smallexample
1852
1853
1854 @node CloogOptions
1855 @subsection CloogOptions
1856 @example
1857 @group
1858 struct cloogoptions
1859 @{ int l ;                     /* -l option.          */
1860   int f ;                     /* -f option.          */
1861   int strides ;               /* -strides option.    */
1862   int esp ;                   /* -esp option.        */
1863   int csp ;                   /* -csp option.        */
1864   int fsp ;                   /* -fsp option.        */
1865   int otl ;                   /* -otl option.        */
1866   int block ;                 /* -block option.      */
1867   int cpp ;                   /* -cpp option.        */
1868   int compilable ;            /* -compilable option. */
1869 @} ;
1870 typedef struct cloogoptions CloogOptions ;
1871 @end group
1872 @end example
1873
1874 @noindent The @code{CloogOptions} structure contains all the possible options to
1875 rule CLooG's behaviour (@pxref{Calling CLooG}).
1876 As a reminder, the default values are:
1877 @itemize @bullet
1878 @item @math{l = -1} (optimize control until the innermost loops),
1879 @item @math{f = 1} (optimize control from the outermost loops),
1880 @item @math{strides = 0} (use only unit strides),
1881 @item @math{esp = 0} (do not spread complex equalities),
1882 @item @math{csp = 1} (spread constant values),
1883 @item @math{fsp = 1} (start to spread from the first iterators),
1884 @item @math{otl = 1} (simplify loops running only once).
1885 @item @math{block = 0} (do not make statement blocks when not necessary).
1886 @item @math{cpp = 0} (do not generate a compilable part of code using preprocessor).
1887 @item @math{compilable = 0} (do not generate a compilable code).
1888 @end itemize
1889
1890
1891 @node CLooG Functions
1892 @section CLooG Functions Description
1893
1894 @menu
1895 * cloog_program_generate::
1896 * cloog_program_scatter::
1897 * cloog_program_pprint::
1898 * cloog_program_read::
1899 * From Matrices to Domains and Conversely::
1900 * Allocation and Initialization Functions::
1901 * Memory Deallocation Functions::
1902 * Printing Functions::
1903 @end menu
1904
1905
1906 @node cloog_program_generate
1907 @subsection cloog_program_generate
1908 @example
1909 @group
1910 CloogProgram   * cloog_program_generate
1911 ( CloogProgram * program,       /* Input program. */
1912   CloogOptions * options        /* Options. */
1913 ) ;
1914 @end group
1915 @end example
1916
1917 @noindent The @code{cloog_program_generate} function generates
1918 the data structure of the source code that scans the input
1919 polyhedra pointed by @code{program}
1920 according to the options pointed by @code{options}.
1921 The process is made directly on the input structure pointed by
1922 @code{program}, thus the original structure is no longer available
1923 after a call to this function. It returns a pointer to a
1924 @code{CloogProgram} structure containing the
1925 solution in CLooG structures.
1926
1927 The input @code{CloogProgram} structure must have only one loop level
1928 (no inner loops): there is one loop per statement block. For a given block,
1929 the corresponding loop carries the iteration domain, the statement block,
1930 and a loop stride initialized to 1. For instance, the input @code{CloogProgram} structure
1931 that have been sent to @code{cloog_program_generate} to achieve the final
1932 structure and code shown as example in the @code{CloogProgram} structure
1933 description (@pxref{CloogProgram}) was the following one:
1934
1935 @smallexample
1936 +-- CloogProgram
1937 |       |
1938 |       Language: c
1939 |       |
1940 |       Scattering dimension number: 0
1941 |       |
1942 |       +-- CloogNames
1943 |       |       |
1944 |       |       Scattering dimension number: 0
1945 |       |       |
1946 |       |       +-- No scattering string
1947 |       |       |
1948 |       |       Iterator number -----------: 2
1949 |       |       |
1950 |       |       +-- Iterator strings ------: i j
1951 |       |       |
1952 |       |       Parameter number ----------: 1
1953 |       |       |
1954 |       |       +-- Parameter strings -----: n
1955 |       |
1956 |       +-- Context
1957 |       |       [    1     1    -2  ]
1958 |       |
1959 |       +-- CloogLoop
1960 |       |       |
1961 |       |       +-- CloogDomain
1962 |       |       |       [    1     1     0     0     0  ]
1963 |       |       |       [    1    -1     0     1     0  ]
1964 |       |       |       [    1     0     1     0     0  ]
1965 |       |       |       [    1     0    -1     1     0  ]
1966 |       |       |
1967 |       |       Stride: 1
1968 |       |       |
1969 |       |       +-- CloogBlock
1970 |       |       |       |
1971 |       |       |       +-- CloogStatement 1
1972 |       |       |       |          |
1973 |       |       |       |          V
1974 |       |       |       |   CloogStatement 2
1975 |       |       |       |
1976 |       |       |       +-- Null scattering function
1977 |       |       |       |
1978 |       |       |       Depth: 2
1979 |       |       |
1980 |       |       V
1981 |       |   CloogLoop
1982 |       |       |
1983 |       |       +-- CloogDomain
1984 |       |       |       [    1     1     0     0     0  ]
1985 |       |       |       [    1    -1     0     1     0  ]
1986 |       |       |       [    1     0     1    -1    -1  ]
1987 |       |       |       [    1     0    -1     2     0  ]
1988 |       |       |
1989 |       |       Stride: 1
1990 |       |       |
1991 |       |       +-- CloogBlock
1992 |       |       |       |
1993 |       |       |       +-- CloogStatement 3
1994 |       |       |       |
1995 |       |       |       +-- Null scattering function
1996 |       |       |       |
1997 |       |       |       Depth: 2
1998 |       |       |
1999 |       |
2000 |
2001 @end smallexample
2002
2003
2004 @node cloog_program_pprint
2005 @subsection cloog_program_pprint
2006 @example
2007 @group
2008 void cloog_program_pprint
2009 ( FILE * file,                  /* Output file. */
2010   CloogProgram * program,       /* Program to print. */
2011   CloogOptions * options        /* Options. */
2012 ) ;
2013 @end group
2014 @end example
2015
2016 @noindent The function @code{cloog_program_pprint} is a pretty printer for
2017 @code{CloogProgram} structures when it is a solution provided by
2018 the @code{cloog_program_generate} function. It prints the code or pseudo-code in the
2019 file pointed by @code{file} (possibly @code{stdout}) with regards to the
2020 options pointed by @code{options}.
2021
2022
2023 @node cloog_program_scatter
2024 @subsection cloog_program_scatter
2025 @example
2026 @group
2027 void cloog_program_scatter
2028 ( CloogProgram * program,       /* Input program. */
2029   CloogDomainList * scattering, /* Additional scattering functions. */
2030   char ** names ;               /* Additional dimension names. */
2031 ) ;
2032 @end group
2033 @end example
2034
2035 @noindent The function @code{cloog_program_scatter} applies scattering
2036 functions to the @code{CloogProgram} structure pointed by @code{program}.
2037 Original domains of @code{program} are freed. Scattering functions
2038 are inside the @code{CloogDomainList} structure pointed by @code{scattering}.
2039 There must be as many scattering functions in the @code{CloogDomainList}
2040 structure as loops (i.e. iteration domains) in the @code{CloogProgram}
2041 structure. The first scattering function of the list will be applied to the
2042 iteration domain of the first loop in the program, and so on.
2043 @code{names} gives the scattering dimension names as an array of strings. If
2044 @code{names} is @code{NULL}, names are automatically generated: the name of
2045 the @math{n^{th}} scattering dimension will be @code{cn}.
2046
2047
2048 @node cloog_program_read
2049 @subsection cloog_program_read
2050 @example
2051 CloogProgram * cloog_program_read(FILE *) ;
2052 @end example
2053 @noindent The @code{cloog_program_read} function
2054 reads the program data from a CLooG input file
2055 (@pxref{Writing The Input File}). It takes
2056 as input a pointer to the file it has to read (possibly @code{stdin}), and
2057 return a pointer to the read @code{CloogProgram} structure.
2058
2059
2060 @node From Matrices to Domains and Conversely
2061 @subsection From Matrices to Domains and Conversely
2062 @example
2063 CloogMatrix * cloog_domain_domain2matrix(CloogDomain *) ;
2064 CloogDomain * cloog_domain_matrix2domain(CloogMatrix *) ;
2065 @end example
2066 @noindent Two functions are provided to translate a @code{CloogDomain}
2067 data structure to a @code{CloogMatrix} data structure and conversely.
2068 Each function takes as input a pointer to the data structure to translate
2069 and returns as output a pointer to the translated data structure. The
2070 input data structure if neither modified nor freed. They
2071 may be quite useful for e.g. pretty printing since it is more convenient
2072 in constraint (matrix) representation.
2073
2074
2075 @node Allocation and Initialization Functions
2076 @subsection Allocation and Initialization Functions
2077 @example
2078 CloogStructure * cloog_structure_malloc() ;
2079 @end example
2080 @noindent Each CLooG data structure has an allocation and initialization
2081 function as shown above, where @code{Structure} and @code{structure} have to
2082 be replaced by the name of the convenient structure (without @samp{Cloog} prefix) for
2083 instance @code{CloogLoop * cloog_loop_malloc() ;}. These functions return
2084 pointers to an allocated structure with fields set to convenient default
2085 values. @strong{Using those functions is mandatory} to support internal
2086 management fields and to avoid upward compatibility problems if
2087 new fields appear. An exception is @code{cloog_matrix_malloc} since the
2088 @code{CloogMatrix} comes directly from the PolyLib. It takes two parameters:
2089 the number of rows and columns of the matrix we want to allocate:
2090 @example
2091 CloogMatrix * cloog_matrix_malloc(unsigned nbrows, unsigned nbcolumns);
2092 @end example
2093
2094
2095 @node Memory Deallocation Functions
2096 @subsection Memory Deallocation Functions
2097 @example
2098 void cloog_structure_free(CloogStructure *) ;
2099 @end example
2100 @noindent Each CLooG data structure has a deallocation function as shown above,
2101  where @code{Structure} and @code{structure} have to
2102 be replaced by the name of the convenient structure (without @samp{Cloog} prefix) for
2103 instance @code{void cloog_loop_free(CloogLoop *) ;}. These functions
2104 free the allocated memory for the structure provided as input. They free
2105 memory recursively, i.e. they also free the allocated memory for the internal
2106 structures.
2107 @strong{Using those functions is mandatory} to avoid memory leaks on internal
2108 management fields and to avoid upward compatibility problems if
2109 new fields appear.
2110
2111
2112 @node Printing Functions
2113 @subsection Printing Functions
2114 @example
2115 void cloog_structure_print(FILE *, CloogStructure *) ;
2116 @end example
2117 @noindent Each CLooG data structure has a printing function as shown above,
2118  where @code{Structure} and @code{structure} have to
2119 be replaced by the name of the convenient structure (without @samp{Cloog} prefix) for
2120 instance @code{void cloog_loop_print(FILE *, CloogLoop *) ;}. These functions
2121 print the pointed structure (and its fields recursively) to the file provided
2122 as input (possibly @code{stdout}).
2123
2124
2125 @node Example of Library Utilization
2126 @section Example of Library Utilization
2127 Here is a basic example showing how it is possible to use the CLooG library,
2128 assuming that a standard installation has been done.
2129 The following C program reads a CLooG input file on the standard input,
2130 then prints the solution on the standard output.
2131 Options are preselected to the default values of the CLooG software.
2132 This example is provided in the @code{example} directory of the
2133 CLooG distribution.
2134 @example
2135 /* example.c */
2136 # include <stdio.h>
2137 # include <cloog/cloog.h>
2138
2139 int main()
2140 @{ CloogProgram * program ;
2141   CloogOptions * options ;
2142
2143   /* Setting options and reading program informations. */
2144   options = cloog_options_malloc() ;
2145   program = cloog_program_read(stdin,options) ;
2146
2147   /* Generating and printing the code. */
2148   program = cloog_program_generate(program,options) ;
2149   cloog_program_pprint(stdout,program,options) ;
2150
2151   cloog_options_free(options) ;
2152   cloog_program_free(program) ;
2153   return 0;
2154 @}
2155 @end example
2156
2157 @noindent The compilation command could be:
2158 @example
2159 gcc example.c -lcloog -o example
2160 @end example
2161 @noindent A calling command with the input file test.cloog could be:
2162 @example
2163 more test.cloog | ./example
2164 @end example
2165
2166
2167 @c %  ******************************** HACKING *********************************
2168 @c @node Hacking
2169 @c @chapter Hacking CLooG
2170
2171 @c @menu
2172 @c * Program organization::
2173 @c * Special Options::
2174 @c * CLooG Coding Standards::
2175 @c @end menu
2176
2177 @c @node Program organization
2178 @c @section Program organization
2179
2180 @c @node Special Options
2181 @c @section Special Options
2182
2183 @c @node CLooG Coding Standards
2184 @c @section CLooG Coding Standards
2185
2186
2187 @c %  ****************************** INSTALLING ********************************
2188 @node Installing
2189 @chapter Installing CLooG
2190
2191 @menu
2192 * License::
2193 * Requirements::
2194 * Basic Installation::
2195 * Optional Features::
2196 * Uninstallation::
2197 @end menu
2198
2199 @node License
2200 @section License
2201 First of all, it would be very kind to refer the following paper in any
2202 publication that result from the use of the CLooG software or its library,
2203 @pxref{Bas04} (a bibtex entry is provided behind the title page of this
2204 manual, along with copyright notice, and in the CLooG home
2205 @code{http://www.CLooG.org}.
2206
2207 This program is free software; you can redistribute it and/or
2208 modify it under the terms of the GNU General Public License version 2
2209 as published by the Free Software Foundation.
2210 This program is distributed in the hope that it will be useful,
2211 but WITHOUT ANY WARRANTY; without even the implied warranty of
2212 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
2213 GNU General Public License for more details.
2214 @code{http://www.gnu.org/copyleft/gpl.html}
2215
2216
2217 @node Requirements
2218 @section Requirements
2219
2220 @menu
2221 * PolyLib::
2222 * GMP Library::
2223 @end menu
2224
2225
2226 @node PolyLib
2227 @subsection PolyLib (mandatory)
2228 To successfully install CLooG, the user need firstly to install PolyLib
2229 version 5.22.1 or above (default 64 bits version is satisfying
2230 as well as 32 bits or GMP multiple precision version).
2231 Polylib can be downloaded freely
2232 at @code{http://icps.u-strasbg.fr/PolyLib/} or
2233 @code{http://www.irisa.fr/polylib/}. Once downloaded and unpacked
2234 (e.g. using the @samp{tar -zxvf polylib-5.22.1.tar.gz} command),
2235 the user can compile
2236 it by typing the following commands on the PolyLib's root directory:
2237
2238 @itemize @bullet
2239 @item @code{./configure}
2240 @item @code{make}
2241 @item And as root: @code{make install}
2242 @end itemize
2243
2244 The PolyLib default installation is @code{/usr/local}. This directory may
2245 not be inside your library path. To fix the problem, the user should set
2246 @example
2247 export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib
2248 @end example
2249 @noindent if your shell is, e.g., bash or
2250 @example
2251 setenv LD_LIBRARY_PATH $LD_LIBRARY_PATH:/usr/local/lib
2252 @end example
2253 @noindent if your shell is, e.g., tcsh. Add the line to your .bashrc or .tcshrc (or
2254 whatever convenient file) to make this change permanent. Another solution
2255 is to ask PolyLib to install in the standard path by using the prefix
2256 option of the configure script:
2257 @samp{./configure --prefix=/usr}.
2258
2259 CLooG makes intensive calls to polyhedral operations, and PolyLib
2260 functions do the job. Polylib is a free library written in C for the
2261 manipulation of polyhedra. The library is operating on objects like
2262 vectors, matrices, lattices, polyhedra, Z-polyhedra, unions of
2263 polyhedra and a lot of other intermediary structures. It provides
2264 functions for all the important operations on these structures.
2265
2266 @node GMP Library
2267 @subsection GMP Library (optional)
2268
2269 To be able to deal with insanely large coefficient, the user will need to
2270 install the GNU Multiple Precision Library (GMP for short) version 4.1.4
2271 or above. It can be freely downloaded from @code{http://www.swox.com/gmp}.
2272 The user can compile it by typing the following commands on the GMP root
2273 directory:
2274
2275 @itemize @bullet
2276 @item @code{./configure}
2277 @item @code{make}
2278 @item And as root: @code{make install}
2279 @end itemize
2280
2281 The GMP default installation is @code{/usr/local}, the same method to
2282 fix a library path problem applies as with PolyLib (@pxref{PolyLib}).
2283
2284 The PolyLib has to be built using the GMP library by specifying the option
2285 @samp{--with-libgmp=PATH_TO_GMP} to the PolyLib configure script
2286 (where @code{PATH_TO_GMP} is @code{/usr/local} if you did not change the GMP
2287 installation directory). Then you have to set the convenient CLooG configure
2288 script options to buid the GMP version (@pxref{Optional Features}).
2289
2290
2291 @node Basic Installation
2292 @section CLooG Basic Installation
2293
2294 Once downloaded and unpacked
2295 (e.g. using the @samp{tar -zxvf cloog-@value{VERSION}.tar.gz} command),
2296 you can compile CLooG by typing the following commands on the CLooG's root
2297 directory:
2298
2299 @itemize @bullet
2300 @item @code{./configure}
2301 @item @code{make}
2302 @item And as root: @code{make install}
2303 @end itemize
2304
2305 Depending on the location of the PolyLib, you may need to set the
2306 option @code{--with-polylib} of the configure script
2307 (e.g. @samp{./configure --with-polylib=/usr/local} with a default PolyLib
2308 installation).
2309
2310 The program binaries and object files can be removed from the
2311 source code directory by typing @code{make clean}. To also remove the
2312 files that the @code{configure} script created (so you can compile the
2313 package for a different kind of computer) type @code{make distclean}.
2314
2315 Both the CLooG software and library have been successfully compiled
2316 on the following systems:
2317 @itemize @bullet
2318 @item PC's under Linux, with the @code{gcc} compiler,
2319 @item PC's under Windows (Cygwin), with the @code{gcc} compiler,
2320 @item Sparc and UltraSparc Stations, with the @code{gcc} compiler.
2321 @end itemize
2322
2323 @node Optional Features
2324 @section Optional Features
2325 The @code{configure} shell script attempts to guess correct values for
2326 various system-dependent variables and user options used during compilation.
2327 It uses those values to create the @code{Makefile}. Various user options
2328 are provided by the CLooG's configure script. They are summarized in the
2329 following list and may be printed by typing @code{./configure --help} in the
2330 CLooG top-level directory.
2331
2332 @itemize @bullet
2333 @item By default, the installation directory is @code{/usr/local}:
2334 @code{make install} will install the package's files in
2335 @code{/usr/local/bin}, @code{/usr/local/lib} and @code{/usr/local/include}.
2336 The user can specify an installation prefix other than @code{/usr/local} by
2337 giving @code{configure} the option @code{--prefix=PATH}.
2338
2339 @item By default, @code{configure} will look for the PolyLib in standard
2340 locations. If necessary, the user can specify the PolyLib path by giving
2341 @code{configure} the option @code{--with-polylib=PATH}.
2342
2343 @item By default, both CLooG software and library are compiled and installed.
2344 By giving  @code{configure} the option @code{--without-cloog} the user
2345 disable the compilation and installation of the CLooG software.
2346 By giving @code{configure} the option
2347 @code{--without-lib} the user disable the compilation and installation of the
2348 CLooG library.
2349
2350 @item By default, CLooG is built in 64bits version if such version of the
2351 PolyLib is found by @code{configure}. If the only existing version of the
2352 PolyLib is the 32bits or if the user give to @code{configure} the option
2353 @code{--with-bits=32}, the 32bits version of CLooG will be compiled. In the
2354 same way, the option @code{--with-bits=gmp} have to be used to build
2355 the multiple precision version.
2356
2357 @item By default, @code{configure} will look for the GMP library
2358 (necessary to build the multiple precision version) in standard
2359 locations. If necessary, the user can specify the GMP path by giving
2360 @code{configure} the option @code{--with-gmp=PATH}.
2361 @end itemize
2362
2363 @node Uninstallation
2364 @section Uninstallation
2365 The user can easily remove the CLooG software and library from his system
2366 by typing (as root if necessary) from the CLooG top-level directory
2367 @code{make uninstall}.
2368
2369 @c %  **************************** DOCUMENTATION ******************************
2370 @node Documentation
2371 @chapter Documentation
2372 The CLooG distribution provides several documentation sources. First, the
2373 source code itself is as documented as possible. The code comments use a
2374 Doxygen-compatible presentation (something similar to what JavaDoc does for
2375 JAVA). The user may install Doxygen
2376 (see @code{http://www.stack.nl/~dimitri/doxygen}) to automatically
2377 generate a technical documentation by typing @code{make doc} or
2378 @code{doxygen ./autoconf/Doxyfile} at the CLooG top-level directory after
2379 running the configure script (@pxref{Installing}). Doxygen will generate
2380 documentation sources (in HTML, LaTeX and man) in the @code{doc/source}
2381 directory of the CLooG distribution.
2382
2383 The Texinfo sources of the present document are also provided in the @code{doc}
2384 directory. You can build it in either DVI format (by typing
2385 @code{texi2dvi cloog.texi}) or PDF format
2386 (by typing @code{texi2pdf cloog.texi}) or HTML format
2387 (by typing @code{makeinfo --html cloog.texi}, using @code{--no-split}
2388 option to generate a single HTML file) or info format
2389 (by typing @code{makeinfo cloog.texi}).
2390
2391 @c %  ****************************** REFERENCES ********************************
2392 @node References
2393 @chapter References
2394
2395 @itemize
2396 @item
2397 @anchor{Bas03a}[Bas03a] C. Bastoul, P. Feautrier. Improving data locality
2398 by chunking. CC'12 International Conference on Compiler Construction,
2399 LNCS 2622, pages 320-335, Warsaw, april 2003.
2400
2401 @item
2402 @anchor{Bas03b}[Bas03b] C. Bastoul. Efficient code generation for automatic
2403 parallelization and optimization. ISPDC'03 IEEE International Symposium on
2404 Parallel and Distributed Computing, pages 23-30, Ljubljana, october 2003.
2405
2406 @item
2407 @anchor{Bas04}[Bas04] C. Bastoul. Code Generation in the Polyhedral Model
2408 Is Easier Than You Think. PACT'13 IEEE International Conference on Parallel
2409 Architecture and Compilation Techniques, pages 7-16, Juan-les-Pins,
2410 september 2004.
2411
2412 @item
2413 @anchor{Fea92}[Fea92] P. Feautrier Some efficient solutions to the affine
2414 scheduling problem, part II: multidimensional time.
2415 International Journal of Parallel Programming, 21(6):389--420, December 1992.
2416
2417 @item
2418 @anchor{Gri04}[Gri04] M. Griebl. Automatic parallelization of loop programs
2419 for distributed memory architectures. Habilitation Thesis. Facult@"at f@"ur
2420 Mathematik und Informatik, Universit@"at Passau, 2004.
2421 @emph{http://www.infosun.fmi.uni-passau.de/cl/loopo/}
2422
2423 @item
2424 @anchor{Qui00}[Qui00] F. Quiller@'e, S. Rajopadhye, and D. Wilde.
2425 Generation of efficient nested loops from polyhedra.
2426 International Journal of Parallel Programming, 28(5):469-498,
2427 october 2000.
2428
2429 @item
2430 @anchor{Wil93}[Wil93] Doran K. Wilde.
2431 A library for doing polyhedral operations.
2432 Technical Report 785, IRISA, Rennes, France, 1993.
2433
2434 @end itemize
2435
2436
2437
2438
2439 @c % /*************************************************************************
2440 @c %  *                       PART VI: END OF THE DOCUMENT                    *
2441 @c %  *************************************************************************/
2442 @c @unnumbered Index
2443
2444 @c @printindex cp
2445
2446 @bye