2015-06-11 Paul Thomas <pault@gcc.gnu.org>
[official-gcc.git] / gcc / tree-predcom.c
blob4415fe3ab4d676e4d975bfeb4c04e848a6603085
1 /* Predictive commoning.
2 Copyright (C) 2005-2015 Free Software Foundation, Inc.
4 This file is part of GCC.
6 GCC is free software; you can redistribute it and/or modify it
7 under the terms of the GNU General Public License as published by the
8 Free Software Foundation; either version 3, or (at your option) any
9 later version.
11 GCC is distributed in the hope that it will be useful, but WITHOUT
12 ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
13 FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
14 for more details.
16 You should have received a copy of the GNU General Public License
17 along with GCC; see the file COPYING3. If not see
18 <http://www.gnu.org/licenses/>. */
20 /* This file implements the predictive commoning optimization. Predictive
21 commoning can be viewed as CSE around a loop, and with some improvements,
22 as generalized strength reduction-- i.e., reusing values computed in
23 earlier iterations of a loop in the later ones. So far, the pass only
24 handles the most useful case, that is, reusing values of memory references.
25 If you think this is all just a special case of PRE, you are sort of right;
26 however, concentrating on loops is simpler, and makes it possible to
27 incorporate data dependence analysis to detect the opportunities, perform
28 loop unrolling to avoid copies together with renaming immediately,
29 and if needed, we could also take register pressure into account.
31 Let us demonstrate what is done on an example:
33 for (i = 0; i < 100; i++)
35 a[i+2] = a[i] + a[i+1];
36 b[10] = b[10] + i;
37 c[i] = c[99 - i];
38 d[i] = d[i + 1];
41 1) We find data references in the loop, and split them to mutually
42 independent groups (i.e., we find components of a data dependence
43 graph). We ignore read-read dependences whose distance is not constant.
44 (TODO -- we could also ignore antidependences). In this example, we
45 find the following groups:
47 a[i]{read}, a[i+1]{read}, a[i+2]{write}
48 b[10]{read}, b[10]{write}
49 c[99 - i]{read}, c[i]{write}
50 d[i + 1]{read}, d[i]{write}
52 2) Inside each of the group, we verify several conditions:
53 a) all the references must differ in indices only, and the indices
54 must all have the same step
55 b) the references must dominate loop latch (and thus, they must be
56 ordered by dominance relation).
57 c) the distance of the indices must be a small multiple of the step
58 We are then able to compute the difference of the references (# of
59 iterations before they point to the same place as the first of them).
60 Also, in case there are writes in the loop, we split the groups into
61 chains whose head is the write whose values are used by the reads in
62 the same chain. The chains are then processed independently,
63 making the further transformations simpler. Also, the shorter chains
64 need the same number of registers, but may require lower unrolling
65 factor in order to get rid of the copies on the loop latch.
67 In our example, we get the following chains (the chain for c is invalid).
69 a[i]{read,+0}, a[i+1]{read,-1}, a[i+2]{write,-2}
70 b[10]{read,+0}, b[10]{write,+0}
71 d[i + 1]{read,+0}, d[i]{write,+1}
73 3) For each read, we determine the read or write whose value it reuses,
74 together with the distance of this reuse. I.e. we take the last
75 reference before it with distance 0, or the last of the references
76 with the smallest positive distance to the read. Then, we remove
77 the references that are not used in any of these chains, discard the
78 empty groups, and propagate all the links so that they point to the
79 single root reference of the chain (adjusting their distance
80 appropriately). Some extra care needs to be taken for references with
81 step 0. In our example (the numbers indicate the distance of the
82 reuse),
84 a[i] --> (*) 2, a[i+1] --> (*) 1, a[i+2] (*)
85 b[10] --> (*) 1, b[10] (*)
87 4) The chains are combined together if possible. If the corresponding
88 elements of two chains are always combined together with the same
89 operator, we remember just the result of this combination, instead
90 of remembering the values separately. We may need to perform
91 reassociation to enable combining, for example
93 e[i] + f[i+1] + e[i+1] + f[i]
95 can be reassociated as
97 (e[i] + f[i]) + (e[i+1] + f[i+1])
99 and we can combine the chains for e and f into one chain.
101 5) For each root reference (end of the chain) R, let N be maximum distance
102 of a reference reusing its value. Variables R0 up to RN are created,
103 together with phi nodes that transfer values from R1 .. RN to
104 R0 .. R(N-1).
105 Initial values are loaded to R0..R(N-1) (in case not all references
106 must necessarily be accessed and they may trap, we may fail here;
107 TODO sometimes, the loads could be guarded by a check for the number
108 of iterations). Values loaded/stored in roots are also copied to
109 RN. Other reads are replaced with the appropriate variable Ri.
110 Everything is put to SSA form.
112 As a small improvement, if R0 is dead after the root (i.e., all uses of
113 the value with the maximum distance dominate the root), we can avoid
114 creating RN and use R0 instead of it.
116 In our example, we get (only the parts concerning a and b are shown):
117 for (i = 0; i < 100; i++)
119 f = phi (a[0], s);
120 s = phi (a[1], f);
121 x = phi (b[10], x);
123 f = f + s;
124 a[i+2] = f;
125 x = x + i;
126 b[10] = x;
129 6) Factor F for unrolling is determined as the smallest common multiple of
130 (N + 1) for each root reference (N for references for that we avoided
131 creating RN). If F and the loop is small enough, loop is unrolled F
132 times. The stores to RN (R0) in the copies of the loop body are
133 periodically replaced with R0, R1, ... (R1, R2, ...), so that they can
134 be coalesced and the copies can be eliminated.
136 TODO -- copy propagation and other optimizations may change the live
137 ranges of the temporary registers and prevent them from being coalesced;
138 this may increase the register pressure.
140 In our case, F = 2 and the (main loop of the) result is
142 for (i = 0; i < ...; i += 2)
144 f = phi (a[0], f);
145 s = phi (a[1], s);
146 x = phi (b[10], x);
148 f = f + s;
149 a[i+2] = f;
150 x = x + i;
151 b[10] = x;
153 s = s + f;
154 a[i+3] = s;
155 x = x + i;
156 b[10] = x;
159 TODO -- stores killing other stores can be taken into account, e.g.,
160 for (i = 0; i < n; i++)
162 a[i] = 1;
163 a[i+2] = 2;
166 can be replaced with
168 t0 = a[0];
169 t1 = a[1];
170 for (i = 0; i < n; i++)
172 a[i] = 1;
173 t2 = 2;
174 t0 = t1;
175 t1 = t2;
177 a[n] = t0;
178 a[n+1] = t1;
180 The interesting part is that this would generalize store motion; still, since
181 sm is performed elsewhere, it does not seem that important.
183 Predictive commoning can be generalized for arbitrary computations (not
184 just memory loads), and also nontrivial transfer functions (e.g., replacing
185 i * i with ii_last + 2 * i + 1), to generalize strength reduction. */
187 #include "config.h"
188 #include "system.h"
189 #include "coretypes.h"
190 #include "tm.h"
191 #include "input.h"
192 #include "alias.h"
193 #include "symtab.h"
194 #include "tree.h"
195 #include "fold-const.h"
196 #include "tm_p.h"
197 #include "cfgloop.h"
198 #include "predict.h"
199 #include "hard-reg-set.h"
200 #include "function.h"
201 #include "dominance.h"
202 #include "cfg.h"
203 #include "basic-block.h"
204 #include "tree-ssa-alias.h"
205 #include "internal-fn.h"
206 #include "tree-eh.h"
207 #include "gimple-expr.h"
208 #include "is-a.h"
209 #include "gimple.h"
210 #include "gimplify.h"
211 #include "gimple-iterator.h"
212 #include "gimplify-me.h"
213 #include "gimple-ssa.h"
214 #include "tree-phinodes.h"
215 #include "ssa-iterators.h"
216 #include "stringpool.h"
217 #include "tree-ssanames.h"
218 #include "tree-ssa-loop-ivopts.h"
219 #include "tree-ssa-loop-manip.h"
220 #include "tree-ssa-loop-niter.h"
221 #include "tree-ssa-loop.h"
222 #include "tree-into-ssa.h"
223 #include "rtl.h"
224 #include "flags.h"
225 #include "insn-config.h"
226 #include "expmed.h"
227 #include "dojump.h"
228 #include "explow.h"
229 #include "calls.h"
230 #include "emit-rtl.h"
231 #include "varasm.h"
232 #include "stmt.h"
233 #include "expr.h"
234 #include "tree-dfa.h"
235 #include "tree-ssa.h"
236 #include "tree-data-ref.h"
237 #include "tree-scalar-evolution.h"
238 #include "tree-chrec.h"
239 #include "params.h"
240 #include "gimple-pretty-print.h"
241 #include "tree-pass.h"
242 #include "tree-affine.h"
243 #include "tree-inline.h"
244 #include "wide-int-print.h"
246 /* The maximum number of iterations between the considered memory
247 references. */
249 #define MAX_DISTANCE (target_avail_regs < 16 ? 4 : 8)
251 /* Data references (or phi nodes that carry data reference values across
252 loop iterations). */
254 typedef struct dref_d
256 /* The reference itself. */
257 struct data_reference *ref;
259 /* The statement in that the reference appears. */
260 gimple stmt;
262 /* In case that STMT is a phi node, this field is set to the SSA name
263 defined by it in replace_phis_by_defined_names (in order to avoid
264 pointing to phi node that got reallocated in the meantime). */
265 tree name_defined_by_phi;
267 /* Distance of the reference from the root of the chain (in number of
268 iterations of the loop). */
269 unsigned distance;
271 /* Number of iterations offset from the first reference in the component. */
272 widest_int offset;
274 /* Number of the reference in a component, in dominance ordering. */
275 unsigned pos;
277 /* True if the memory reference is always accessed when the loop is
278 entered. */
279 unsigned always_accessed : 1;
280 } *dref;
283 /* Type of the chain of the references. */
285 enum chain_type
287 /* The addresses of the references in the chain are constant. */
288 CT_INVARIANT,
290 /* There are only loads in the chain. */
291 CT_LOAD,
293 /* Root of the chain is store, the rest are loads. */
294 CT_STORE_LOAD,
296 /* A combination of two chains. */
297 CT_COMBINATION
300 /* Chains of data references. */
302 typedef struct chain
304 /* Type of the chain. */
305 enum chain_type type;
307 /* For combination chains, the operator and the two chains that are
308 combined, and the type of the result. */
309 enum tree_code op;
310 tree rslt_type;
311 struct chain *ch1, *ch2;
313 /* The references in the chain. */
314 vec<dref> refs;
316 /* The maximum distance of the reference in the chain from the root. */
317 unsigned length;
319 /* The variables used to copy the value throughout iterations. */
320 vec<tree> vars;
322 /* Initializers for the variables. */
323 vec<tree> inits;
325 /* True if there is a use of a variable with the maximal distance
326 that comes after the root in the loop. */
327 unsigned has_max_use_after : 1;
329 /* True if all the memory references in the chain are always accessed. */
330 unsigned all_always_accessed : 1;
332 /* True if this chain was combined together with some other chain. */
333 unsigned combined : 1;
334 } *chain_p;
337 /* Describes the knowledge about the step of the memory references in
338 the component. */
340 enum ref_step_type
342 /* The step is zero. */
343 RS_INVARIANT,
345 /* The step is nonzero. */
346 RS_NONZERO,
348 /* The step may or may not be nonzero. */
349 RS_ANY
352 /* Components of the data dependence graph. */
354 struct component
356 /* The references in the component. */
357 vec<dref> refs;
359 /* What we know about the step of the references in the component. */
360 enum ref_step_type comp_step;
362 /* Next component in the list. */
363 struct component *next;
366 /* Bitmap of ssa names defined by looparound phi nodes covered by chains. */
368 static bitmap looparound_phis;
370 /* Cache used by tree_to_aff_combination_expand. */
372 static hash_map<tree, name_expansion *> *name_expansions;
374 /* Dumps data reference REF to FILE. */
376 extern void dump_dref (FILE *, dref);
377 void
378 dump_dref (FILE *file, dref ref)
380 if (ref->ref)
382 fprintf (file, " ");
383 print_generic_expr (file, DR_REF (ref->ref), TDF_SLIM);
384 fprintf (file, " (id %u%s)\n", ref->pos,
385 DR_IS_READ (ref->ref) ? "" : ", write");
387 fprintf (file, " offset ");
388 print_decs (ref->offset, file);
389 fprintf (file, "\n");
391 fprintf (file, " distance %u\n", ref->distance);
393 else
395 if (gimple_code (ref->stmt) == GIMPLE_PHI)
396 fprintf (file, " looparound ref\n");
397 else
398 fprintf (file, " combination ref\n");
399 fprintf (file, " in statement ");
400 print_gimple_stmt (file, ref->stmt, 0, TDF_SLIM);
401 fprintf (file, "\n");
402 fprintf (file, " distance %u\n", ref->distance);
407 /* Dumps CHAIN to FILE. */
409 extern void dump_chain (FILE *, chain_p);
410 void
411 dump_chain (FILE *file, chain_p chain)
413 dref a;
414 const char *chain_type;
415 unsigned i;
416 tree var;
418 switch (chain->type)
420 case CT_INVARIANT:
421 chain_type = "Load motion";
422 break;
424 case CT_LOAD:
425 chain_type = "Loads-only";
426 break;
428 case CT_STORE_LOAD:
429 chain_type = "Store-loads";
430 break;
432 case CT_COMBINATION:
433 chain_type = "Combination";
434 break;
436 default:
437 gcc_unreachable ();
440 fprintf (file, "%s chain %p%s\n", chain_type, (void *) chain,
441 chain->combined ? " (combined)" : "");
442 if (chain->type != CT_INVARIANT)
443 fprintf (file, " max distance %u%s\n", chain->length,
444 chain->has_max_use_after ? "" : ", may reuse first");
446 if (chain->type == CT_COMBINATION)
448 fprintf (file, " equal to %p %s %p in type ",
449 (void *) chain->ch1, op_symbol_code (chain->op),
450 (void *) chain->ch2);
451 print_generic_expr (file, chain->rslt_type, TDF_SLIM);
452 fprintf (file, "\n");
455 if (chain->vars.exists ())
457 fprintf (file, " vars");
458 FOR_EACH_VEC_ELT (chain->vars, i, var)
460 fprintf (file, " ");
461 print_generic_expr (file, var, TDF_SLIM);
463 fprintf (file, "\n");
466 if (chain->inits.exists ())
468 fprintf (file, " inits");
469 FOR_EACH_VEC_ELT (chain->inits, i, var)
471 fprintf (file, " ");
472 print_generic_expr (file, var, TDF_SLIM);
474 fprintf (file, "\n");
477 fprintf (file, " references:\n");
478 FOR_EACH_VEC_ELT (chain->refs, i, a)
479 dump_dref (file, a);
481 fprintf (file, "\n");
484 /* Dumps CHAINS to FILE. */
486 extern void dump_chains (FILE *, vec<chain_p> );
487 void
488 dump_chains (FILE *file, vec<chain_p> chains)
490 chain_p chain;
491 unsigned i;
493 FOR_EACH_VEC_ELT (chains, i, chain)
494 dump_chain (file, chain);
497 /* Dumps COMP to FILE. */
499 extern void dump_component (FILE *, struct component *);
500 void
501 dump_component (FILE *file, struct component *comp)
503 dref a;
504 unsigned i;
506 fprintf (file, "Component%s:\n",
507 comp->comp_step == RS_INVARIANT ? " (invariant)" : "");
508 FOR_EACH_VEC_ELT (comp->refs, i, a)
509 dump_dref (file, a);
510 fprintf (file, "\n");
513 /* Dumps COMPS to FILE. */
515 extern void dump_components (FILE *, struct component *);
516 void
517 dump_components (FILE *file, struct component *comps)
519 struct component *comp;
521 for (comp = comps; comp; comp = comp->next)
522 dump_component (file, comp);
525 /* Frees a chain CHAIN. */
527 static void
528 release_chain (chain_p chain)
530 dref ref;
531 unsigned i;
533 if (chain == NULL)
534 return;
536 FOR_EACH_VEC_ELT (chain->refs, i, ref)
537 free (ref);
539 chain->refs.release ();
540 chain->vars.release ();
541 chain->inits.release ();
543 free (chain);
546 /* Frees CHAINS. */
548 static void
549 release_chains (vec<chain_p> chains)
551 unsigned i;
552 chain_p chain;
554 FOR_EACH_VEC_ELT (chains, i, chain)
555 release_chain (chain);
556 chains.release ();
559 /* Frees a component COMP. */
561 static void
562 release_component (struct component *comp)
564 comp->refs.release ();
565 free (comp);
568 /* Frees list of components COMPS. */
570 static void
571 release_components (struct component *comps)
573 struct component *act, *next;
575 for (act = comps; act; act = next)
577 next = act->next;
578 release_component (act);
582 /* Finds a root of tree given by FATHERS containing A, and performs path
583 shortening. */
585 static unsigned
586 component_of (unsigned fathers[], unsigned a)
588 unsigned root, n;
590 for (root = a; root != fathers[root]; root = fathers[root])
591 continue;
593 for (; a != root; a = n)
595 n = fathers[a];
596 fathers[a] = root;
599 return root;
602 /* Join operation for DFU. FATHERS gives the tree, SIZES are sizes of the
603 components, A and B are components to merge. */
605 static void
606 merge_comps (unsigned fathers[], unsigned sizes[], unsigned a, unsigned b)
608 unsigned ca = component_of (fathers, a);
609 unsigned cb = component_of (fathers, b);
611 if (ca == cb)
612 return;
614 if (sizes[ca] < sizes[cb])
616 sizes[cb] += sizes[ca];
617 fathers[ca] = cb;
619 else
621 sizes[ca] += sizes[cb];
622 fathers[cb] = ca;
626 /* Returns true if A is a reference that is suitable for predictive commoning
627 in the innermost loop that contains it. REF_STEP is set according to the
628 step of the reference A. */
630 static bool
631 suitable_reference_p (struct data_reference *a, enum ref_step_type *ref_step)
633 tree ref = DR_REF (a), step = DR_STEP (a);
635 if (!step
636 || TREE_THIS_VOLATILE (ref)
637 || !is_gimple_reg_type (TREE_TYPE (ref))
638 || tree_could_throw_p (ref))
639 return false;
641 if (integer_zerop (step))
642 *ref_step = RS_INVARIANT;
643 else if (integer_nonzerop (step))
644 *ref_step = RS_NONZERO;
645 else
646 *ref_step = RS_ANY;
648 return true;
651 /* Stores DR_OFFSET (DR) + DR_INIT (DR) to OFFSET. */
653 static void
654 aff_combination_dr_offset (struct data_reference *dr, aff_tree *offset)
656 tree type = TREE_TYPE (DR_OFFSET (dr));
657 aff_tree delta;
659 tree_to_aff_combination_expand (DR_OFFSET (dr), type, offset,
660 &name_expansions);
661 aff_combination_const (&delta, type, wi::to_widest (DR_INIT (dr)));
662 aff_combination_add (offset, &delta);
665 /* Determines number of iterations of the innermost enclosing loop before B
666 refers to exactly the same location as A and stores it to OFF. If A and
667 B do not have the same step, they never meet, or anything else fails,
668 returns false, otherwise returns true. Both A and B are assumed to
669 satisfy suitable_reference_p. */
671 static bool
672 determine_offset (struct data_reference *a, struct data_reference *b,
673 widest_int *off)
675 aff_tree diff, baseb, step;
676 tree typea, typeb;
678 /* Check that both the references access the location in the same type. */
679 typea = TREE_TYPE (DR_REF (a));
680 typeb = TREE_TYPE (DR_REF (b));
681 if (!useless_type_conversion_p (typeb, typea))
682 return false;
684 /* Check whether the base address and the step of both references is the
685 same. */
686 if (!operand_equal_p (DR_STEP (a), DR_STEP (b), 0)
687 || !operand_equal_p (DR_BASE_ADDRESS (a), DR_BASE_ADDRESS (b), 0))
688 return false;
690 if (integer_zerop (DR_STEP (a)))
692 /* If the references have loop invariant address, check that they access
693 exactly the same location. */
694 *off = 0;
695 return (operand_equal_p (DR_OFFSET (a), DR_OFFSET (b), 0)
696 && operand_equal_p (DR_INIT (a), DR_INIT (b), 0));
699 /* Compare the offsets of the addresses, and check whether the difference
700 is a multiple of step. */
701 aff_combination_dr_offset (a, &diff);
702 aff_combination_dr_offset (b, &baseb);
703 aff_combination_scale (&baseb, -1);
704 aff_combination_add (&diff, &baseb);
706 tree_to_aff_combination_expand (DR_STEP (a), TREE_TYPE (DR_STEP (a)),
707 &step, &name_expansions);
708 return aff_combination_constant_multiple_p (&diff, &step, off);
711 /* Returns the last basic block in LOOP for that we are sure that
712 it is executed whenever the loop is entered. */
714 static basic_block
715 last_always_executed_block (struct loop *loop)
717 unsigned i;
718 vec<edge> exits = get_loop_exit_edges (loop);
719 edge ex;
720 basic_block last = loop->latch;
722 FOR_EACH_VEC_ELT (exits, i, ex)
723 last = nearest_common_dominator (CDI_DOMINATORS, last, ex->src);
724 exits.release ();
726 return last;
729 /* Splits dependence graph on DATAREFS described by DEPENDS to components. */
731 static struct component *
732 split_data_refs_to_components (struct loop *loop,
733 vec<data_reference_p> datarefs,
734 vec<ddr_p> depends)
736 unsigned i, n = datarefs.length ();
737 unsigned ca, ia, ib, bad;
738 unsigned *comp_father = XNEWVEC (unsigned, n + 1);
739 unsigned *comp_size = XNEWVEC (unsigned, n + 1);
740 struct component **comps;
741 struct data_reference *dr, *dra, *drb;
742 struct data_dependence_relation *ddr;
743 struct component *comp_list = NULL, *comp;
744 dref dataref;
745 basic_block last_always_executed = last_always_executed_block (loop);
747 FOR_EACH_VEC_ELT (datarefs, i, dr)
749 if (!DR_REF (dr))
751 /* A fake reference for call or asm_expr that may clobber memory;
752 just fail. */
753 goto end;
755 /* predcom pass isn't prepared to handle calls with data references. */
756 if (is_gimple_call (DR_STMT (dr)))
757 goto end;
758 dr->aux = (void *) (size_t) i;
759 comp_father[i] = i;
760 comp_size[i] = 1;
763 /* A component reserved for the "bad" data references. */
764 comp_father[n] = n;
765 comp_size[n] = 1;
767 FOR_EACH_VEC_ELT (datarefs, i, dr)
769 enum ref_step_type dummy;
771 if (!suitable_reference_p (dr, &dummy))
773 ia = (unsigned) (size_t) dr->aux;
774 merge_comps (comp_father, comp_size, n, ia);
778 FOR_EACH_VEC_ELT (depends, i, ddr)
780 widest_int dummy_off;
782 if (DDR_ARE_DEPENDENT (ddr) == chrec_known)
783 continue;
785 dra = DDR_A (ddr);
786 drb = DDR_B (ddr);
787 ia = component_of (comp_father, (unsigned) (size_t) dra->aux);
788 ib = component_of (comp_father, (unsigned) (size_t) drb->aux);
789 if (ia == ib)
790 continue;
792 bad = component_of (comp_father, n);
794 /* If both A and B are reads, we may ignore unsuitable dependences. */
795 if (DR_IS_READ (dra) && DR_IS_READ (drb))
797 if (ia == bad || ib == bad
798 || !determine_offset (dra, drb, &dummy_off))
799 continue;
801 /* If A is read and B write or vice versa and there is unsuitable
802 dependence, instead of merging both components into a component
803 that will certainly not pass suitable_component_p, just put the
804 read into bad component, perhaps at least the write together with
805 all the other data refs in it's component will be optimizable. */
806 else if (DR_IS_READ (dra) && ib != bad)
808 if (ia == bad)
809 continue;
810 else if (!determine_offset (dra, drb, &dummy_off))
812 merge_comps (comp_father, comp_size, bad, ia);
813 continue;
816 else if (DR_IS_READ (drb) && ia != bad)
818 if (ib == bad)
819 continue;
820 else if (!determine_offset (dra, drb, &dummy_off))
822 merge_comps (comp_father, comp_size, bad, ib);
823 continue;
827 merge_comps (comp_father, comp_size, ia, ib);
830 comps = XCNEWVEC (struct component *, n);
831 bad = component_of (comp_father, n);
832 FOR_EACH_VEC_ELT (datarefs, i, dr)
834 ia = (unsigned) (size_t) dr->aux;
835 ca = component_of (comp_father, ia);
836 if (ca == bad)
837 continue;
839 comp = comps[ca];
840 if (!comp)
842 comp = XCNEW (struct component);
843 comp->refs.create (comp_size[ca]);
844 comps[ca] = comp;
847 dataref = XCNEW (struct dref_d);
848 dataref->ref = dr;
849 dataref->stmt = DR_STMT (dr);
850 dataref->offset = 0;
851 dataref->distance = 0;
853 dataref->always_accessed
854 = dominated_by_p (CDI_DOMINATORS, last_always_executed,
855 gimple_bb (dataref->stmt));
856 dataref->pos = comp->refs.length ();
857 comp->refs.quick_push (dataref);
860 for (i = 0; i < n; i++)
862 comp = comps[i];
863 if (comp)
865 comp->next = comp_list;
866 comp_list = comp;
869 free (comps);
871 end:
872 free (comp_father);
873 free (comp_size);
874 return comp_list;
877 /* Returns true if the component COMP satisfies the conditions
878 described in 2) at the beginning of this file. LOOP is the current
879 loop. */
881 static bool
882 suitable_component_p (struct loop *loop, struct component *comp)
884 unsigned i;
885 dref a, first;
886 basic_block ba, bp = loop->header;
887 bool ok, has_write = false;
889 FOR_EACH_VEC_ELT (comp->refs, i, a)
891 ba = gimple_bb (a->stmt);
893 if (!just_once_each_iteration_p (loop, ba))
894 return false;
896 gcc_assert (dominated_by_p (CDI_DOMINATORS, ba, bp));
897 bp = ba;
899 if (DR_IS_WRITE (a->ref))
900 has_write = true;
903 first = comp->refs[0];
904 ok = suitable_reference_p (first->ref, &comp->comp_step);
905 gcc_assert (ok);
906 first->offset = 0;
908 for (i = 1; comp->refs.iterate (i, &a); i++)
910 if (!determine_offset (first->ref, a->ref, &a->offset))
911 return false;
913 #ifdef ENABLE_CHECKING
915 enum ref_step_type a_step;
916 ok = suitable_reference_p (a->ref, &a_step);
917 gcc_assert (ok && a_step == comp->comp_step);
919 #endif
922 /* If there is a write inside the component, we must know whether the
923 step is nonzero or not -- we would not otherwise be able to recognize
924 whether the value accessed by reads comes from the OFFSET-th iteration
925 or the previous one. */
926 if (has_write && comp->comp_step == RS_ANY)
927 return false;
929 return true;
932 /* Check the conditions on references inside each of components COMPS,
933 and remove the unsuitable components from the list. The new list
934 of components is returned. The conditions are described in 2) at
935 the beginning of this file. LOOP is the current loop. */
937 static struct component *
938 filter_suitable_components (struct loop *loop, struct component *comps)
940 struct component **comp, *act;
942 for (comp = &comps; *comp; )
944 act = *comp;
945 if (suitable_component_p (loop, act))
946 comp = &act->next;
947 else
949 dref ref;
950 unsigned i;
952 *comp = act->next;
953 FOR_EACH_VEC_ELT (act->refs, i, ref)
954 free (ref);
955 release_component (act);
959 return comps;
962 /* Compares two drefs A and B by their offset and position. Callback for
963 qsort. */
965 static int
966 order_drefs (const void *a, const void *b)
968 const dref *const da = (const dref *) a;
969 const dref *const db = (const dref *) b;
970 int offcmp = wi::cmps ((*da)->offset, (*db)->offset);
972 if (offcmp != 0)
973 return offcmp;
975 return (*da)->pos - (*db)->pos;
978 /* Returns root of the CHAIN. */
980 static inline dref
981 get_chain_root (chain_p chain)
983 return chain->refs[0];
986 /* Adds REF to the chain CHAIN. */
988 static void
989 add_ref_to_chain (chain_p chain, dref ref)
991 dref root = get_chain_root (chain);
993 gcc_assert (wi::les_p (root->offset, ref->offset));
994 widest_int dist = ref->offset - root->offset;
995 if (wi::leu_p (MAX_DISTANCE, dist))
997 free (ref);
998 return;
1000 gcc_assert (wi::fits_uhwi_p (dist));
1002 chain->refs.safe_push (ref);
1004 ref->distance = dist.to_uhwi ();
1006 if (ref->distance >= chain->length)
1008 chain->length = ref->distance;
1009 chain->has_max_use_after = false;
1012 if (ref->distance == chain->length
1013 && ref->pos > root->pos)
1014 chain->has_max_use_after = true;
1016 chain->all_always_accessed &= ref->always_accessed;
1019 /* Returns the chain for invariant component COMP. */
1021 static chain_p
1022 make_invariant_chain (struct component *comp)
1024 chain_p chain = XCNEW (struct chain);
1025 unsigned i;
1026 dref ref;
1028 chain->type = CT_INVARIANT;
1030 chain->all_always_accessed = true;
1032 FOR_EACH_VEC_ELT (comp->refs, i, ref)
1034 chain->refs.safe_push (ref);
1035 chain->all_always_accessed &= ref->always_accessed;
1038 return chain;
1041 /* Make a new chain rooted at REF. */
1043 static chain_p
1044 make_rooted_chain (dref ref)
1046 chain_p chain = XCNEW (struct chain);
1048 chain->type = DR_IS_READ (ref->ref) ? CT_LOAD : CT_STORE_LOAD;
1050 chain->refs.safe_push (ref);
1051 chain->all_always_accessed = ref->always_accessed;
1053 ref->distance = 0;
1055 return chain;
1058 /* Returns true if CHAIN is not trivial. */
1060 static bool
1061 nontrivial_chain_p (chain_p chain)
1063 return chain != NULL && chain->refs.length () > 1;
1066 /* Returns the ssa name that contains the value of REF, or NULL_TREE if there
1067 is no such name. */
1069 static tree
1070 name_for_ref (dref ref)
1072 tree name;
1074 if (is_gimple_assign (ref->stmt))
1076 if (!ref->ref || DR_IS_READ (ref->ref))
1077 name = gimple_assign_lhs (ref->stmt);
1078 else
1079 name = gimple_assign_rhs1 (ref->stmt);
1081 else
1082 name = PHI_RESULT (ref->stmt);
1084 return (TREE_CODE (name) == SSA_NAME ? name : NULL_TREE);
1087 /* Returns true if REF is a valid initializer for ROOT with given DISTANCE (in
1088 iterations of the innermost enclosing loop). */
1090 static bool
1091 valid_initializer_p (struct data_reference *ref,
1092 unsigned distance, struct data_reference *root)
1094 aff_tree diff, base, step;
1095 widest_int off;
1097 /* Both REF and ROOT must be accessing the same object. */
1098 if (!operand_equal_p (DR_BASE_ADDRESS (ref), DR_BASE_ADDRESS (root), 0))
1099 return false;
1101 /* The initializer is defined outside of loop, hence its address must be
1102 invariant inside the loop. */
1103 gcc_assert (integer_zerop (DR_STEP (ref)));
1105 /* If the address of the reference is invariant, initializer must access
1106 exactly the same location. */
1107 if (integer_zerop (DR_STEP (root)))
1108 return (operand_equal_p (DR_OFFSET (ref), DR_OFFSET (root), 0)
1109 && operand_equal_p (DR_INIT (ref), DR_INIT (root), 0));
1111 /* Verify that this index of REF is equal to the root's index at
1112 -DISTANCE-th iteration. */
1113 aff_combination_dr_offset (root, &diff);
1114 aff_combination_dr_offset (ref, &base);
1115 aff_combination_scale (&base, -1);
1116 aff_combination_add (&diff, &base);
1118 tree_to_aff_combination_expand (DR_STEP (root), TREE_TYPE (DR_STEP (root)),
1119 &step, &name_expansions);
1120 if (!aff_combination_constant_multiple_p (&diff, &step, &off))
1121 return false;
1123 if (off != distance)
1124 return false;
1126 return true;
1129 /* Finds looparound phi node of LOOP that copies the value of REF, and if its
1130 initial value is correct (equal to initial value of REF shifted by one
1131 iteration), returns the phi node. Otherwise, NULL_TREE is returned. ROOT
1132 is the root of the current chain. */
1134 static gphi *
1135 find_looparound_phi (struct loop *loop, dref ref, dref root)
1137 tree name, init, init_ref;
1138 gphi *phi = NULL;
1139 gimple init_stmt;
1140 edge latch = loop_latch_edge (loop);
1141 struct data_reference init_dr;
1142 gphi_iterator psi;
1144 if (is_gimple_assign (ref->stmt))
1146 if (DR_IS_READ (ref->ref))
1147 name = gimple_assign_lhs (ref->stmt);
1148 else
1149 name = gimple_assign_rhs1 (ref->stmt);
1151 else
1152 name = PHI_RESULT (ref->stmt);
1153 if (!name)
1154 return NULL;
1156 for (psi = gsi_start_phis (loop->header); !gsi_end_p (psi); gsi_next (&psi))
1158 phi = psi.phi ();
1159 if (PHI_ARG_DEF_FROM_EDGE (phi, latch) == name)
1160 break;
1163 if (gsi_end_p (psi))
1164 return NULL;
1166 init = PHI_ARG_DEF_FROM_EDGE (phi, loop_preheader_edge (loop));
1167 if (TREE_CODE (init) != SSA_NAME)
1168 return NULL;
1169 init_stmt = SSA_NAME_DEF_STMT (init);
1170 if (gimple_code (init_stmt) != GIMPLE_ASSIGN)
1171 return NULL;
1172 gcc_assert (gimple_assign_lhs (init_stmt) == init);
1174 init_ref = gimple_assign_rhs1 (init_stmt);
1175 if (!REFERENCE_CLASS_P (init_ref)
1176 && !DECL_P (init_ref))
1177 return NULL;
1179 /* Analyze the behavior of INIT_REF with respect to LOOP (innermost
1180 loop enclosing PHI). */
1181 memset (&init_dr, 0, sizeof (struct data_reference));
1182 DR_REF (&init_dr) = init_ref;
1183 DR_STMT (&init_dr) = phi;
1184 if (!dr_analyze_innermost (&init_dr, loop))
1185 return NULL;
1187 if (!valid_initializer_p (&init_dr, ref->distance + 1, root->ref))
1188 return NULL;
1190 return phi;
1193 /* Adds a reference for the looparound copy of REF in PHI to CHAIN. */
1195 static void
1196 insert_looparound_copy (chain_p chain, dref ref, gphi *phi)
1198 dref nw = XCNEW (struct dref_d), aref;
1199 unsigned i;
1201 nw->stmt = phi;
1202 nw->distance = ref->distance + 1;
1203 nw->always_accessed = 1;
1205 FOR_EACH_VEC_ELT (chain->refs, i, aref)
1206 if (aref->distance >= nw->distance)
1207 break;
1208 chain->refs.safe_insert (i, nw);
1210 if (nw->distance > chain->length)
1212 chain->length = nw->distance;
1213 chain->has_max_use_after = false;
1217 /* For references in CHAIN that are copied around the LOOP (created previously
1218 by PRE, or by user), add the results of such copies to the chain. This
1219 enables us to remove the copies by unrolling, and may need less registers
1220 (also, it may allow us to combine chains together). */
1222 static void
1223 add_looparound_copies (struct loop *loop, chain_p chain)
1225 unsigned i;
1226 dref ref, root = get_chain_root (chain);
1227 gphi *phi;
1229 FOR_EACH_VEC_ELT (chain->refs, i, ref)
1231 phi = find_looparound_phi (loop, ref, root);
1232 if (!phi)
1233 continue;
1235 bitmap_set_bit (looparound_phis, SSA_NAME_VERSION (PHI_RESULT (phi)));
1236 insert_looparound_copy (chain, ref, phi);
1240 /* Find roots of the values and determine distances in the component COMP.
1241 The references are redistributed into CHAINS. LOOP is the current
1242 loop. */
1244 static void
1245 determine_roots_comp (struct loop *loop,
1246 struct component *comp,
1247 vec<chain_p> *chains)
1249 unsigned i;
1250 dref a;
1251 chain_p chain = NULL;
1252 widest_int last_ofs = 0;
1254 /* Invariants are handled specially. */
1255 if (comp->comp_step == RS_INVARIANT)
1257 chain = make_invariant_chain (comp);
1258 chains->safe_push (chain);
1259 return;
1262 comp->refs.qsort (order_drefs);
1264 FOR_EACH_VEC_ELT (comp->refs, i, a)
1266 if (!chain || DR_IS_WRITE (a->ref)
1267 || wi::leu_p (MAX_DISTANCE, a->offset - last_ofs))
1269 if (nontrivial_chain_p (chain))
1271 add_looparound_copies (loop, chain);
1272 chains->safe_push (chain);
1274 else
1275 release_chain (chain);
1276 chain = make_rooted_chain (a);
1277 last_ofs = a->offset;
1278 continue;
1281 add_ref_to_chain (chain, a);
1284 if (nontrivial_chain_p (chain))
1286 add_looparound_copies (loop, chain);
1287 chains->safe_push (chain);
1289 else
1290 release_chain (chain);
1293 /* Find roots of the values and determine distances in components COMPS, and
1294 separates the references to CHAINS. LOOP is the current loop. */
1296 static void
1297 determine_roots (struct loop *loop,
1298 struct component *comps, vec<chain_p> *chains)
1300 struct component *comp;
1302 for (comp = comps; comp; comp = comp->next)
1303 determine_roots_comp (loop, comp, chains);
1306 /* Replace the reference in statement STMT with temporary variable
1307 NEW_TREE. If SET is true, NEW_TREE is instead initialized to the value of
1308 the reference in the statement. IN_LHS is true if the reference
1309 is in the lhs of STMT, false if it is in rhs. */
1311 static void
1312 replace_ref_with (gimple stmt, tree new_tree, bool set, bool in_lhs)
1314 tree val;
1315 gassign *new_stmt;
1316 gimple_stmt_iterator bsi, psi;
1318 if (gimple_code (stmt) == GIMPLE_PHI)
1320 gcc_assert (!in_lhs && !set);
1322 val = PHI_RESULT (stmt);
1323 bsi = gsi_after_labels (gimple_bb (stmt));
1324 psi = gsi_for_stmt (stmt);
1325 remove_phi_node (&psi, false);
1327 /* Turn the phi node into GIMPLE_ASSIGN. */
1328 new_stmt = gimple_build_assign (val, new_tree);
1329 gsi_insert_before (&bsi, new_stmt, GSI_NEW_STMT);
1330 return;
1333 /* Since the reference is of gimple_reg type, it should only
1334 appear as lhs or rhs of modify statement. */
1335 gcc_assert (is_gimple_assign (stmt));
1337 bsi = gsi_for_stmt (stmt);
1339 /* If we do not need to initialize NEW_TREE, just replace the use of OLD. */
1340 if (!set)
1342 gcc_assert (!in_lhs);
1343 gimple_assign_set_rhs_from_tree (&bsi, new_tree);
1344 stmt = gsi_stmt (bsi);
1345 update_stmt (stmt);
1346 return;
1349 if (in_lhs)
1351 /* We have statement
1353 OLD = VAL
1355 If OLD is a memory reference, then VAL is gimple_val, and we transform
1356 this to
1358 OLD = VAL
1359 NEW = VAL
1361 Otherwise, we are replacing a combination chain,
1362 VAL is the expression that performs the combination, and OLD is an
1363 SSA name. In this case, we transform the assignment to
1365 OLD = VAL
1366 NEW = OLD
1370 val = gimple_assign_lhs (stmt);
1371 if (TREE_CODE (val) != SSA_NAME)
1373 val = gimple_assign_rhs1 (stmt);
1374 gcc_assert (gimple_assign_single_p (stmt));
1375 if (TREE_CLOBBER_P (val))
1376 val = get_or_create_ssa_default_def (cfun, SSA_NAME_VAR (new_tree));
1377 else
1378 gcc_assert (gimple_assign_copy_p (stmt));
1381 else
1383 /* VAL = OLD
1385 is transformed to
1387 VAL = OLD
1388 NEW = VAL */
1390 val = gimple_assign_lhs (stmt);
1393 new_stmt = gimple_build_assign (new_tree, unshare_expr (val));
1394 gsi_insert_after (&bsi, new_stmt, GSI_NEW_STMT);
1397 /* Returns a memory reference to DR in the ITER-th iteration of
1398 the loop it was analyzed in. Append init stmts to STMTS. */
1400 static tree
1401 ref_at_iteration (data_reference_p dr, int iter, gimple_seq *stmts)
1403 tree off = DR_OFFSET (dr);
1404 tree coff = DR_INIT (dr);
1405 if (iter == 0)
1407 else if (TREE_CODE (DR_STEP (dr)) == INTEGER_CST)
1408 coff = size_binop (PLUS_EXPR, coff,
1409 size_binop (MULT_EXPR, DR_STEP (dr), ssize_int (iter)));
1410 else
1411 off = size_binop (PLUS_EXPR, off,
1412 size_binop (MULT_EXPR, DR_STEP (dr), ssize_int (iter)));
1413 tree addr = fold_build_pointer_plus (DR_BASE_ADDRESS (dr), off);
1414 addr = force_gimple_operand_1 (unshare_expr (addr), stmts,
1415 is_gimple_mem_ref_addr, NULL_TREE);
1416 tree alias_ptr = fold_convert (reference_alias_ptr_type (DR_REF (dr)), coff);
1417 /* While data-ref analysis punts on bit offsets it still handles
1418 bitfield accesses at byte boundaries. Cope with that. Note that
1419 we cannot simply re-apply the outer COMPONENT_REF because the
1420 byte-granular portion of it is already applied via DR_INIT and
1421 DR_OFFSET, so simply build a BIT_FIELD_REF knowing that the bits
1422 start at offset zero. */
1423 if (TREE_CODE (DR_REF (dr)) == COMPONENT_REF
1424 && DECL_BIT_FIELD (TREE_OPERAND (DR_REF (dr), 1)))
1426 tree field = TREE_OPERAND (DR_REF (dr), 1);
1427 return build3 (BIT_FIELD_REF, TREE_TYPE (DR_REF (dr)),
1428 build2 (MEM_REF, DECL_BIT_FIELD_TYPE (field),
1429 addr, alias_ptr),
1430 DECL_SIZE (field), bitsize_zero_node);
1432 else
1433 return fold_build2 (MEM_REF, TREE_TYPE (DR_REF (dr)), addr, alias_ptr);
1436 /* Get the initialization expression for the INDEX-th temporary variable
1437 of CHAIN. */
1439 static tree
1440 get_init_expr (chain_p chain, unsigned index)
1442 if (chain->type == CT_COMBINATION)
1444 tree e1 = get_init_expr (chain->ch1, index);
1445 tree e2 = get_init_expr (chain->ch2, index);
1447 return fold_build2 (chain->op, chain->rslt_type, e1, e2);
1449 else
1450 return chain->inits[index];
1453 /* Returns a new temporary variable used for the I-th variable carrying
1454 value of REF. The variable's uid is marked in TMP_VARS. */
1456 static tree
1457 predcom_tmp_var (tree ref, unsigned i, bitmap tmp_vars)
1459 tree type = TREE_TYPE (ref);
1460 /* We never access the components of the temporary variable in predictive
1461 commoning. */
1462 tree var = create_tmp_reg (type, get_lsm_tmp_name (ref, i));
1463 bitmap_set_bit (tmp_vars, DECL_UID (var));
1464 return var;
1467 /* Creates the variables for CHAIN, as well as phi nodes for them and
1468 initialization on entry to LOOP. Uids of the newly created
1469 temporary variables are marked in TMP_VARS. */
1471 static void
1472 initialize_root_vars (struct loop *loop, chain_p chain, bitmap tmp_vars)
1474 unsigned i;
1475 unsigned n = chain->length;
1476 dref root = get_chain_root (chain);
1477 bool reuse_first = !chain->has_max_use_after;
1478 tree ref, init, var, next;
1479 gphi *phi;
1480 gimple_seq stmts;
1481 edge entry = loop_preheader_edge (loop), latch = loop_latch_edge (loop);
1483 /* If N == 0, then all the references are within the single iteration. And
1484 since this is an nonempty chain, reuse_first cannot be true. */
1485 gcc_assert (n > 0 || !reuse_first);
1487 chain->vars.create (n + 1);
1489 if (chain->type == CT_COMBINATION)
1490 ref = gimple_assign_lhs (root->stmt);
1491 else
1492 ref = DR_REF (root->ref);
1494 for (i = 0; i < n + (reuse_first ? 0 : 1); i++)
1496 var = predcom_tmp_var (ref, i, tmp_vars);
1497 chain->vars.quick_push (var);
1499 if (reuse_first)
1500 chain->vars.quick_push (chain->vars[0]);
1502 FOR_EACH_VEC_ELT (chain->vars, i, var)
1503 chain->vars[i] = make_ssa_name (var);
1505 for (i = 0; i < n; i++)
1507 var = chain->vars[i];
1508 next = chain->vars[i + 1];
1509 init = get_init_expr (chain, i);
1511 init = force_gimple_operand (init, &stmts, true, NULL_TREE);
1512 if (stmts)
1513 gsi_insert_seq_on_edge_immediate (entry, stmts);
1515 phi = create_phi_node (var, loop->header);
1516 add_phi_arg (phi, init, entry, UNKNOWN_LOCATION);
1517 add_phi_arg (phi, next, latch, UNKNOWN_LOCATION);
1521 /* Create the variables and initialization statement for root of chain
1522 CHAIN. Uids of the newly created temporary variables are marked
1523 in TMP_VARS. */
1525 static void
1526 initialize_root (struct loop *loop, chain_p chain, bitmap tmp_vars)
1528 dref root = get_chain_root (chain);
1529 bool in_lhs = (chain->type == CT_STORE_LOAD
1530 || chain->type == CT_COMBINATION);
1532 initialize_root_vars (loop, chain, tmp_vars);
1533 replace_ref_with (root->stmt,
1534 chain->vars[chain->length],
1535 true, in_lhs);
1538 /* Initializes a variable for load motion for ROOT and prepares phi nodes and
1539 initialization on entry to LOOP if necessary. The ssa name for the variable
1540 is stored in VARS. If WRITTEN is true, also a phi node to copy its value
1541 around the loop is created. Uid of the newly created temporary variable
1542 is marked in TMP_VARS. INITS is the list containing the (single)
1543 initializer. */
1545 static void
1546 initialize_root_vars_lm (struct loop *loop, dref root, bool written,
1547 vec<tree> *vars, vec<tree> inits,
1548 bitmap tmp_vars)
1550 unsigned i;
1551 tree ref = DR_REF (root->ref), init, var, next;
1552 gimple_seq stmts;
1553 gphi *phi;
1554 edge entry = loop_preheader_edge (loop), latch = loop_latch_edge (loop);
1556 /* Find the initializer for the variable, and check that it cannot
1557 trap. */
1558 init = inits[0];
1560 vars->create (written ? 2 : 1);
1561 var = predcom_tmp_var (ref, 0, tmp_vars);
1562 vars->quick_push (var);
1563 if (written)
1564 vars->quick_push ((*vars)[0]);
1566 FOR_EACH_VEC_ELT (*vars, i, var)
1567 (*vars)[i] = make_ssa_name (var);
1569 var = (*vars)[0];
1571 init = force_gimple_operand (init, &stmts, written, NULL_TREE);
1572 if (stmts)
1573 gsi_insert_seq_on_edge_immediate (entry, stmts);
1575 if (written)
1577 next = (*vars)[1];
1578 phi = create_phi_node (var, loop->header);
1579 add_phi_arg (phi, init, entry, UNKNOWN_LOCATION);
1580 add_phi_arg (phi, next, latch, UNKNOWN_LOCATION);
1582 else
1584 gassign *init_stmt = gimple_build_assign (var, init);
1585 gsi_insert_on_edge_immediate (entry, init_stmt);
1590 /* Execute load motion for references in chain CHAIN. Uids of the newly
1591 created temporary variables are marked in TMP_VARS. */
1593 static void
1594 execute_load_motion (struct loop *loop, chain_p chain, bitmap tmp_vars)
1596 auto_vec<tree> vars;
1597 dref a;
1598 unsigned n_writes = 0, ridx, i;
1599 tree var;
1601 gcc_assert (chain->type == CT_INVARIANT);
1602 gcc_assert (!chain->combined);
1603 FOR_EACH_VEC_ELT (chain->refs, i, a)
1604 if (DR_IS_WRITE (a->ref))
1605 n_writes++;
1607 /* If there are no reads in the loop, there is nothing to do. */
1608 if (n_writes == chain->refs.length ())
1609 return;
1611 initialize_root_vars_lm (loop, get_chain_root (chain), n_writes > 0,
1612 &vars, chain->inits, tmp_vars);
1614 ridx = 0;
1615 FOR_EACH_VEC_ELT (chain->refs, i, a)
1617 bool is_read = DR_IS_READ (a->ref);
1619 if (DR_IS_WRITE (a->ref))
1621 n_writes--;
1622 if (n_writes)
1624 var = vars[0];
1625 var = make_ssa_name (SSA_NAME_VAR (var));
1626 vars[0] = var;
1628 else
1629 ridx = 1;
1632 replace_ref_with (a->stmt, vars[ridx],
1633 !is_read, !is_read);
1637 /* Returns the single statement in that NAME is used, excepting
1638 the looparound phi nodes contained in one of the chains. If there is no
1639 such statement, or more statements, NULL is returned. */
1641 static gimple
1642 single_nonlooparound_use (tree name)
1644 use_operand_p use;
1645 imm_use_iterator it;
1646 gimple stmt, ret = NULL;
1648 FOR_EACH_IMM_USE_FAST (use, it, name)
1650 stmt = USE_STMT (use);
1652 if (gimple_code (stmt) == GIMPLE_PHI)
1654 /* Ignore uses in looparound phi nodes. Uses in other phi nodes
1655 could not be processed anyway, so just fail for them. */
1656 if (bitmap_bit_p (looparound_phis,
1657 SSA_NAME_VERSION (PHI_RESULT (stmt))))
1658 continue;
1660 return NULL;
1662 else if (is_gimple_debug (stmt))
1663 continue;
1664 else if (ret != NULL)
1665 return NULL;
1666 else
1667 ret = stmt;
1670 return ret;
1673 /* Remove statement STMT, as well as the chain of assignments in that it is
1674 used. */
1676 static void
1677 remove_stmt (gimple stmt)
1679 tree name;
1680 gimple next;
1681 gimple_stmt_iterator psi;
1683 if (gimple_code (stmt) == GIMPLE_PHI)
1685 name = PHI_RESULT (stmt);
1686 next = single_nonlooparound_use (name);
1687 reset_debug_uses (stmt);
1688 psi = gsi_for_stmt (stmt);
1689 remove_phi_node (&psi, true);
1691 if (!next
1692 || !gimple_assign_ssa_name_copy_p (next)
1693 || gimple_assign_rhs1 (next) != name)
1694 return;
1696 stmt = next;
1699 while (1)
1701 gimple_stmt_iterator bsi;
1703 bsi = gsi_for_stmt (stmt);
1705 name = gimple_assign_lhs (stmt);
1706 gcc_assert (TREE_CODE (name) == SSA_NAME);
1708 next = single_nonlooparound_use (name);
1709 reset_debug_uses (stmt);
1711 unlink_stmt_vdef (stmt);
1712 gsi_remove (&bsi, true);
1713 release_defs (stmt);
1715 if (!next
1716 || !gimple_assign_ssa_name_copy_p (next)
1717 || gimple_assign_rhs1 (next) != name)
1718 return;
1720 stmt = next;
1724 /* Perform the predictive commoning optimization for a chain CHAIN.
1725 Uids of the newly created temporary variables are marked in TMP_VARS.*/
1727 static void
1728 execute_pred_commoning_chain (struct loop *loop, chain_p chain,
1729 bitmap tmp_vars)
1731 unsigned i;
1732 dref a;
1733 tree var;
1735 if (chain->combined)
1737 /* For combined chains, just remove the statements that are used to
1738 compute the values of the expression (except for the root one).
1739 We delay this until after all chains are processed. */
1741 else
1743 /* For non-combined chains, set up the variables that hold its value,
1744 and replace the uses of the original references by these
1745 variables. */
1746 initialize_root (loop, chain, tmp_vars);
1747 for (i = 1; chain->refs.iterate (i, &a); i++)
1749 var = chain->vars[chain->length - a->distance];
1750 replace_ref_with (a->stmt, var, false, false);
1755 /* Determines the unroll factor necessary to remove as many temporary variable
1756 copies as possible. CHAINS is the list of chains that will be
1757 optimized. */
1759 static unsigned
1760 determine_unroll_factor (vec<chain_p> chains)
1762 chain_p chain;
1763 unsigned factor = 1, af, nfactor, i;
1764 unsigned max = PARAM_VALUE (PARAM_MAX_UNROLL_TIMES);
1766 FOR_EACH_VEC_ELT (chains, i, chain)
1768 if (chain->type == CT_INVARIANT)
1769 continue;
1771 if (chain->combined)
1773 /* For combined chains, we can't handle unrolling if we replace
1774 looparound PHIs. */
1775 dref a;
1776 unsigned j;
1777 for (j = 1; chain->refs.iterate (j, &a); j++)
1778 if (gimple_code (a->stmt) == GIMPLE_PHI)
1779 return 1;
1780 continue;
1783 /* The best unroll factor for this chain is equal to the number of
1784 temporary variables that we create for it. */
1785 af = chain->length;
1786 if (chain->has_max_use_after)
1787 af++;
1789 nfactor = factor * af / gcd (factor, af);
1790 if (nfactor <= max)
1791 factor = nfactor;
1794 return factor;
1797 /* Perform the predictive commoning optimization for CHAINS.
1798 Uids of the newly created temporary variables are marked in TMP_VARS. */
1800 static void
1801 execute_pred_commoning (struct loop *loop, vec<chain_p> chains,
1802 bitmap tmp_vars)
1804 chain_p chain;
1805 unsigned i;
1807 FOR_EACH_VEC_ELT (chains, i, chain)
1809 if (chain->type == CT_INVARIANT)
1810 execute_load_motion (loop, chain, tmp_vars);
1811 else
1812 execute_pred_commoning_chain (loop, chain, tmp_vars);
1815 FOR_EACH_VEC_ELT (chains, i, chain)
1817 if (chain->type == CT_INVARIANT)
1819 else if (chain->combined)
1821 /* For combined chains, just remove the statements that are used to
1822 compute the values of the expression (except for the root one). */
1823 dref a;
1824 unsigned j;
1825 for (j = 1; chain->refs.iterate (j, &a); j++)
1826 remove_stmt (a->stmt);
1830 update_ssa (TODO_update_ssa_only_virtuals);
1833 /* For each reference in CHAINS, if its defining statement is
1834 phi node, record the ssa name that is defined by it. */
1836 static void
1837 replace_phis_by_defined_names (vec<chain_p> chains)
1839 chain_p chain;
1840 dref a;
1841 unsigned i, j;
1843 FOR_EACH_VEC_ELT (chains, i, chain)
1844 FOR_EACH_VEC_ELT (chain->refs, j, a)
1846 if (gimple_code (a->stmt) == GIMPLE_PHI)
1848 a->name_defined_by_phi = PHI_RESULT (a->stmt);
1849 a->stmt = NULL;
1854 /* For each reference in CHAINS, if name_defined_by_phi is not
1855 NULL, use it to set the stmt field. */
1857 static void
1858 replace_names_by_phis (vec<chain_p> chains)
1860 chain_p chain;
1861 dref a;
1862 unsigned i, j;
1864 FOR_EACH_VEC_ELT (chains, i, chain)
1865 FOR_EACH_VEC_ELT (chain->refs, j, a)
1866 if (a->stmt == NULL)
1868 a->stmt = SSA_NAME_DEF_STMT (a->name_defined_by_phi);
1869 gcc_assert (gimple_code (a->stmt) == GIMPLE_PHI);
1870 a->name_defined_by_phi = NULL_TREE;
1874 /* Wrapper over execute_pred_commoning, to pass it as a callback
1875 to tree_transform_and_unroll_loop. */
1877 struct epcc_data
1879 vec<chain_p> chains;
1880 bitmap tmp_vars;
1883 static void
1884 execute_pred_commoning_cbck (struct loop *loop, void *data)
1886 struct epcc_data *const dta = (struct epcc_data *) data;
1888 /* Restore phi nodes that were replaced by ssa names before
1889 tree_transform_and_unroll_loop (see detailed description in
1890 tree_predictive_commoning_loop). */
1891 replace_names_by_phis (dta->chains);
1892 execute_pred_commoning (loop, dta->chains, dta->tmp_vars);
1895 /* Base NAME and all the names in the chain of phi nodes that use it
1896 on variable VAR. The phi nodes are recognized by being in the copies of
1897 the header of the LOOP. */
1899 static void
1900 base_names_in_chain_on (struct loop *loop, tree name, tree var)
1902 gimple stmt, phi;
1903 imm_use_iterator iter;
1905 replace_ssa_name_symbol (name, var);
1907 while (1)
1909 phi = NULL;
1910 FOR_EACH_IMM_USE_STMT (stmt, iter, name)
1912 if (gimple_code (stmt) == GIMPLE_PHI
1913 && flow_bb_inside_loop_p (loop, gimple_bb (stmt)))
1915 phi = stmt;
1916 BREAK_FROM_IMM_USE_STMT (iter);
1919 if (!phi)
1920 return;
1922 name = PHI_RESULT (phi);
1923 replace_ssa_name_symbol (name, var);
1927 /* Given an unrolled LOOP after predictive commoning, remove the
1928 register copies arising from phi nodes by changing the base
1929 variables of SSA names. TMP_VARS is the set of the temporary variables
1930 for those we want to perform this. */
1932 static void
1933 eliminate_temp_copies (struct loop *loop, bitmap tmp_vars)
1935 edge e;
1936 gphi *phi;
1937 gimple stmt;
1938 tree name, use, var;
1939 gphi_iterator psi;
1941 e = loop_latch_edge (loop);
1942 for (psi = gsi_start_phis (loop->header); !gsi_end_p (psi); gsi_next (&psi))
1944 phi = psi.phi ();
1945 name = PHI_RESULT (phi);
1946 var = SSA_NAME_VAR (name);
1947 if (!var || !bitmap_bit_p (tmp_vars, DECL_UID (var)))
1948 continue;
1949 use = PHI_ARG_DEF_FROM_EDGE (phi, e);
1950 gcc_assert (TREE_CODE (use) == SSA_NAME);
1952 /* Base all the ssa names in the ud and du chain of NAME on VAR. */
1953 stmt = SSA_NAME_DEF_STMT (use);
1954 while (gimple_code (stmt) == GIMPLE_PHI
1955 /* In case we could not unroll the loop enough to eliminate
1956 all copies, we may reach the loop header before the defining
1957 statement (in that case, some register copies will be present
1958 in loop latch in the final code, corresponding to the newly
1959 created looparound phi nodes). */
1960 && gimple_bb (stmt) != loop->header)
1962 gcc_assert (single_pred_p (gimple_bb (stmt)));
1963 use = PHI_ARG_DEF (stmt, 0);
1964 stmt = SSA_NAME_DEF_STMT (use);
1967 base_names_in_chain_on (loop, use, var);
1971 /* Returns true if CHAIN is suitable to be combined. */
1973 static bool
1974 chain_can_be_combined_p (chain_p chain)
1976 return (!chain->combined
1977 && (chain->type == CT_LOAD || chain->type == CT_COMBINATION));
1980 /* Returns the modify statement that uses NAME. Skips over assignment
1981 statements, NAME is replaced with the actual name used in the returned
1982 statement. */
1984 static gimple
1985 find_use_stmt (tree *name)
1987 gimple stmt;
1988 tree rhs, lhs;
1990 /* Skip over assignments. */
1991 while (1)
1993 stmt = single_nonlooparound_use (*name);
1994 if (!stmt)
1995 return NULL;
1997 if (gimple_code (stmt) != GIMPLE_ASSIGN)
1998 return NULL;
2000 lhs = gimple_assign_lhs (stmt);
2001 if (TREE_CODE (lhs) != SSA_NAME)
2002 return NULL;
2004 if (gimple_assign_copy_p (stmt))
2006 rhs = gimple_assign_rhs1 (stmt);
2007 if (rhs != *name)
2008 return NULL;
2010 *name = lhs;
2012 else if (get_gimple_rhs_class (gimple_assign_rhs_code (stmt))
2013 == GIMPLE_BINARY_RHS)
2014 return stmt;
2015 else
2016 return NULL;
2020 /* Returns true if we may perform reassociation for operation CODE in TYPE. */
2022 static bool
2023 may_reassociate_p (tree type, enum tree_code code)
2025 if (FLOAT_TYPE_P (type)
2026 && !flag_unsafe_math_optimizations)
2027 return false;
2029 return (commutative_tree_code (code)
2030 && associative_tree_code (code));
2033 /* If the operation used in STMT is associative and commutative, go through the
2034 tree of the same operations and returns its root. Distance to the root
2035 is stored in DISTANCE. */
2037 static gimple
2038 find_associative_operation_root (gimple stmt, unsigned *distance)
2040 tree lhs;
2041 gimple next;
2042 enum tree_code code = gimple_assign_rhs_code (stmt);
2043 tree type = TREE_TYPE (gimple_assign_lhs (stmt));
2044 unsigned dist = 0;
2046 if (!may_reassociate_p (type, code))
2047 return NULL;
2049 while (1)
2051 lhs = gimple_assign_lhs (stmt);
2052 gcc_assert (TREE_CODE (lhs) == SSA_NAME);
2054 next = find_use_stmt (&lhs);
2055 if (!next
2056 || gimple_assign_rhs_code (next) != code)
2057 break;
2059 stmt = next;
2060 dist++;
2063 if (distance)
2064 *distance = dist;
2065 return stmt;
2068 /* Returns the common statement in that NAME1 and NAME2 have a use. If there
2069 is no such statement, returns NULL_TREE. In case the operation used on
2070 NAME1 and NAME2 is associative and commutative, returns the root of the
2071 tree formed by this operation instead of the statement that uses NAME1 or
2072 NAME2. */
2074 static gimple
2075 find_common_use_stmt (tree *name1, tree *name2)
2077 gimple stmt1, stmt2;
2079 stmt1 = find_use_stmt (name1);
2080 if (!stmt1)
2081 return NULL;
2083 stmt2 = find_use_stmt (name2);
2084 if (!stmt2)
2085 return NULL;
2087 if (stmt1 == stmt2)
2088 return stmt1;
2090 stmt1 = find_associative_operation_root (stmt1, NULL);
2091 if (!stmt1)
2092 return NULL;
2093 stmt2 = find_associative_operation_root (stmt2, NULL);
2094 if (!stmt2)
2095 return NULL;
2097 return (stmt1 == stmt2 ? stmt1 : NULL);
2100 /* Checks whether R1 and R2 are combined together using CODE, with the result
2101 in RSLT_TYPE, in order R1 CODE R2 if SWAP is false and in order R2 CODE R1
2102 if it is true. If CODE is ERROR_MARK, set these values instead. */
2104 static bool
2105 combinable_refs_p (dref r1, dref r2,
2106 enum tree_code *code, bool *swap, tree *rslt_type)
2108 enum tree_code acode;
2109 bool aswap;
2110 tree atype;
2111 tree name1, name2;
2112 gimple stmt;
2114 name1 = name_for_ref (r1);
2115 name2 = name_for_ref (r2);
2116 gcc_assert (name1 != NULL_TREE && name2 != NULL_TREE);
2118 stmt = find_common_use_stmt (&name1, &name2);
2120 if (!stmt
2121 /* A simple post-dominance check - make sure the combination
2122 is executed under the same condition as the references. */
2123 || (gimple_bb (stmt) != gimple_bb (r1->stmt)
2124 && gimple_bb (stmt) != gimple_bb (r2->stmt)))
2125 return false;
2127 acode = gimple_assign_rhs_code (stmt);
2128 aswap = (!commutative_tree_code (acode)
2129 && gimple_assign_rhs1 (stmt) != name1);
2130 atype = TREE_TYPE (gimple_assign_lhs (stmt));
2132 if (*code == ERROR_MARK)
2134 *code = acode;
2135 *swap = aswap;
2136 *rslt_type = atype;
2137 return true;
2140 return (*code == acode
2141 && *swap == aswap
2142 && *rslt_type == atype);
2145 /* Remove OP from the operation on rhs of STMT, and replace STMT with
2146 an assignment of the remaining operand. */
2148 static void
2149 remove_name_from_operation (gimple stmt, tree op)
2151 tree other_op;
2152 gimple_stmt_iterator si;
2154 gcc_assert (is_gimple_assign (stmt));
2156 if (gimple_assign_rhs1 (stmt) == op)
2157 other_op = gimple_assign_rhs2 (stmt);
2158 else
2159 other_op = gimple_assign_rhs1 (stmt);
2161 si = gsi_for_stmt (stmt);
2162 gimple_assign_set_rhs_from_tree (&si, other_op);
2164 /* We should not have reallocated STMT. */
2165 gcc_assert (gsi_stmt (si) == stmt);
2167 update_stmt (stmt);
2170 /* Reassociates the expression in that NAME1 and NAME2 are used so that they
2171 are combined in a single statement, and returns this statement. */
2173 static gimple
2174 reassociate_to_the_same_stmt (tree name1, tree name2)
2176 gimple stmt1, stmt2, root1, root2, s1, s2;
2177 gassign *new_stmt, *tmp_stmt;
2178 tree new_name, tmp_name, var, r1, r2;
2179 unsigned dist1, dist2;
2180 enum tree_code code;
2181 tree type = TREE_TYPE (name1);
2182 gimple_stmt_iterator bsi;
2184 stmt1 = find_use_stmt (&name1);
2185 stmt2 = find_use_stmt (&name2);
2186 root1 = find_associative_operation_root (stmt1, &dist1);
2187 root2 = find_associative_operation_root (stmt2, &dist2);
2188 code = gimple_assign_rhs_code (stmt1);
2190 gcc_assert (root1 && root2 && root1 == root2
2191 && code == gimple_assign_rhs_code (stmt2));
2193 /* Find the root of the nearest expression in that both NAME1 and NAME2
2194 are used. */
2195 r1 = name1;
2196 s1 = stmt1;
2197 r2 = name2;
2198 s2 = stmt2;
2200 while (dist1 > dist2)
2202 s1 = find_use_stmt (&r1);
2203 r1 = gimple_assign_lhs (s1);
2204 dist1--;
2206 while (dist2 > dist1)
2208 s2 = find_use_stmt (&r2);
2209 r2 = gimple_assign_lhs (s2);
2210 dist2--;
2213 while (s1 != s2)
2215 s1 = find_use_stmt (&r1);
2216 r1 = gimple_assign_lhs (s1);
2217 s2 = find_use_stmt (&r2);
2218 r2 = gimple_assign_lhs (s2);
2221 /* Remove NAME1 and NAME2 from the statements in that they are used
2222 currently. */
2223 remove_name_from_operation (stmt1, name1);
2224 remove_name_from_operation (stmt2, name2);
2226 /* Insert the new statement combining NAME1 and NAME2 before S1, and
2227 combine it with the rhs of S1. */
2228 var = create_tmp_reg (type, "predreastmp");
2229 new_name = make_ssa_name (var);
2230 new_stmt = gimple_build_assign (new_name, code, name1, name2);
2232 var = create_tmp_reg (type, "predreastmp");
2233 tmp_name = make_ssa_name (var);
2235 /* Rhs of S1 may now be either a binary expression with operation
2236 CODE, or gimple_val (in case that stmt1 == s1 or stmt2 == s1,
2237 so that name1 or name2 was removed from it). */
2238 tmp_stmt = gimple_build_assign (tmp_name, gimple_assign_rhs_code (s1),
2239 gimple_assign_rhs1 (s1),
2240 gimple_assign_rhs2 (s1));
2242 bsi = gsi_for_stmt (s1);
2243 gimple_assign_set_rhs_with_ops (&bsi, code, new_name, tmp_name);
2244 s1 = gsi_stmt (bsi);
2245 update_stmt (s1);
2247 gsi_insert_before (&bsi, new_stmt, GSI_SAME_STMT);
2248 gsi_insert_before (&bsi, tmp_stmt, GSI_SAME_STMT);
2250 return new_stmt;
2253 /* Returns the statement that combines references R1 and R2. In case R1
2254 and R2 are not used in the same statement, but they are used with an
2255 associative and commutative operation in the same expression, reassociate
2256 the expression so that they are used in the same statement. */
2258 static gimple
2259 stmt_combining_refs (dref r1, dref r2)
2261 gimple stmt1, stmt2;
2262 tree name1 = name_for_ref (r1);
2263 tree name2 = name_for_ref (r2);
2265 stmt1 = find_use_stmt (&name1);
2266 stmt2 = find_use_stmt (&name2);
2267 if (stmt1 == stmt2)
2268 return stmt1;
2270 return reassociate_to_the_same_stmt (name1, name2);
2273 /* Tries to combine chains CH1 and CH2 together. If this succeeds, the
2274 description of the new chain is returned, otherwise we return NULL. */
2276 static chain_p
2277 combine_chains (chain_p ch1, chain_p ch2)
2279 dref r1, r2, nw;
2280 enum tree_code op = ERROR_MARK;
2281 bool swap = false;
2282 chain_p new_chain;
2283 unsigned i;
2284 gimple root_stmt;
2285 tree rslt_type = NULL_TREE;
2287 if (ch1 == ch2)
2288 return NULL;
2289 if (ch1->length != ch2->length)
2290 return NULL;
2292 if (ch1->refs.length () != ch2->refs.length ())
2293 return NULL;
2295 for (i = 0; (ch1->refs.iterate (i, &r1)
2296 && ch2->refs.iterate (i, &r2)); i++)
2298 if (r1->distance != r2->distance)
2299 return NULL;
2301 if (!combinable_refs_p (r1, r2, &op, &swap, &rslt_type))
2302 return NULL;
2305 if (swap)
2307 chain_p tmp = ch1;
2308 ch1 = ch2;
2309 ch2 = tmp;
2312 new_chain = XCNEW (struct chain);
2313 new_chain->type = CT_COMBINATION;
2314 new_chain->op = op;
2315 new_chain->ch1 = ch1;
2316 new_chain->ch2 = ch2;
2317 new_chain->rslt_type = rslt_type;
2318 new_chain->length = ch1->length;
2320 for (i = 0; (ch1->refs.iterate (i, &r1)
2321 && ch2->refs.iterate (i, &r2)); i++)
2323 nw = XCNEW (struct dref_d);
2324 nw->stmt = stmt_combining_refs (r1, r2);
2325 nw->distance = r1->distance;
2327 new_chain->refs.safe_push (nw);
2330 new_chain->has_max_use_after = false;
2331 root_stmt = get_chain_root (new_chain)->stmt;
2332 for (i = 1; new_chain->refs.iterate (i, &nw); i++)
2334 if (nw->distance == new_chain->length
2335 && !stmt_dominates_stmt_p (nw->stmt, root_stmt))
2337 new_chain->has_max_use_after = true;
2338 break;
2342 ch1->combined = true;
2343 ch2->combined = true;
2344 return new_chain;
2347 /* Try to combine the CHAINS. */
2349 static void
2350 try_combine_chains (vec<chain_p> *chains)
2352 unsigned i, j;
2353 chain_p ch1, ch2, cch;
2354 auto_vec<chain_p> worklist;
2356 FOR_EACH_VEC_ELT (*chains, i, ch1)
2357 if (chain_can_be_combined_p (ch1))
2358 worklist.safe_push (ch1);
2360 while (!worklist.is_empty ())
2362 ch1 = worklist.pop ();
2363 if (!chain_can_be_combined_p (ch1))
2364 continue;
2366 FOR_EACH_VEC_ELT (*chains, j, ch2)
2368 if (!chain_can_be_combined_p (ch2))
2369 continue;
2371 cch = combine_chains (ch1, ch2);
2372 if (cch)
2374 worklist.safe_push (cch);
2375 chains->safe_push (cch);
2376 break;
2382 /* Prepare initializers for CHAIN in LOOP. Returns false if this is
2383 impossible because one of these initializers may trap, true otherwise. */
2385 static bool
2386 prepare_initializers_chain (struct loop *loop, chain_p chain)
2388 unsigned i, n = (chain->type == CT_INVARIANT) ? 1 : chain->length;
2389 struct data_reference *dr = get_chain_root (chain)->ref;
2390 tree init;
2391 dref laref;
2392 edge entry = loop_preheader_edge (loop);
2394 /* Find the initializers for the variables, and check that they cannot
2395 trap. */
2396 chain->inits.create (n);
2397 for (i = 0; i < n; i++)
2398 chain->inits.quick_push (NULL_TREE);
2400 /* If we have replaced some looparound phi nodes, use their initializers
2401 instead of creating our own. */
2402 FOR_EACH_VEC_ELT (chain->refs, i, laref)
2404 if (gimple_code (laref->stmt) != GIMPLE_PHI)
2405 continue;
2407 gcc_assert (laref->distance > 0);
2408 chain->inits[n - laref->distance]
2409 = PHI_ARG_DEF_FROM_EDGE (laref->stmt, entry);
2412 for (i = 0; i < n; i++)
2414 gimple_seq stmts = NULL;
2416 if (chain->inits[i] != NULL_TREE)
2417 continue;
2419 init = ref_at_iteration (dr, (int) i - n, &stmts);
2420 if (!chain->all_always_accessed && tree_could_trap_p (init))
2422 gimple_seq_discard (stmts);
2423 return false;
2426 if (stmts)
2427 gsi_insert_seq_on_edge_immediate (entry, stmts);
2429 chain->inits[i] = init;
2432 return true;
2435 /* Prepare initializers for CHAINS in LOOP, and free chains that cannot
2436 be used because the initializers might trap. */
2438 static void
2439 prepare_initializers (struct loop *loop, vec<chain_p> chains)
2441 chain_p chain;
2442 unsigned i;
2444 for (i = 0; i < chains.length (); )
2446 chain = chains[i];
2447 if (prepare_initializers_chain (loop, chain))
2448 i++;
2449 else
2451 release_chain (chain);
2452 chains.unordered_remove (i);
2457 /* Performs predictive commoning for LOOP. Returns true if LOOP was
2458 unrolled. */
2460 static bool
2461 tree_predictive_commoning_loop (struct loop *loop)
2463 vec<data_reference_p> datarefs;
2464 vec<ddr_p> dependences;
2465 struct component *components;
2466 vec<chain_p> chains = vNULL;
2467 unsigned unroll_factor;
2468 struct tree_niter_desc desc;
2469 bool unroll = false;
2470 edge exit;
2471 bitmap tmp_vars;
2473 if (dump_file && (dump_flags & TDF_DETAILS))
2474 fprintf (dump_file, "Processing loop %d\n", loop->num);
2476 /* Find the data references and split them into components according to their
2477 dependence relations. */
2478 auto_vec<loop_p, 3> loop_nest;
2479 dependences.create (10);
2480 datarefs.create (10);
2481 if (! compute_data_dependences_for_loop (loop, true, &loop_nest, &datarefs,
2482 &dependences))
2484 if (dump_file && (dump_flags & TDF_DETAILS))
2485 fprintf (dump_file, "Cannot analyze data dependencies\n");
2486 free_data_refs (datarefs);
2487 free_dependence_relations (dependences);
2488 return false;
2491 if (dump_file && (dump_flags & TDF_DETAILS))
2492 dump_data_dependence_relations (dump_file, dependences);
2494 components = split_data_refs_to_components (loop, datarefs, dependences);
2495 loop_nest.release ();
2496 free_dependence_relations (dependences);
2497 if (!components)
2499 free_data_refs (datarefs);
2500 free_affine_expand_cache (&name_expansions);
2501 return false;
2504 if (dump_file && (dump_flags & TDF_DETAILS))
2506 fprintf (dump_file, "Initial state:\n\n");
2507 dump_components (dump_file, components);
2510 /* Find the suitable components and split them into chains. */
2511 components = filter_suitable_components (loop, components);
2513 tmp_vars = BITMAP_ALLOC (NULL);
2514 looparound_phis = BITMAP_ALLOC (NULL);
2515 determine_roots (loop, components, &chains);
2516 release_components (components);
2518 if (!chains.exists ())
2520 if (dump_file && (dump_flags & TDF_DETAILS))
2521 fprintf (dump_file,
2522 "Predictive commoning failed: no suitable chains\n");
2523 goto end;
2525 prepare_initializers (loop, chains);
2527 /* Try to combine the chains that are always worked with together. */
2528 try_combine_chains (&chains);
2530 if (dump_file && (dump_flags & TDF_DETAILS))
2532 fprintf (dump_file, "Before commoning:\n\n");
2533 dump_chains (dump_file, chains);
2536 /* Determine the unroll factor, and if the loop should be unrolled, ensure
2537 that its number of iterations is divisible by the factor. */
2538 unroll_factor = determine_unroll_factor (chains);
2539 scev_reset ();
2540 unroll = (unroll_factor > 1
2541 && can_unroll_loop_p (loop, unroll_factor, &desc));
2542 exit = single_dom_exit (loop);
2544 /* Execute the predictive commoning transformations, and possibly unroll the
2545 loop. */
2546 if (unroll)
2548 struct epcc_data dta;
2550 if (dump_file && (dump_flags & TDF_DETAILS))
2551 fprintf (dump_file, "Unrolling %u times.\n", unroll_factor);
2553 dta.chains = chains;
2554 dta.tmp_vars = tmp_vars;
2556 update_ssa (TODO_update_ssa_only_virtuals);
2558 /* Cfg manipulations performed in tree_transform_and_unroll_loop before
2559 execute_pred_commoning_cbck is called may cause phi nodes to be
2560 reallocated, which is a problem since CHAINS may point to these
2561 statements. To fix this, we store the ssa names defined by the
2562 phi nodes here instead of the phi nodes themselves, and restore
2563 the phi nodes in execute_pred_commoning_cbck. A bit hacky. */
2564 replace_phis_by_defined_names (chains);
2566 tree_transform_and_unroll_loop (loop, unroll_factor, exit, &desc,
2567 execute_pred_commoning_cbck, &dta);
2568 eliminate_temp_copies (loop, tmp_vars);
2570 else
2572 if (dump_file && (dump_flags & TDF_DETAILS))
2573 fprintf (dump_file,
2574 "Executing predictive commoning without unrolling.\n");
2575 execute_pred_commoning (loop, chains, tmp_vars);
2578 end: ;
2579 release_chains (chains);
2580 free_data_refs (datarefs);
2581 BITMAP_FREE (tmp_vars);
2582 BITMAP_FREE (looparound_phis);
2584 free_affine_expand_cache (&name_expansions);
2586 return unroll;
2589 /* Runs predictive commoning. */
2591 unsigned
2592 tree_predictive_commoning (void)
2594 bool unrolled = false;
2595 struct loop *loop;
2596 unsigned ret = 0;
2598 initialize_original_copy_tables ();
2599 FOR_EACH_LOOP (loop, LI_ONLY_INNERMOST)
2600 if (optimize_loop_for_speed_p (loop))
2602 unrolled |= tree_predictive_commoning_loop (loop);
2605 if (unrolled)
2607 scev_reset ();
2608 ret = TODO_cleanup_cfg;
2610 free_original_copy_tables ();
2612 return ret;
2615 /* Predictive commoning Pass. */
2617 static unsigned
2618 run_tree_predictive_commoning (struct function *fun)
2620 if (number_of_loops (fun) <= 1)
2621 return 0;
2623 return tree_predictive_commoning ();
2626 namespace {
2628 const pass_data pass_data_predcom =
2630 GIMPLE_PASS, /* type */
2631 "pcom", /* name */
2632 OPTGROUP_LOOP, /* optinfo_flags */
2633 TV_PREDCOM, /* tv_id */
2634 PROP_cfg, /* properties_required */
2635 0, /* properties_provided */
2636 0, /* properties_destroyed */
2637 0, /* todo_flags_start */
2638 TODO_update_ssa_only_virtuals, /* todo_flags_finish */
2641 class pass_predcom : public gimple_opt_pass
2643 public:
2644 pass_predcom (gcc::context *ctxt)
2645 : gimple_opt_pass (pass_data_predcom, ctxt)
2648 /* opt_pass methods: */
2649 virtual bool gate (function *) { return flag_predictive_commoning != 0; }
2650 virtual unsigned int execute (function *fun)
2652 return run_tree_predictive_commoning (fun);
2655 }; // class pass_predcom
2657 } // anon namespace
2659 gimple_opt_pass *
2660 make_pass_predcom (gcc::context *ctxt)
2662 return new pass_predcom (ctxt);