PR sanitizer/65081
[official-gcc.git] / gcc / tree-predcom.c
blob8dac1ba5afd5520368e20596d643a716b1982a5a
1 /* Predictive commoning.
2 Copyright (C) 2005-2015 Free Software Foundation, Inc.
4 This file is part of GCC.
6 GCC is free software; you can redistribute it and/or modify it
7 under the terms of the GNU General Public License as published by the
8 Free Software Foundation; either version 3, or (at your option) any
9 later version.
11 GCC is distributed in the hope that it will be useful, but WITHOUT
12 ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
13 FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
14 for more details.
16 You should have received a copy of the GNU General Public License
17 along with GCC; see the file COPYING3. If not see
18 <http://www.gnu.org/licenses/>. */
20 /* This file implements the predictive commoning optimization. Predictive
21 commoning can be viewed as CSE around a loop, and with some improvements,
22 as generalized strength reduction-- i.e., reusing values computed in
23 earlier iterations of a loop in the later ones. So far, the pass only
24 handles the most useful case, that is, reusing values of memory references.
25 If you think this is all just a special case of PRE, you are sort of right;
26 however, concentrating on loops is simpler, and makes it possible to
27 incorporate data dependence analysis to detect the opportunities, perform
28 loop unrolling to avoid copies together with renaming immediately,
29 and if needed, we could also take register pressure into account.
31 Let us demonstrate what is done on an example:
33 for (i = 0; i < 100; i++)
35 a[i+2] = a[i] + a[i+1];
36 b[10] = b[10] + i;
37 c[i] = c[99 - i];
38 d[i] = d[i + 1];
41 1) We find data references in the loop, and split them to mutually
42 independent groups (i.e., we find components of a data dependence
43 graph). We ignore read-read dependences whose distance is not constant.
44 (TODO -- we could also ignore antidependences). In this example, we
45 find the following groups:
47 a[i]{read}, a[i+1]{read}, a[i+2]{write}
48 b[10]{read}, b[10]{write}
49 c[99 - i]{read}, c[i]{write}
50 d[i + 1]{read}, d[i]{write}
52 2) Inside each of the group, we verify several conditions:
53 a) all the references must differ in indices only, and the indices
54 must all have the same step
55 b) the references must dominate loop latch (and thus, they must be
56 ordered by dominance relation).
57 c) the distance of the indices must be a small multiple of the step
58 We are then able to compute the difference of the references (# of
59 iterations before they point to the same place as the first of them).
60 Also, in case there are writes in the loop, we split the groups into
61 chains whose head is the write whose values are used by the reads in
62 the same chain. The chains are then processed independently,
63 making the further transformations simpler. Also, the shorter chains
64 need the same number of registers, but may require lower unrolling
65 factor in order to get rid of the copies on the loop latch.
67 In our example, we get the following chains (the chain for c is invalid).
69 a[i]{read,+0}, a[i+1]{read,-1}, a[i+2]{write,-2}
70 b[10]{read,+0}, b[10]{write,+0}
71 d[i + 1]{read,+0}, d[i]{write,+1}
73 3) For each read, we determine the read or write whose value it reuses,
74 together with the distance of this reuse. I.e. we take the last
75 reference before it with distance 0, or the last of the references
76 with the smallest positive distance to the read. Then, we remove
77 the references that are not used in any of these chains, discard the
78 empty groups, and propagate all the links so that they point to the
79 single root reference of the chain (adjusting their distance
80 appropriately). Some extra care needs to be taken for references with
81 step 0. In our example (the numbers indicate the distance of the
82 reuse),
84 a[i] --> (*) 2, a[i+1] --> (*) 1, a[i+2] (*)
85 b[10] --> (*) 1, b[10] (*)
87 4) The chains are combined together if possible. If the corresponding
88 elements of two chains are always combined together with the same
89 operator, we remember just the result of this combination, instead
90 of remembering the values separately. We may need to perform
91 reassociation to enable combining, for example
93 e[i] + f[i+1] + e[i+1] + f[i]
95 can be reassociated as
97 (e[i] + f[i]) + (e[i+1] + f[i+1])
99 and we can combine the chains for e and f into one chain.
101 5) For each root reference (end of the chain) R, let N be maximum distance
102 of a reference reusing its value. Variables R0 up to RN are created,
103 together with phi nodes that transfer values from R1 .. RN to
104 R0 .. R(N-1).
105 Initial values are loaded to R0..R(N-1) (in case not all references
106 must necessarily be accessed and they may trap, we may fail here;
107 TODO sometimes, the loads could be guarded by a check for the number
108 of iterations). Values loaded/stored in roots are also copied to
109 RN. Other reads are replaced with the appropriate variable Ri.
110 Everything is put to SSA form.
112 As a small improvement, if R0 is dead after the root (i.e., all uses of
113 the value with the maximum distance dominate the root), we can avoid
114 creating RN and use R0 instead of it.
116 In our example, we get (only the parts concerning a and b are shown):
117 for (i = 0; i < 100; i++)
119 f = phi (a[0], s);
120 s = phi (a[1], f);
121 x = phi (b[10], x);
123 f = f + s;
124 a[i+2] = f;
125 x = x + i;
126 b[10] = x;
129 6) Factor F for unrolling is determined as the smallest common multiple of
130 (N + 1) for each root reference (N for references for that we avoided
131 creating RN). If F and the loop is small enough, loop is unrolled F
132 times. The stores to RN (R0) in the copies of the loop body are
133 periodically replaced with R0, R1, ... (R1, R2, ...), so that they can
134 be coalesced and the copies can be eliminated.
136 TODO -- copy propagation and other optimizations may change the live
137 ranges of the temporary registers and prevent them from being coalesced;
138 this may increase the register pressure.
140 In our case, F = 2 and the (main loop of the) result is
142 for (i = 0; i < ...; i += 2)
144 f = phi (a[0], f);
145 s = phi (a[1], s);
146 x = phi (b[10], x);
148 f = f + s;
149 a[i+2] = f;
150 x = x + i;
151 b[10] = x;
153 s = s + f;
154 a[i+3] = s;
155 x = x + i;
156 b[10] = x;
159 TODO -- stores killing other stores can be taken into account, e.g.,
160 for (i = 0; i < n; i++)
162 a[i] = 1;
163 a[i+2] = 2;
166 can be replaced with
168 t0 = a[0];
169 t1 = a[1];
170 for (i = 0; i < n; i++)
172 a[i] = 1;
173 t2 = 2;
174 t0 = t1;
175 t1 = t2;
177 a[n] = t0;
178 a[n+1] = t1;
180 The interesting part is that this would generalize store motion; still, since
181 sm is performed elsewhere, it does not seem that important.
183 Predictive commoning can be generalized for arbitrary computations (not
184 just memory loads), and also nontrivial transfer functions (e.g., replacing
185 i * i with ii_last + 2 * i + 1), to generalize strength reduction. */
187 #include "config.h"
188 #include "system.h"
189 #include "coretypes.h"
190 #include "tm.h"
191 #include "hash-set.h"
192 #include "machmode.h"
193 #include "vec.h"
194 #include "double-int.h"
195 #include "input.h"
196 #include "alias.h"
197 #include "symtab.h"
198 #include "wide-int.h"
199 #include "inchash.h"
200 #include "tree.h"
201 #include "fold-const.h"
202 #include "tm_p.h"
203 #include "cfgloop.h"
204 #include "predict.h"
205 #include "hard-reg-set.h"
206 #include "function.h"
207 #include "dominance.h"
208 #include "cfg.h"
209 #include "basic-block.h"
210 #include "tree-ssa-alias.h"
211 #include "internal-fn.h"
212 #include "tree-eh.h"
213 #include "gimple-expr.h"
214 #include "is-a.h"
215 #include "gimple.h"
216 #include "gimplify.h"
217 #include "gimple-iterator.h"
218 #include "gimplify-me.h"
219 #include "gimple-ssa.h"
220 #include "tree-phinodes.h"
221 #include "ssa-iterators.h"
222 #include "stringpool.h"
223 #include "tree-ssanames.h"
224 #include "tree-ssa-loop-ivopts.h"
225 #include "tree-ssa-loop-manip.h"
226 #include "tree-ssa-loop-niter.h"
227 #include "tree-ssa-loop.h"
228 #include "tree-into-ssa.h"
229 #include "hashtab.h"
230 #include "rtl.h"
231 #include "flags.h"
232 #include "statistics.h"
233 #include "real.h"
234 #include "fixed-value.h"
235 #include "insn-config.h"
236 #include "expmed.h"
237 #include "dojump.h"
238 #include "explow.h"
239 #include "calls.h"
240 #include "emit-rtl.h"
241 #include "varasm.h"
242 #include "stmt.h"
243 #include "expr.h"
244 #include "tree-dfa.h"
245 #include "tree-ssa.h"
246 #include "tree-data-ref.h"
247 #include "tree-scalar-evolution.h"
248 #include "tree-chrec.h"
249 #include "params.h"
250 #include "gimple-pretty-print.h"
251 #include "tree-pass.h"
252 #include "tree-affine.h"
253 #include "tree-inline.h"
254 #include "wide-int-print.h"
256 /* The maximum number of iterations between the considered memory
257 references. */
259 #define MAX_DISTANCE (target_avail_regs < 16 ? 4 : 8)
261 /* Data references (or phi nodes that carry data reference values across
262 loop iterations). */
264 typedef struct dref_d
266 /* The reference itself. */
267 struct data_reference *ref;
269 /* The statement in that the reference appears. */
270 gimple stmt;
272 /* In case that STMT is a phi node, this field is set to the SSA name
273 defined by it in replace_phis_by_defined_names (in order to avoid
274 pointing to phi node that got reallocated in the meantime). */
275 tree name_defined_by_phi;
277 /* Distance of the reference from the root of the chain (in number of
278 iterations of the loop). */
279 unsigned distance;
281 /* Number of iterations offset from the first reference in the component. */
282 widest_int offset;
284 /* Number of the reference in a component, in dominance ordering. */
285 unsigned pos;
287 /* True if the memory reference is always accessed when the loop is
288 entered. */
289 unsigned always_accessed : 1;
290 } *dref;
293 /* Type of the chain of the references. */
295 enum chain_type
297 /* The addresses of the references in the chain are constant. */
298 CT_INVARIANT,
300 /* There are only loads in the chain. */
301 CT_LOAD,
303 /* Root of the chain is store, the rest are loads. */
304 CT_STORE_LOAD,
306 /* A combination of two chains. */
307 CT_COMBINATION
310 /* Chains of data references. */
312 typedef struct chain
314 /* Type of the chain. */
315 enum chain_type type;
317 /* For combination chains, the operator and the two chains that are
318 combined, and the type of the result. */
319 enum tree_code op;
320 tree rslt_type;
321 struct chain *ch1, *ch2;
323 /* The references in the chain. */
324 vec<dref> refs;
326 /* The maximum distance of the reference in the chain from the root. */
327 unsigned length;
329 /* The variables used to copy the value throughout iterations. */
330 vec<tree> vars;
332 /* Initializers for the variables. */
333 vec<tree> inits;
335 /* True if there is a use of a variable with the maximal distance
336 that comes after the root in the loop. */
337 unsigned has_max_use_after : 1;
339 /* True if all the memory references in the chain are always accessed. */
340 unsigned all_always_accessed : 1;
342 /* True if this chain was combined together with some other chain. */
343 unsigned combined : 1;
344 } *chain_p;
347 /* Describes the knowledge about the step of the memory references in
348 the component. */
350 enum ref_step_type
352 /* The step is zero. */
353 RS_INVARIANT,
355 /* The step is nonzero. */
356 RS_NONZERO,
358 /* The step may or may not be nonzero. */
359 RS_ANY
362 /* Components of the data dependence graph. */
364 struct component
366 /* The references in the component. */
367 vec<dref> refs;
369 /* What we know about the step of the references in the component. */
370 enum ref_step_type comp_step;
372 /* Next component in the list. */
373 struct component *next;
376 /* Bitmap of ssa names defined by looparound phi nodes covered by chains. */
378 static bitmap looparound_phis;
380 /* Cache used by tree_to_aff_combination_expand. */
382 static hash_map<tree, name_expansion *> *name_expansions;
384 /* Dumps data reference REF to FILE. */
386 extern void dump_dref (FILE *, dref);
387 void
388 dump_dref (FILE *file, dref ref)
390 if (ref->ref)
392 fprintf (file, " ");
393 print_generic_expr (file, DR_REF (ref->ref), TDF_SLIM);
394 fprintf (file, " (id %u%s)\n", ref->pos,
395 DR_IS_READ (ref->ref) ? "" : ", write");
397 fprintf (file, " offset ");
398 print_decs (ref->offset, file);
399 fprintf (file, "\n");
401 fprintf (file, " distance %u\n", ref->distance);
403 else
405 if (gimple_code (ref->stmt) == GIMPLE_PHI)
406 fprintf (file, " looparound ref\n");
407 else
408 fprintf (file, " combination ref\n");
409 fprintf (file, " in statement ");
410 print_gimple_stmt (file, ref->stmt, 0, TDF_SLIM);
411 fprintf (file, "\n");
412 fprintf (file, " distance %u\n", ref->distance);
417 /* Dumps CHAIN to FILE. */
419 extern void dump_chain (FILE *, chain_p);
420 void
421 dump_chain (FILE *file, chain_p chain)
423 dref a;
424 const char *chain_type;
425 unsigned i;
426 tree var;
428 switch (chain->type)
430 case CT_INVARIANT:
431 chain_type = "Load motion";
432 break;
434 case CT_LOAD:
435 chain_type = "Loads-only";
436 break;
438 case CT_STORE_LOAD:
439 chain_type = "Store-loads";
440 break;
442 case CT_COMBINATION:
443 chain_type = "Combination";
444 break;
446 default:
447 gcc_unreachable ();
450 fprintf (file, "%s chain %p%s\n", chain_type, (void *) chain,
451 chain->combined ? " (combined)" : "");
452 if (chain->type != CT_INVARIANT)
453 fprintf (file, " max distance %u%s\n", chain->length,
454 chain->has_max_use_after ? "" : ", may reuse first");
456 if (chain->type == CT_COMBINATION)
458 fprintf (file, " equal to %p %s %p in type ",
459 (void *) chain->ch1, op_symbol_code (chain->op),
460 (void *) chain->ch2);
461 print_generic_expr (file, chain->rslt_type, TDF_SLIM);
462 fprintf (file, "\n");
465 if (chain->vars.exists ())
467 fprintf (file, " vars");
468 FOR_EACH_VEC_ELT (chain->vars, i, var)
470 fprintf (file, " ");
471 print_generic_expr (file, var, TDF_SLIM);
473 fprintf (file, "\n");
476 if (chain->inits.exists ())
478 fprintf (file, " inits");
479 FOR_EACH_VEC_ELT (chain->inits, i, var)
481 fprintf (file, " ");
482 print_generic_expr (file, var, TDF_SLIM);
484 fprintf (file, "\n");
487 fprintf (file, " references:\n");
488 FOR_EACH_VEC_ELT (chain->refs, i, a)
489 dump_dref (file, a);
491 fprintf (file, "\n");
494 /* Dumps CHAINS to FILE. */
496 extern void dump_chains (FILE *, vec<chain_p> );
497 void
498 dump_chains (FILE *file, vec<chain_p> chains)
500 chain_p chain;
501 unsigned i;
503 FOR_EACH_VEC_ELT (chains, i, chain)
504 dump_chain (file, chain);
507 /* Dumps COMP to FILE. */
509 extern void dump_component (FILE *, struct component *);
510 void
511 dump_component (FILE *file, struct component *comp)
513 dref a;
514 unsigned i;
516 fprintf (file, "Component%s:\n",
517 comp->comp_step == RS_INVARIANT ? " (invariant)" : "");
518 FOR_EACH_VEC_ELT (comp->refs, i, a)
519 dump_dref (file, a);
520 fprintf (file, "\n");
523 /* Dumps COMPS to FILE. */
525 extern void dump_components (FILE *, struct component *);
526 void
527 dump_components (FILE *file, struct component *comps)
529 struct component *comp;
531 for (comp = comps; comp; comp = comp->next)
532 dump_component (file, comp);
535 /* Frees a chain CHAIN. */
537 static void
538 release_chain (chain_p chain)
540 dref ref;
541 unsigned i;
543 if (chain == NULL)
544 return;
546 FOR_EACH_VEC_ELT (chain->refs, i, ref)
547 free (ref);
549 chain->refs.release ();
550 chain->vars.release ();
551 chain->inits.release ();
553 free (chain);
556 /* Frees CHAINS. */
558 static void
559 release_chains (vec<chain_p> chains)
561 unsigned i;
562 chain_p chain;
564 FOR_EACH_VEC_ELT (chains, i, chain)
565 release_chain (chain);
566 chains.release ();
569 /* Frees a component COMP. */
571 static void
572 release_component (struct component *comp)
574 comp->refs.release ();
575 free (comp);
578 /* Frees list of components COMPS. */
580 static void
581 release_components (struct component *comps)
583 struct component *act, *next;
585 for (act = comps; act; act = next)
587 next = act->next;
588 release_component (act);
592 /* Finds a root of tree given by FATHERS containing A, and performs path
593 shortening. */
595 static unsigned
596 component_of (unsigned fathers[], unsigned a)
598 unsigned root, n;
600 for (root = a; root != fathers[root]; root = fathers[root])
601 continue;
603 for (; a != root; a = n)
605 n = fathers[a];
606 fathers[a] = root;
609 return root;
612 /* Join operation for DFU. FATHERS gives the tree, SIZES are sizes of the
613 components, A and B are components to merge. */
615 static void
616 merge_comps (unsigned fathers[], unsigned sizes[], unsigned a, unsigned b)
618 unsigned ca = component_of (fathers, a);
619 unsigned cb = component_of (fathers, b);
621 if (ca == cb)
622 return;
624 if (sizes[ca] < sizes[cb])
626 sizes[cb] += sizes[ca];
627 fathers[ca] = cb;
629 else
631 sizes[ca] += sizes[cb];
632 fathers[cb] = ca;
636 /* Returns true if A is a reference that is suitable for predictive commoning
637 in the innermost loop that contains it. REF_STEP is set according to the
638 step of the reference A. */
640 static bool
641 suitable_reference_p (struct data_reference *a, enum ref_step_type *ref_step)
643 tree ref = DR_REF (a), step = DR_STEP (a);
645 if (!step
646 || TREE_THIS_VOLATILE (ref)
647 || !is_gimple_reg_type (TREE_TYPE (ref))
648 || tree_could_throw_p (ref))
649 return false;
651 if (integer_zerop (step))
652 *ref_step = RS_INVARIANT;
653 else if (integer_nonzerop (step))
654 *ref_step = RS_NONZERO;
655 else
656 *ref_step = RS_ANY;
658 return true;
661 /* Stores DR_OFFSET (DR) + DR_INIT (DR) to OFFSET. */
663 static void
664 aff_combination_dr_offset (struct data_reference *dr, aff_tree *offset)
666 tree type = TREE_TYPE (DR_OFFSET (dr));
667 aff_tree delta;
669 tree_to_aff_combination_expand (DR_OFFSET (dr), type, offset,
670 &name_expansions);
671 aff_combination_const (&delta, type, wi::to_widest (DR_INIT (dr)));
672 aff_combination_add (offset, &delta);
675 /* Determines number of iterations of the innermost enclosing loop before B
676 refers to exactly the same location as A and stores it to OFF. If A and
677 B do not have the same step, they never meet, or anything else fails,
678 returns false, otherwise returns true. Both A and B are assumed to
679 satisfy suitable_reference_p. */
681 static bool
682 determine_offset (struct data_reference *a, struct data_reference *b,
683 widest_int *off)
685 aff_tree diff, baseb, step;
686 tree typea, typeb;
688 /* Check that both the references access the location in the same type. */
689 typea = TREE_TYPE (DR_REF (a));
690 typeb = TREE_TYPE (DR_REF (b));
691 if (!useless_type_conversion_p (typeb, typea))
692 return false;
694 /* Check whether the base address and the step of both references is the
695 same. */
696 if (!operand_equal_p (DR_STEP (a), DR_STEP (b), 0)
697 || !operand_equal_p (DR_BASE_ADDRESS (a), DR_BASE_ADDRESS (b), 0))
698 return false;
700 if (integer_zerop (DR_STEP (a)))
702 /* If the references have loop invariant address, check that they access
703 exactly the same location. */
704 *off = 0;
705 return (operand_equal_p (DR_OFFSET (a), DR_OFFSET (b), 0)
706 && operand_equal_p (DR_INIT (a), DR_INIT (b), 0));
709 /* Compare the offsets of the addresses, and check whether the difference
710 is a multiple of step. */
711 aff_combination_dr_offset (a, &diff);
712 aff_combination_dr_offset (b, &baseb);
713 aff_combination_scale (&baseb, -1);
714 aff_combination_add (&diff, &baseb);
716 tree_to_aff_combination_expand (DR_STEP (a), TREE_TYPE (DR_STEP (a)),
717 &step, &name_expansions);
718 return aff_combination_constant_multiple_p (&diff, &step, off);
721 /* Returns the last basic block in LOOP for that we are sure that
722 it is executed whenever the loop is entered. */
724 static basic_block
725 last_always_executed_block (struct loop *loop)
727 unsigned i;
728 vec<edge> exits = get_loop_exit_edges (loop);
729 edge ex;
730 basic_block last = loop->latch;
732 FOR_EACH_VEC_ELT (exits, i, ex)
733 last = nearest_common_dominator (CDI_DOMINATORS, last, ex->src);
734 exits.release ();
736 return last;
739 /* Splits dependence graph on DATAREFS described by DEPENDS to components. */
741 static struct component *
742 split_data_refs_to_components (struct loop *loop,
743 vec<data_reference_p> datarefs,
744 vec<ddr_p> depends)
746 unsigned i, n = datarefs.length ();
747 unsigned ca, ia, ib, bad;
748 unsigned *comp_father = XNEWVEC (unsigned, n + 1);
749 unsigned *comp_size = XNEWVEC (unsigned, n + 1);
750 struct component **comps;
751 struct data_reference *dr, *dra, *drb;
752 struct data_dependence_relation *ddr;
753 struct component *comp_list = NULL, *comp;
754 dref dataref;
755 basic_block last_always_executed = last_always_executed_block (loop);
757 FOR_EACH_VEC_ELT (datarefs, i, dr)
759 if (!DR_REF (dr))
761 /* A fake reference for call or asm_expr that may clobber memory;
762 just fail. */
763 goto end;
765 /* predcom pass isn't prepared to handle calls with data references. */
766 if (is_gimple_call (DR_STMT (dr)))
767 goto end;
768 dr->aux = (void *) (size_t) i;
769 comp_father[i] = i;
770 comp_size[i] = 1;
773 /* A component reserved for the "bad" data references. */
774 comp_father[n] = n;
775 comp_size[n] = 1;
777 FOR_EACH_VEC_ELT (datarefs, i, dr)
779 enum ref_step_type dummy;
781 if (!suitable_reference_p (dr, &dummy))
783 ia = (unsigned) (size_t) dr->aux;
784 merge_comps (comp_father, comp_size, n, ia);
788 FOR_EACH_VEC_ELT (depends, i, ddr)
790 widest_int dummy_off;
792 if (DDR_ARE_DEPENDENT (ddr) == chrec_known)
793 continue;
795 dra = DDR_A (ddr);
796 drb = DDR_B (ddr);
797 ia = component_of (comp_father, (unsigned) (size_t) dra->aux);
798 ib = component_of (comp_father, (unsigned) (size_t) drb->aux);
799 if (ia == ib)
800 continue;
802 bad = component_of (comp_father, n);
804 /* If both A and B are reads, we may ignore unsuitable dependences. */
805 if (DR_IS_READ (dra) && DR_IS_READ (drb))
807 if (ia == bad || ib == bad
808 || !determine_offset (dra, drb, &dummy_off))
809 continue;
811 /* If A is read and B write or vice versa and there is unsuitable
812 dependence, instead of merging both components into a component
813 that will certainly not pass suitable_component_p, just put the
814 read into bad component, perhaps at least the write together with
815 all the other data refs in it's component will be optimizable. */
816 else if (DR_IS_READ (dra) && ib != bad)
818 if (ia == bad)
819 continue;
820 else if (!determine_offset (dra, drb, &dummy_off))
822 merge_comps (comp_father, comp_size, bad, ia);
823 continue;
826 else if (DR_IS_READ (drb) && ia != bad)
828 if (ib == bad)
829 continue;
830 else if (!determine_offset (dra, drb, &dummy_off))
832 merge_comps (comp_father, comp_size, bad, ib);
833 continue;
837 merge_comps (comp_father, comp_size, ia, ib);
840 comps = XCNEWVEC (struct component *, n);
841 bad = component_of (comp_father, n);
842 FOR_EACH_VEC_ELT (datarefs, i, dr)
844 ia = (unsigned) (size_t) dr->aux;
845 ca = component_of (comp_father, ia);
846 if (ca == bad)
847 continue;
849 comp = comps[ca];
850 if (!comp)
852 comp = XCNEW (struct component);
853 comp->refs.create (comp_size[ca]);
854 comps[ca] = comp;
857 dataref = XCNEW (struct dref_d);
858 dataref->ref = dr;
859 dataref->stmt = DR_STMT (dr);
860 dataref->offset = 0;
861 dataref->distance = 0;
863 dataref->always_accessed
864 = dominated_by_p (CDI_DOMINATORS, last_always_executed,
865 gimple_bb (dataref->stmt));
866 dataref->pos = comp->refs.length ();
867 comp->refs.quick_push (dataref);
870 for (i = 0; i < n; i++)
872 comp = comps[i];
873 if (comp)
875 comp->next = comp_list;
876 comp_list = comp;
879 free (comps);
881 end:
882 free (comp_father);
883 free (comp_size);
884 return comp_list;
887 /* Returns true if the component COMP satisfies the conditions
888 described in 2) at the beginning of this file. LOOP is the current
889 loop. */
891 static bool
892 suitable_component_p (struct loop *loop, struct component *comp)
894 unsigned i;
895 dref a, first;
896 basic_block ba, bp = loop->header;
897 bool ok, has_write = false;
899 FOR_EACH_VEC_ELT (comp->refs, i, a)
901 ba = gimple_bb (a->stmt);
903 if (!just_once_each_iteration_p (loop, ba))
904 return false;
906 gcc_assert (dominated_by_p (CDI_DOMINATORS, ba, bp));
907 bp = ba;
909 if (DR_IS_WRITE (a->ref))
910 has_write = true;
913 first = comp->refs[0];
914 ok = suitable_reference_p (first->ref, &comp->comp_step);
915 gcc_assert (ok);
916 first->offset = 0;
918 for (i = 1; comp->refs.iterate (i, &a); i++)
920 if (!determine_offset (first->ref, a->ref, &a->offset))
921 return false;
923 #ifdef ENABLE_CHECKING
925 enum ref_step_type a_step;
926 ok = suitable_reference_p (a->ref, &a_step);
927 gcc_assert (ok && a_step == comp->comp_step);
929 #endif
932 /* If there is a write inside the component, we must know whether the
933 step is nonzero or not -- we would not otherwise be able to recognize
934 whether the value accessed by reads comes from the OFFSET-th iteration
935 or the previous one. */
936 if (has_write && comp->comp_step == RS_ANY)
937 return false;
939 return true;
942 /* Check the conditions on references inside each of components COMPS,
943 and remove the unsuitable components from the list. The new list
944 of components is returned. The conditions are described in 2) at
945 the beginning of this file. LOOP is the current loop. */
947 static struct component *
948 filter_suitable_components (struct loop *loop, struct component *comps)
950 struct component **comp, *act;
952 for (comp = &comps; *comp; )
954 act = *comp;
955 if (suitable_component_p (loop, act))
956 comp = &act->next;
957 else
959 dref ref;
960 unsigned i;
962 *comp = act->next;
963 FOR_EACH_VEC_ELT (act->refs, i, ref)
964 free (ref);
965 release_component (act);
969 return comps;
972 /* Compares two drefs A and B by their offset and position. Callback for
973 qsort. */
975 static int
976 order_drefs (const void *a, const void *b)
978 const dref *const da = (const dref *) a;
979 const dref *const db = (const dref *) b;
980 int offcmp = wi::cmps ((*da)->offset, (*db)->offset);
982 if (offcmp != 0)
983 return offcmp;
985 return (*da)->pos - (*db)->pos;
988 /* Returns root of the CHAIN. */
990 static inline dref
991 get_chain_root (chain_p chain)
993 return chain->refs[0];
996 /* Adds REF to the chain CHAIN. */
998 static void
999 add_ref_to_chain (chain_p chain, dref ref)
1001 dref root = get_chain_root (chain);
1003 gcc_assert (wi::les_p (root->offset, ref->offset));
1004 widest_int dist = ref->offset - root->offset;
1005 if (wi::leu_p (MAX_DISTANCE, dist))
1007 free (ref);
1008 return;
1010 gcc_assert (wi::fits_uhwi_p (dist));
1012 chain->refs.safe_push (ref);
1014 ref->distance = dist.to_uhwi ();
1016 if (ref->distance >= chain->length)
1018 chain->length = ref->distance;
1019 chain->has_max_use_after = false;
1022 if (ref->distance == chain->length
1023 && ref->pos > root->pos)
1024 chain->has_max_use_after = true;
1026 chain->all_always_accessed &= ref->always_accessed;
1029 /* Returns the chain for invariant component COMP. */
1031 static chain_p
1032 make_invariant_chain (struct component *comp)
1034 chain_p chain = XCNEW (struct chain);
1035 unsigned i;
1036 dref ref;
1038 chain->type = CT_INVARIANT;
1040 chain->all_always_accessed = true;
1042 FOR_EACH_VEC_ELT (comp->refs, i, ref)
1044 chain->refs.safe_push (ref);
1045 chain->all_always_accessed &= ref->always_accessed;
1048 return chain;
1051 /* Make a new chain rooted at REF. */
1053 static chain_p
1054 make_rooted_chain (dref ref)
1056 chain_p chain = XCNEW (struct chain);
1058 chain->type = DR_IS_READ (ref->ref) ? CT_LOAD : CT_STORE_LOAD;
1060 chain->refs.safe_push (ref);
1061 chain->all_always_accessed = ref->always_accessed;
1063 ref->distance = 0;
1065 return chain;
1068 /* Returns true if CHAIN is not trivial. */
1070 static bool
1071 nontrivial_chain_p (chain_p chain)
1073 return chain != NULL && chain->refs.length () > 1;
1076 /* Returns the ssa name that contains the value of REF, or NULL_TREE if there
1077 is no such name. */
1079 static tree
1080 name_for_ref (dref ref)
1082 tree name;
1084 if (is_gimple_assign (ref->stmt))
1086 if (!ref->ref || DR_IS_READ (ref->ref))
1087 name = gimple_assign_lhs (ref->stmt);
1088 else
1089 name = gimple_assign_rhs1 (ref->stmt);
1091 else
1092 name = PHI_RESULT (ref->stmt);
1094 return (TREE_CODE (name) == SSA_NAME ? name : NULL_TREE);
1097 /* Returns true if REF is a valid initializer for ROOT with given DISTANCE (in
1098 iterations of the innermost enclosing loop). */
1100 static bool
1101 valid_initializer_p (struct data_reference *ref,
1102 unsigned distance, struct data_reference *root)
1104 aff_tree diff, base, step;
1105 widest_int off;
1107 /* Both REF and ROOT must be accessing the same object. */
1108 if (!operand_equal_p (DR_BASE_ADDRESS (ref), DR_BASE_ADDRESS (root), 0))
1109 return false;
1111 /* The initializer is defined outside of loop, hence its address must be
1112 invariant inside the loop. */
1113 gcc_assert (integer_zerop (DR_STEP (ref)));
1115 /* If the address of the reference is invariant, initializer must access
1116 exactly the same location. */
1117 if (integer_zerop (DR_STEP (root)))
1118 return (operand_equal_p (DR_OFFSET (ref), DR_OFFSET (root), 0)
1119 && operand_equal_p (DR_INIT (ref), DR_INIT (root), 0));
1121 /* Verify that this index of REF is equal to the root's index at
1122 -DISTANCE-th iteration. */
1123 aff_combination_dr_offset (root, &diff);
1124 aff_combination_dr_offset (ref, &base);
1125 aff_combination_scale (&base, -1);
1126 aff_combination_add (&diff, &base);
1128 tree_to_aff_combination_expand (DR_STEP (root), TREE_TYPE (DR_STEP (root)),
1129 &step, &name_expansions);
1130 if (!aff_combination_constant_multiple_p (&diff, &step, &off))
1131 return false;
1133 if (off != distance)
1134 return false;
1136 return true;
1139 /* Finds looparound phi node of LOOP that copies the value of REF, and if its
1140 initial value is correct (equal to initial value of REF shifted by one
1141 iteration), returns the phi node. Otherwise, NULL_TREE is returned. ROOT
1142 is the root of the current chain. */
1144 static gphi *
1145 find_looparound_phi (struct loop *loop, dref ref, dref root)
1147 tree name, init, init_ref;
1148 gphi *phi = NULL;
1149 gimple init_stmt;
1150 edge latch = loop_latch_edge (loop);
1151 struct data_reference init_dr;
1152 gphi_iterator psi;
1154 if (is_gimple_assign (ref->stmt))
1156 if (DR_IS_READ (ref->ref))
1157 name = gimple_assign_lhs (ref->stmt);
1158 else
1159 name = gimple_assign_rhs1 (ref->stmt);
1161 else
1162 name = PHI_RESULT (ref->stmt);
1163 if (!name)
1164 return NULL;
1166 for (psi = gsi_start_phis (loop->header); !gsi_end_p (psi); gsi_next (&psi))
1168 phi = psi.phi ();
1169 if (PHI_ARG_DEF_FROM_EDGE (phi, latch) == name)
1170 break;
1173 if (gsi_end_p (psi))
1174 return NULL;
1176 init = PHI_ARG_DEF_FROM_EDGE (phi, loop_preheader_edge (loop));
1177 if (TREE_CODE (init) != SSA_NAME)
1178 return NULL;
1179 init_stmt = SSA_NAME_DEF_STMT (init);
1180 if (gimple_code (init_stmt) != GIMPLE_ASSIGN)
1181 return NULL;
1182 gcc_assert (gimple_assign_lhs (init_stmt) == init);
1184 init_ref = gimple_assign_rhs1 (init_stmt);
1185 if (!REFERENCE_CLASS_P (init_ref)
1186 && !DECL_P (init_ref))
1187 return NULL;
1189 /* Analyze the behavior of INIT_REF with respect to LOOP (innermost
1190 loop enclosing PHI). */
1191 memset (&init_dr, 0, sizeof (struct data_reference));
1192 DR_REF (&init_dr) = init_ref;
1193 DR_STMT (&init_dr) = phi;
1194 if (!dr_analyze_innermost (&init_dr, loop))
1195 return NULL;
1197 if (!valid_initializer_p (&init_dr, ref->distance + 1, root->ref))
1198 return NULL;
1200 return phi;
1203 /* Adds a reference for the looparound copy of REF in PHI to CHAIN. */
1205 static void
1206 insert_looparound_copy (chain_p chain, dref ref, gphi *phi)
1208 dref nw = XCNEW (struct dref_d), aref;
1209 unsigned i;
1211 nw->stmt = phi;
1212 nw->distance = ref->distance + 1;
1213 nw->always_accessed = 1;
1215 FOR_EACH_VEC_ELT (chain->refs, i, aref)
1216 if (aref->distance >= nw->distance)
1217 break;
1218 chain->refs.safe_insert (i, nw);
1220 if (nw->distance > chain->length)
1222 chain->length = nw->distance;
1223 chain->has_max_use_after = false;
1227 /* For references in CHAIN that are copied around the LOOP (created previously
1228 by PRE, or by user), add the results of such copies to the chain. This
1229 enables us to remove the copies by unrolling, and may need less registers
1230 (also, it may allow us to combine chains together). */
1232 static void
1233 add_looparound_copies (struct loop *loop, chain_p chain)
1235 unsigned i;
1236 dref ref, root = get_chain_root (chain);
1237 gphi *phi;
1239 FOR_EACH_VEC_ELT (chain->refs, i, ref)
1241 phi = find_looparound_phi (loop, ref, root);
1242 if (!phi)
1243 continue;
1245 bitmap_set_bit (looparound_phis, SSA_NAME_VERSION (PHI_RESULT (phi)));
1246 insert_looparound_copy (chain, ref, phi);
1250 /* Find roots of the values and determine distances in the component COMP.
1251 The references are redistributed into CHAINS. LOOP is the current
1252 loop. */
1254 static void
1255 determine_roots_comp (struct loop *loop,
1256 struct component *comp,
1257 vec<chain_p> *chains)
1259 unsigned i;
1260 dref a;
1261 chain_p chain = NULL;
1262 widest_int last_ofs = 0;
1264 /* Invariants are handled specially. */
1265 if (comp->comp_step == RS_INVARIANT)
1267 chain = make_invariant_chain (comp);
1268 chains->safe_push (chain);
1269 return;
1272 comp->refs.qsort (order_drefs);
1274 FOR_EACH_VEC_ELT (comp->refs, i, a)
1276 if (!chain || DR_IS_WRITE (a->ref)
1277 || wi::leu_p (MAX_DISTANCE, a->offset - last_ofs))
1279 if (nontrivial_chain_p (chain))
1281 add_looparound_copies (loop, chain);
1282 chains->safe_push (chain);
1284 else
1285 release_chain (chain);
1286 chain = make_rooted_chain (a);
1287 last_ofs = a->offset;
1288 continue;
1291 add_ref_to_chain (chain, a);
1294 if (nontrivial_chain_p (chain))
1296 add_looparound_copies (loop, chain);
1297 chains->safe_push (chain);
1299 else
1300 release_chain (chain);
1303 /* Find roots of the values and determine distances in components COMPS, and
1304 separates the references to CHAINS. LOOP is the current loop. */
1306 static void
1307 determine_roots (struct loop *loop,
1308 struct component *comps, vec<chain_p> *chains)
1310 struct component *comp;
1312 for (comp = comps; comp; comp = comp->next)
1313 determine_roots_comp (loop, comp, chains);
1316 /* Replace the reference in statement STMT with temporary variable
1317 NEW_TREE. If SET is true, NEW_TREE is instead initialized to the value of
1318 the reference in the statement. IN_LHS is true if the reference
1319 is in the lhs of STMT, false if it is in rhs. */
1321 static void
1322 replace_ref_with (gimple stmt, tree new_tree, bool set, bool in_lhs)
1324 tree val;
1325 gassign *new_stmt;
1326 gimple_stmt_iterator bsi, psi;
1328 if (gimple_code (stmt) == GIMPLE_PHI)
1330 gcc_assert (!in_lhs && !set);
1332 val = PHI_RESULT (stmt);
1333 bsi = gsi_after_labels (gimple_bb (stmt));
1334 psi = gsi_for_stmt (stmt);
1335 remove_phi_node (&psi, false);
1337 /* Turn the phi node into GIMPLE_ASSIGN. */
1338 new_stmt = gimple_build_assign (val, new_tree);
1339 gsi_insert_before (&bsi, new_stmt, GSI_NEW_STMT);
1340 return;
1343 /* Since the reference is of gimple_reg type, it should only
1344 appear as lhs or rhs of modify statement. */
1345 gcc_assert (is_gimple_assign (stmt));
1347 bsi = gsi_for_stmt (stmt);
1349 /* If we do not need to initialize NEW_TREE, just replace the use of OLD. */
1350 if (!set)
1352 gcc_assert (!in_lhs);
1353 gimple_assign_set_rhs_from_tree (&bsi, new_tree);
1354 stmt = gsi_stmt (bsi);
1355 update_stmt (stmt);
1356 return;
1359 if (in_lhs)
1361 /* We have statement
1363 OLD = VAL
1365 If OLD is a memory reference, then VAL is gimple_val, and we transform
1366 this to
1368 OLD = VAL
1369 NEW = VAL
1371 Otherwise, we are replacing a combination chain,
1372 VAL is the expression that performs the combination, and OLD is an
1373 SSA name. In this case, we transform the assignment to
1375 OLD = VAL
1376 NEW = OLD
1380 val = gimple_assign_lhs (stmt);
1381 if (TREE_CODE (val) != SSA_NAME)
1383 val = gimple_assign_rhs1 (stmt);
1384 gcc_assert (gimple_assign_single_p (stmt));
1385 if (TREE_CLOBBER_P (val))
1386 val = get_or_create_ssa_default_def (cfun, SSA_NAME_VAR (new_tree));
1387 else
1388 gcc_assert (gimple_assign_copy_p (stmt));
1391 else
1393 /* VAL = OLD
1395 is transformed to
1397 VAL = OLD
1398 NEW = VAL */
1400 val = gimple_assign_lhs (stmt);
1403 new_stmt = gimple_build_assign (new_tree, unshare_expr (val));
1404 gsi_insert_after (&bsi, new_stmt, GSI_NEW_STMT);
1407 /* Returns a memory reference to DR in the ITER-th iteration of
1408 the loop it was analyzed in. Append init stmts to STMTS. */
1410 static tree
1411 ref_at_iteration (data_reference_p dr, int iter, gimple_seq *stmts)
1413 tree off = DR_OFFSET (dr);
1414 tree coff = DR_INIT (dr);
1415 if (iter == 0)
1417 else if (TREE_CODE (DR_STEP (dr)) == INTEGER_CST)
1418 coff = size_binop (PLUS_EXPR, coff,
1419 size_binop (MULT_EXPR, DR_STEP (dr), ssize_int (iter)));
1420 else
1421 off = size_binop (PLUS_EXPR, off,
1422 size_binop (MULT_EXPR, DR_STEP (dr), ssize_int (iter)));
1423 tree addr = fold_build_pointer_plus (DR_BASE_ADDRESS (dr), off);
1424 addr = force_gimple_operand_1 (unshare_expr (addr), stmts,
1425 is_gimple_mem_ref_addr, NULL_TREE);
1426 tree alias_ptr = fold_convert (reference_alias_ptr_type (DR_REF (dr)), coff);
1427 /* While data-ref analysis punts on bit offsets it still handles
1428 bitfield accesses at byte boundaries. Cope with that. Note that
1429 we cannot simply re-apply the outer COMPONENT_REF because the
1430 byte-granular portion of it is already applied via DR_INIT and
1431 DR_OFFSET, so simply build a BIT_FIELD_REF knowing that the bits
1432 start at offset zero. */
1433 if (TREE_CODE (DR_REF (dr)) == COMPONENT_REF
1434 && DECL_BIT_FIELD (TREE_OPERAND (DR_REF (dr), 1)))
1436 tree field = TREE_OPERAND (DR_REF (dr), 1);
1437 return build3 (BIT_FIELD_REF, TREE_TYPE (DR_REF (dr)),
1438 build2 (MEM_REF, DECL_BIT_FIELD_TYPE (field),
1439 addr, alias_ptr),
1440 DECL_SIZE (field), bitsize_zero_node);
1442 else
1443 return fold_build2 (MEM_REF, TREE_TYPE (DR_REF (dr)), addr, alias_ptr);
1446 /* Get the initialization expression for the INDEX-th temporary variable
1447 of CHAIN. */
1449 static tree
1450 get_init_expr (chain_p chain, unsigned index)
1452 if (chain->type == CT_COMBINATION)
1454 tree e1 = get_init_expr (chain->ch1, index);
1455 tree e2 = get_init_expr (chain->ch2, index);
1457 return fold_build2 (chain->op, chain->rslt_type, e1, e2);
1459 else
1460 return chain->inits[index];
1463 /* Returns a new temporary variable used for the I-th variable carrying
1464 value of REF. The variable's uid is marked in TMP_VARS. */
1466 static tree
1467 predcom_tmp_var (tree ref, unsigned i, bitmap tmp_vars)
1469 tree type = TREE_TYPE (ref);
1470 /* We never access the components of the temporary variable in predictive
1471 commoning. */
1472 tree var = create_tmp_reg (type, get_lsm_tmp_name (ref, i));
1473 bitmap_set_bit (tmp_vars, DECL_UID (var));
1474 return var;
1477 /* Creates the variables for CHAIN, as well as phi nodes for them and
1478 initialization on entry to LOOP. Uids of the newly created
1479 temporary variables are marked in TMP_VARS. */
1481 static void
1482 initialize_root_vars (struct loop *loop, chain_p chain, bitmap tmp_vars)
1484 unsigned i;
1485 unsigned n = chain->length;
1486 dref root = get_chain_root (chain);
1487 bool reuse_first = !chain->has_max_use_after;
1488 tree ref, init, var, next;
1489 gphi *phi;
1490 gimple_seq stmts;
1491 edge entry = loop_preheader_edge (loop), latch = loop_latch_edge (loop);
1493 /* If N == 0, then all the references are within the single iteration. And
1494 since this is an nonempty chain, reuse_first cannot be true. */
1495 gcc_assert (n > 0 || !reuse_first);
1497 chain->vars.create (n + 1);
1499 if (chain->type == CT_COMBINATION)
1500 ref = gimple_assign_lhs (root->stmt);
1501 else
1502 ref = DR_REF (root->ref);
1504 for (i = 0; i < n + (reuse_first ? 0 : 1); i++)
1506 var = predcom_tmp_var (ref, i, tmp_vars);
1507 chain->vars.quick_push (var);
1509 if (reuse_first)
1510 chain->vars.quick_push (chain->vars[0]);
1512 FOR_EACH_VEC_ELT (chain->vars, i, var)
1513 chain->vars[i] = make_ssa_name (var);
1515 for (i = 0; i < n; i++)
1517 var = chain->vars[i];
1518 next = chain->vars[i + 1];
1519 init = get_init_expr (chain, i);
1521 init = force_gimple_operand (init, &stmts, true, NULL_TREE);
1522 if (stmts)
1523 gsi_insert_seq_on_edge_immediate (entry, stmts);
1525 phi = create_phi_node (var, loop->header);
1526 add_phi_arg (phi, init, entry, UNKNOWN_LOCATION);
1527 add_phi_arg (phi, next, latch, UNKNOWN_LOCATION);
1531 /* Create the variables and initialization statement for root of chain
1532 CHAIN. Uids of the newly created temporary variables are marked
1533 in TMP_VARS. */
1535 static void
1536 initialize_root (struct loop *loop, chain_p chain, bitmap tmp_vars)
1538 dref root = get_chain_root (chain);
1539 bool in_lhs = (chain->type == CT_STORE_LOAD
1540 || chain->type == CT_COMBINATION);
1542 initialize_root_vars (loop, chain, tmp_vars);
1543 replace_ref_with (root->stmt,
1544 chain->vars[chain->length],
1545 true, in_lhs);
1548 /* Initializes a variable for load motion for ROOT and prepares phi nodes and
1549 initialization on entry to LOOP if necessary. The ssa name for the variable
1550 is stored in VARS. If WRITTEN is true, also a phi node to copy its value
1551 around the loop is created. Uid of the newly created temporary variable
1552 is marked in TMP_VARS. INITS is the list containing the (single)
1553 initializer. */
1555 static void
1556 initialize_root_vars_lm (struct loop *loop, dref root, bool written,
1557 vec<tree> *vars, vec<tree> inits,
1558 bitmap tmp_vars)
1560 unsigned i;
1561 tree ref = DR_REF (root->ref), init, var, next;
1562 gimple_seq stmts;
1563 gphi *phi;
1564 edge entry = loop_preheader_edge (loop), latch = loop_latch_edge (loop);
1566 /* Find the initializer for the variable, and check that it cannot
1567 trap. */
1568 init = inits[0];
1570 vars->create (written ? 2 : 1);
1571 var = predcom_tmp_var (ref, 0, tmp_vars);
1572 vars->quick_push (var);
1573 if (written)
1574 vars->quick_push ((*vars)[0]);
1576 FOR_EACH_VEC_ELT (*vars, i, var)
1577 (*vars)[i] = make_ssa_name (var);
1579 var = (*vars)[0];
1581 init = force_gimple_operand (init, &stmts, written, NULL_TREE);
1582 if (stmts)
1583 gsi_insert_seq_on_edge_immediate (entry, stmts);
1585 if (written)
1587 next = (*vars)[1];
1588 phi = create_phi_node (var, loop->header);
1589 add_phi_arg (phi, init, entry, UNKNOWN_LOCATION);
1590 add_phi_arg (phi, next, latch, UNKNOWN_LOCATION);
1592 else
1594 gassign *init_stmt = gimple_build_assign (var, init);
1595 gsi_insert_on_edge_immediate (entry, init_stmt);
1600 /* Execute load motion for references in chain CHAIN. Uids of the newly
1601 created temporary variables are marked in TMP_VARS. */
1603 static void
1604 execute_load_motion (struct loop *loop, chain_p chain, bitmap tmp_vars)
1606 auto_vec<tree> vars;
1607 dref a;
1608 unsigned n_writes = 0, ridx, i;
1609 tree var;
1611 gcc_assert (chain->type == CT_INVARIANT);
1612 gcc_assert (!chain->combined);
1613 FOR_EACH_VEC_ELT (chain->refs, i, a)
1614 if (DR_IS_WRITE (a->ref))
1615 n_writes++;
1617 /* If there are no reads in the loop, there is nothing to do. */
1618 if (n_writes == chain->refs.length ())
1619 return;
1621 initialize_root_vars_lm (loop, get_chain_root (chain), n_writes > 0,
1622 &vars, chain->inits, tmp_vars);
1624 ridx = 0;
1625 FOR_EACH_VEC_ELT (chain->refs, i, a)
1627 bool is_read = DR_IS_READ (a->ref);
1629 if (DR_IS_WRITE (a->ref))
1631 n_writes--;
1632 if (n_writes)
1634 var = vars[0];
1635 var = make_ssa_name (SSA_NAME_VAR (var));
1636 vars[0] = var;
1638 else
1639 ridx = 1;
1642 replace_ref_with (a->stmt, vars[ridx],
1643 !is_read, !is_read);
1647 /* Returns the single statement in that NAME is used, excepting
1648 the looparound phi nodes contained in one of the chains. If there is no
1649 such statement, or more statements, NULL is returned. */
1651 static gimple
1652 single_nonlooparound_use (tree name)
1654 use_operand_p use;
1655 imm_use_iterator it;
1656 gimple stmt, ret = NULL;
1658 FOR_EACH_IMM_USE_FAST (use, it, name)
1660 stmt = USE_STMT (use);
1662 if (gimple_code (stmt) == GIMPLE_PHI)
1664 /* Ignore uses in looparound phi nodes. Uses in other phi nodes
1665 could not be processed anyway, so just fail for them. */
1666 if (bitmap_bit_p (looparound_phis,
1667 SSA_NAME_VERSION (PHI_RESULT (stmt))))
1668 continue;
1670 return NULL;
1672 else if (is_gimple_debug (stmt))
1673 continue;
1674 else if (ret != NULL)
1675 return NULL;
1676 else
1677 ret = stmt;
1680 return ret;
1683 /* Remove statement STMT, as well as the chain of assignments in that it is
1684 used. */
1686 static void
1687 remove_stmt (gimple stmt)
1689 tree name;
1690 gimple next;
1691 gimple_stmt_iterator psi;
1693 if (gimple_code (stmt) == GIMPLE_PHI)
1695 name = PHI_RESULT (stmt);
1696 next = single_nonlooparound_use (name);
1697 reset_debug_uses (stmt);
1698 psi = gsi_for_stmt (stmt);
1699 remove_phi_node (&psi, true);
1701 if (!next
1702 || !gimple_assign_ssa_name_copy_p (next)
1703 || gimple_assign_rhs1 (next) != name)
1704 return;
1706 stmt = next;
1709 while (1)
1711 gimple_stmt_iterator bsi;
1713 bsi = gsi_for_stmt (stmt);
1715 name = gimple_assign_lhs (stmt);
1716 gcc_assert (TREE_CODE (name) == SSA_NAME);
1718 next = single_nonlooparound_use (name);
1719 reset_debug_uses (stmt);
1721 unlink_stmt_vdef (stmt);
1722 gsi_remove (&bsi, true);
1723 release_defs (stmt);
1725 if (!next
1726 || !gimple_assign_ssa_name_copy_p (next)
1727 || gimple_assign_rhs1 (next) != name)
1728 return;
1730 stmt = next;
1734 /* Perform the predictive commoning optimization for a chain CHAIN.
1735 Uids of the newly created temporary variables are marked in TMP_VARS.*/
1737 static void
1738 execute_pred_commoning_chain (struct loop *loop, chain_p chain,
1739 bitmap tmp_vars)
1741 unsigned i;
1742 dref a;
1743 tree var;
1745 if (chain->combined)
1747 /* For combined chains, just remove the statements that are used to
1748 compute the values of the expression (except for the root one).
1749 We delay this until after all chains are processed. */
1751 else
1753 /* For non-combined chains, set up the variables that hold its value,
1754 and replace the uses of the original references by these
1755 variables. */
1756 initialize_root (loop, chain, tmp_vars);
1757 for (i = 1; chain->refs.iterate (i, &a); i++)
1759 var = chain->vars[chain->length - a->distance];
1760 replace_ref_with (a->stmt, var, false, false);
1765 /* Determines the unroll factor necessary to remove as many temporary variable
1766 copies as possible. CHAINS is the list of chains that will be
1767 optimized. */
1769 static unsigned
1770 determine_unroll_factor (vec<chain_p> chains)
1772 chain_p chain;
1773 unsigned factor = 1, af, nfactor, i;
1774 unsigned max = PARAM_VALUE (PARAM_MAX_UNROLL_TIMES);
1776 FOR_EACH_VEC_ELT (chains, i, chain)
1778 if (chain->type == CT_INVARIANT || chain->combined)
1779 continue;
1781 /* The best unroll factor for this chain is equal to the number of
1782 temporary variables that we create for it. */
1783 af = chain->length;
1784 if (chain->has_max_use_after)
1785 af++;
1787 nfactor = factor * af / gcd (factor, af);
1788 if (nfactor <= max)
1789 factor = nfactor;
1792 return factor;
1795 /* Perform the predictive commoning optimization for CHAINS.
1796 Uids of the newly created temporary variables are marked in TMP_VARS. */
1798 static void
1799 execute_pred_commoning (struct loop *loop, vec<chain_p> chains,
1800 bitmap tmp_vars)
1802 chain_p chain;
1803 unsigned i;
1805 FOR_EACH_VEC_ELT (chains, i, chain)
1807 if (chain->type == CT_INVARIANT)
1808 execute_load_motion (loop, chain, tmp_vars);
1809 else
1810 execute_pred_commoning_chain (loop, chain, tmp_vars);
1813 FOR_EACH_VEC_ELT (chains, i, chain)
1815 if (chain->type == CT_INVARIANT)
1817 else if (chain->combined)
1819 /* For combined chains, just remove the statements that are used to
1820 compute the values of the expression (except for the root one). */
1821 dref a;
1822 unsigned j;
1823 for (j = 1; chain->refs.iterate (j, &a); j++)
1824 remove_stmt (a->stmt);
1828 update_ssa (TODO_update_ssa_only_virtuals);
1831 /* For each reference in CHAINS, if its defining statement is
1832 phi node, record the ssa name that is defined by it. */
1834 static void
1835 replace_phis_by_defined_names (vec<chain_p> chains)
1837 chain_p chain;
1838 dref a;
1839 unsigned i, j;
1841 FOR_EACH_VEC_ELT (chains, i, chain)
1842 FOR_EACH_VEC_ELT (chain->refs, j, a)
1844 if (gimple_code (a->stmt) == GIMPLE_PHI)
1846 a->name_defined_by_phi = PHI_RESULT (a->stmt);
1847 a->stmt = NULL;
1852 /* For each reference in CHAINS, if name_defined_by_phi is not
1853 NULL, use it to set the stmt field. */
1855 static void
1856 replace_names_by_phis (vec<chain_p> chains)
1858 chain_p chain;
1859 dref a;
1860 unsigned i, j;
1862 FOR_EACH_VEC_ELT (chains, i, chain)
1863 FOR_EACH_VEC_ELT (chain->refs, j, a)
1864 if (a->stmt == NULL)
1866 a->stmt = SSA_NAME_DEF_STMT (a->name_defined_by_phi);
1867 gcc_assert (gimple_code (a->stmt) == GIMPLE_PHI);
1868 a->name_defined_by_phi = NULL_TREE;
1872 /* Wrapper over execute_pred_commoning, to pass it as a callback
1873 to tree_transform_and_unroll_loop. */
1875 struct epcc_data
1877 vec<chain_p> chains;
1878 bitmap tmp_vars;
1881 static void
1882 execute_pred_commoning_cbck (struct loop *loop, void *data)
1884 struct epcc_data *const dta = (struct epcc_data *) data;
1886 /* Restore phi nodes that were replaced by ssa names before
1887 tree_transform_and_unroll_loop (see detailed description in
1888 tree_predictive_commoning_loop). */
1889 replace_names_by_phis (dta->chains);
1890 execute_pred_commoning (loop, dta->chains, dta->tmp_vars);
1893 /* Base NAME and all the names in the chain of phi nodes that use it
1894 on variable VAR. The phi nodes are recognized by being in the copies of
1895 the header of the LOOP. */
1897 static void
1898 base_names_in_chain_on (struct loop *loop, tree name, tree var)
1900 gimple stmt, phi;
1901 imm_use_iterator iter;
1903 replace_ssa_name_symbol (name, var);
1905 while (1)
1907 phi = NULL;
1908 FOR_EACH_IMM_USE_STMT (stmt, iter, name)
1910 if (gimple_code (stmt) == GIMPLE_PHI
1911 && flow_bb_inside_loop_p (loop, gimple_bb (stmt)))
1913 phi = stmt;
1914 BREAK_FROM_IMM_USE_STMT (iter);
1917 if (!phi)
1918 return;
1920 name = PHI_RESULT (phi);
1921 replace_ssa_name_symbol (name, var);
1925 /* Given an unrolled LOOP after predictive commoning, remove the
1926 register copies arising from phi nodes by changing the base
1927 variables of SSA names. TMP_VARS is the set of the temporary variables
1928 for those we want to perform this. */
1930 static void
1931 eliminate_temp_copies (struct loop *loop, bitmap tmp_vars)
1933 edge e;
1934 gphi *phi;
1935 gimple stmt;
1936 tree name, use, var;
1937 gphi_iterator psi;
1939 e = loop_latch_edge (loop);
1940 for (psi = gsi_start_phis (loop->header); !gsi_end_p (psi); gsi_next (&psi))
1942 phi = psi.phi ();
1943 name = PHI_RESULT (phi);
1944 var = SSA_NAME_VAR (name);
1945 if (!var || !bitmap_bit_p (tmp_vars, DECL_UID (var)))
1946 continue;
1947 use = PHI_ARG_DEF_FROM_EDGE (phi, e);
1948 gcc_assert (TREE_CODE (use) == SSA_NAME);
1950 /* Base all the ssa names in the ud and du chain of NAME on VAR. */
1951 stmt = SSA_NAME_DEF_STMT (use);
1952 while (gimple_code (stmt) == GIMPLE_PHI
1953 /* In case we could not unroll the loop enough to eliminate
1954 all copies, we may reach the loop header before the defining
1955 statement (in that case, some register copies will be present
1956 in loop latch in the final code, corresponding to the newly
1957 created looparound phi nodes). */
1958 && gimple_bb (stmt) != loop->header)
1960 gcc_assert (single_pred_p (gimple_bb (stmt)));
1961 use = PHI_ARG_DEF (stmt, 0);
1962 stmt = SSA_NAME_DEF_STMT (use);
1965 base_names_in_chain_on (loop, use, var);
1969 /* Returns true if CHAIN is suitable to be combined. */
1971 static bool
1972 chain_can_be_combined_p (chain_p chain)
1974 return (!chain->combined
1975 && (chain->type == CT_LOAD || chain->type == CT_COMBINATION));
1978 /* Returns the modify statement that uses NAME. Skips over assignment
1979 statements, NAME is replaced with the actual name used in the returned
1980 statement. */
1982 static gimple
1983 find_use_stmt (tree *name)
1985 gimple stmt;
1986 tree rhs, lhs;
1988 /* Skip over assignments. */
1989 while (1)
1991 stmt = single_nonlooparound_use (*name);
1992 if (!stmt)
1993 return NULL;
1995 if (gimple_code (stmt) != GIMPLE_ASSIGN)
1996 return NULL;
1998 lhs = gimple_assign_lhs (stmt);
1999 if (TREE_CODE (lhs) != SSA_NAME)
2000 return NULL;
2002 if (gimple_assign_copy_p (stmt))
2004 rhs = gimple_assign_rhs1 (stmt);
2005 if (rhs != *name)
2006 return NULL;
2008 *name = lhs;
2010 else if (get_gimple_rhs_class (gimple_assign_rhs_code (stmt))
2011 == GIMPLE_BINARY_RHS)
2012 return stmt;
2013 else
2014 return NULL;
2018 /* Returns true if we may perform reassociation for operation CODE in TYPE. */
2020 static bool
2021 may_reassociate_p (tree type, enum tree_code code)
2023 if (FLOAT_TYPE_P (type)
2024 && !flag_unsafe_math_optimizations)
2025 return false;
2027 return (commutative_tree_code (code)
2028 && associative_tree_code (code));
2031 /* If the operation used in STMT is associative and commutative, go through the
2032 tree of the same operations and returns its root. Distance to the root
2033 is stored in DISTANCE. */
2035 static gimple
2036 find_associative_operation_root (gimple stmt, unsigned *distance)
2038 tree lhs;
2039 gimple next;
2040 enum tree_code code = gimple_assign_rhs_code (stmt);
2041 tree type = TREE_TYPE (gimple_assign_lhs (stmt));
2042 unsigned dist = 0;
2044 if (!may_reassociate_p (type, code))
2045 return NULL;
2047 while (1)
2049 lhs = gimple_assign_lhs (stmt);
2050 gcc_assert (TREE_CODE (lhs) == SSA_NAME);
2052 next = find_use_stmt (&lhs);
2053 if (!next
2054 || gimple_assign_rhs_code (next) != code)
2055 break;
2057 stmt = next;
2058 dist++;
2061 if (distance)
2062 *distance = dist;
2063 return stmt;
2066 /* Returns the common statement in that NAME1 and NAME2 have a use. If there
2067 is no such statement, returns NULL_TREE. In case the operation used on
2068 NAME1 and NAME2 is associative and commutative, returns the root of the
2069 tree formed by this operation instead of the statement that uses NAME1 or
2070 NAME2. */
2072 static gimple
2073 find_common_use_stmt (tree *name1, tree *name2)
2075 gimple stmt1, stmt2;
2077 stmt1 = find_use_stmt (name1);
2078 if (!stmt1)
2079 return NULL;
2081 stmt2 = find_use_stmt (name2);
2082 if (!stmt2)
2083 return NULL;
2085 if (stmt1 == stmt2)
2086 return stmt1;
2088 stmt1 = find_associative_operation_root (stmt1, NULL);
2089 if (!stmt1)
2090 return NULL;
2091 stmt2 = find_associative_operation_root (stmt2, NULL);
2092 if (!stmt2)
2093 return NULL;
2095 return (stmt1 == stmt2 ? stmt1 : NULL);
2098 /* Checks whether R1 and R2 are combined together using CODE, with the result
2099 in RSLT_TYPE, in order R1 CODE R2 if SWAP is false and in order R2 CODE R1
2100 if it is true. If CODE is ERROR_MARK, set these values instead. */
2102 static bool
2103 combinable_refs_p (dref r1, dref r2,
2104 enum tree_code *code, bool *swap, tree *rslt_type)
2106 enum tree_code acode;
2107 bool aswap;
2108 tree atype;
2109 tree name1, name2;
2110 gimple stmt;
2112 name1 = name_for_ref (r1);
2113 name2 = name_for_ref (r2);
2114 gcc_assert (name1 != NULL_TREE && name2 != NULL_TREE);
2116 stmt = find_common_use_stmt (&name1, &name2);
2118 if (!stmt
2119 /* A simple post-dominance check - make sure the combination
2120 is executed under the same condition as the references. */
2121 || (gimple_bb (stmt) != gimple_bb (r1->stmt)
2122 && gimple_bb (stmt) != gimple_bb (r2->stmt)))
2123 return false;
2125 acode = gimple_assign_rhs_code (stmt);
2126 aswap = (!commutative_tree_code (acode)
2127 && gimple_assign_rhs1 (stmt) != name1);
2128 atype = TREE_TYPE (gimple_assign_lhs (stmt));
2130 if (*code == ERROR_MARK)
2132 *code = acode;
2133 *swap = aswap;
2134 *rslt_type = atype;
2135 return true;
2138 return (*code == acode
2139 && *swap == aswap
2140 && *rslt_type == atype);
2143 /* Remove OP from the operation on rhs of STMT, and replace STMT with
2144 an assignment of the remaining operand. */
2146 static void
2147 remove_name_from_operation (gimple stmt, tree op)
2149 tree other_op;
2150 gimple_stmt_iterator si;
2152 gcc_assert (is_gimple_assign (stmt));
2154 if (gimple_assign_rhs1 (stmt) == op)
2155 other_op = gimple_assign_rhs2 (stmt);
2156 else
2157 other_op = gimple_assign_rhs1 (stmt);
2159 si = gsi_for_stmt (stmt);
2160 gimple_assign_set_rhs_from_tree (&si, other_op);
2162 /* We should not have reallocated STMT. */
2163 gcc_assert (gsi_stmt (si) == stmt);
2165 update_stmt (stmt);
2168 /* Reassociates the expression in that NAME1 and NAME2 are used so that they
2169 are combined in a single statement, and returns this statement. */
2171 static gimple
2172 reassociate_to_the_same_stmt (tree name1, tree name2)
2174 gimple stmt1, stmt2, root1, root2, s1, s2;
2175 gassign *new_stmt, *tmp_stmt;
2176 tree new_name, tmp_name, var, r1, r2;
2177 unsigned dist1, dist2;
2178 enum tree_code code;
2179 tree type = TREE_TYPE (name1);
2180 gimple_stmt_iterator bsi;
2182 stmt1 = find_use_stmt (&name1);
2183 stmt2 = find_use_stmt (&name2);
2184 root1 = find_associative_operation_root (stmt1, &dist1);
2185 root2 = find_associative_operation_root (stmt2, &dist2);
2186 code = gimple_assign_rhs_code (stmt1);
2188 gcc_assert (root1 && root2 && root1 == root2
2189 && code == gimple_assign_rhs_code (stmt2));
2191 /* Find the root of the nearest expression in that both NAME1 and NAME2
2192 are used. */
2193 r1 = name1;
2194 s1 = stmt1;
2195 r2 = name2;
2196 s2 = stmt2;
2198 while (dist1 > dist2)
2200 s1 = find_use_stmt (&r1);
2201 r1 = gimple_assign_lhs (s1);
2202 dist1--;
2204 while (dist2 > dist1)
2206 s2 = find_use_stmt (&r2);
2207 r2 = gimple_assign_lhs (s2);
2208 dist2--;
2211 while (s1 != s2)
2213 s1 = find_use_stmt (&r1);
2214 r1 = gimple_assign_lhs (s1);
2215 s2 = find_use_stmt (&r2);
2216 r2 = gimple_assign_lhs (s2);
2219 /* Remove NAME1 and NAME2 from the statements in that they are used
2220 currently. */
2221 remove_name_from_operation (stmt1, name1);
2222 remove_name_from_operation (stmt2, name2);
2224 /* Insert the new statement combining NAME1 and NAME2 before S1, and
2225 combine it with the rhs of S1. */
2226 var = create_tmp_reg (type, "predreastmp");
2227 new_name = make_ssa_name (var);
2228 new_stmt = gimple_build_assign (new_name, code, name1, name2);
2230 var = create_tmp_reg (type, "predreastmp");
2231 tmp_name = make_ssa_name (var);
2233 /* Rhs of S1 may now be either a binary expression with operation
2234 CODE, or gimple_val (in case that stmt1 == s1 or stmt2 == s1,
2235 so that name1 or name2 was removed from it). */
2236 tmp_stmt = gimple_build_assign (tmp_name, gimple_assign_rhs_code (s1),
2237 gimple_assign_rhs1 (s1),
2238 gimple_assign_rhs2 (s1));
2240 bsi = gsi_for_stmt (s1);
2241 gimple_assign_set_rhs_with_ops (&bsi, code, new_name, tmp_name);
2242 s1 = gsi_stmt (bsi);
2243 update_stmt (s1);
2245 gsi_insert_before (&bsi, new_stmt, GSI_SAME_STMT);
2246 gsi_insert_before (&bsi, tmp_stmt, GSI_SAME_STMT);
2248 return new_stmt;
2251 /* Returns the statement that combines references R1 and R2. In case R1
2252 and R2 are not used in the same statement, but they are used with an
2253 associative and commutative operation in the same expression, reassociate
2254 the expression so that they are used in the same statement. */
2256 static gimple
2257 stmt_combining_refs (dref r1, dref r2)
2259 gimple stmt1, stmt2;
2260 tree name1 = name_for_ref (r1);
2261 tree name2 = name_for_ref (r2);
2263 stmt1 = find_use_stmt (&name1);
2264 stmt2 = find_use_stmt (&name2);
2265 if (stmt1 == stmt2)
2266 return stmt1;
2268 return reassociate_to_the_same_stmt (name1, name2);
2271 /* Tries to combine chains CH1 and CH2 together. If this succeeds, the
2272 description of the new chain is returned, otherwise we return NULL. */
2274 static chain_p
2275 combine_chains (chain_p ch1, chain_p ch2)
2277 dref r1, r2, nw;
2278 enum tree_code op = ERROR_MARK;
2279 bool swap = false;
2280 chain_p new_chain;
2281 unsigned i;
2282 gimple root_stmt;
2283 tree rslt_type = NULL_TREE;
2285 if (ch1 == ch2)
2286 return NULL;
2287 if (ch1->length != ch2->length)
2288 return NULL;
2290 if (ch1->refs.length () != ch2->refs.length ())
2291 return NULL;
2293 for (i = 0; (ch1->refs.iterate (i, &r1)
2294 && ch2->refs.iterate (i, &r2)); i++)
2296 if (r1->distance != r2->distance)
2297 return NULL;
2299 if (!combinable_refs_p (r1, r2, &op, &swap, &rslt_type))
2300 return NULL;
2303 if (swap)
2305 chain_p tmp = ch1;
2306 ch1 = ch2;
2307 ch2 = tmp;
2310 new_chain = XCNEW (struct chain);
2311 new_chain->type = CT_COMBINATION;
2312 new_chain->op = op;
2313 new_chain->ch1 = ch1;
2314 new_chain->ch2 = ch2;
2315 new_chain->rslt_type = rslt_type;
2316 new_chain->length = ch1->length;
2318 for (i = 0; (ch1->refs.iterate (i, &r1)
2319 && ch2->refs.iterate (i, &r2)); i++)
2321 nw = XCNEW (struct dref_d);
2322 nw->stmt = stmt_combining_refs (r1, r2);
2323 nw->distance = r1->distance;
2325 new_chain->refs.safe_push (nw);
2328 new_chain->has_max_use_after = false;
2329 root_stmt = get_chain_root (new_chain)->stmt;
2330 for (i = 1; new_chain->refs.iterate (i, &nw); i++)
2332 if (nw->distance == new_chain->length
2333 && !stmt_dominates_stmt_p (nw->stmt, root_stmt))
2335 new_chain->has_max_use_after = true;
2336 break;
2340 ch1->combined = true;
2341 ch2->combined = true;
2342 return new_chain;
2345 /* Try to combine the CHAINS. */
2347 static void
2348 try_combine_chains (vec<chain_p> *chains)
2350 unsigned i, j;
2351 chain_p ch1, ch2, cch;
2352 auto_vec<chain_p> worklist;
2354 FOR_EACH_VEC_ELT (*chains, i, ch1)
2355 if (chain_can_be_combined_p (ch1))
2356 worklist.safe_push (ch1);
2358 while (!worklist.is_empty ())
2360 ch1 = worklist.pop ();
2361 if (!chain_can_be_combined_p (ch1))
2362 continue;
2364 FOR_EACH_VEC_ELT (*chains, j, ch2)
2366 if (!chain_can_be_combined_p (ch2))
2367 continue;
2369 cch = combine_chains (ch1, ch2);
2370 if (cch)
2372 worklist.safe_push (cch);
2373 chains->safe_push (cch);
2374 break;
2380 /* Prepare initializers for CHAIN in LOOP. Returns false if this is
2381 impossible because one of these initializers may trap, true otherwise. */
2383 static bool
2384 prepare_initializers_chain (struct loop *loop, chain_p chain)
2386 unsigned i, n = (chain->type == CT_INVARIANT) ? 1 : chain->length;
2387 struct data_reference *dr = get_chain_root (chain)->ref;
2388 tree init;
2389 dref laref;
2390 edge entry = loop_preheader_edge (loop);
2392 /* Find the initializers for the variables, and check that they cannot
2393 trap. */
2394 chain->inits.create (n);
2395 for (i = 0; i < n; i++)
2396 chain->inits.quick_push (NULL_TREE);
2398 /* If we have replaced some looparound phi nodes, use their initializers
2399 instead of creating our own. */
2400 FOR_EACH_VEC_ELT (chain->refs, i, laref)
2402 if (gimple_code (laref->stmt) != GIMPLE_PHI)
2403 continue;
2405 gcc_assert (laref->distance > 0);
2406 chain->inits[n - laref->distance]
2407 = PHI_ARG_DEF_FROM_EDGE (laref->stmt, entry);
2410 for (i = 0; i < n; i++)
2412 gimple_seq stmts = NULL;
2414 if (chain->inits[i] != NULL_TREE)
2415 continue;
2417 init = ref_at_iteration (dr, (int) i - n, &stmts);
2418 if (!chain->all_always_accessed && tree_could_trap_p (init))
2420 gimple_seq_discard (stmts);
2421 return false;
2424 if (stmts)
2425 gsi_insert_seq_on_edge_immediate (entry, stmts);
2427 chain->inits[i] = init;
2430 return true;
2433 /* Prepare initializers for CHAINS in LOOP, and free chains that cannot
2434 be used because the initializers might trap. */
2436 static void
2437 prepare_initializers (struct loop *loop, vec<chain_p> chains)
2439 chain_p chain;
2440 unsigned i;
2442 for (i = 0; i < chains.length (); )
2444 chain = chains[i];
2445 if (prepare_initializers_chain (loop, chain))
2446 i++;
2447 else
2449 release_chain (chain);
2450 chains.unordered_remove (i);
2455 /* Performs predictive commoning for LOOP. Returns true if LOOP was
2456 unrolled. */
2458 static bool
2459 tree_predictive_commoning_loop (struct loop *loop)
2461 vec<data_reference_p> datarefs;
2462 vec<ddr_p> dependences;
2463 struct component *components;
2464 vec<chain_p> chains = vNULL;
2465 unsigned unroll_factor;
2466 struct tree_niter_desc desc;
2467 bool unroll = false;
2468 edge exit;
2469 bitmap tmp_vars;
2471 if (dump_file && (dump_flags & TDF_DETAILS))
2472 fprintf (dump_file, "Processing loop %d\n", loop->num);
2474 /* Find the data references and split them into components according to their
2475 dependence relations. */
2476 auto_vec<loop_p, 3> loop_nest;
2477 dependences.create (10);
2478 datarefs.create (10);
2479 if (! compute_data_dependences_for_loop (loop, true, &loop_nest, &datarefs,
2480 &dependences))
2482 if (dump_file && (dump_flags & TDF_DETAILS))
2483 fprintf (dump_file, "Cannot analyze data dependencies\n");
2484 free_data_refs (datarefs);
2485 free_dependence_relations (dependences);
2486 return false;
2489 if (dump_file && (dump_flags & TDF_DETAILS))
2490 dump_data_dependence_relations (dump_file, dependences);
2492 components = split_data_refs_to_components (loop, datarefs, dependences);
2493 loop_nest.release ();
2494 free_dependence_relations (dependences);
2495 if (!components)
2497 free_data_refs (datarefs);
2498 free_affine_expand_cache (&name_expansions);
2499 return false;
2502 if (dump_file && (dump_flags & TDF_DETAILS))
2504 fprintf (dump_file, "Initial state:\n\n");
2505 dump_components (dump_file, components);
2508 /* Find the suitable components and split them into chains. */
2509 components = filter_suitable_components (loop, components);
2511 tmp_vars = BITMAP_ALLOC (NULL);
2512 looparound_phis = BITMAP_ALLOC (NULL);
2513 determine_roots (loop, components, &chains);
2514 release_components (components);
2516 if (!chains.exists ())
2518 if (dump_file && (dump_flags & TDF_DETAILS))
2519 fprintf (dump_file,
2520 "Predictive commoning failed: no suitable chains\n");
2521 goto end;
2523 prepare_initializers (loop, chains);
2525 /* Try to combine the chains that are always worked with together. */
2526 try_combine_chains (&chains);
2528 if (dump_file && (dump_flags & TDF_DETAILS))
2530 fprintf (dump_file, "Before commoning:\n\n");
2531 dump_chains (dump_file, chains);
2534 /* Determine the unroll factor, and if the loop should be unrolled, ensure
2535 that its number of iterations is divisible by the factor. */
2536 unroll_factor = determine_unroll_factor (chains);
2537 scev_reset ();
2538 unroll = (unroll_factor > 1
2539 && can_unroll_loop_p (loop, unroll_factor, &desc));
2540 exit = single_dom_exit (loop);
2542 /* Execute the predictive commoning transformations, and possibly unroll the
2543 loop. */
2544 if (unroll)
2546 struct epcc_data dta;
2548 if (dump_file && (dump_flags & TDF_DETAILS))
2549 fprintf (dump_file, "Unrolling %u times.\n", unroll_factor);
2551 dta.chains = chains;
2552 dta.tmp_vars = tmp_vars;
2554 update_ssa (TODO_update_ssa_only_virtuals);
2556 /* Cfg manipulations performed in tree_transform_and_unroll_loop before
2557 execute_pred_commoning_cbck is called may cause phi nodes to be
2558 reallocated, which is a problem since CHAINS may point to these
2559 statements. To fix this, we store the ssa names defined by the
2560 phi nodes here instead of the phi nodes themselves, and restore
2561 the phi nodes in execute_pred_commoning_cbck. A bit hacky. */
2562 replace_phis_by_defined_names (chains);
2564 tree_transform_and_unroll_loop (loop, unroll_factor, exit, &desc,
2565 execute_pred_commoning_cbck, &dta);
2566 eliminate_temp_copies (loop, tmp_vars);
2568 else
2570 if (dump_file && (dump_flags & TDF_DETAILS))
2571 fprintf (dump_file,
2572 "Executing predictive commoning without unrolling.\n");
2573 execute_pred_commoning (loop, chains, tmp_vars);
2576 end: ;
2577 release_chains (chains);
2578 free_data_refs (datarefs);
2579 BITMAP_FREE (tmp_vars);
2580 BITMAP_FREE (looparound_phis);
2582 free_affine_expand_cache (&name_expansions);
2584 return unroll;
2587 /* Runs predictive commoning. */
2589 unsigned
2590 tree_predictive_commoning (void)
2592 bool unrolled = false;
2593 struct loop *loop;
2594 unsigned ret = 0;
2596 initialize_original_copy_tables ();
2597 FOR_EACH_LOOP (loop, LI_ONLY_INNERMOST)
2598 if (optimize_loop_for_speed_p (loop))
2600 unrolled |= tree_predictive_commoning_loop (loop);
2603 if (unrolled)
2605 scev_reset ();
2606 ret = TODO_cleanup_cfg;
2608 free_original_copy_tables ();
2610 return ret;
2613 /* Predictive commoning Pass. */
2615 static unsigned
2616 run_tree_predictive_commoning (struct function *fun)
2618 if (number_of_loops (fun) <= 1)
2619 return 0;
2621 return tree_predictive_commoning ();
2624 namespace {
2626 const pass_data pass_data_predcom =
2628 GIMPLE_PASS, /* type */
2629 "pcom", /* name */
2630 OPTGROUP_LOOP, /* optinfo_flags */
2631 TV_PREDCOM, /* tv_id */
2632 PROP_cfg, /* properties_required */
2633 0, /* properties_provided */
2634 0, /* properties_destroyed */
2635 0, /* todo_flags_start */
2636 TODO_update_ssa_only_virtuals, /* todo_flags_finish */
2639 class pass_predcom : public gimple_opt_pass
2641 public:
2642 pass_predcom (gcc::context *ctxt)
2643 : gimple_opt_pass (pass_data_predcom, ctxt)
2646 /* opt_pass methods: */
2647 virtual bool gate (function *) { return flag_predictive_commoning != 0; }
2648 virtual unsigned int execute (function *fun)
2650 return run_tree_predictive_commoning (fun);
2653 }; // class pass_predcom
2655 } // anon namespace
2657 gimple_opt_pass *
2658 make_pass_predcom (gcc::context *ctxt)
2660 return new pass_predcom (ctxt);