Daily bump.
[official-gcc.git] / gcc / tree-predcom.c
blob843f68e387bf8c9123c4d594fe3f53dc6fb6a0e5
1 /* Predictive commoning.
2 Copyright (C) 2005-2016 Free Software Foundation, Inc.
4 This file is part of GCC.
6 GCC is free software; you can redistribute it and/or modify it
7 under the terms of the GNU General Public License as published by the
8 Free Software Foundation; either version 3, or (at your option) any
9 later version.
11 GCC is distributed in the hope that it will be useful, but WITHOUT
12 ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
13 FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
14 for more details.
16 You should have received a copy of the GNU General Public License
17 along with GCC; see the file COPYING3. If not see
18 <http://www.gnu.org/licenses/>. */
20 /* This file implements the predictive commoning optimization. Predictive
21 commoning can be viewed as CSE around a loop, and with some improvements,
22 as generalized strength reduction-- i.e., reusing values computed in
23 earlier iterations of a loop in the later ones. So far, the pass only
24 handles the most useful case, that is, reusing values of memory references.
25 If you think this is all just a special case of PRE, you are sort of right;
26 however, concentrating on loops is simpler, and makes it possible to
27 incorporate data dependence analysis to detect the opportunities, perform
28 loop unrolling to avoid copies together with renaming immediately,
29 and if needed, we could also take register pressure into account.
31 Let us demonstrate what is done on an example:
33 for (i = 0; i < 100; i++)
35 a[i+2] = a[i] + a[i+1];
36 b[10] = b[10] + i;
37 c[i] = c[99 - i];
38 d[i] = d[i + 1];
41 1) We find data references in the loop, and split them to mutually
42 independent groups (i.e., we find components of a data dependence
43 graph). We ignore read-read dependences whose distance is not constant.
44 (TODO -- we could also ignore antidependences). In this example, we
45 find the following groups:
47 a[i]{read}, a[i+1]{read}, a[i+2]{write}
48 b[10]{read}, b[10]{write}
49 c[99 - i]{read}, c[i]{write}
50 d[i + 1]{read}, d[i]{write}
52 2) Inside each of the group, we verify several conditions:
53 a) all the references must differ in indices only, and the indices
54 must all have the same step
55 b) the references must dominate loop latch (and thus, they must be
56 ordered by dominance relation).
57 c) the distance of the indices must be a small multiple of the step
58 We are then able to compute the difference of the references (# of
59 iterations before they point to the same place as the first of them).
60 Also, in case there are writes in the loop, we split the groups into
61 chains whose head is the write whose values are used by the reads in
62 the same chain. The chains are then processed independently,
63 making the further transformations simpler. Also, the shorter chains
64 need the same number of registers, but may require lower unrolling
65 factor in order to get rid of the copies on the loop latch.
67 In our example, we get the following chains (the chain for c is invalid).
69 a[i]{read,+0}, a[i+1]{read,-1}, a[i+2]{write,-2}
70 b[10]{read,+0}, b[10]{write,+0}
71 d[i + 1]{read,+0}, d[i]{write,+1}
73 3) For each read, we determine the read or write whose value it reuses,
74 together with the distance of this reuse. I.e. we take the last
75 reference before it with distance 0, or the last of the references
76 with the smallest positive distance to the read. Then, we remove
77 the references that are not used in any of these chains, discard the
78 empty groups, and propagate all the links so that they point to the
79 single root reference of the chain (adjusting their distance
80 appropriately). Some extra care needs to be taken for references with
81 step 0. In our example (the numbers indicate the distance of the
82 reuse),
84 a[i] --> (*) 2, a[i+1] --> (*) 1, a[i+2] (*)
85 b[10] --> (*) 1, b[10] (*)
87 4) The chains are combined together if possible. If the corresponding
88 elements of two chains are always combined together with the same
89 operator, we remember just the result of this combination, instead
90 of remembering the values separately. We may need to perform
91 reassociation to enable combining, for example
93 e[i] + f[i+1] + e[i+1] + f[i]
95 can be reassociated as
97 (e[i] + f[i]) + (e[i+1] + f[i+1])
99 and we can combine the chains for e and f into one chain.
101 5) For each root reference (end of the chain) R, let N be maximum distance
102 of a reference reusing its value. Variables R0 up to RN are created,
103 together with phi nodes that transfer values from R1 .. RN to
104 R0 .. R(N-1).
105 Initial values are loaded to R0..R(N-1) (in case not all references
106 must necessarily be accessed and they may trap, we may fail here;
107 TODO sometimes, the loads could be guarded by a check for the number
108 of iterations). Values loaded/stored in roots are also copied to
109 RN. Other reads are replaced with the appropriate variable Ri.
110 Everything is put to SSA form.
112 As a small improvement, if R0 is dead after the root (i.e., all uses of
113 the value with the maximum distance dominate the root), we can avoid
114 creating RN and use R0 instead of it.
116 In our example, we get (only the parts concerning a and b are shown):
117 for (i = 0; i < 100; i++)
119 f = phi (a[0], s);
120 s = phi (a[1], f);
121 x = phi (b[10], x);
123 f = f + s;
124 a[i+2] = f;
125 x = x + i;
126 b[10] = x;
129 6) Factor F for unrolling is determined as the smallest common multiple of
130 (N + 1) for each root reference (N for references for that we avoided
131 creating RN). If F and the loop is small enough, loop is unrolled F
132 times. The stores to RN (R0) in the copies of the loop body are
133 periodically replaced with R0, R1, ... (R1, R2, ...), so that they can
134 be coalesced and the copies can be eliminated.
136 TODO -- copy propagation and other optimizations may change the live
137 ranges of the temporary registers and prevent them from being coalesced;
138 this may increase the register pressure.
140 In our case, F = 2 and the (main loop of the) result is
142 for (i = 0; i < ...; i += 2)
144 f = phi (a[0], f);
145 s = phi (a[1], s);
146 x = phi (b[10], x);
148 f = f + s;
149 a[i+2] = f;
150 x = x + i;
151 b[10] = x;
153 s = s + f;
154 a[i+3] = s;
155 x = x + i;
156 b[10] = x;
159 TODO -- stores killing other stores can be taken into account, e.g.,
160 for (i = 0; i < n; i++)
162 a[i] = 1;
163 a[i+2] = 2;
166 can be replaced with
168 t0 = a[0];
169 t1 = a[1];
170 for (i = 0; i < n; i++)
172 a[i] = 1;
173 t2 = 2;
174 t0 = t1;
175 t1 = t2;
177 a[n] = t0;
178 a[n+1] = t1;
180 The interesting part is that this would generalize store motion; still, since
181 sm is performed elsewhere, it does not seem that important.
183 Predictive commoning can be generalized for arbitrary computations (not
184 just memory loads), and also nontrivial transfer functions (e.g., replacing
185 i * i with ii_last + 2 * i + 1), to generalize strength reduction. */
187 #include "config.h"
188 #include "system.h"
189 #include "coretypes.h"
190 #include "backend.h"
191 #include "rtl.h"
192 #include "tree.h"
193 #include "gimple.h"
194 #include "predict.h"
195 #include "tree-pass.h"
196 #include "ssa.h"
197 #include "gimple-pretty-print.h"
198 #include "alias.h"
199 #include "fold-const.h"
200 #include "cfgloop.h"
201 #include "tree-eh.h"
202 #include "gimplify.h"
203 #include "gimple-iterator.h"
204 #include "gimplify-me.h"
205 #include "tree-ssa-loop-ivopts.h"
206 #include "tree-ssa-loop-manip.h"
207 #include "tree-ssa-loop-niter.h"
208 #include "tree-ssa-loop.h"
209 #include "tree-into-ssa.h"
210 #include "tree-dfa.h"
211 #include "tree-ssa.h"
212 #include "tree-data-ref.h"
213 #include "tree-scalar-evolution.h"
214 #include "params.h"
215 #include "tree-affine.h"
216 #include "builtins.h"
218 /* The maximum number of iterations between the considered memory
219 references. */
221 #define MAX_DISTANCE (target_avail_regs < 16 ? 4 : 8)
223 /* Data references (or phi nodes that carry data reference values across
224 loop iterations). */
226 typedef struct dref_d
228 /* The reference itself. */
229 struct data_reference *ref;
231 /* The statement in that the reference appears. */
232 gimple *stmt;
234 /* In case that STMT is a phi node, this field is set to the SSA name
235 defined by it in replace_phis_by_defined_names (in order to avoid
236 pointing to phi node that got reallocated in the meantime). */
237 tree name_defined_by_phi;
239 /* Distance of the reference from the root of the chain (in number of
240 iterations of the loop). */
241 unsigned distance;
243 /* Number of iterations offset from the first reference in the component. */
244 widest_int offset;
246 /* Number of the reference in a component, in dominance ordering. */
247 unsigned pos;
249 /* True if the memory reference is always accessed when the loop is
250 entered. */
251 unsigned always_accessed : 1;
252 } *dref;
255 /* Type of the chain of the references. */
257 enum chain_type
259 /* The addresses of the references in the chain are constant. */
260 CT_INVARIANT,
262 /* There are only loads in the chain. */
263 CT_LOAD,
265 /* Root of the chain is store, the rest are loads. */
266 CT_STORE_LOAD,
268 /* A combination of two chains. */
269 CT_COMBINATION
272 /* Chains of data references. */
274 typedef struct chain
276 /* Type of the chain. */
277 enum chain_type type;
279 /* For combination chains, the operator and the two chains that are
280 combined, and the type of the result. */
281 enum tree_code op;
282 tree rslt_type;
283 struct chain *ch1, *ch2;
285 /* The references in the chain. */
286 vec<dref> refs;
288 /* The maximum distance of the reference in the chain from the root. */
289 unsigned length;
291 /* The variables used to copy the value throughout iterations. */
292 vec<tree> vars;
294 /* Initializers for the variables. */
295 vec<tree> inits;
297 /* True if there is a use of a variable with the maximal distance
298 that comes after the root in the loop. */
299 unsigned has_max_use_after : 1;
301 /* True if all the memory references in the chain are always accessed. */
302 unsigned all_always_accessed : 1;
304 /* True if this chain was combined together with some other chain. */
305 unsigned combined : 1;
306 } *chain_p;
309 /* Describes the knowledge about the step of the memory references in
310 the component. */
312 enum ref_step_type
314 /* The step is zero. */
315 RS_INVARIANT,
317 /* The step is nonzero. */
318 RS_NONZERO,
320 /* The step may or may not be nonzero. */
321 RS_ANY
324 /* Components of the data dependence graph. */
326 struct component
328 /* The references in the component. */
329 vec<dref> refs;
331 /* What we know about the step of the references in the component. */
332 enum ref_step_type comp_step;
334 /* Next component in the list. */
335 struct component *next;
338 /* Bitmap of ssa names defined by looparound phi nodes covered by chains. */
340 static bitmap looparound_phis;
342 /* Cache used by tree_to_aff_combination_expand. */
344 static hash_map<tree, name_expansion *> *name_expansions;
346 /* Dumps data reference REF to FILE. */
348 extern void dump_dref (FILE *, dref);
349 void
350 dump_dref (FILE *file, dref ref)
352 if (ref->ref)
354 fprintf (file, " ");
355 print_generic_expr (file, DR_REF (ref->ref), TDF_SLIM);
356 fprintf (file, " (id %u%s)\n", ref->pos,
357 DR_IS_READ (ref->ref) ? "" : ", write");
359 fprintf (file, " offset ");
360 print_decs (ref->offset, file);
361 fprintf (file, "\n");
363 fprintf (file, " distance %u\n", ref->distance);
365 else
367 if (gimple_code (ref->stmt) == GIMPLE_PHI)
368 fprintf (file, " looparound ref\n");
369 else
370 fprintf (file, " combination ref\n");
371 fprintf (file, " in statement ");
372 print_gimple_stmt (file, ref->stmt, 0, TDF_SLIM);
373 fprintf (file, "\n");
374 fprintf (file, " distance %u\n", ref->distance);
379 /* Dumps CHAIN to FILE. */
381 extern void dump_chain (FILE *, chain_p);
382 void
383 dump_chain (FILE *file, chain_p chain)
385 dref a;
386 const char *chain_type;
387 unsigned i;
388 tree var;
390 switch (chain->type)
392 case CT_INVARIANT:
393 chain_type = "Load motion";
394 break;
396 case CT_LOAD:
397 chain_type = "Loads-only";
398 break;
400 case CT_STORE_LOAD:
401 chain_type = "Store-loads";
402 break;
404 case CT_COMBINATION:
405 chain_type = "Combination";
406 break;
408 default:
409 gcc_unreachable ();
412 fprintf (file, "%s chain %p%s\n", chain_type, (void *) chain,
413 chain->combined ? " (combined)" : "");
414 if (chain->type != CT_INVARIANT)
415 fprintf (file, " max distance %u%s\n", chain->length,
416 chain->has_max_use_after ? "" : ", may reuse first");
418 if (chain->type == CT_COMBINATION)
420 fprintf (file, " equal to %p %s %p in type ",
421 (void *) chain->ch1, op_symbol_code (chain->op),
422 (void *) chain->ch2);
423 print_generic_expr (file, chain->rslt_type, TDF_SLIM);
424 fprintf (file, "\n");
427 if (chain->vars.exists ())
429 fprintf (file, " vars");
430 FOR_EACH_VEC_ELT (chain->vars, i, var)
432 fprintf (file, " ");
433 print_generic_expr (file, var, TDF_SLIM);
435 fprintf (file, "\n");
438 if (chain->inits.exists ())
440 fprintf (file, " inits");
441 FOR_EACH_VEC_ELT (chain->inits, i, var)
443 fprintf (file, " ");
444 print_generic_expr (file, var, TDF_SLIM);
446 fprintf (file, "\n");
449 fprintf (file, " references:\n");
450 FOR_EACH_VEC_ELT (chain->refs, i, a)
451 dump_dref (file, a);
453 fprintf (file, "\n");
456 /* Dumps CHAINS to FILE. */
458 extern void dump_chains (FILE *, vec<chain_p> );
459 void
460 dump_chains (FILE *file, vec<chain_p> chains)
462 chain_p chain;
463 unsigned i;
465 FOR_EACH_VEC_ELT (chains, i, chain)
466 dump_chain (file, chain);
469 /* Dumps COMP to FILE. */
471 extern void dump_component (FILE *, struct component *);
472 void
473 dump_component (FILE *file, struct component *comp)
475 dref a;
476 unsigned i;
478 fprintf (file, "Component%s:\n",
479 comp->comp_step == RS_INVARIANT ? " (invariant)" : "");
480 FOR_EACH_VEC_ELT (comp->refs, i, a)
481 dump_dref (file, a);
482 fprintf (file, "\n");
485 /* Dumps COMPS to FILE. */
487 extern void dump_components (FILE *, struct component *);
488 void
489 dump_components (FILE *file, struct component *comps)
491 struct component *comp;
493 for (comp = comps; comp; comp = comp->next)
494 dump_component (file, comp);
497 /* Frees a chain CHAIN. */
499 static void
500 release_chain (chain_p chain)
502 dref ref;
503 unsigned i;
505 if (chain == NULL)
506 return;
508 FOR_EACH_VEC_ELT (chain->refs, i, ref)
509 free (ref);
511 chain->refs.release ();
512 chain->vars.release ();
513 chain->inits.release ();
515 free (chain);
518 /* Frees CHAINS. */
520 static void
521 release_chains (vec<chain_p> chains)
523 unsigned i;
524 chain_p chain;
526 FOR_EACH_VEC_ELT (chains, i, chain)
527 release_chain (chain);
528 chains.release ();
531 /* Frees a component COMP. */
533 static void
534 release_component (struct component *comp)
536 comp->refs.release ();
537 free (comp);
540 /* Frees list of components COMPS. */
542 static void
543 release_components (struct component *comps)
545 struct component *act, *next;
547 for (act = comps; act; act = next)
549 next = act->next;
550 release_component (act);
554 /* Finds a root of tree given by FATHERS containing A, and performs path
555 shortening. */
557 static unsigned
558 component_of (unsigned fathers[], unsigned a)
560 unsigned root, n;
562 for (root = a; root != fathers[root]; root = fathers[root])
563 continue;
565 for (; a != root; a = n)
567 n = fathers[a];
568 fathers[a] = root;
571 return root;
574 /* Join operation for DFU. FATHERS gives the tree, SIZES are sizes of the
575 components, A and B are components to merge. */
577 static void
578 merge_comps (unsigned fathers[], unsigned sizes[], unsigned a, unsigned b)
580 unsigned ca = component_of (fathers, a);
581 unsigned cb = component_of (fathers, b);
583 if (ca == cb)
584 return;
586 if (sizes[ca] < sizes[cb])
588 sizes[cb] += sizes[ca];
589 fathers[ca] = cb;
591 else
593 sizes[ca] += sizes[cb];
594 fathers[cb] = ca;
598 /* Returns true if A is a reference that is suitable for predictive commoning
599 in the innermost loop that contains it. REF_STEP is set according to the
600 step of the reference A. */
602 static bool
603 suitable_reference_p (struct data_reference *a, enum ref_step_type *ref_step)
605 tree ref = DR_REF (a), step = DR_STEP (a);
607 if (!step
608 || TREE_THIS_VOLATILE (ref)
609 || !is_gimple_reg_type (TREE_TYPE (ref))
610 || tree_could_throw_p (ref))
611 return false;
613 if (integer_zerop (step))
614 *ref_step = RS_INVARIANT;
615 else if (integer_nonzerop (step))
616 *ref_step = RS_NONZERO;
617 else
618 *ref_step = RS_ANY;
620 return true;
623 /* Stores DR_OFFSET (DR) + DR_INIT (DR) to OFFSET. */
625 static void
626 aff_combination_dr_offset (struct data_reference *dr, aff_tree *offset)
628 tree type = TREE_TYPE (DR_OFFSET (dr));
629 aff_tree delta;
631 tree_to_aff_combination_expand (DR_OFFSET (dr), type, offset,
632 &name_expansions);
633 aff_combination_const (&delta, type, wi::to_widest (DR_INIT (dr)));
634 aff_combination_add (offset, &delta);
637 /* Determines number of iterations of the innermost enclosing loop before B
638 refers to exactly the same location as A and stores it to OFF. If A and
639 B do not have the same step, they never meet, or anything else fails,
640 returns false, otherwise returns true. Both A and B are assumed to
641 satisfy suitable_reference_p. */
643 static bool
644 determine_offset (struct data_reference *a, struct data_reference *b,
645 widest_int *off)
647 aff_tree diff, baseb, step;
648 tree typea, typeb;
650 /* Check that both the references access the location in the same type. */
651 typea = TREE_TYPE (DR_REF (a));
652 typeb = TREE_TYPE (DR_REF (b));
653 if (!useless_type_conversion_p (typeb, typea))
654 return false;
656 /* Check whether the base address and the step of both references is the
657 same. */
658 if (!operand_equal_p (DR_STEP (a), DR_STEP (b), 0)
659 || !operand_equal_p (DR_BASE_ADDRESS (a), DR_BASE_ADDRESS (b), 0))
660 return false;
662 if (integer_zerop (DR_STEP (a)))
664 /* If the references have loop invariant address, check that they access
665 exactly the same location. */
666 *off = 0;
667 return (operand_equal_p (DR_OFFSET (a), DR_OFFSET (b), 0)
668 && operand_equal_p (DR_INIT (a), DR_INIT (b), 0));
671 /* Compare the offsets of the addresses, and check whether the difference
672 is a multiple of step. */
673 aff_combination_dr_offset (a, &diff);
674 aff_combination_dr_offset (b, &baseb);
675 aff_combination_scale (&baseb, -1);
676 aff_combination_add (&diff, &baseb);
678 tree_to_aff_combination_expand (DR_STEP (a), TREE_TYPE (DR_STEP (a)),
679 &step, &name_expansions);
680 return aff_combination_constant_multiple_p (&diff, &step, off);
683 /* Returns the last basic block in LOOP for that we are sure that
684 it is executed whenever the loop is entered. */
686 static basic_block
687 last_always_executed_block (struct loop *loop)
689 unsigned i;
690 vec<edge> exits = get_loop_exit_edges (loop);
691 edge ex;
692 basic_block last = loop->latch;
694 FOR_EACH_VEC_ELT (exits, i, ex)
695 last = nearest_common_dominator (CDI_DOMINATORS, last, ex->src);
696 exits.release ();
698 return last;
701 /* Splits dependence graph on DATAREFS described by DEPENDS to components. */
703 static struct component *
704 split_data_refs_to_components (struct loop *loop,
705 vec<data_reference_p> datarefs,
706 vec<ddr_p> depends)
708 unsigned i, n = datarefs.length ();
709 unsigned ca, ia, ib, bad;
710 unsigned *comp_father = XNEWVEC (unsigned, n + 1);
711 unsigned *comp_size = XNEWVEC (unsigned, n + 1);
712 struct component **comps;
713 struct data_reference *dr, *dra, *drb;
714 struct data_dependence_relation *ddr;
715 struct component *comp_list = NULL, *comp;
716 dref dataref;
717 basic_block last_always_executed = last_always_executed_block (loop);
719 FOR_EACH_VEC_ELT (datarefs, i, dr)
721 if (!DR_REF (dr))
723 /* A fake reference for call or asm_expr that may clobber memory;
724 just fail. */
725 goto end;
727 /* predcom pass isn't prepared to handle calls with data references. */
728 if (is_gimple_call (DR_STMT (dr)))
729 goto end;
730 dr->aux = (void *) (size_t) i;
731 comp_father[i] = i;
732 comp_size[i] = 1;
735 /* A component reserved for the "bad" data references. */
736 comp_father[n] = n;
737 comp_size[n] = 1;
739 FOR_EACH_VEC_ELT (datarefs, i, dr)
741 enum ref_step_type dummy;
743 if (!suitable_reference_p (dr, &dummy))
745 ia = (unsigned) (size_t) dr->aux;
746 merge_comps (comp_father, comp_size, n, ia);
750 FOR_EACH_VEC_ELT (depends, i, ddr)
752 widest_int dummy_off;
754 if (DDR_ARE_DEPENDENT (ddr) == chrec_known)
755 continue;
757 dra = DDR_A (ddr);
758 drb = DDR_B (ddr);
759 ia = component_of (comp_father, (unsigned) (size_t) dra->aux);
760 ib = component_of (comp_father, (unsigned) (size_t) drb->aux);
761 if (ia == ib)
762 continue;
764 bad = component_of (comp_father, n);
766 /* If both A and B are reads, we may ignore unsuitable dependences. */
767 if (DR_IS_READ (dra) && DR_IS_READ (drb))
769 if (ia == bad || ib == bad
770 || !determine_offset (dra, drb, &dummy_off))
771 continue;
773 /* If A is read and B write or vice versa and there is unsuitable
774 dependence, instead of merging both components into a component
775 that will certainly not pass suitable_component_p, just put the
776 read into bad component, perhaps at least the write together with
777 all the other data refs in it's component will be optimizable. */
778 else if (DR_IS_READ (dra) && ib != bad)
780 if (ia == bad)
781 continue;
782 else if (!determine_offset (dra, drb, &dummy_off))
784 merge_comps (comp_father, comp_size, bad, ia);
785 continue;
788 else if (DR_IS_READ (drb) && ia != bad)
790 if (ib == bad)
791 continue;
792 else if (!determine_offset (dra, drb, &dummy_off))
794 merge_comps (comp_father, comp_size, bad, ib);
795 continue;
799 merge_comps (comp_father, comp_size, ia, ib);
802 comps = XCNEWVEC (struct component *, n);
803 bad = component_of (comp_father, n);
804 FOR_EACH_VEC_ELT (datarefs, i, dr)
806 ia = (unsigned) (size_t) dr->aux;
807 ca = component_of (comp_father, ia);
808 if (ca == bad)
809 continue;
811 comp = comps[ca];
812 if (!comp)
814 comp = XCNEW (struct component);
815 comp->refs.create (comp_size[ca]);
816 comps[ca] = comp;
819 dataref = XCNEW (struct dref_d);
820 dataref->ref = dr;
821 dataref->stmt = DR_STMT (dr);
822 dataref->offset = 0;
823 dataref->distance = 0;
825 dataref->always_accessed
826 = dominated_by_p (CDI_DOMINATORS, last_always_executed,
827 gimple_bb (dataref->stmt));
828 dataref->pos = comp->refs.length ();
829 comp->refs.quick_push (dataref);
832 for (i = 0; i < n; i++)
834 comp = comps[i];
835 if (comp)
837 comp->next = comp_list;
838 comp_list = comp;
841 free (comps);
843 end:
844 free (comp_father);
845 free (comp_size);
846 return comp_list;
849 /* Returns true if the component COMP satisfies the conditions
850 described in 2) at the beginning of this file. LOOP is the current
851 loop. */
853 static bool
854 suitable_component_p (struct loop *loop, struct component *comp)
856 unsigned i;
857 dref a, first;
858 basic_block ba, bp = loop->header;
859 bool ok, has_write = false;
861 FOR_EACH_VEC_ELT (comp->refs, i, a)
863 ba = gimple_bb (a->stmt);
865 if (!just_once_each_iteration_p (loop, ba))
866 return false;
868 gcc_assert (dominated_by_p (CDI_DOMINATORS, ba, bp));
869 bp = ba;
871 if (DR_IS_WRITE (a->ref))
872 has_write = true;
875 first = comp->refs[0];
876 ok = suitable_reference_p (first->ref, &comp->comp_step);
877 gcc_assert (ok);
878 first->offset = 0;
880 for (i = 1; comp->refs.iterate (i, &a); i++)
882 if (!determine_offset (first->ref, a->ref, &a->offset))
883 return false;
885 enum ref_step_type a_step;
886 gcc_checking_assert (suitable_reference_p (a->ref, &a_step)
887 && a_step == comp->comp_step);
890 /* If there is a write inside the component, we must know whether the
891 step is nonzero or not -- we would not otherwise be able to recognize
892 whether the value accessed by reads comes from the OFFSET-th iteration
893 or the previous one. */
894 if (has_write && comp->comp_step == RS_ANY)
895 return false;
897 return true;
900 /* Check the conditions on references inside each of components COMPS,
901 and remove the unsuitable components from the list. The new list
902 of components is returned. The conditions are described in 2) at
903 the beginning of this file. LOOP is the current loop. */
905 static struct component *
906 filter_suitable_components (struct loop *loop, struct component *comps)
908 struct component **comp, *act;
910 for (comp = &comps; *comp; )
912 act = *comp;
913 if (suitable_component_p (loop, act))
914 comp = &act->next;
915 else
917 dref ref;
918 unsigned i;
920 *comp = act->next;
921 FOR_EACH_VEC_ELT (act->refs, i, ref)
922 free (ref);
923 release_component (act);
927 return comps;
930 /* Compares two drefs A and B by their offset and position. Callback for
931 qsort. */
933 static int
934 order_drefs (const void *a, const void *b)
936 const dref *const da = (const dref *) a;
937 const dref *const db = (const dref *) b;
938 int offcmp = wi::cmps ((*da)->offset, (*db)->offset);
940 if (offcmp != 0)
941 return offcmp;
943 return (*da)->pos - (*db)->pos;
946 /* Returns root of the CHAIN. */
948 static inline dref
949 get_chain_root (chain_p chain)
951 return chain->refs[0];
954 /* Adds REF to the chain CHAIN. */
956 static void
957 add_ref_to_chain (chain_p chain, dref ref)
959 dref root = get_chain_root (chain);
961 gcc_assert (wi::les_p (root->offset, ref->offset));
962 widest_int dist = ref->offset - root->offset;
963 if (wi::leu_p (MAX_DISTANCE, dist))
965 free (ref);
966 return;
968 gcc_assert (wi::fits_uhwi_p (dist));
970 chain->refs.safe_push (ref);
972 ref->distance = dist.to_uhwi ();
974 if (ref->distance >= chain->length)
976 chain->length = ref->distance;
977 chain->has_max_use_after = false;
980 if (ref->distance == chain->length
981 && ref->pos > root->pos)
982 chain->has_max_use_after = true;
984 chain->all_always_accessed &= ref->always_accessed;
987 /* Returns the chain for invariant component COMP. */
989 static chain_p
990 make_invariant_chain (struct component *comp)
992 chain_p chain = XCNEW (struct chain);
993 unsigned i;
994 dref ref;
996 chain->type = CT_INVARIANT;
998 chain->all_always_accessed = true;
1000 FOR_EACH_VEC_ELT (comp->refs, i, ref)
1002 chain->refs.safe_push (ref);
1003 chain->all_always_accessed &= ref->always_accessed;
1006 return chain;
1009 /* Make a new chain rooted at REF. */
1011 static chain_p
1012 make_rooted_chain (dref ref)
1014 chain_p chain = XCNEW (struct chain);
1016 chain->type = DR_IS_READ (ref->ref) ? CT_LOAD : CT_STORE_LOAD;
1018 chain->refs.safe_push (ref);
1019 chain->all_always_accessed = ref->always_accessed;
1021 ref->distance = 0;
1023 return chain;
1026 /* Returns true if CHAIN is not trivial. */
1028 static bool
1029 nontrivial_chain_p (chain_p chain)
1031 return chain != NULL && chain->refs.length () > 1;
1034 /* Returns the ssa name that contains the value of REF, or NULL_TREE if there
1035 is no such name. */
1037 static tree
1038 name_for_ref (dref ref)
1040 tree name;
1042 if (is_gimple_assign (ref->stmt))
1044 if (!ref->ref || DR_IS_READ (ref->ref))
1045 name = gimple_assign_lhs (ref->stmt);
1046 else
1047 name = gimple_assign_rhs1 (ref->stmt);
1049 else
1050 name = PHI_RESULT (ref->stmt);
1052 return (TREE_CODE (name) == SSA_NAME ? name : NULL_TREE);
1055 /* Returns true if REF is a valid initializer for ROOT with given DISTANCE (in
1056 iterations of the innermost enclosing loop). */
1058 static bool
1059 valid_initializer_p (struct data_reference *ref,
1060 unsigned distance, struct data_reference *root)
1062 aff_tree diff, base, step;
1063 widest_int off;
1065 /* Both REF and ROOT must be accessing the same object. */
1066 if (!operand_equal_p (DR_BASE_ADDRESS (ref), DR_BASE_ADDRESS (root), 0))
1067 return false;
1069 /* The initializer is defined outside of loop, hence its address must be
1070 invariant inside the loop. */
1071 gcc_assert (integer_zerop (DR_STEP (ref)));
1073 /* If the address of the reference is invariant, initializer must access
1074 exactly the same location. */
1075 if (integer_zerop (DR_STEP (root)))
1076 return (operand_equal_p (DR_OFFSET (ref), DR_OFFSET (root), 0)
1077 && operand_equal_p (DR_INIT (ref), DR_INIT (root), 0));
1079 /* Verify that this index of REF is equal to the root's index at
1080 -DISTANCE-th iteration. */
1081 aff_combination_dr_offset (root, &diff);
1082 aff_combination_dr_offset (ref, &base);
1083 aff_combination_scale (&base, -1);
1084 aff_combination_add (&diff, &base);
1086 tree_to_aff_combination_expand (DR_STEP (root), TREE_TYPE (DR_STEP (root)),
1087 &step, &name_expansions);
1088 if (!aff_combination_constant_multiple_p (&diff, &step, &off))
1089 return false;
1091 if (off != distance)
1092 return false;
1094 return true;
1097 /* Finds looparound phi node of LOOP that copies the value of REF, and if its
1098 initial value is correct (equal to initial value of REF shifted by one
1099 iteration), returns the phi node. Otherwise, NULL_TREE is returned. ROOT
1100 is the root of the current chain. */
1102 static gphi *
1103 find_looparound_phi (struct loop *loop, dref ref, dref root)
1105 tree name, init, init_ref;
1106 gphi *phi = NULL;
1107 gimple *init_stmt;
1108 edge latch = loop_latch_edge (loop);
1109 struct data_reference init_dr;
1110 gphi_iterator psi;
1112 if (is_gimple_assign (ref->stmt))
1114 if (DR_IS_READ (ref->ref))
1115 name = gimple_assign_lhs (ref->stmt);
1116 else
1117 name = gimple_assign_rhs1 (ref->stmt);
1119 else
1120 name = PHI_RESULT (ref->stmt);
1121 if (!name)
1122 return NULL;
1124 for (psi = gsi_start_phis (loop->header); !gsi_end_p (psi); gsi_next (&psi))
1126 phi = psi.phi ();
1127 if (PHI_ARG_DEF_FROM_EDGE (phi, latch) == name)
1128 break;
1131 if (gsi_end_p (psi))
1132 return NULL;
1134 init = PHI_ARG_DEF_FROM_EDGE (phi, loop_preheader_edge (loop));
1135 if (TREE_CODE (init) != SSA_NAME)
1136 return NULL;
1137 init_stmt = SSA_NAME_DEF_STMT (init);
1138 if (gimple_code (init_stmt) != GIMPLE_ASSIGN)
1139 return NULL;
1140 gcc_assert (gimple_assign_lhs (init_stmt) == init);
1142 init_ref = gimple_assign_rhs1 (init_stmt);
1143 if (!REFERENCE_CLASS_P (init_ref)
1144 && !DECL_P (init_ref))
1145 return NULL;
1147 /* Analyze the behavior of INIT_REF with respect to LOOP (innermost
1148 loop enclosing PHI). */
1149 memset (&init_dr, 0, sizeof (struct data_reference));
1150 DR_REF (&init_dr) = init_ref;
1151 DR_STMT (&init_dr) = phi;
1152 if (!dr_analyze_innermost (&init_dr, loop))
1153 return NULL;
1155 if (!valid_initializer_p (&init_dr, ref->distance + 1, root->ref))
1156 return NULL;
1158 return phi;
1161 /* Adds a reference for the looparound copy of REF in PHI to CHAIN. */
1163 static void
1164 insert_looparound_copy (chain_p chain, dref ref, gphi *phi)
1166 dref nw = XCNEW (struct dref_d), aref;
1167 unsigned i;
1169 nw->stmt = phi;
1170 nw->distance = ref->distance + 1;
1171 nw->always_accessed = 1;
1173 FOR_EACH_VEC_ELT (chain->refs, i, aref)
1174 if (aref->distance >= nw->distance)
1175 break;
1176 chain->refs.safe_insert (i, nw);
1178 if (nw->distance > chain->length)
1180 chain->length = nw->distance;
1181 chain->has_max_use_after = false;
1185 /* For references in CHAIN that are copied around the LOOP (created previously
1186 by PRE, or by user), add the results of such copies to the chain. This
1187 enables us to remove the copies by unrolling, and may need less registers
1188 (also, it may allow us to combine chains together). */
1190 static void
1191 add_looparound_copies (struct loop *loop, chain_p chain)
1193 unsigned i;
1194 dref ref, root = get_chain_root (chain);
1195 gphi *phi;
1197 FOR_EACH_VEC_ELT (chain->refs, i, ref)
1199 phi = find_looparound_phi (loop, ref, root);
1200 if (!phi)
1201 continue;
1203 bitmap_set_bit (looparound_phis, SSA_NAME_VERSION (PHI_RESULT (phi)));
1204 insert_looparound_copy (chain, ref, phi);
1208 /* Find roots of the values and determine distances in the component COMP.
1209 The references are redistributed into CHAINS. LOOP is the current
1210 loop. */
1212 static void
1213 determine_roots_comp (struct loop *loop,
1214 struct component *comp,
1215 vec<chain_p> *chains)
1217 unsigned i;
1218 dref a;
1219 chain_p chain = NULL;
1220 widest_int last_ofs = 0;
1222 /* Invariants are handled specially. */
1223 if (comp->comp_step == RS_INVARIANT)
1225 chain = make_invariant_chain (comp);
1226 chains->safe_push (chain);
1227 return;
1230 comp->refs.qsort (order_drefs);
1232 FOR_EACH_VEC_ELT (comp->refs, i, a)
1234 if (!chain || DR_IS_WRITE (a->ref)
1235 || wi::leu_p (MAX_DISTANCE, a->offset - last_ofs))
1237 if (nontrivial_chain_p (chain))
1239 add_looparound_copies (loop, chain);
1240 chains->safe_push (chain);
1242 else
1243 release_chain (chain);
1244 chain = make_rooted_chain (a);
1245 last_ofs = a->offset;
1246 continue;
1249 add_ref_to_chain (chain, a);
1252 if (nontrivial_chain_p (chain))
1254 add_looparound_copies (loop, chain);
1255 chains->safe_push (chain);
1257 else
1258 release_chain (chain);
1261 /* Find roots of the values and determine distances in components COMPS, and
1262 separates the references to CHAINS. LOOP is the current loop. */
1264 static void
1265 determine_roots (struct loop *loop,
1266 struct component *comps, vec<chain_p> *chains)
1268 struct component *comp;
1270 for (comp = comps; comp; comp = comp->next)
1271 determine_roots_comp (loop, comp, chains);
1274 /* Replace the reference in statement STMT with temporary variable
1275 NEW_TREE. If SET is true, NEW_TREE is instead initialized to the value of
1276 the reference in the statement. IN_LHS is true if the reference
1277 is in the lhs of STMT, false if it is in rhs. */
1279 static void
1280 replace_ref_with (gimple *stmt, tree new_tree, bool set, bool in_lhs)
1282 tree val;
1283 gassign *new_stmt;
1284 gimple_stmt_iterator bsi, psi;
1286 if (gimple_code (stmt) == GIMPLE_PHI)
1288 gcc_assert (!in_lhs && !set);
1290 val = PHI_RESULT (stmt);
1291 bsi = gsi_after_labels (gimple_bb (stmt));
1292 psi = gsi_for_stmt (stmt);
1293 remove_phi_node (&psi, false);
1295 /* Turn the phi node into GIMPLE_ASSIGN. */
1296 new_stmt = gimple_build_assign (val, new_tree);
1297 gsi_insert_before (&bsi, new_stmt, GSI_NEW_STMT);
1298 return;
1301 /* Since the reference is of gimple_reg type, it should only
1302 appear as lhs or rhs of modify statement. */
1303 gcc_assert (is_gimple_assign (stmt));
1305 bsi = gsi_for_stmt (stmt);
1307 /* If we do not need to initialize NEW_TREE, just replace the use of OLD. */
1308 if (!set)
1310 gcc_assert (!in_lhs);
1311 gimple_assign_set_rhs_from_tree (&bsi, new_tree);
1312 stmt = gsi_stmt (bsi);
1313 update_stmt (stmt);
1314 return;
1317 if (in_lhs)
1319 /* We have statement
1321 OLD = VAL
1323 If OLD is a memory reference, then VAL is gimple_val, and we transform
1324 this to
1326 OLD = VAL
1327 NEW = VAL
1329 Otherwise, we are replacing a combination chain,
1330 VAL is the expression that performs the combination, and OLD is an
1331 SSA name. In this case, we transform the assignment to
1333 OLD = VAL
1334 NEW = OLD
1338 val = gimple_assign_lhs (stmt);
1339 if (TREE_CODE (val) != SSA_NAME)
1341 val = gimple_assign_rhs1 (stmt);
1342 gcc_assert (gimple_assign_single_p (stmt));
1343 if (TREE_CLOBBER_P (val))
1344 val = get_or_create_ssa_default_def (cfun, SSA_NAME_VAR (new_tree));
1345 else
1346 gcc_assert (gimple_assign_copy_p (stmt));
1349 else
1351 /* VAL = OLD
1353 is transformed to
1355 VAL = OLD
1356 NEW = VAL */
1358 val = gimple_assign_lhs (stmt);
1361 new_stmt = gimple_build_assign (new_tree, unshare_expr (val));
1362 gsi_insert_after (&bsi, new_stmt, GSI_NEW_STMT);
1365 /* Returns a memory reference to DR in the ITER-th iteration of
1366 the loop it was analyzed in. Append init stmts to STMTS. */
1368 static tree
1369 ref_at_iteration (data_reference_p dr, int iter, gimple_seq *stmts)
1371 tree off = DR_OFFSET (dr);
1372 tree coff = DR_INIT (dr);
1373 if (iter == 0)
1375 else if (TREE_CODE (DR_STEP (dr)) == INTEGER_CST)
1376 coff = size_binop (PLUS_EXPR, coff,
1377 size_binop (MULT_EXPR, DR_STEP (dr), ssize_int (iter)));
1378 else
1379 off = size_binop (PLUS_EXPR, off,
1380 size_binop (MULT_EXPR, DR_STEP (dr), ssize_int (iter)));
1381 tree addr = fold_build_pointer_plus (DR_BASE_ADDRESS (dr), off);
1382 addr = force_gimple_operand_1 (unshare_expr (addr), stmts,
1383 is_gimple_mem_ref_addr, NULL_TREE);
1384 tree alias_ptr = fold_convert (reference_alias_ptr_type (DR_REF (dr)), coff);
1385 tree type = build_aligned_type (TREE_TYPE (DR_REF (dr)),
1386 get_object_alignment (DR_REF (dr)));
1387 /* While data-ref analysis punts on bit offsets it still handles
1388 bitfield accesses at byte boundaries. Cope with that. Note that
1389 we cannot simply re-apply the outer COMPONENT_REF because the
1390 byte-granular portion of it is already applied via DR_INIT and
1391 DR_OFFSET, so simply build a BIT_FIELD_REF knowing that the bits
1392 start at offset zero. */
1393 if (TREE_CODE (DR_REF (dr)) == COMPONENT_REF
1394 && DECL_BIT_FIELD (TREE_OPERAND (DR_REF (dr), 1)))
1396 tree field = TREE_OPERAND (DR_REF (dr), 1);
1397 return build3 (BIT_FIELD_REF, TREE_TYPE (DR_REF (dr)),
1398 build2 (MEM_REF, type, addr, alias_ptr),
1399 DECL_SIZE (field), bitsize_zero_node);
1401 else
1402 return fold_build2 (MEM_REF, type, addr, alias_ptr);
1405 /* Get the initialization expression for the INDEX-th temporary variable
1406 of CHAIN. */
1408 static tree
1409 get_init_expr (chain_p chain, unsigned index)
1411 if (chain->type == CT_COMBINATION)
1413 tree e1 = get_init_expr (chain->ch1, index);
1414 tree e2 = get_init_expr (chain->ch2, index);
1416 return fold_build2 (chain->op, chain->rslt_type, e1, e2);
1418 else
1419 return chain->inits[index];
1422 /* Returns a new temporary variable used for the I-th variable carrying
1423 value of REF. The variable's uid is marked in TMP_VARS. */
1425 static tree
1426 predcom_tmp_var (tree ref, unsigned i, bitmap tmp_vars)
1428 tree type = TREE_TYPE (ref);
1429 /* We never access the components of the temporary variable in predictive
1430 commoning. */
1431 tree var = create_tmp_reg (type, get_lsm_tmp_name (ref, i));
1432 bitmap_set_bit (tmp_vars, DECL_UID (var));
1433 return var;
1436 /* Creates the variables for CHAIN, as well as phi nodes for them and
1437 initialization on entry to LOOP. Uids of the newly created
1438 temporary variables are marked in TMP_VARS. */
1440 static void
1441 initialize_root_vars (struct loop *loop, chain_p chain, bitmap tmp_vars)
1443 unsigned i;
1444 unsigned n = chain->length;
1445 dref root = get_chain_root (chain);
1446 bool reuse_first = !chain->has_max_use_after;
1447 tree ref, init, var, next;
1448 gphi *phi;
1449 gimple_seq stmts;
1450 edge entry = loop_preheader_edge (loop), latch = loop_latch_edge (loop);
1452 /* If N == 0, then all the references are within the single iteration. And
1453 since this is an nonempty chain, reuse_first cannot be true. */
1454 gcc_assert (n > 0 || !reuse_first);
1456 chain->vars.create (n + 1);
1458 if (chain->type == CT_COMBINATION)
1459 ref = gimple_assign_lhs (root->stmt);
1460 else
1461 ref = DR_REF (root->ref);
1463 for (i = 0; i < n + (reuse_first ? 0 : 1); i++)
1465 var = predcom_tmp_var (ref, i, tmp_vars);
1466 chain->vars.quick_push (var);
1468 if (reuse_first)
1469 chain->vars.quick_push (chain->vars[0]);
1471 FOR_EACH_VEC_ELT (chain->vars, i, var)
1472 chain->vars[i] = make_ssa_name (var);
1474 for (i = 0; i < n; i++)
1476 var = chain->vars[i];
1477 next = chain->vars[i + 1];
1478 init = get_init_expr (chain, i);
1480 init = force_gimple_operand (init, &stmts, true, NULL_TREE);
1481 if (stmts)
1482 gsi_insert_seq_on_edge_immediate (entry, stmts);
1484 phi = create_phi_node (var, loop->header);
1485 add_phi_arg (phi, init, entry, UNKNOWN_LOCATION);
1486 add_phi_arg (phi, next, latch, UNKNOWN_LOCATION);
1490 /* Create the variables and initialization statement for root of chain
1491 CHAIN. Uids of the newly created temporary variables are marked
1492 in TMP_VARS. */
1494 static void
1495 initialize_root (struct loop *loop, chain_p chain, bitmap tmp_vars)
1497 dref root = get_chain_root (chain);
1498 bool in_lhs = (chain->type == CT_STORE_LOAD
1499 || chain->type == CT_COMBINATION);
1501 initialize_root_vars (loop, chain, tmp_vars);
1502 replace_ref_with (root->stmt,
1503 chain->vars[chain->length],
1504 true, in_lhs);
1507 /* Initializes a variable for load motion for ROOT and prepares phi nodes and
1508 initialization on entry to LOOP if necessary. The ssa name for the variable
1509 is stored in VARS. If WRITTEN is true, also a phi node to copy its value
1510 around the loop is created. Uid of the newly created temporary variable
1511 is marked in TMP_VARS. INITS is the list containing the (single)
1512 initializer. */
1514 static void
1515 initialize_root_vars_lm (struct loop *loop, dref root, bool written,
1516 vec<tree> *vars, vec<tree> inits,
1517 bitmap tmp_vars)
1519 unsigned i;
1520 tree ref = DR_REF (root->ref), init, var, next;
1521 gimple_seq stmts;
1522 gphi *phi;
1523 edge entry = loop_preheader_edge (loop), latch = loop_latch_edge (loop);
1525 /* Find the initializer for the variable, and check that it cannot
1526 trap. */
1527 init = inits[0];
1529 vars->create (written ? 2 : 1);
1530 var = predcom_tmp_var (ref, 0, tmp_vars);
1531 vars->quick_push (var);
1532 if (written)
1533 vars->quick_push ((*vars)[0]);
1535 FOR_EACH_VEC_ELT (*vars, i, var)
1536 (*vars)[i] = make_ssa_name (var);
1538 var = (*vars)[0];
1540 init = force_gimple_operand (init, &stmts, written, NULL_TREE);
1541 if (stmts)
1542 gsi_insert_seq_on_edge_immediate (entry, stmts);
1544 if (written)
1546 next = (*vars)[1];
1547 phi = create_phi_node (var, loop->header);
1548 add_phi_arg (phi, init, entry, UNKNOWN_LOCATION);
1549 add_phi_arg (phi, next, latch, UNKNOWN_LOCATION);
1551 else
1553 gassign *init_stmt = gimple_build_assign (var, init);
1554 gsi_insert_on_edge_immediate (entry, init_stmt);
1559 /* Execute load motion for references in chain CHAIN. Uids of the newly
1560 created temporary variables are marked in TMP_VARS. */
1562 static void
1563 execute_load_motion (struct loop *loop, chain_p chain, bitmap tmp_vars)
1565 auto_vec<tree> vars;
1566 dref a;
1567 unsigned n_writes = 0, ridx, i;
1568 tree var;
1570 gcc_assert (chain->type == CT_INVARIANT);
1571 gcc_assert (!chain->combined);
1572 FOR_EACH_VEC_ELT (chain->refs, i, a)
1573 if (DR_IS_WRITE (a->ref))
1574 n_writes++;
1576 /* If there are no reads in the loop, there is nothing to do. */
1577 if (n_writes == chain->refs.length ())
1578 return;
1580 initialize_root_vars_lm (loop, get_chain_root (chain), n_writes > 0,
1581 &vars, chain->inits, tmp_vars);
1583 ridx = 0;
1584 FOR_EACH_VEC_ELT (chain->refs, i, a)
1586 bool is_read = DR_IS_READ (a->ref);
1588 if (DR_IS_WRITE (a->ref))
1590 n_writes--;
1591 if (n_writes)
1593 var = vars[0];
1594 var = make_ssa_name (SSA_NAME_VAR (var));
1595 vars[0] = var;
1597 else
1598 ridx = 1;
1601 replace_ref_with (a->stmt, vars[ridx],
1602 !is_read, !is_read);
1606 /* Returns the single statement in that NAME is used, excepting
1607 the looparound phi nodes contained in one of the chains. If there is no
1608 such statement, or more statements, NULL is returned. */
1610 static gimple *
1611 single_nonlooparound_use (tree name)
1613 use_operand_p use;
1614 imm_use_iterator it;
1615 gimple *stmt, *ret = NULL;
1617 FOR_EACH_IMM_USE_FAST (use, it, name)
1619 stmt = USE_STMT (use);
1621 if (gimple_code (stmt) == GIMPLE_PHI)
1623 /* Ignore uses in looparound phi nodes. Uses in other phi nodes
1624 could not be processed anyway, so just fail for them. */
1625 if (bitmap_bit_p (looparound_phis,
1626 SSA_NAME_VERSION (PHI_RESULT (stmt))))
1627 continue;
1629 return NULL;
1631 else if (is_gimple_debug (stmt))
1632 continue;
1633 else if (ret != NULL)
1634 return NULL;
1635 else
1636 ret = stmt;
1639 return ret;
1642 /* Remove statement STMT, as well as the chain of assignments in that it is
1643 used. */
1645 static void
1646 remove_stmt (gimple *stmt)
1648 tree name;
1649 gimple *next;
1650 gimple_stmt_iterator psi;
1652 if (gimple_code (stmt) == GIMPLE_PHI)
1654 name = PHI_RESULT (stmt);
1655 next = single_nonlooparound_use (name);
1656 reset_debug_uses (stmt);
1657 psi = gsi_for_stmt (stmt);
1658 remove_phi_node (&psi, true);
1660 if (!next
1661 || !gimple_assign_ssa_name_copy_p (next)
1662 || gimple_assign_rhs1 (next) != name)
1663 return;
1665 stmt = next;
1668 while (1)
1670 gimple_stmt_iterator bsi;
1672 bsi = gsi_for_stmt (stmt);
1674 name = gimple_assign_lhs (stmt);
1675 gcc_assert (TREE_CODE (name) == SSA_NAME);
1677 next = single_nonlooparound_use (name);
1678 reset_debug_uses (stmt);
1680 unlink_stmt_vdef (stmt);
1681 gsi_remove (&bsi, true);
1682 release_defs (stmt);
1684 if (!next
1685 || !gimple_assign_ssa_name_copy_p (next)
1686 || gimple_assign_rhs1 (next) != name)
1687 return;
1689 stmt = next;
1693 /* Perform the predictive commoning optimization for a chain CHAIN.
1694 Uids of the newly created temporary variables are marked in TMP_VARS.*/
1696 static void
1697 execute_pred_commoning_chain (struct loop *loop, chain_p chain,
1698 bitmap tmp_vars)
1700 unsigned i;
1701 dref a;
1702 tree var;
1704 if (chain->combined)
1706 /* For combined chains, just remove the statements that are used to
1707 compute the values of the expression (except for the root one).
1708 We delay this until after all chains are processed. */
1710 else
1712 /* For non-combined chains, set up the variables that hold its value,
1713 and replace the uses of the original references by these
1714 variables. */
1715 initialize_root (loop, chain, tmp_vars);
1716 for (i = 1; chain->refs.iterate (i, &a); i++)
1718 var = chain->vars[chain->length - a->distance];
1719 replace_ref_with (a->stmt, var, false, false);
1724 /* Determines the unroll factor necessary to remove as many temporary variable
1725 copies as possible. CHAINS is the list of chains that will be
1726 optimized. */
1728 static unsigned
1729 determine_unroll_factor (vec<chain_p> chains)
1731 chain_p chain;
1732 unsigned factor = 1, af, nfactor, i;
1733 unsigned max = PARAM_VALUE (PARAM_MAX_UNROLL_TIMES);
1735 FOR_EACH_VEC_ELT (chains, i, chain)
1737 if (chain->type == CT_INVARIANT)
1738 continue;
1740 if (chain->combined)
1742 /* For combined chains, we can't handle unrolling if we replace
1743 looparound PHIs. */
1744 dref a;
1745 unsigned j;
1746 for (j = 1; chain->refs.iterate (j, &a); j++)
1747 if (gimple_code (a->stmt) == GIMPLE_PHI)
1748 return 1;
1749 continue;
1752 /* The best unroll factor for this chain is equal to the number of
1753 temporary variables that we create for it. */
1754 af = chain->length;
1755 if (chain->has_max_use_after)
1756 af++;
1758 nfactor = factor * af / gcd (factor, af);
1759 if (nfactor <= max)
1760 factor = nfactor;
1763 return factor;
1766 /* Perform the predictive commoning optimization for CHAINS.
1767 Uids of the newly created temporary variables are marked in TMP_VARS. */
1769 static void
1770 execute_pred_commoning (struct loop *loop, vec<chain_p> chains,
1771 bitmap tmp_vars)
1773 chain_p chain;
1774 unsigned i;
1776 FOR_EACH_VEC_ELT (chains, i, chain)
1778 if (chain->type == CT_INVARIANT)
1779 execute_load_motion (loop, chain, tmp_vars);
1780 else
1781 execute_pred_commoning_chain (loop, chain, tmp_vars);
1784 FOR_EACH_VEC_ELT (chains, i, chain)
1786 if (chain->type == CT_INVARIANT)
1788 else if (chain->combined)
1790 /* For combined chains, just remove the statements that are used to
1791 compute the values of the expression (except for the root one). */
1792 dref a;
1793 unsigned j;
1794 for (j = 1; chain->refs.iterate (j, &a); j++)
1795 remove_stmt (a->stmt);
1799 update_ssa (TODO_update_ssa_only_virtuals);
1802 /* For each reference in CHAINS, if its defining statement is
1803 phi node, record the ssa name that is defined by it. */
1805 static void
1806 replace_phis_by_defined_names (vec<chain_p> chains)
1808 chain_p chain;
1809 dref a;
1810 unsigned i, j;
1812 FOR_EACH_VEC_ELT (chains, i, chain)
1813 FOR_EACH_VEC_ELT (chain->refs, j, a)
1815 if (gimple_code (a->stmt) == GIMPLE_PHI)
1817 a->name_defined_by_phi = PHI_RESULT (a->stmt);
1818 a->stmt = NULL;
1823 /* For each reference in CHAINS, if name_defined_by_phi is not
1824 NULL, use it to set the stmt field. */
1826 static void
1827 replace_names_by_phis (vec<chain_p> chains)
1829 chain_p chain;
1830 dref a;
1831 unsigned i, j;
1833 FOR_EACH_VEC_ELT (chains, i, chain)
1834 FOR_EACH_VEC_ELT (chain->refs, j, a)
1835 if (a->stmt == NULL)
1837 a->stmt = SSA_NAME_DEF_STMT (a->name_defined_by_phi);
1838 gcc_assert (gimple_code (a->stmt) == GIMPLE_PHI);
1839 a->name_defined_by_phi = NULL_TREE;
1843 /* Wrapper over execute_pred_commoning, to pass it as a callback
1844 to tree_transform_and_unroll_loop. */
1846 struct epcc_data
1848 vec<chain_p> chains;
1849 bitmap tmp_vars;
1852 static void
1853 execute_pred_commoning_cbck (struct loop *loop, void *data)
1855 struct epcc_data *const dta = (struct epcc_data *) data;
1857 /* Restore phi nodes that were replaced by ssa names before
1858 tree_transform_and_unroll_loop (see detailed description in
1859 tree_predictive_commoning_loop). */
1860 replace_names_by_phis (dta->chains);
1861 execute_pred_commoning (loop, dta->chains, dta->tmp_vars);
1864 /* Base NAME and all the names in the chain of phi nodes that use it
1865 on variable VAR. The phi nodes are recognized by being in the copies of
1866 the header of the LOOP. */
1868 static void
1869 base_names_in_chain_on (struct loop *loop, tree name, tree var)
1871 gimple *stmt, *phi;
1872 imm_use_iterator iter;
1874 replace_ssa_name_symbol (name, var);
1876 while (1)
1878 phi = NULL;
1879 FOR_EACH_IMM_USE_STMT (stmt, iter, name)
1881 if (gimple_code (stmt) == GIMPLE_PHI
1882 && flow_bb_inside_loop_p (loop, gimple_bb (stmt)))
1884 phi = stmt;
1885 BREAK_FROM_IMM_USE_STMT (iter);
1888 if (!phi)
1889 return;
1891 name = PHI_RESULT (phi);
1892 replace_ssa_name_symbol (name, var);
1896 /* Given an unrolled LOOP after predictive commoning, remove the
1897 register copies arising from phi nodes by changing the base
1898 variables of SSA names. TMP_VARS is the set of the temporary variables
1899 for those we want to perform this. */
1901 static void
1902 eliminate_temp_copies (struct loop *loop, bitmap tmp_vars)
1904 edge e;
1905 gphi *phi;
1906 gimple *stmt;
1907 tree name, use, var;
1908 gphi_iterator psi;
1910 e = loop_latch_edge (loop);
1911 for (psi = gsi_start_phis (loop->header); !gsi_end_p (psi); gsi_next (&psi))
1913 phi = psi.phi ();
1914 name = PHI_RESULT (phi);
1915 var = SSA_NAME_VAR (name);
1916 if (!var || !bitmap_bit_p (tmp_vars, DECL_UID (var)))
1917 continue;
1918 use = PHI_ARG_DEF_FROM_EDGE (phi, e);
1919 gcc_assert (TREE_CODE (use) == SSA_NAME);
1921 /* Base all the ssa names in the ud and du chain of NAME on VAR. */
1922 stmt = SSA_NAME_DEF_STMT (use);
1923 while (gimple_code (stmt) == GIMPLE_PHI
1924 /* In case we could not unroll the loop enough to eliminate
1925 all copies, we may reach the loop header before the defining
1926 statement (in that case, some register copies will be present
1927 in loop latch in the final code, corresponding to the newly
1928 created looparound phi nodes). */
1929 && gimple_bb (stmt) != loop->header)
1931 gcc_assert (single_pred_p (gimple_bb (stmt)));
1932 use = PHI_ARG_DEF (stmt, 0);
1933 stmt = SSA_NAME_DEF_STMT (use);
1936 base_names_in_chain_on (loop, use, var);
1940 /* Returns true if CHAIN is suitable to be combined. */
1942 static bool
1943 chain_can_be_combined_p (chain_p chain)
1945 return (!chain->combined
1946 && (chain->type == CT_LOAD || chain->type == CT_COMBINATION));
1949 /* Returns the modify statement that uses NAME. Skips over assignment
1950 statements, NAME is replaced with the actual name used in the returned
1951 statement. */
1953 static gimple *
1954 find_use_stmt (tree *name)
1956 gimple *stmt;
1957 tree rhs, lhs;
1959 /* Skip over assignments. */
1960 while (1)
1962 stmt = single_nonlooparound_use (*name);
1963 if (!stmt)
1964 return NULL;
1966 if (gimple_code (stmt) != GIMPLE_ASSIGN)
1967 return NULL;
1969 lhs = gimple_assign_lhs (stmt);
1970 if (TREE_CODE (lhs) != SSA_NAME)
1971 return NULL;
1973 if (gimple_assign_copy_p (stmt))
1975 rhs = gimple_assign_rhs1 (stmt);
1976 if (rhs != *name)
1977 return NULL;
1979 *name = lhs;
1981 else if (get_gimple_rhs_class (gimple_assign_rhs_code (stmt))
1982 == GIMPLE_BINARY_RHS)
1983 return stmt;
1984 else
1985 return NULL;
1989 /* Returns true if we may perform reassociation for operation CODE in TYPE. */
1991 static bool
1992 may_reassociate_p (tree type, enum tree_code code)
1994 if (FLOAT_TYPE_P (type)
1995 && !flag_unsafe_math_optimizations)
1996 return false;
1998 return (commutative_tree_code (code)
1999 && associative_tree_code (code));
2002 /* If the operation used in STMT is associative and commutative, go through the
2003 tree of the same operations and returns its root. Distance to the root
2004 is stored in DISTANCE. */
2006 static gimple *
2007 find_associative_operation_root (gimple *stmt, unsigned *distance)
2009 tree lhs;
2010 gimple *next;
2011 enum tree_code code = gimple_assign_rhs_code (stmt);
2012 tree type = TREE_TYPE (gimple_assign_lhs (stmt));
2013 unsigned dist = 0;
2015 if (!may_reassociate_p (type, code))
2016 return NULL;
2018 while (1)
2020 lhs = gimple_assign_lhs (stmt);
2021 gcc_assert (TREE_CODE (lhs) == SSA_NAME);
2023 next = find_use_stmt (&lhs);
2024 if (!next
2025 || gimple_assign_rhs_code (next) != code)
2026 break;
2028 stmt = next;
2029 dist++;
2032 if (distance)
2033 *distance = dist;
2034 return stmt;
2037 /* Returns the common statement in that NAME1 and NAME2 have a use. If there
2038 is no such statement, returns NULL_TREE. In case the operation used on
2039 NAME1 and NAME2 is associative and commutative, returns the root of the
2040 tree formed by this operation instead of the statement that uses NAME1 or
2041 NAME2. */
2043 static gimple *
2044 find_common_use_stmt (tree *name1, tree *name2)
2046 gimple *stmt1, *stmt2;
2048 stmt1 = find_use_stmt (name1);
2049 if (!stmt1)
2050 return NULL;
2052 stmt2 = find_use_stmt (name2);
2053 if (!stmt2)
2054 return NULL;
2056 if (stmt1 == stmt2)
2057 return stmt1;
2059 stmt1 = find_associative_operation_root (stmt1, NULL);
2060 if (!stmt1)
2061 return NULL;
2062 stmt2 = find_associative_operation_root (stmt2, NULL);
2063 if (!stmt2)
2064 return NULL;
2066 return (stmt1 == stmt2 ? stmt1 : NULL);
2069 /* Checks whether R1 and R2 are combined together using CODE, with the result
2070 in RSLT_TYPE, in order R1 CODE R2 if SWAP is false and in order R2 CODE R1
2071 if it is true. If CODE is ERROR_MARK, set these values instead. */
2073 static bool
2074 combinable_refs_p (dref r1, dref r2,
2075 enum tree_code *code, bool *swap, tree *rslt_type)
2077 enum tree_code acode;
2078 bool aswap;
2079 tree atype;
2080 tree name1, name2;
2081 gimple *stmt;
2083 name1 = name_for_ref (r1);
2084 name2 = name_for_ref (r2);
2085 gcc_assert (name1 != NULL_TREE && name2 != NULL_TREE);
2087 stmt = find_common_use_stmt (&name1, &name2);
2089 if (!stmt
2090 /* A simple post-dominance check - make sure the combination
2091 is executed under the same condition as the references. */
2092 || (gimple_bb (stmt) != gimple_bb (r1->stmt)
2093 && gimple_bb (stmt) != gimple_bb (r2->stmt)))
2094 return false;
2096 acode = gimple_assign_rhs_code (stmt);
2097 aswap = (!commutative_tree_code (acode)
2098 && gimple_assign_rhs1 (stmt) != name1);
2099 atype = TREE_TYPE (gimple_assign_lhs (stmt));
2101 if (*code == ERROR_MARK)
2103 *code = acode;
2104 *swap = aswap;
2105 *rslt_type = atype;
2106 return true;
2109 return (*code == acode
2110 && *swap == aswap
2111 && *rslt_type == atype);
2114 /* Remove OP from the operation on rhs of STMT, and replace STMT with
2115 an assignment of the remaining operand. */
2117 static void
2118 remove_name_from_operation (gimple *stmt, tree op)
2120 tree other_op;
2121 gimple_stmt_iterator si;
2123 gcc_assert (is_gimple_assign (stmt));
2125 if (gimple_assign_rhs1 (stmt) == op)
2126 other_op = gimple_assign_rhs2 (stmt);
2127 else
2128 other_op = gimple_assign_rhs1 (stmt);
2130 si = gsi_for_stmt (stmt);
2131 gimple_assign_set_rhs_from_tree (&si, other_op);
2133 /* We should not have reallocated STMT. */
2134 gcc_assert (gsi_stmt (si) == stmt);
2136 update_stmt (stmt);
2139 /* Reassociates the expression in that NAME1 and NAME2 are used so that they
2140 are combined in a single statement, and returns this statement. */
2142 static gimple *
2143 reassociate_to_the_same_stmt (tree name1, tree name2)
2145 gimple *stmt1, *stmt2, *root1, *root2, *s1, *s2;
2146 gassign *new_stmt, *tmp_stmt;
2147 tree new_name, tmp_name, var, r1, r2;
2148 unsigned dist1, dist2;
2149 enum tree_code code;
2150 tree type = TREE_TYPE (name1);
2151 gimple_stmt_iterator bsi;
2153 stmt1 = find_use_stmt (&name1);
2154 stmt2 = find_use_stmt (&name2);
2155 root1 = find_associative_operation_root (stmt1, &dist1);
2156 root2 = find_associative_operation_root (stmt2, &dist2);
2157 code = gimple_assign_rhs_code (stmt1);
2159 gcc_assert (root1 && root2 && root1 == root2
2160 && code == gimple_assign_rhs_code (stmt2));
2162 /* Find the root of the nearest expression in that both NAME1 and NAME2
2163 are used. */
2164 r1 = name1;
2165 s1 = stmt1;
2166 r2 = name2;
2167 s2 = stmt2;
2169 while (dist1 > dist2)
2171 s1 = find_use_stmt (&r1);
2172 r1 = gimple_assign_lhs (s1);
2173 dist1--;
2175 while (dist2 > dist1)
2177 s2 = find_use_stmt (&r2);
2178 r2 = gimple_assign_lhs (s2);
2179 dist2--;
2182 while (s1 != s2)
2184 s1 = find_use_stmt (&r1);
2185 r1 = gimple_assign_lhs (s1);
2186 s2 = find_use_stmt (&r2);
2187 r2 = gimple_assign_lhs (s2);
2190 /* Remove NAME1 and NAME2 from the statements in that they are used
2191 currently. */
2192 remove_name_from_operation (stmt1, name1);
2193 remove_name_from_operation (stmt2, name2);
2195 /* Insert the new statement combining NAME1 and NAME2 before S1, and
2196 combine it with the rhs of S1. */
2197 var = create_tmp_reg (type, "predreastmp");
2198 new_name = make_ssa_name (var);
2199 new_stmt = gimple_build_assign (new_name, code, name1, name2);
2201 var = create_tmp_reg (type, "predreastmp");
2202 tmp_name = make_ssa_name (var);
2204 /* Rhs of S1 may now be either a binary expression with operation
2205 CODE, or gimple_val (in case that stmt1 == s1 or stmt2 == s1,
2206 so that name1 or name2 was removed from it). */
2207 tmp_stmt = gimple_build_assign (tmp_name, gimple_assign_rhs_code (s1),
2208 gimple_assign_rhs1 (s1),
2209 gimple_assign_rhs2 (s1));
2211 bsi = gsi_for_stmt (s1);
2212 gimple_assign_set_rhs_with_ops (&bsi, code, new_name, tmp_name);
2213 s1 = gsi_stmt (bsi);
2214 update_stmt (s1);
2216 gsi_insert_before (&bsi, new_stmt, GSI_SAME_STMT);
2217 gsi_insert_before (&bsi, tmp_stmt, GSI_SAME_STMT);
2219 return new_stmt;
2222 /* Returns the statement that combines references R1 and R2. In case R1
2223 and R2 are not used in the same statement, but they are used with an
2224 associative and commutative operation in the same expression, reassociate
2225 the expression so that they are used in the same statement. */
2227 static gimple *
2228 stmt_combining_refs (dref r1, dref r2)
2230 gimple *stmt1, *stmt2;
2231 tree name1 = name_for_ref (r1);
2232 tree name2 = name_for_ref (r2);
2234 stmt1 = find_use_stmt (&name1);
2235 stmt2 = find_use_stmt (&name2);
2236 if (stmt1 == stmt2)
2237 return stmt1;
2239 return reassociate_to_the_same_stmt (name1, name2);
2242 /* Tries to combine chains CH1 and CH2 together. If this succeeds, the
2243 description of the new chain is returned, otherwise we return NULL. */
2245 static chain_p
2246 combine_chains (chain_p ch1, chain_p ch2)
2248 dref r1, r2, nw;
2249 enum tree_code op = ERROR_MARK;
2250 bool swap = false;
2251 chain_p new_chain;
2252 unsigned i;
2253 gimple *root_stmt;
2254 tree rslt_type = NULL_TREE;
2256 if (ch1 == ch2)
2257 return NULL;
2258 if (ch1->length != ch2->length)
2259 return NULL;
2261 if (ch1->refs.length () != ch2->refs.length ())
2262 return NULL;
2264 for (i = 0; (ch1->refs.iterate (i, &r1)
2265 && ch2->refs.iterate (i, &r2)); i++)
2267 if (r1->distance != r2->distance)
2268 return NULL;
2270 if (!combinable_refs_p (r1, r2, &op, &swap, &rslt_type))
2271 return NULL;
2274 if (swap)
2275 std::swap (ch1, ch2);
2277 new_chain = XCNEW (struct chain);
2278 new_chain->type = CT_COMBINATION;
2279 new_chain->op = op;
2280 new_chain->ch1 = ch1;
2281 new_chain->ch2 = ch2;
2282 new_chain->rslt_type = rslt_type;
2283 new_chain->length = ch1->length;
2285 for (i = 0; (ch1->refs.iterate (i, &r1)
2286 && ch2->refs.iterate (i, &r2)); i++)
2288 nw = XCNEW (struct dref_d);
2289 nw->stmt = stmt_combining_refs (r1, r2);
2290 nw->distance = r1->distance;
2292 new_chain->refs.safe_push (nw);
2295 new_chain->has_max_use_after = false;
2296 root_stmt = get_chain_root (new_chain)->stmt;
2297 for (i = 1; new_chain->refs.iterate (i, &nw); i++)
2299 if (nw->distance == new_chain->length
2300 && !stmt_dominates_stmt_p (nw->stmt, root_stmt))
2302 new_chain->has_max_use_after = true;
2303 break;
2307 ch1->combined = true;
2308 ch2->combined = true;
2309 return new_chain;
2312 /* Try to combine the CHAINS. */
2314 static void
2315 try_combine_chains (vec<chain_p> *chains)
2317 unsigned i, j;
2318 chain_p ch1, ch2, cch;
2319 auto_vec<chain_p> worklist;
2321 FOR_EACH_VEC_ELT (*chains, i, ch1)
2322 if (chain_can_be_combined_p (ch1))
2323 worklist.safe_push (ch1);
2325 while (!worklist.is_empty ())
2327 ch1 = worklist.pop ();
2328 if (!chain_can_be_combined_p (ch1))
2329 continue;
2331 FOR_EACH_VEC_ELT (*chains, j, ch2)
2333 if (!chain_can_be_combined_p (ch2))
2334 continue;
2336 cch = combine_chains (ch1, ch2);
2337 if (cch)
2339 worklist.safe_push (cch);
2340 chains->safe_push (cch);
2341 break;
2347 /* Prepare initializers for CHAIN in LOOP. Returns false if this is
2348 impossible because one of these initializers may trap, true otherwise. */
2350 static bool
2351 prepare_initializers_chain (struct loop *loop, chain_p chain)
2353 unsigned i, n = (chain->type == CT_INVARIANT) ? 1 : chain->length;
2354 struct data_reference *dr = get_chain_root (chain)->ref;
2355 tree init;
2356 dref laref;
2357 edge entry = loop_preheader_edge (loop);
2359 /* Find the initializers for the variables, and check that they cannot
2360 trap. */
2361 chain->inits.create (n);
2362 for (i = 0; i < n; i++)
2363 chain->inits.quick_push (NULL_TREE);
2365 /* If we have replaced some looparound phi nodes, use their initializers
2366 instead of creating our own. */
2367 FOR_EACH_VEC_ELT (chain->refs, i, laref)
2369 if (gimple_code (laref->stmt) != GIMPLE_PHI)
2370 continue;
2372 gcc_assert (laref->distance > 0);
2373 chain->inits[n - laref->distance]
2374 = PHI_ARG_DEF_FROM_EDGE (laref->stmt, entry);
2377 for (i = 0; i < n; i++)
2379 gimple_seq stmts = NULL;
2381 if (chain->inits[i] != NULL_TREE)
2382 continue;
2384 init = ref_at_iteration (dr, (int) i - n, &stmts);
2385 if (!chain->all_always_accessed && tree_could_trap_p (init))
2387 gimple_seq_discard (stmts);
2388 return false;
2391 if (stmts)
2392 gsi_insert_seq_on_edge_immediate (entry, stmts);
2394 chain->inits[i] = init;
2397 return true;
2400 /* Prepare initializers for CHAINS in LOOP, and free chains that cannot
2401 be used because the initializers might trap. */
2403 static void
2404 prepare_initializers (struct loop *loop, vec<chain_p> chains)
2406 chain_p chain;
2407 unsigned i;
2409 for (i = 0; i < chains.length (); )
2411 chain = chains[i];
2412 if (prepare_initializers_chain (loop, chain))
2413 i++;
2414 else
2416 release_chain (chain);
2417 chains.unordered_remove (i);
2422 /* Performs predictive commoning for LOOP. Returns true if LOOP was
2423 unrolled. */
2425 static bool
2426 tree_predictive_commoning_loop (struct loop *loop)
2428 vec<data_reference_p> datarefs;
2429 vec<ddr_p> dependences;
2430 struct component *components;
2431 vec<chain_p> chains = vNULL;
2432 unsigned unroll_factor;
2433 struct tree_niter_desc desc;
2434 bool unroll = false;
2435 edge exit;
2436 bitmap tmp_vars;
2438 if (dump_file && (dump_flags & TDF_DETAILS))
2439 fprintf (dump_file, "Processing loop %d\n", loop->num);
2441 /* Find the data references and split them into components according to their
2442 dependence relations. */
2443 auto_vec<loop_p, 3> loop_nest;
2444 dependences.create (10);
2445 datarefs.create (10);
2446 if (! compute_data_dependences_for_loop (loop, true, &loop_nest, &datarefs,
2447 &dependences))
2449 if (dump_file && (dump_flags & TDF_DETAILS))
2450 fprintf (dump_file, "Cannot analyze data dependencies\n");
2451 free_data_refs (datarefs);
2452 free_dependence_relations (dependences);
2453 return false;
2456 if (dump_file && (dump_flags & TDF_DETAILS))
2457 dump_data_dependence_relations (dump_file, dependences);
2459 components = split_data_refs_to_components (loop, datarefs, dependences);
2460 loop_nest.release ();
2461 free_dependence_relations (dependences);
2462 if (!components)
2464 free_data_refs (datarefs);
2465 free_affine_expand_cache (&name_expansions);
2466 return false;
2469 if (dump_file && (dump_flags & TDF_DETAILS))
2471 fprintf (dump_file, "Initial state:\n\n");
2472 dump_components (dump_file, components);
2475 /* Find the suitable components and split them into chains. */
2476 components = filter_suitable_components (loop, components);
2478 tmp_vars = BITMAP_ALLOC (NULL);
2479 looparound_phis = BITMAP_ALLOC (NULL);
2480 determine_roots (loop, components, &chains);
2481 release_components (components);
2483 if (!chains.exists ())
2485 if (dump_file && (dump_flags & TDF_DETAILS))
2486 fprintf (dump_file,
2487 "Predictive commoning failed: no suitable chains\n");
2488 goto end;
2490 prepare_initializers (loop, chains);
2492 /* Try to combine the chains that are always worked with together. */
2493 try_combine_chains (&chains);
2495 if (dump_file && (dump_flags & TDF_DETAILS))
2497 fprintf (dump_file, "Before commoning:\n\n");
2498 dump_chains (dump_file, chains);
2501 /* Determine the unroll factor, and if the loop should be unrolled, ensure
2502 that its number of iterations is divisible by the factor. */
2503 unroll_factor = determine_unroll_factor (chains);
2504 scev_reset ();
2505 unroll = (unroll_factor > 1
2506 && can_unroll_loop_p (loop, unroll_factor, &desc));
2507 exit = single_dom_exit (loop);
2509 /* Execute the predictive commoning transformations, and possibly unroll the
2510 loop. */
2511 if (unroll)
2513 struct epcc_data dta;
2515 if (dump_file && (dump_flags & TDF_DETAILS))
2516 fprintf (dump_file, "Unrolling %u times.\n", unroll_factor);
2518 dta.chains = chains;
2519 dta.tmp_vars = tmp_vars;
2521 update_ssa (TODO_update_ssa_only_virtuals);
2523 /* Cfg manipulations performed in tree_transform_and_unroll_loop before
2524 execute_pred_commoning_cbck is called may cause phi nodes to be
2525 reallocated, which is a problem since CHAINS may point to these
2526 statements. To fix this, we store the ssa names defined by the
2527 phi nodes here instead of the phi nodes themselves, and restore
2528 the phi nodes in execute_pred_commoning_cbck. A bit hacky. */
2529 replace_phis_by_defined_names (chains);
2531 tree_transform_and_unroll_loop (loop, unroll_factor, exit, &desc,
2532 execute_pred_commoning_cbck, &dta);
2533 eliminate_temp_copies (loop, tmp_vars);
2535 else
2537 if (dump_file && (dump_flags & TDF_DETAILS))
2538 fprintf (dump_file,
2539 "Executing predictive commoning without unrolling.\n");
2540 execute_pred_commoning (loop, chains, tmp_vars);
2543 end: ;
2544 release_chains (chains);
2545 free_data_refs (datarefs);
2546 BITMAP_FREE (tmp_vars);
2547 BITMAP_FREE (looparound_phis);
2549 free_affine_expand_cache (&name_expansions);
2551 return unroll;
2554 /* Runs predictive commoning. */
2556 unsigned
2557 tree_predictive_commoning (void)
2559 bool unrolled = false;
2560 struct loop *loop;
2561 unsigned ret = 0;
2563 initialize_original_copy_tables ();
2564 FOR_EACH_LOOP (loop, LI_ONLY_INNERMOST)
2565 if (optimize_loop_for_speed_p (loop))
2567 unrolled |= tree_predictive_commoning_loop (loop);
2570 if (unrolled)
2572 scev_reset ();
2573 ret = TODO_cleanup_cfg;
2575 free_original_copy_tables ();
2577 return ret;
2580 /* Predictive commoning Pass. */
2582 static unsigned
2583 run_tree_predictive_commoning (struct function *fun)
2585 if (number_of_loops (fun) <= 1)
2586 return 0;
2588 return tree_predictive_commoning ();
2591 namespace {
2593 const pass_data pass_data_predcom =
2595 GIMPLE_PASS, /* type */
2596 "pcom", /* name */
2597 OPTGROUP_LOOP, /* optinfo_flags */
2598 TV_PREDCOM, /* tv_id */
2599 PROP_cfg, /* properties_required */
2600 0, /* properties_provided */
2601 0, /* properties_destroyed */
2602 0, /* todo_flags_start */
2603 TODO_update_ssa_only_virtuals, /* todo_flags_finish */
2606 class pass_predcom : public gimple_opt_pass
2608 public:
2609 pass_predcom (gcc::context *ctxt)
2610 : gimple_opt_pass (pass_data_predcom, ctxt)
2613 /* opt_pass methods: */
2614 virtual bool gate (function *) { return flag_predictive_commoning != 0; }
2615 virtual unsigned int execute (function *fun)
2617 return run_tree_predictive_commoning (fun);
2620 }; // class pass_predcom
2622 } // anon namespace
2624 gimple_opt_pass *
2625 make_pass_predcom (gcc::context *ctxt)
2627 return new pass_predcom (ctxt);