Add noexcept to generic std::size, std::empty and std::data
[official-gcc.git] / gcc / tree-predcom.c
blob747c1b82d3724c284f9dd68328acbc0ec7938db6
1 /* Predictive commoning.
2 Copyright (C) 2005-2017 Free Software Foundation, Inc.
4 This file is part of GCC.
6 GCC is free software; you can redistribute it and/or modify it
7 under the terms of the GNU General Public License as published by the
8 Free Software Foundation; either version 3, or (at your option) any
9 later version.
11 GCC is distributed in the hope that it will be useful, but WITHOUT
12 ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
13 FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
14 for more details.
16 You should have received a copy of the GNU General Public License
17 along with GCC; see the file COPYING3. If not see
18 <http://www.gnu.org/licenses/>. */
20 /* This file implements the predictive commoning optimization. Predictive
21 commoning can be viewed as CSE around a loop, and with some improvements,
22 as generalized strength reduction-- i.e., reusing values computed in
23 earlier iterations of a loop in the later ones. So far, the pass only
24 handles the most useful case, that is, reusing values of memory references.
25 If you think this is all just a special case of PRE, you are sort of right;
26 however, concentrating on loops is simpler, and makes it possible to
27 incorporate data dependence analysis to detect the opportunities, perform
28 loop unrolling to avoid copies together with renaming immediately,
29 and if needed, we could also take register pressure into account.
31 Let us demonstrate what is done on an example:
33 for (i = 0; i < 100; i++)
35 a[i+2] = a[i] + a[i+1];
36 b[10] = b[10] + i;
37 c[i] = c[99 - i];
38 d[i] = d[i + 1];
41 1) We find data references in the loop, and split them to mutually
42 independent groups (i.e., we find components of a data dependence
43 graph). We ignore read-read dependences whose distance is not constant.
44 (TODO -- we could also ignore antidependences). In this example, we
45 find the following groups:
47 a[i]{read}, a[i+1]{read}, a[i+2]{write}
48 b[10]{read}, b[10]{write}
49 c[99 - i]{read}, c[i]{write}
50 d[i + 1]{read}, d[i]{write}
52 2) Inside each of the group, we verify several conditions:
53 a) all the references must differ in indices only, and the indices
54 must all have the same step
55 b) the references must dominate loop latch (and thus, they must be
56 ordered by dominance relation).
57 c) the distance of the indices must be a small multiple of the step
58 We are then able to compute the difference of the references (# of
59 iterations before they point to the same place as the first of them).
60 Also, in case there are writes in the loop, we split the groups into
61 chains whose head is the write whose values are used by the reads in
62 the same chain. The chains are then processed independently,
63 making the further transformations simpler. Also, the shorter chains
64 need the same number of registers, but may require lower unrolling
65 factor in order to get rid of the copies on the loop latch.
67 In our example, we get the following chains (the chain for c is invalid).
69 a[i]{read,+0}, a[i+1]{read,-1}, a[i+2]{write,-2}
70 b[10]{read,+0}, b[10]{write,+0}
71 d[i + 1]{read,+0}, d[i]{write,+1}
73 3) For each read, we determine the read or write whose value it reuses,
74 together with the distance of this reuse. I.e. we take the last
75 reference before it with distance 0, or the last of the references
76 with the smallest positive distance to the read. Then, we remove
77 the references that are not used in any of these chains, discard the
78 empty groups, and propagate all the links so that they point to the
79 single root reference of the chain (adjusting their distance
80 appropriately). Some extra care needs to be taken for references with
81 step 0. In our example (the numbers indicate the distance of the
82 reuse),
84 a[i] --> (*) 2, a[i+1] --> (*) 1, a[i+2] (*)
85 b[10] --> (*) 1, b[10] (*)
87 4) The chains are combined together if possible. If the corresponding
88 elements of two chains are always combined together with the same
89 operator, we remember just the result of this combination, instead
90 of remembering the values separately. We may need to perform
91 reassociation to enable combining, for example
93 e[i] + f[i+1] + e[i+1] + f[i]
95 can be reassociated as
97 (e[i] + f[i]) + (e[i+1] + f[i+1])
99 and we can combine the chains for e and f into one chain.
101 5) For each root reference (end of the chain) R, let N be maximum distance
102 of a reference reusing its value. Variables R0 up to RN are created,
103 together with phi nodes that transfer values from R1 .. RN to
104 R0 .. R(N-1).
105 Initial values are loaded to R0..R(N-1) (in case not all references
106 must necessarily be accessed and they may trap, we may fail here;
107 TODO sometimes, the loads could be guarded by a check for the number
108 of iterations). Values loaded/stored in roots are also copied to
109 RN. Other reads are replaced with the appropriate variable Ri.
110 Everything is put to SSA form.
112 As a small improvement, if R0 is dead after the root (i.e., all uses of
113 the value with the maximum distance dominate the root), we can avoid
114 creating RN and use R0 instead of it.
116 In our example, we get (only the parts concerning a and b are shown):
117 for (i = 0; i < 100; i++)
119 f = phi (a[0], s);
120 s = phi (a[1], f);
121 x = phi (b[10], x);
123 f = f + s;
124 a[i+2] = f;
125 x = x + i;
126 b[10] = x;
129 6) Factor F for unrolling is determined as the smallest common multiple of
130 (N + 1) for each root reference (N for references for that we avoided
131 creating RN). If F and the loop is small enough, loop is unrolled F
132 times. The stores to RN (R0) in the copies of the loop body are
133 periodically replaced with R0, R1, ... (R1, R2, ...), so that they can
134 be coalesced and the copies can be eliminated.
136 TODO -- copy propagation and other optimizations may change the live
137 ranges of the temporary registers and prevent them from being coalesced;
138 this may increase the register pressure.
140 In our case, F = 2 and the (main loop of the) result is
142 for (i = 0; i < ...; i += 2)
144 f = phi (a[0], f);
145 s = phi (a[1], s);
146 x = phi (b[10], x);
148 f = f + s;
149 a[i+2] = f;
150 x = x + i;
151 b[10] = x;
153 s = s + f;
154 a[i+3] = s;
155 x = x + i;
156 b[10] = x;
159 Apart from predictive commoning on Load-Load and Store-Load chains, we
160 also support Store-Store chains -- stores killed by other store can be
161 eliminated. Given below example:
163 for (i = 0; i < n; i++)
165 a[i] = 1;
166 a[i+2] = 2;
169 It can be replaced with:
171 t0 = a[0];
172 t1 = a[1];
173 for (i = 0; i < n; i++)
175 a[i] = 1;
176 t2 = 2;
177 t0 = t1;
178 t1 = t2;
180 a[n] = t0;
181 a[n+1] = t1;
183 If the loop runs more than 1 iterations, it can be further simplified into:
185 for (i = 0; i < n; i++)
187 a[i] = 1;
189 a[n] = 2;
190 a[n+1] = 2;
192 The interesting part is this can be viewed either as general store motion
193 or general dead store elimination in either intra/inter-iterations way.
195 TODO: For now, we don't support store-store chains in multi-exit loops. We
196 force to not unroll in case of store-store chain even if other chains might
197 ask for unroll.
199 Predictive commoning can be generalized for arbitrary computations (not
200 just memory loads), and also nontrivial transfer functions (e.g., replacing
201 i * i with ii_last + 2 * i + 1), to generalize strength reduction. */
203 #include "config.h"
204 #include "system.h"
205 #include "coretypes.h"
206 #include "backend.h"
207 #include "rtl.h"
208 #include "tree.h"
209 #include "gimple.h"
210 #include "predict.h"
211 #include "tree-pass.h"
212 #include "ssa.h"
213 #include "gimple-pretty-print.h"
214 #include "alias.h"
215 #include "fold-const.h"
216 #include "cfgloop.h"
217 #include "tree-eh.h"
218 #include "gimplify.h"
219 #include "gimple-iterator.h"
220 #include "gimplify-me.h"
221 #include "tree-ssa-loop-ivopts.h"
222 #include "tree-ssa-loop-manip.h"
223 #include "tree-ssa-loop-niter.h"
224 #include "tree-ssa-loop.h"
225 #include "tree-into-ssa.h"
226 #include "tree-dfa.h"
227 #include "tree-ssa.h"
228 #include "tree-data-ref.h"
229 #include "tree-scalar-evolution.h"
230 #include "params.h"
231 #include "tree-affine.h"
232 #include "builtins.h"
234 /* The maximum number of iterations between the considered memory
235 references. */
237 #define MAX_DISTANCE (target_avail_regs < 16 ? 4 : 8)
239 /* Data references (or phi nodes that carry data reference values across
240 loop iterations). */
242 typedef struct dref_d
244 /* The reference itself. */
245 struct data_reference *ref;
247 /* The statement in that the reference appears. */
248 gimple *stmt;
250 /* In case that STMT is a phi node, this field is set to the SSA name
251 defined by it in replace_phis_by_defined_names (in order to avoid
252 pointing to phi node that got reallocated in the meantime). */
253 tree name_defined_by_phi;
255 /* Distance of the reference from the root of the chain (in number of
256 iterations of the loop). */
257 unsigned distance;
259 /* Number of iterations offset from the first reference in the component. */
260 widest_int offset;
262 /* Number of the reference in a component, in dominance ordering. */
263 unsigned pos;
265 /* True if the memory reference is always accessed when the loop is
266 entered. */
267 unsigned always_accessed : 1;
268 } *dref;
271 /* Type of the chain of the references. */
273 enum chain_type
275 /* The addresses of the references in the chain are constant. */
276 CT_INVARIANT,
278 /* There are only loads in the chain. */
279 CT_LOAD,
281 /* Root of the chain is store, the rest are loads. */
282 CT_STORE_LOAD,
284 /* There are only stores in the chain. */
285 CT_STORE_STORE,
287 /* A combination of two chains. */
288 CT_COMBINATION
291 /* Chains of data references. */
293 typedef struct chain
295 /* Type of the chain. */
296 enum chain_type type;
298 /* For combination chains, the operator and the two chains that are
299 combined, and the type of the result. */
300 enum tree_code op;
301 tree rslt_type;
302 struct chain *ch1, *ch2;
304 /* The references in the chain. */
305 vec<dref> refs;
307 /* The maximum distance of the reference in the chain from the root. */
308 unsigned length;
310 /* The variables used to copy the value throughout iterations. */
311 vec<tree> vars;
313 /* Initializers for the variables. */
314 vec<tree> inits;
316 /* Finalizers for the eliminated stores. */
317 vec<tree> finis;
319 /* gimple stmts intializing the initial variables of the chain. */
320 gimple_seq init_seq;
322 /* gimple stmts finalizing the eliminated stores of the chain. */
323 gimple_seq fini_seq;
325 /* True if there is a use of a variable with the maximal distance
326 that comes after the root in the loop. */
327 unsigned has_max_use_after : 1;
329 /* True if all the memory references in the chain are always accessed. */
330 unsigned all_always_accessed : 1;
332 /* True if this chain was combined together with some other chain. */
333 unsigned combined : 1;
335 /* True if this is store elimination chain and eliminated stores store
336 loop invariant value into memory. */
337 unsigned inv_store_elimination : 1;
338 } *chain_p;
341 /* Describes the knowledge about the step of the memory references in
342 the component. */
344 enum ref_step_type
346 /* The step is zero. */
347 RS_INVARIANT,
349 /* The step is nonzero. */
350 RS_NONZERO,
352 /* The step may or may not be nonzero. */
353 RS_ANY
356 /* Components of the data dependence graph. */
358 struct component
360 /* The references in the component. */
361 vec<dref> refs;
363 /* What we know about the step of the references in the component. */
364 enum ref_step_type comp_step;
366 /* True if all references in component are stores and we try to do
367 intra/inter loop iteration dead store elimination. */
368 bool eliminate_store_p;
370 /* Next component in the list. */
371 struct component *next;
374 /* Bitmap of ssa names defined by looparound phi nodes covered by chains. */
376 static bitmap looparound_phis;
378 /* Cache used by tree_to_aff_combination_expand. */
380 static hash_map<tree, name_expansion *> *name_expansions;
382 /* Dumps data reference REF to FILE. */
384 extern void dump_dref (FILE *, dref);
385 void
386 dump_dref (FILE *file, dref ref)
388 if (ref->ref)
390 fprintf (file, " ");
391 print_generic_expr (file, DR_REF (ref->ref), TDF_SLIM);
392 fprintf (file, " (id %u%s)\n", ref->pos,
393 DR_IS_READ (ref->ref) ? "" : ", write");
395 fprintf (file, " offset ");
396 print_decs (ref->offset, file);
397 fprintf (file, "\n");
399 fprintf (file, " distance %u\n", ref->distance);
401 else
403 if (gimple_code (ref->stmt) == GIMPLE_PHI)
404 fprintf (file, " looparound ref\n");
405 else
406 fprintf (file, " combination ref\n");
407 fprintf (file, " in statement ");
408 print_gimple_stmt (file, ref->stmt, 0, TDF_SLIM);
409 fprintf (file, "\n");
410 fprintf (file, " distance %u\n", ref->distance);
415 /* Dumps CHAIN to FILE. */
417 extern void dump_chain (FILE *, chain_p);
418 void
419 dump_chain (FILE *file, chain_p chain)
421 dref a;
422 const char *chain_type;
423 unsigned i;
424 tree var;
426 switch (chain->type)
428 case CT_INVARIANT:
429 chain_type = "Load motion";
430 break;
432 case CT_LOAD:
433 chain_type = "Loads-only";
434 break;
436 case CT_STORE_LOAD:
437 chain_type = "Store-loads";
438 break;
440 case CT_STORE_STORE:
441 chain_type = "Store-stores";
442 break;
444 case CT_COMBINATION:
445 chain_type = "Combination";
446 break;
448 default:
449 gcc_unreachable ();
452 fprintf (file, "%s chain %p%s\n", chain_type, (void *) chain,
453 chain->combined ? " (combined)" : "");
454 if (chain->type != CT_INVARIANT)
455 fprintf (file, " max distance %u%s\n", chain->length,
456 chain->has_max_use_after ? "" : ", may reuse first");
458 if (chain->type == CT_COMBINATION)
460 fprintf (file, " equal to %p %s %p in type ",
461 (void *) chain->ch1, op_symbol_code (chain->op),
462 (void *) chain->ch2);
463 print_generic_expr (file, chain->rslt_type, TDF_SLIM);
464 fprintf (file, "\n");
467 if (chain->vars.exists ())
469 fprintf (file, " vars");
470 FOR_EACH_VEC_ELT (chain->vars, i, var)
472 fprintf (file, " ");
473 print_generic_expr (file, var, TDF_SLIM);
475 fprintf (file, "\n");
478 if (chain->inits.exists ())
480 fprintf (file, " inits");
481 FOR_EACH_VEC_ELT (chain->inits, i, var)
483 fprintf (file, " ");
484 print_generic_expr (file, var, TDF_SLIM);
486 fprintf (file, "\n");
489 fprintf (file, " references:\n");
490 FOR_EACH_VEC_ELT (chain->refs, i, a)
491 dump_dref (file, a);
493 fprintf (file, "\n");
496 /* Dumps CHAINS to FILE. */
498 extern void dump_chains (FILE *, vec<chain_p> );
499 void
500 dump_chains (FILE *file, vec<chain_p> chains)
502 chain_p chain;
503 unsigned i;
505 FOR_EACH_VEC_ELT (chains, i, chain)
506 dump_chain (file, chain);
509 /* Dumps COMP to FILE. */
511 extern void dump_component (FILE *, struct component *);
512 void
513 dump_component (FILE *file, struct component *comp)
515 dref a;
516 unsigned i;
518 fprintf (file, "Component%s:\n",
519 comp->comp_step == RS_INVARIANT ? " (invariant)" : "");
520 FOR_EACH_VEC_ELT (comp->refs, i, a)
521 dump_dref (file, a);
522 fprintf (file, "\n");
525 /* Dumps COMPS to FILE. */
527 extern void dump_components (FILE *, struct component *);
528 void
529 dump_components (FILE *file, struct component *comps)
531 struct component *comp;
533 for (comp = comps; comp; comp = comp->next)
534 dump_component (file, comp);
537 /* Frees a chain CHAIN. */
539 static void
540 release_chain (chain_p chain)
542 dref ref;
543 unsigned i;
545 if (chain == NULL)
546 return;
548 FOR_EACH_VEC_ELT (chain->refs, i, ref)
549 free (ref);
551 chain->refs.release ();
552 chain->vars.release ();
553 chain->inits.release ();
554 if (chain->init_seq)
555 gimple_seq_discard (chain->init_seq);
557 chain->finis.release ();
558 if (chain->fini_seq)
559 gimple_seq_discard (chain->fini_seq);
561 free (chain);
564 /* Frees CHAINS. */
566 static void
567 release_chains (vec<chain_p> chains)
569 unsigned i;
570 chain_p chain;
572 FOR_EACH_VEC_ELT (chains, i, chain)
573 release_chain (chain);
574 chains.release ();
577 /* Frees a component COMP. */
579 static void
580 release_component (struct component *comp)
582 comp->refs.release ();
583 free (comp);
586 /* Frees list of components COMPS. */
588 static void
589 release_components (struct component *comps)
591 struct component *act, *next;
593 for (act = comps; act; act = next)
595 next = act->next;
596 release_component (act);
600 /* Finds a root of tree given by FATHERS containing A, and performs path
601 shortening. */
603 static unsigned
604 component_of (unsigned fathers[], unsigned a)
606 unsigned root, n;
608 for (root = a; root != fathers[root]; root = fathers[root])
609 continue;
611 for (; a != root; a = n)
613 n = fathers[a];
614 fathers[a] = root;
617 return root;
620 /* Join operation for DFU. FATHERS gives the tree, SIZES are sizes of the
621 components, A and B are components to merge. */
623 static void
624 merge_comps (unsigned fathers[], unsigned sizes[], unsigned a, unsigned b)
626 unsigned ca = component_of (fathers, a);
627 unsigned cb = component_of (fathers, b);
629 if (ca == cb)
630 return;
632 if (sizes[ca] < sizes[cb])
634 sizes[cb] += sizes[ca];
635 fathers[ca] = cb;
637 else
639 sizes[ca] += sizes[cb];
640 fathers[cb] = ca;
644 /* Returns true if A is a reference that is suitable for predictive commoning
645 in the innermost loop that contains it. REF_STEP is set according to the
646 step of the reference A. */
648 static bool
649 suitable_reference_p (struct data_reference *a, enum ref_step_type *ref_step)
651 tree ref = DR_REF (a), step = DR_STEP (a);
653 if (!step
654 || TREE_THIS_VOLATILE (ref)
655 || !is_gimple_reg_type (TREE_TYPE (ref))
656 || tree_could_throw_p (ref))
657 return false;
659 if (integer_zerop (step))
660 *ref_step = RS_INVARIANT;
661 else if (integer_nonzerop (step))
662 *ref_step = RS_NONZERO;
663 else
664 *ref_step = RS_ANY;
666 return true;
669 /* Stores DR_OFFSET (DR) + DR_INIT (DR) to OFFSET. */
671 static void
672 aff_combination_dr_offset (struct data_reference *dr, aff_tree *offset)
674 tree type = TREE_TYPE (DR_OFFSET (dr));
675 aff_tree delta;
677 tree_to_aff_combination_expand (DR_OFFSET (dr), type, offset,
678 &name_expansions);
679 aff_combination_const (&delta, type, wi::to_widest (DR_INIT (dr)));
680 aff_combination_add (offset, &delta);
683 /* Determines number of iterations of the innermost enclosing loop before B
684 refers to exactly the same location as A and stores it to OFF. If A and
685 B do not have the same step, they never meet, or anything else fails,
686 returns false, otherwise returns true. Both A and B are assumed to
687 satisfy suitable_reference_p. */
689 static bool
690 determine_offset (struct data_reference *a, struct data_reference *b,
691 widest_int *off)
693 aff_tree diff, baseb, step;
694 tree typea, typeb;
696 /* Check that both the references access the location in the same type. */
697 typea = TREE_TYPE (DR_REF (a));
698 typeb = TREE_TYPE (DR_REF (b));
699 if (!useless_type_conversion_p (typeb, typea))
700 return false;
702 /* Check whether the base address and the step of both references is the
703 same. */
704 if (!operand_equal_p (DR_STEP (a), DR_STEP (b), 0)
705 || !operand_equal_p (DR_BASE_ADDRESS (a), DR_BASE_ADDRESS (b), 0))
706 return false;
708 if (integer_zerop (DR_STEP (a)))
710 /* If the references have loop invariant address, check that they access
711 exactly the same location. */
712 *off = 0;
713 return (operand_equal_p (DR_OFFSET (a), DR_OFFSET (b), 0)
714 && operand_equal_p (DR_INIT (a), DR_INIT (b), 0));
717 /* Compare the offsets of the addresses, and check whether the difference
718 is a multiple of step. */
719 aff_combination_dr_offset (a, &diff);
720 aff_combination_dr_offset (b, &baseb);
721 aff_combination_scale (&baseb, -1);
722 aff_combination_add (&diff, &baseb);
724 tree_to_aff_combination_expand (DR_STEP (a), TREE_TYPE (DR_STEP (a)),
725 &step, &name_expansions);
726 return aff_combination_constant_multiple_p (&diff, &step, off);
729 /* Returns the last basic block in LOOP for that we are sure that
730 it is executed whenever the loop is entered. */
732 static basic_block
733 last_always_executed_block (struct loop *loop)
735 unsigned i;
736 vec<edge> exits = get_loop_exit_edges (loop);
737 edge ex;
738 basic_block last = loop->latch;
740 FOR_EACH_VEC_ELT (exits, i, ex)
741 last = nearest_common_dominator (CDI_DOMINATORS, last, ex->src);
742 exits.release ();
744 return last;
747 /* Splits dependence graph on DATAREFS described by DEPENDS to components. */
749 static struct component *
750 split_data_refs_to_components (struct loop *loop,
751 vec<data_reference_p> datarefs,
752 vec<ddr_p> depends)
754 unsigned i, n = datarefs.length ();
755 unsigned ca, ia, ib, bad;
756 unsigned *comp_father = XNEWVEC (unsigned, n + 1);
757 unsigned *comp_size = XNEWVEC (unsigned, n + 1);
758 struct component **comps;
759 struct data_reference *dr, *dra, *drb;
760 struct data_dependence_relation *ddr;
761 struct component *comp_list = NULL, *comp;
762 dref dataref;
763 /* Don't do store elimination if loop has multiple exit edges. */
764 bool eliminate_store_p = single_exit (loop) != NULL;
765 basic_block last_always_executed = last_always_executed_block (loop);
767 FOR_EACH_VEC_ELT (datarefs, i, dr)
769 if (!DR_REF (dr))
771 /* A fake reference for call or asm_expr that may clobber memory;
772 just fail. */
773 goto end;
775 /* predcom pass isn't prepared to handle calls with data references. */
776 if (is_gimple_call (DR_STMT (dr)))
777 goto end;
778 dr->aux = (void *) (size_t) i;
779 comp_father[i] = i;
780 comp_size[i] = 1;
783 /* A component reserved for the "bad" data references. */
784 comp_father[n] = n;
785 comp_size[n] = 1;
787 FOR_EACH_VEC_ELT (datarefs, i, dr)
789 enum ref_step_type dummy;
791 if (!suitable_reference_p (dr, &dummy))
793 ia = (unsigned) (size_t) dr->aux;
794 merge_comps (comp_father, comp_size, n, ia);
798 FOR_EACH_VEC_ELT (depends, i, ddr)
800 widest_int dummy_off;
802 if (DDR_ARE_DEPENDENT (ddr) == chrec_known)
803 continue;
805 dra = DDR_A (ddr);
806 drb = DDR_B (ddr);
808 /* Don't do store elimination if there is any unknown dependence for
809 any store data reference. */
810 if ((DR_IS_WRITE (dra) || DR_IS_WRITE (drb))
811 && (DDR_ARE_DEPENDENT (ddr) == chrec_dont_know
812 || DDR_NUM_DIST_VECTS (ddr) == 0))
813 eliminate_store_p = false;
815 ia = component_of (comp_father, (unsigned) (size_t) dra->aux);
816 ib = component_of (comp_father, (unsigned) (size_t) drb->aux);
817 if (ia == ib)
818 continue;
820 bad = component_of (comp_father, n);
822 /* If both A and B are reads, we may ignore unsuitable dependences. */
823 if (DR_IS_READ (dra) && DR_IS_READ (drb))
825 if (ia == bad || ib == bad
826 || !determine_offset (dra, drb, &dummy_off))
827 continue;
829 /* If A is read and B write or vice versa and there is unsuitable
830 dependence, instead of merging both components into a component
831 that will certainly not pass suitable_component_p, just put the
832 read into bad component, perhaps at least the write together with
833 all the other data refs in it's component will be optimizable. */
834 else if (DR_IS_READ (dra) && ib != bad)
836 if (ia == bad)
837 continue;
838 else if (!determine_offset (dra, drb, &dummy_off))
840 merge_comps (comp_father, comp_size, bad, ia);
841 continue;
844 else if (DR_IS_READ (drb) && ia != bad)
846 if (ib == bad)
847 continue;
848 else if (!determine_offset (dra, drb, &dummy_off))
850 merge_comps (comp_father, comp_size, bad, ib);
851 continue;
854 else if (DR_IS_WRITE (dra) && DR_IS_WRITE (drb)
855 && ia != bad && ib != bad
856 && !determine_offset (dra, drb, &dummy_off))
858 merge_comps (comp_father, comp_size, bad, ia);
859 merge_comps (comp_father, comp_size, bad, ib);
860 continue;
863 merge_comps (comp_father, comp_size, ia, ib);
866 if (eliminate_store_p)
868 tree niters = number_of_latch_executions (loop);
870 /* Don't do store elimination if niters info is unknown because stores
871 in the last iteration can't be eliminated and we need to recover it
872 after loop. */
873 eliminate_store_p = (niters != NULL_TREE && niters != chrec_dont_know);
876 comps = XCNEWVEC (struct component *, n);
877 bad = component_of (comp_father, n);
878 FOR_EACH_VEC_ELT (datarefs, i, dr)
880 ia = (unsigned) (size_t) dr->aux;
881 ca = component_of (comp_father, ia);
882 if (ca == bad)
883 continue;
885 comp = comps[ca];
886 if (!comp)
888 comp = XCNEW (struct component);
889 comp->refs.create (comp_size[ca]);
890 comp->eliminate_store_p = eliminate_store_p;
891 comps[ca] = comp;
894 dataref = XCNEW (struct dref_d);
895 dataref->ref = dr;
896 dataref->stmt = DR_STMT (dr);
897 dataref->offset = 0;
898 dataref->distance = 0;
900 dataref->always_accessed
901 = dominated_by_p (CDI_DOMINATORS, last_always_executed,
902 gimple_bb (dataref->stmt));
903 dataref->pos = comp->refs.length ();
904 comp->refs.quick_push (dataref);
905 if (DR_IS_READ (dr))
906 comp->eliminate_store_p = false;
909 for (i = 0; i < n; i++)
911 comp = comps[i];
912 if (comp)
914 comp->next = comp_list;
915 comp_list = comp;
918 free (comps);
920 end:
921 free (comp_father);
922 free (comp_size);
923 return comp_list;
926 /* Returns true if the component COMP satisfies the conditions
927 described in 2) at the beginning of this file. LOOP is the current
928 loop. */
930 static bool
931 suitable_component_p (struct loop *loop, struct component *comp)
933 unsigned i;
934 dref a, first;
935 basic_block ba, bp = loop->header;
936 bool ok, has_write = false;
938 FOR_EACH_VEC_ELT (comp->refs, i, a)
940 ba = gimple_bb (a->stmt);
942 if (!just_once_each_iteration_p (loop, ba))
943 return false;
945 gcc_assert (dominated_by_p (CDI_DOMINATORS, ba, bp));
946 bp = ba;
948 if (DR_IS_WRITE (a->ref))
949 has_write = true;
952 first = comp->refs[0];
953 ok = suitable_reference_p (first->ref, &comp->comp_step);
954 gcc_assert (ok);
955 first->offset = 0;
957 for (i = 1; comp->refs.iterate (i, &a); i++)
959 if (!determine_offset (first->ref, a->ref, &a->offset))
960 return false;
962 enum ref_step_type a_step;
963 gcc_checking_assert (suitable_reference_p (a->ref, &a_step)
964 && a_step == comp->comp_step);
967 /* If there is a write inside the component, we must know whether the
968 step is nonzero or not -- we would not otherwise be able to recognize
969 whether the value accessed by reads comes from the OFFSET-th iteration
970 or the previous one. */
971 if (has_write && comp->comp_step == RS_ANY)
972 return false;
974 return true;
977 /* Check the conditions on references inside each of components COMPS,
978 and remove the unsuitable components from the list. The new list
979 of components is returned. The conditions are described in 2) at
980 the beginning of this file. LOOP is the current loop. */
982 static struct component *
983 filter_suitable_components (struct loop *loop, struct component *comps)
985 struct component **comp, *act;
987 for (comp = &comps; *comp; )
989 act = *comp;
990 if (suitable_component_p (loop, act))
991 comp = &act->next;
992 else
994 dref ref;
995 unsigned i;
997 *comp = act->next;
998 FOR_EACH_VEC_ELT (act->refs, i, ref)
999 free (ref);
1000 release_component (act);
1004 return comps;
1007 /* Compares two drefs A and B by their offset and position. Callback for
1008 qsort. */
1010 static int
1011 order_drefs (const void *a, const void *b)
1013 const dref *const da = (const dref *) a;
1014 const dref *const db = (const dref *) b;
1015 int offcmp = wi::cmps ((*da)->offset, (*db)->offset);
1017 if (offcmp != 0)
1018 return offcmp;
1020 return (*da)->pos - (*db)->pos;
1023 /* Compares two drefs A and B by their position. Callback for qsort. */
1025 static int
1026 order_drefs_by_pos (const void *a, const void *b)
1028 const dref *const da = (const dref *) a;
1029 const dref *const db = (const dref *) b;
1031 return (*da)->pos - (*db)->pos;
1034 /* Returns root of the CHAIN. */
1036 static inline dref
1037 get_chain_root (chain_p chain)
1039 return chain->refs[0];
1042 /* Given CHAIN, returns the last ref at DISTANCE, or NULL if it doesn't
1043 exist. */
1045 static inline dref
1046 get_chain_last_ref_at (chain_p chain, unsigned distance)
1048 unsigned i;
1050 for (i = chain->refs.length (); i > 0; i--)
1051 if (distance == chain->refs[i - 1]->distance)
1052 break;
1054 return (i > 0) ? chain->refs[i - 1] : NULL;
1057 /* Adds REF to the chain CHAIN. */
1059 static void
1060 add_ref_to_chain (chain_p chain, dref ref)
1062 dref root = get_chain_root (chain);
1064 gcc_assert (wi::les_p (root->offset, ref->offset));
1065 widest_int dist = ref->offset - root->offset;
1066 if (wi::leu_p (MAX_DISTANCE, dist))
1068 free (ref);
1069 return;
1071 gcc_assert (wi::fits_uhwi_p (dist));
1073 chain->refs.safe_push (ref);
1075 ref->distance = dist.to_uhwi ();
1077 if (ref->distance >= chain->length)
1079 chain->length = ref->distance;
1080 chain->has_max_use_after = false;
1083 /* Don't set the flag for store-store chain since there is no use. */
1084 if (chain->type != CT_STORE_STORE
1085 && ref->distance == chain->length
1086 && ref->pos > root->pos)
1087 chain->has_max_use_after = true;
1089 chain->all_always_accessed &= ref->always_accessed;
1092 /* Returns the chain for invariant component COMP. */
1094 static chain_p
1095 make_invariant_chain (struct component *comp)
1097 chain_p chain = XCNEW (struct chain);
1098 unsigned i;
1099 dref ref;
1101 chain->type = CT_INVARIANT;
1103 chain->all_always_accessed = true;
1105 FOR_EACH_VEC_ELT (comp->refs, i, ref)
1107 chain->refs.safe_push (ref);
1108 chain->all_always_accessed &= ref->always_accessed;
1111 chain->inits = vNULL;
1112 chain->finis = vNULL;
1114 return chain;
1117 /* Make a new chain of type TYPE rooted at REF. */
1119 static chain_p
1120 make_rooted_chain (dref ref, enum chain_type type)
1122 chain_p chain = XCNEW (struct chain);
1124 chain->type = type;
1125 chain->refs.safe_push (ref);
1126 chain->all_always_accessed = ref->always_accessed;
1127 ref->distance = 0;
1129 chain->inits = vNULL;
1130 chain->finis = vNULL;
1132 return chain;
1135 /* Returns true if CHAIN is not trivial. */
1137 static bool
1138 nontrivial_chain_p (chain_p chain)
1140 return chain != NULL && chain->refs.length () > 1;
1143 /* Returns the ssa name that contains the value of REF, or NULL_TREE if there
1144 is no such name. */
1146 static tree
1147 name_for_ref (dref ref)
1149 tree name;
1151 if (is_gimple_assign (ref->stmt))
1153 if (!ref->ref || DR_IS_READ (ref->ref))
1154 name = gimple_assign_lhs (ref->stmt);
1155 else
1156 name = gimple_assign_rhs1 (ref->stmt);
1158 else
1159 name = PHI_RESULT (ref->stmt);
1161 return (TREE_CODE (name) == SSA_NAME ? name : NULL_TREE);
1164 /* Returns true if REF is a valid initializer for ROOT with given DISTANCE (in
1165 iterations of the innermost enclosing loop). */
1167 static bool
1168 valid_initializer_p (struct data_reference *ref,
1169 unsigned distance, struct data_reference *root)
1171 aff_tree diff, base, step;
1172 widest_int off;
1174 /* Both REF and ROOT must be accessing the same object. */
1175 if (!operand_equal_p (DR_BASE_ADDRESS (ref), DR_BASE_ADDRESS (root), 0))
1176 return false;
1178 /* The initializer is defined outside of loop, hence its address must be
1179 invariant inside the loop. */
1180 gcc_assert (integer_zerop (DR_STEP (ref)));
1182 /* If the address of the reference is invariant, initializer must access
1183 exactly the same location. */
1184 if (integer_zerop (DR_STEP (root)))
1185 return (operand_equal_p (DR_OFFSET (ref), DR_OFFSET (root), 0)
1186 && operand_equal_p (DR_INIT (ref), DR_INIT (root), 0));
1188 /* Verify that this index of REF is equal to the root's index at
1189 -DISTANCE-th iteration. */
1190 aff_combination_dr_offset (root, &diff);
1191 aff_combination_dr_offset (ref, &base);
1192 aff_combination_scale (&base, -1);
1193 aff_combination_add (&diff, &base);
1195 tree_to_aff_combination_expand (DR_STEP (root), TREE_TYPE (DR_STEP (root)),
1196 &step, &name_expansions);
1197 if (!aff_combination_constant_multiple_p (&diff, &step, &off))
1198 return false;
1200 if (off != distance)
1201 return false;
1203 return true;
1206 /* Finds looparound phi node of LOOP that copies the value of REF, and if its
1207 initial value is correct (equal to initial value of REF shifted by one
1208 iteration), returns the phi node. Otherwise, NULL_TREE is returned. ROOT
1209 is the root of the current chain. */
1211 static gphi *
1212 find_looparound_phi (struct loop *loop, dref ref, dref root)
1214 tree name, init, init_ref;
1215 gphi *phi = NULL;
1216 gimple *init_stmt;
1217 edge latch = loop_latch_edge (loop);
1218 struct data_reference init_dr;
1219 gphi_iterator psi;
1221 if (is_gimple_assign (ref->stmt))
1223 if (DR_IS_READ (ref->ref))
1224 name = gimple_assign_lhs (ref->stmt);
1225 else
1226 name = gimple_assign_rhs1 (ref->stmt);
1228 else
1229 name = PHI_RESULT (ref->stmt);
1230 if (!name)
1231 return NULL;
1233 for (psi = gsi_start_phis (loop->header); !gsi_end_p (psi); gsi_next (&psi))
1235 phi = psi.phi ();
1236 if (PHI_ARG_DEF_FROM_EDGE (phi, latch) == name)
1237 break;
1240 if (gsi_end_p (psi))
1241 return NULL;
1243 init = PHI_ARG_DEF_FROM_EDGE (phi, loop_preheader_edge (loop));
1244 if (TREE_CODE (init) != SSA_NAME)
1245 return NULL;
1246 init_stmt = SSA_NAME_DEF_STMT (init);
1247 if (gimple_code (init_stmt) != GIMPLE_ASSIGN)
1248 return NULL;
1249 gcc_assert (gimple_assign_lhs (init_stmt) == init);
1251 init_ref = gimple_assign_rhs1 (init_stmt);
1252 if (!REFERENCE_CLASS_P (init_ref)
1253 && !DECL_P (init_ref))
1254 return NULL;
1256 /* Analyze the behavior of INIT_REF with respect to LOOP (innermost
1257 loop enclosing PHI). */
1258 memset (&init_dr, 0, sizeof (struct data_reference));
1259 DR_REF (&init_dr) = init_ref;
1260 DR_STMT (&init_dr) = phi;
1261 if (!dr_analyze_innermost (&DR_INNERMOST (&init_dr), init_ref, loop))
1262 return NULL;
1264 if (!valid_initializer_p (&init_dr, ref->distance + 1, root->ref))
1265 return NULL;
1267 return phi;
1270 /* Adds a reference for the looparound copy of REF in PHI to CHAIN. */
1272 static void
1273 insert_looparound_copy (chain_p chain, dref ref, gphi *phi)
1275 dref nw = XCNEW (struct dref_d), aref;
1276 unsigned i;
1278 nw->stmt = phi;
1279 nw->distance = ref->distance + 1;
1280 nw->always_accessed = 1;
1282 FOR_EACH_VEC_ELT (chain->refs, i, aref)
1283 if (aref->distance >= nw->distance)
1284 break;
1285 chain->refs.safe_insert (i, nw);
1287 if (nw->distance > chain->length)
1289 chain->length = nw->distance;
1290 chain->has_max_use_after = false;
1294 /* For references in CHAIN that are copied around the LOOP (created previously
1295 by PRE, or by user), add the results of such copies to the chain. This
1296 enables us to remove the copies by unrolling, and may need less registers
1297 (also, it may allow us to combine chains together). */
1299 static void
1300 add_looparound_copies (struct loop *loop, chain_p chain)
1302 unsigned i;
1303 dref ref, root = get_chain_root (chain);
1304 gphi *phi;
1306 if (chain->type == CT_STORE_STORE)
1307 return;
1309 FOR_EACH_VEC_ELT (chain->refs, i, ref)
1311 phi = find_looparound_phi (loop, ref, root);
1312 if (!phi)
1313 continue;
1315 bitmap_set_bit (looparound_phis, SSA_NAME_VERSION (PHI_RESULT (phi)));
1316 insert_looparound_copy (chain, ref, phi);
1320 /* Find roots of the values and determine distances in the component COMP.
1321 The references are redistributed into CHAINS. LOOP is the current
1322 loop. */
1324 static void
1325 determine_roots_comp (struct loop *loop,
1326 struct component *comp,
1327 vec<chain_p> *chains)
1329 unsigned i;
1330 dref a;
1331 chain_p chain = NULL;
1332 widest_int last_ofs = 0;
1333 enum chain_type type;
1335 /* Invariants are handled specially. */
1336 if (comp->comp_step == RS_INVARIANT)
1338 chain = make_invariant_chain (comp);
1339 chains->safe_push (chain);
1340 return;
1343 /* Trivial component. */
1344 if (comp->refs.length () <= 1)
1346 if (comp->refs.length () == 1)
1348 free (comp->refs[0]);
1349 comp->refs.truncate (0);
1351 return;
1354 comp->refs.qsort (order_drefs);
1355 FOR_EACH_VEC_ELT (comp->refs, i, a)
1357 if (!chain
1358 || (!comp->eliminate_store_p && DR_IS_WRITE (a->ref))
1359 || wi::leu_p (MAX_DISTANCE, a->offset - last_ofs))
1361 if (nontrivial_chain_p (chain))
1363 add_looparound_copies (loop, chain);
1364 chains->safe_push (chain);
1366 else
1367 release_chain (chain);
1369 if (DR_IS_READ (a->ref))
1370 type = CT_LOAD;
1371 else
1372 type = comp->eliminate_store_p ? CT_STORE_STORE : CT_STORE_LOAD;
1374 chain = make_rooted_chain (a, type);
1375 last_ofs = a->offset;
1376 continue;
1379 add_ref_to_chain (chain, a);
1382 if (nontrivial_chain_p (chain))
1384 add_looparound_copies (loop, chain);
1385 chains->safe_push (chain);
1387 else
1388 release_chain (chain);
1391 /* Find roots of the values and determine distances in components COMPS, and
1392 separates the references to CHAINS. LOOP is the current loop. */
1394 static void
1395 determine_roots (struct loop *loop,
1396 struct component *comps, vec<chain_p> *chains)
1398 struct component *comp;
1400 for (comp = comps; comp; comp = comp->next)
1401 determine_roots_comp (loop, comp, chains);
1404 /* Replace the reference in statement STMT with temporary variable
1405 NEW_TREE. If SET is true, NEW_TREE is instead initialized to the value of
1406 the reference in the statement. IN_LHS is true if the reference
1407 is in the lhs of STMT, false if it is in rhs. */
1409 static void
1410 replace_ref_with (gimple *stmt, tree new_tree, bool set, bool in_lhs)
1412 tree val;
1413 gassign *new_stmt;
1414 gimple_stmt_iterator bsi, psi;
1416 if (gimple_code (stmt) == GIMPLE_PHI)
1418 gcc_assert (!in_lhs && !set);
1420 val = PHI_RESULT (stmt);
1421 bsi = gsi_after_labels (gimple_bb (stmt));
1422 psi = gsi_for_stmt (stmt);
1423 remove_phi_node (&psi, false);
1425 /* Turn the phi node into GIMPLE_ASSIGN. */
1426 new_stmt = gimple_build_assign (val, new_tree);
1427 gsi_insert_before (&bsi, new_stmt, GSI_NEW_STMT);
1428 return;
1431 /* Since the reference is of gimple_reg type, it should only
1432 appear as lhs or rhs of modify statement. */
1433 gcc_assert (is_gimple_assign (stmt));
1435 bsi = gsi_for_stmt (stmt);
1437 /* If we do not need to initialize NEW_TREE, just replace the use of OLD. */
1438 if (!set)
1440 gcc_assert (!in_lhs);
1441 gimple_assign_set_rhs_from_tree (&bsi, new_tree);
1442 stmt = gsi_stmt (bsi);
1443 update_stmt (stmt);
1444 return;
1447 if (in_lhs)
1449 /* We have statement
1451 OLD = VAL
1453 If OLD is a memory reference, then VAL is gimple_val, and we transform
1454 this to
1456 OLD = VAL
1457 NEW = VAL
1459 Otherwise, we are replacing a combination chain,
1460 VAL is the expression that performs the combination, and OLD is an
1461 SSA name. In this case, we transform the assignment to
1463 OLD = VAL
1464 NEW = OLD
1468 val = gimple_assign_lhs (stmt);
1469 if (TREE_CODE (val) != SSA_NAME)
1471 val = gimple_assign_rhs1 (stmt);
1472 gcc_assert (gimple_assign_single_p (stmt));
1473 if (TREE_CLOBBER_P (val))
1474 val = get_or_create_ssa_default_def (cfun, SSA_NAME_VAR (new_tree));
1475 else
1476 gcc_assert (gimple_assign_copy_p (stmt));
1479 else
1481 /* VAL = OLD
1483 is transformed to
1485 VAL = OLD
1486 NEW = VAL */
1488 val = gimple_assign_lhs (stmt);
1491 new_stmt = gimple_build_assign (new_tree, unshare_expr (val));
1492 gsi_insert_after (&bsi, new_stmt, GSI_NEW_STMT);
1495 /* Returns a memory reference to DR in the (NITERS + ITER)-th iteration
1496 of the loop it was analyzed in. Append init stmts to STMTS. */
1498 static tree
1499 ref_at_iteration (data_reference_p dr, int iter,
1500 gimple_seq *stmts, tree niters = NULL_TREE)
1502 tree off = DR_OFFSET (dr);
1503 tree coff = DR_INIT (dr);
1504 tree ref = DR_REF (dr);
1505 enum tree_code ref_code = ERROR_MARK;
1506 tree ref_type = NULL_TREE;
1507 tree ref_op1 = NULL_TREE;
1508 tree ref_op2 = NULL_TREE;
1509 tree new_offset;
1511 if (iter != 0)
1513 new_offset = size_binop (MULT_EXPR, DR_STEP (dr), ssize_int (iter));
1514 if (TREE_CODE (new_offset) == INTEGER_CST)
1515 coff = size_binop (PLUS_EXPR, coff, new_offset);
1516 else
1517 off = size_binop (PLUS_EXPR, off, new_offset);
1520 if (niters != NULL_TREE)
1522 niters = fold_convert (ssizetype, niters);
1523 new_offset = size_binop (MULT_EXPR, DR_STEP (dr), niters);
1524 if (TREE_CODE (niters) == INTEGER_CST)
1525 coff = size_binop (PLUS_EXPR, coff, new_offset);
1526 else
1527 off = size_binop (PLUS_EXPR, off, new_offset);
1530 /* While data-ref analysis punts on bit offsets it still handles
1531 bitfield accesses at byte boundaries. Cope with that. Note that
1532 if the bitfield object also starts at a byte-boundary we can simply
1533 replicate the COMPONENT_REF, but we have to subtract the component's
1534 byte-offset from the MEM_REF address first.
1535 Otherwise we simply build a BIT_FIELD_REF knowing that the bits
1536 start at offset zero. */
1537 if (TREE_CODE (ref) == COMPONENT_REF
1538 && DECL_BIT_FIELD (TREE_OPERAND (ref, 1)))
1540 unsigned HOST_WIDE_INT boff;
1541 tree field = TREE_OPERAND (ref, 1);
1542 tree offset = component_ref_field_offset (ref);
1543 ref_type = TREE_TYPE (ref);
1544 boff = tree_to_uhwi (DECL_FIELD_BIT_OFFSET (field));
1545 /* This can occur in Ada. See the comment in get_bit_range. */
1546 if (boff % BITS_PER_UNIT != 0
1547 || !tree_fits_uhwi_p (offset))
1549 ref_code = BIT_FIELD_REF;
1550 ref_op1 = DECL_SIZE (field);
1551 ref_op2 = bitsize_zero_node;
1553 else
1555 boff >>= LOG2_BITS_PER_UNIT;
1556 boff += tree_to_uhwi (offset);
1557 coff = size_binop (MINUS_EXPR, coff, ssize_int (boff));
1558 ref_code = COMPONENT_REF;
1559 ref_op1 = field;
1560 ref_op2 = TREE_OPERAND (ref, 2);
1561 ref = TREE_OPERAND (ref, 0);
1564 tree addr = fold_build_pointer_plus (DR_BASE_ADDRESS (dr), off);
1565 addr = force_gimple_operand_1 (unshare_expr (addr), stmts,
1566 is_gimple_mem_ref_addr, NULL_TREE);
1567 tree alias_ptr = fold_convert (reference_alias_ptr_type (ref), coff);
1568 tree type = build_aligned_type (TREE_TYPE (ref),
1569 get_object_alignment (ref));
1570 ref = build2 (MEM_REF, type, addr, alias_ptr);
1571 if (ref_type)
1572 ref = build3 (ref_code, ref_type, ref, ref_op1, ref_op2);
1573 return ref;
1576 /* Get the initialization expression for the INDEX-th temporary variable
1577 of CHAIN. */
1579 static tree
1580 get_init_expr (chain_p chain, unsigned index)
1582 if (chain->type == CT_COMBINATION)
1584 tree e1 = get_init_expr (chain->ch1, index);
1585 tree e2 = get_init_expr (chain->ch2, index);
1587 return fold_build2 (chain->op, chain->rslt_type, e1, e2);
1589 else
1590 return chain->inits[index];
1593 /* Returns a new temporary variable used for the I-th variable carrying
1594 value of REF. The variable's uid is marked in TMP_VARS. */
1596 static tree
1597 predcom_tmp_var (tree ref, unsigned i, bitmap tmp_vars)
1599 tree type = TREE_TYPE (ref);
1600 /* We never access the components of the temporary variable in predictive
1601 commoning. */
1602 tree var = create_tmp_reg (type, get_lsm_tmp_name (ref, i));
1603 bitmap_set_bit (tmp_vars, DECL_UID (var));
1604 return var;
1607 /* Creates the variables for CHAIN, as well as phi nodes for them and
1608 initialization on entry to LOOP. Uids of the newly created
1609 temporary variables are marked in TMP_VARS. */
1611 static void
1612 initialize_root_vars (struct loop *loop, chain_p chain, bitmap tmp_vars)
1614 unsigned i;
1615 unsigned n = chain->length;
1616 dref root = get_chain_root (chain);
1617 bool reuse_first = !chain->has_max_use_after;
1618 tree ref, init, var, next;
1619 gphi *phi;
1620 gimple_seq stmts;
1621 edge entry = loop_preheader_edge (loop), latch = loop_latch_edge (loop);
1623 /* If N == 0, then all the references are within the single iteration. And
1624 since this is an nonempty chain, reuse_first cannot be true. */
1625 gcc_assert (n > 0 || !reuse_first);
1627 chain->vars.create (n + 1);
1629 if (chain->type == CT_COMBINATION)
1630 ref = gimple_assign_lhs (root->stmt);
1631 else
1632 ref = DR_REF (root->ref);
1634 for (i = 0; i < n + (reuse_first ? 0 : 1); i++)
1636 var = predcom_tmp_var (ref, i, tmp_vars);
1637 chain->vars.quick_push (var);
1639 if (reuse_first)
1640 chain->vars.quick_push (chain->vars[0]);
1642 FOR_EACH_VEC_ELT (chain->vars, i, var)
1643 chain->vars[i] = make_ssa_name (var);
1645 for (i = 0; i < n; i++)
1647 var = chain->vars[i];
1648 next = chain->vars[i + 1];
1649 init = get_init_expr (chain, i);
1651 init = force_gimple_operand (init, &stmts, true, NULL_TREE);
1652 if (stmts)
1653 gsi_insert_seq_on_edge_immediate (entry, stmts);
1655 phi = create_phi_node (var, loop->header);
1656 add_phi_arg (phi, init, entry, UNKNOWN_LOCATION);
1657 add_phi_arg (phi, next, latch, UNKNOWN_LOCATION);
1661 /* For inter-iteration store elimination CHAIN in LOOP, returns true if
1662 all stores to be eliminated store loop invariant values into memory.
1663 In this case, we can use these invariant values directly after LOOP. */
1665 static bool
1666 is_inv_store_elimination_chain (struct loop *loop, chain_p chain)
1668 if (chain->length == 0 || chain->type != CT_STORE_STORE)
1669 return false;
1671 gcc_assert (!chain->has_max_use_after);
1673 /* If loop iterates for unknown times or fewer times than chain->lenght,
1674 we still need to setup root variable and propagate it with PHI node. */
1675 tree niters = number_of_latch_executions (loop);
1676 if (TREE_CODE (niters) != INTEGER_CST
1677 || wi::leu_p (wi::to_wide (niters), chain->length))
1678 return false;
1680 /* Check stores in chain for elimination if they only store loop invariant
1681 values. */
1682 for (unsigned i = 0; i < chain->length; i++)
1684 dref a = get_chain_last_ref_at (chain, i);
1685 if (a == NULL)
1686 continue;
1688 gimple *def_stmt, *stmt = a->stmt;
1689 if (!gimple_assign_single_p (stmt))
1690 return false;
1692 tree val = gimple_assign_rhs1 (stmt);
1693 if (TREE_CLOBBER_P (val))
1694 return false;
1696 if (CONSTANT_CLASS_P (val))
1697 continue;
1699 if (TREE_CODE (val) != SSA_NAME)
1700 return false;
1702 def_stmt = SSA_NAME_DEF_STMT (val);
1703 if (gimple_nop_p (def_stmt))
1704 continue;
1706 if (flow_bb_inside_loop_p (loop, gimple_bb (def_stmt)))
1707 return false;
1709 return true;
1712 /* Creates root variables for store elimination CHAIN in which stores for
1713 elimination only store loop invariant values. In this case, we neither
1714 need to load root variables before loop nor propagate it with PHI nodes. */
1716 static void
1717 initialize_root_vars_store_elim_1 (chain_p chain)
1719 tree var;
1720 unsigned i, n = chain->length;
1722 chain->vars.create (n);
1723 chain->vars.safe_grow_cleared (n);
1725 /* Initialize root value for eliminated stores at each distance. */
1726 for (i = 0; i < n; i++)
1728 dref a = get_chain_last_ref_at (chain, i);
1729 if (a == NULL)
1730 continue;
1732 var = gimple_assign_rhs1 (a->stmt);
1733 chain->vars[a->distance] = var;
1736 /* We don't propagate values with PHI nodes, so manually propagate value
1737 to bubble positions. */
1738 var = chain->vars[0];
1739 for (i = 1; i < n; i++)
1741 if (chain->vars[i] != NULL_TREE)
1743 var = chain->vars[i];
1744 continue;
1746 chain->vars[i] = var;
1749 /* Revert the vector. */
1750 for (i = 0; i < n / 2; i++)
1751 std::swap (chain->vars[i], chain->vars[n - i - 1]);
1754 /* Creates root variables for store elimination CHAIN in which stores for
1755 elimination store loop variant values. In this case, we may need to
1756 load root variables before LOOP and propagate it with PHI nodes. Uids
1757 of the newly created root variables are marked in TMP_VARS. */
1759 static void
1760 initialize_root_vars_store_elim_2 (struct loop *loop,
1761 chain_p chain, bitmap tmp_vars)
1763 unsigned i, n = chain->length;
1764 tree ref, init, var, next, val, phi_result;
1765 gimple *stmt;
1766 gimple_seq stmts;
1768 chain->vars.create (n);
1770 ref = DR_REF (get_chain_root (chain)->ref);
1771 for (i = 0; i < n; i++)
1773 var = predcom_tmp_var (ref, i, tmp_vars);
1774 chain->vars.quick_push (var);
1777 FOR_EACH_VEC_ELT (chain->vars, i, var)
1778 chain->vars[i] = make_ssa_name (var);
1780 /* Root values are either rhs operand of stores to be eliminated, or
1781 loaded from memory before loop. */
1782 auto_vec<tree> vtemps;
1783 vtemps.safe_grow_cleared (n);
1784 for (i = 0; i < n; i++)
1786 init = get_init_expr (chain, i);
1787 if (init == NULL_TREE)
1789 /* Root value is rhs operand of the store to be eliminated if
1790 it isn't loaded from memory before loop. */
1791 dref a = get_chain_last_ref_at (chain, i);
1792 val = gimple_assign_rhs1 (a->stmt);
1793 if (TREE_CLOBBER_P (val))
1794 val = get_or_create_ssa_default_def (cfun, SSA_NAME_VAR (var));
1796 vtemps[n - i - 1] = val;
1798 else
1800 edge latch = loop_latch_edge (loop);
1801 edge entry = loop_preheader_edge (loop);
1803 /* Root value is loaded from memory before loop, we also need
1804 to add PHI nodes to propagate the value across iterations. */
1805 init = force_gimple_operand (init, &stmts, true, NULL_TREE);
1806 if (stmts)
1807 gsi_insert_seq_on_edge_immediate (entry, stmts);
1809 next = chain->vars[n - i];
1810 phi_result = copy_ssa_name (next);
1811 gphi *phi = create_phi_node (phi_result, loop->header);
1812 add_phi_arg (phi, init, entry, UNKNOWN_LOCATION);
1813 add_phi_arg (phi, next, latch, UNKNOWN_LOCATION);
1814 vtemps[n - i - 1] = phi_result;
1818 /* Find the insertion position. */
1819 dref last = get_chain_root (chain);
1820 for (i = 0; i < chain->refs.length (); i++)
1822 if (chain->refs[i]->pos > last->pos)
1823 last = chain->refs[i];
1826 gimple_stmt_iterator gsi = gsi_for_stmt (last->stmt);
1828 /* Insert statements copying root value to root variable. */
1829 for (i = 0; i < n; i++)
1831 var = chain->vars[i];
1832 val = vtemps[i];
1833 stmt = gimple_build_assign (var, val);
1834 gsi_insert_after (&gsi, stmt, GSI_NEW_STMT);
1838 /* Generates stores for CHAIN's eliminated stores in LOOP's last
1839 (CHAIN->length - 1) iterations. */
1841 static void
1842 finalize_eliminated_stores (struct loop *loop, chain_p chain)
1844 unsigned i, n = chain->length;
1846 for (i = 0; i < n; i++)
1848 tree var = chain->vars[i];
1849 tree fini = chain->finis[n - i - 1];
1850 gimple *stmt = gimple_build_assign (fini, var);
1852 gimple_seq_add_stmt_without_update (&chain->fini_seq, stmt);
1855 if (chain->fini_seq)
1857 gsi_insert_seq_on_edge_immediate (single_exit (loop), chain->fini_seq);
1858 chain->fini_seq = NULL;
1862 /* Initializes a variable for load motion for ROOT and prepares phi nodes and
1863 initialization on entry to LOOP if necessary. The ssa name for the variable
1864 is stored in VARS. If WRITTEN is true, also a phi node to copy its value
1865 around the loop is created. Uid of the newly created temporary variable
1866 is marked in TMP_VARS. INITS is the list containing the (single)
1867 initializer. */
1869 static void
1870 initialize_root_vars_lm (struct loop *loop, dref root, bool written,
1871 vec<tree> *vars, vec<tree> inits,
1872 bitmap tmp_vars)
1874 unsigned i;
1875 tree ref = DR_REF (root->ref), init, var, next;
1876 gimple_seq stmts;
1877 gphi *phi;
1878 edge entry = loop_preheader_edge (loop), latch = loop_latch_edge (loop);
1880 /* Find the initializer for the variable, and check that it cannot
1881 trap. */
1882 init = inits[0];
1884 vars->create (written ? 2 : 1);
1885 var = predcom_tmp_var (ref, 0, tmp_vars);
1886 vars->quick_push (var);
1887 if (written)
1888 vars->quick_push ((*vars)[0]);
1890 FOR_EACH_VEC_ELT (*vars, i, var)
1891 (*vars)[i] = make_ssa_name (var);
1893 var = (*vars)[0];
1895 init = force_gimple_operand (init, &stmts, written, NULL_TREE);
1896 if (stmts)
1897 gsi_insert_seq_on_edge_immediate (entry, stmts);
1899 if (written)
1901 next = (*vars)[1];
1902 phi = create_phi_node (var, loop->header);
1903 add_phi_arg (phi, init, entry, UNKNOWN_LOCATION);
1904 add_phi_arg (phi, next, latch, UNKNOWN_LOCATION);
1906 else
1908 gassign *init_stmt = gimple_build_assign (var, init);
1909 gsi_insert_on_edge_immediate (entry, init_stmt);
1914 /* Execute load motion for references in chain CHAIN. Uids of the newly
1915 created temporary variables are marked in TMP_VARS. */
1917 static void
1918 execute_load_motion (struct loop *loop, chain_p chain, bitmap tmp_vars)
1920 auto_vec<tree> vars;
1921 dref a;
1922 unsigned n_writes = 0, ridx, i;
1923 tree var;
1925 gcc_assert (chain->type == CT_INVARIANT);
1926 gcc_assert (!chain->combined);
1927 FOR_EACH_VEC_ELT (chain->refs, i, a)
1928 if (DR_IS_WRITE (a->ref))
1929 n_writes++;
1931 /* If there are no reads in the loop, there is nothing to do. */
1932 if (n_writes == chain->refs.length ())
1933 return;
1935 initialize_root_vars_lm (loop, get_chain_root (chain), n_writes > 0,
1936 &vars, chain->inits, tmp_vars);
1938 ridx = 0;
1939 FOR_EACH_VEC_ELT (chain->refs, i, a)
1941 bool is_read = DR_IS_READ (a->ref);
1943 if (DR_IS_WRITE (a->ref))
1945 n_writes--;
1946 if (n_writes)
1948 var = vars[0];
1949 var = make_ssa_name (SSA_NAME_VAR (var));
1950 vars[0] = var;
1952 else
1953 ridx = 1;
1956 replace_ref_with (a->stmt, vars[ridx],
1957 !is_read, !is_read);
1961 /* Returns the single statement in that NAME is used, excepting
1962 the looparound phi nodes contained in one of the chains. If there is no
1963 such statement, or more statements, NULL is returned. */
1965 static gimple *
1966 single_nonlooparound_use (tree name)
1968 use_operand_p use;
1969 imm_use_iterator it;
1970 gimple *stmt, *ret = NULL;
1972 FOR_EACH_IMM_USE_FAST (use, it, name)
1974 stmt = USE_STMT (use);
1976 if (gimple_code (stmt) == GIMPLE_PHI)
1978 /* Ignore uses in looparound phi nodes. Uses in other phi nodes
1979 could not be processed anyway, so just fail for them. */
1980 if (bitmap_bit_p (looparound_phis,
1981 SSA_NAME_VERSION (PHI_RESULT (stmt))))
1982 continue;
1984 return NULL;
1986 else if (is_gimple_debug (stmt))
1987 continue;
1988 else if (ret != NULL)
1989 return NULL;
1990 else
1991 ret = stmt;
1994 return ret;
1997 /* Remove statement STMT, as well as the chain of assignments in that it is
1998 used. */
2000 static void
2001 remove_stmt (gimple *stmt)
2003 tree name;
2004 gimple *next;
2005 gimple_stmt_iterator psi;
2007 if (gimple_code (stmt) == GIMPLE_PHI)
2009 name = PHI_RESULT (stmt);
2010 next = single_nonlooparound_use (name);
2011 reset_debug_uses (stmt);
2012 psi = gsi_for_stmt (stmt);
2013 remove_phi_node (&psi, true);
2015 if (!next
2016 || !gimple_assign_ssa_name_copy_p (next)
2017 || gimple_assign_rhs1 (next) != name)
2018 return;
2020 stmt = next;
2023 while (1)
2025 gimple_stmt_iterator bsi;
2027 bsi = gsi_for_stmt (stmt);
2029 name = gimple_assign_lhs (stmt);
2030 if (TREE_CODE (name) == SSA_NAME)
2032 next = single_nonlooparound_use (name);
2033 reset_debug_uses (stmt);
2035 else
2037 /* This is a store to be eliminated. */
2038 gcc_assert (gimple_vdef (stmt) != NULL);
2039 next = NULL;
2042 unlink_stmt_vdef (stmt);
2043 gsi_remove (&bsi, true);
2044 release_defs (stmt);
2046 if (!next
2047 || !gimple_assign_ssa_name_copy_p (next)
2048 || gimple_assign_rhs1 (next) != name)
2049 return;
2051 stmt = next;
2055 /* Perform the predictive commoning optimization for a chain CHAIN.
2056 Uids of the newly created temporary variables are marked in TMP_VARS.*/
2058 static void
2059 execute_pred_commoning_chain (struct loop *loop, chain_p chain,
2060 bitmap tmp_vars)
2062 unsigned i, n;
2063 dref a;
2064 tree var;
2065 bool in_lhs;
2067 if (chain->combined)
2069 /* For combined chains, just remove the statements that are used to
2070 compute the values of the expression (except for the root one).
2071 We delay this until after all chains are processed. */
2073 else if (chain->type == CT_STORE_STORE)
2075 if (chain->length > 0)
2077 if (chain->inv_store_elimination)
2079 /* If dead stores in this chain only store loop invariant
2080 values, we can simply record the invariant value and use
2081 it directly after loop. */
2082 initialize_root_vars_store_elim_1 (chain);
2084 else
2086 /* If dead stores in this chain store loop variant values,
2087 we need to set up the variables by loading from memory
2088 before loop and propagating it with PHI nodes. */
2089 initialize_root_vars_store_elim_2 (loop, chain, tmp_vars);
2092 /* For inter-iteration store elimination chain, stores at each
2093 distance in loop's last (chain->length - 1) iterations can't
2094 be eliminated, because there is no following killing store.
2095 We need to generate these stores after loop. */
2096 finalize_eliminated_stores (loop, chain);
2099 /* Eliminate the stores killed by following store. */
2100 n = chain->refs.length ();
2101 for (i = 0; i < n - 1; i++)
2102 remove_stmt (chain->refs[i]->stmt);
2104 else
2106 /* For non-combined chains, set up the variables that hold its value. */
2107 initialize_root_vars (loop, chain, tmp_vars);
2108 a = get_chain_root (chain);
2109 in_lhs = (chain->type == CT_STORE_LOAD
2110 || chain->type == CT_COMBINATION);
2111 replace_ref_with (a->stmt, chain->vars[chain->length], true, in_lhs);
2113 /* Replace the uses of the original references by these variables. */
2114 for (i = 1; chain->refs.iterate (i, &a); i++)
2116 var = chain->vars[chain->length - a->distance];
2117 replace_ref_with (a->stmt, var, false, false);
2122 /* Determines the unroll factor necessary to remove as many temporary variable
2123 copies as possible. CHAINS is the list of chains that will be
2124 optimized. */
2126 static unsigned
2127 determine_unroll_factor (vec<chain_p> chains)
2129 chain_p chain;
2130 unsigned factor = 1, af, nfactor, i;
2131 unsigned max = PARAM_VALUE (PARAM_MAX_UNROLL_TIMES);
2133 FOR_EACH_VEC_ELT (chains, i, chain)
2135 if (chain->type == CT_INVARIANT)
2136 continue;
2137 /* For now we can't handle unrolling when eliminating stores. */
2138 else if (chain->type == CT_STORE_STORE)
2139 return 1;
2141 if (chain->combined)
2143 /* For combined chains, we can't handle unrolling if we replace
2144 looparound PHIs. */
2145 dref a;
2146 unsigned j;
2147 for (j = 1; chain->refs.iterate (j, &a); j++)
2148 if (gimple_code (a->stmt) == GIMPLE_PHI)
2149 return 1;
2150 continue;
2153 /* The best unroll factor for this chain is equal to the number of
2154 temporary variables that we create for it. */
2155 af = chain->length;
2156 if (chain->has_max_use_after)
2157 af++;
2159 nfactor = factor * af / gcd (factor, af);
2160 if (nfactor <= max)
2161 factor = nfactor;
2164 return factor;
2167 /* Perform the predictive commoning optimization for CHAINS.
2168 Uids of the newly created temporary variables are marked in TMP_VARS. */
2170 static void
2171 execute_pred_commoning (struct loop *loop, vec<chain_p> chains,
2172 bitmap tmp_vars)
2174 chain_p chain;
2175 unsigned i;
2177 FOR_EACH_VEC_ELT (chains, i, chain)
2179 if (chain->type == CT_INVARIANT)
2180 execute_load_motion (loop, chain, tmp_vars);
2181 else
2182 execute_pred_commoning_chain (loop, chain, tmp_vars);
2185 FOR_EACH_VEC_ELT (chains, i, chain)
2187 if (chain->type == CT_INVARIANT)
2189 else if (chain->combined)
2191 /* For combined chains, just remove the statements that are used to
2192 compute the values of the expression (except for the root one). */
2193 dref a;
2194 unsigned j;
2195 for (j = 1; chain->refs.iterate (j, &a); j++)
2196 remove_stmt (a->stmt);
2200 update_ssa (TODO_update_ssa_only_virtuals);
2203 /* For each reference in CHAINS, if its defining statement is
2204 phi node, record the ssa name that is defined by it. */
2206 static void
2207 replace_phis_by_defined_names (vec<chain_p> chains)
2209 chain_p chain;
2210 dref a;
2211 unsigned i, j;
2213 FOR_EACH_VEC_ELT (chains, i, chain)
2214 FOR_EACH_VEC_ELT (chain->refs, j, a)
2216 if (gimple_code (a->stmt) == GIMPLE_PHI)
2218 a->name_defined_by_phi = PHI_RESULT (a->stmt);
2219 a->stmt = NULL;
2224 /* For each reference in CHAINS, if name_defined_by_phi is not
2225 NULL, use it to set the stmt field. */
2227 static void
2228 replace_names_by_phis (vec<chain_p> chains)
2230 chain_p chain;
2231 dref a;
2232 unsigned i, j;
2234 FOR_EACH_VEC_ELT (chains, i, chain)
2235 FOR_EACH_VEC_ELT (chain->refs, j, a)
2236 if (a->stmt == NULL)
2238 a->stmt = SSA_NAME_DEF_STMT (a->name_defined_by_phi);
2239 gcc_assert (gimple_code (a->stmt) == GIMPLE_PHI);
2240 a->name_defined_by_phi = NULL_TREE;
2244 /* Wrapper over execute_pred_commoning, to pass it as a callback
2245 to tree_transform_and_unroll_loop. */
2247 struct epcc_data
2249 vec<chain_p> chains;
2250 bitmap tmp_vars;
2253 static void
2254 execute_pred_commoning_cbck (struct loop *loop, void *data)
2256 struct epcc_data *const dta = (struct epcc_data *) data;
2258 /* Restore phi nodes that were replaced by ssa names before
2259 tree_transform_and_unroll_loop (see detailed description in
2260 tree_predictive_commoning_loop). */
2261 replace_names_by_phis (dta->chains);
2262 execute_pred_commoning (loop, dta->chains, dta->tmp_vars);
2265 /* Base NAME and all the names in the chain of phi nodes that use it
2266 on variable VAR. The phi nodes are recognized by being in the copies of
2267 the header of the LOOP. */
2269 static void
2270 base_names_in_chain_on (struct loop *loop, tree name, tree var)
2272 gimple *stmt, *phi;
2273 imm_use_iterator iter;
2275 replace_ssa_name_symbol (name, var);
2277 while (1)
2279 phi = NULL;
2280 FOR_EACH_IMM_USE_STMT (stmt, iter, name)
2282 if (gimple_code (stmt) == GIMPLE_PHI
2283 && flow_bb_inside_loop_p (loop, gimple_bb (stmt)))
2285 phi = stmt;
2286 BREAK_FROM_IMM_USE_STMT (iter);
2289 if (!phi)
2290 return;
2292 name = PHI_RESULT (phi);
2293 replace_ssa_name_symbol (name, var);
2297 /* Given an unrolled LOOP after predictive commoning, remove the
2298 register copies arising from phi nodes by changing the base
2299 variables of SSA names. TMP_VARS is the set of the temporary variables
2300 for those we want to perform this. */
2302 static void
2303 eliminate_temp_copies (struct loop *loop, bitmap tmp_vars)
2305 edge e;
2306 gphi *phi;
2307 gimple *stmt;
2308 tree name, use, var;
2309 gphi_iterator psi;
2311 e = loop_latch_edge (loop);
2312 for (psi = gsi_start_phis (loop->header); !gsi_end_p (psi); gsi_next (&psi))
2314 phi = psi.phi ();
2315 name = PHI_RESULT (phi);
2316 var = SSA_NAME_VAR (name);
2317 if (!var || !bitmap_bit_p (tmp_vars, DECL_UID (var)))
2318 continue;
2319 use = PHI_ARG_DEF_FROM_EDGE (phi, e);
2320 gcc_assert (TREE_CODE (use) == SSA_NAME);
2322 /* Base all the ssa names in the ud and du chain of NAME on VAR. */
2323 stmt = SSA_NAME_DEF_STMT (use);
2324 while (gimple_code (stmt) == GIMPLE_PHI
2325 /* In case we could not unroll the loop enough to eliminate
2326 all copies, we may reach the loop header before the defining
2327 statement (in that case, some register copies will be present
2328 in loop latch in the final code, corresponding to the newly
2329 created looparound phi nodes). */
2330 && gimple_bb (stmt) != loop->header)
2332 gcc_assert (single_pred_p (gimple_bb (stmt)));
2333 use = PHI_ARG_DEF (stmt, 0);
2334 stmt = SSA_NAME_DEF_STMT (use);
2337 base_names_in_chain_on (loop, use, var);
2341 /* Returns true if CHAIN is suitable to be combined. */
2343 static bool
2344 chain_can_be_combined_p (chain_p chain)
2346 return (!chain->combined
2347 && (chain->type == CT_LOAD || chain->type == CT_COMBINATION));
2350 /* Returns the modify statement that uses NAME. Skips over assignment
2351 statements, NAME is replaced with the actual name used in the returned
2352 statement. */
2354 static gimple *
2355 find_use_stmt (tree *name)
2357 gimple *stmt;
2358 tree rhs, lhs;
2360 /* Skip over assignments. */
2361 while (1)
2363 stmt = single_nonlooparound_use (*name);
2364 if (!stmt)
2365 return NULL;
2367 if (gimple_code (stmt) != GIMPLE_ASSIGN)
2368 return NULL;
2370 lhs = gimple_assign_lhs (stmt);
2371 if (TREE_CODE (lhs) != SSA_NAME)
2372 return NULL;
2374 if (gimple_assign_copy_p (stmt))
2376 rhs = gimple_assign_rhs1 (stmt);
2377 if (rhs != *name)
2378 return NULL;
2380 *name = lhs;
2382 else if (get_gimple_rhs_class (gimple_assign_rhs_code (stmt))
2383 == GIMPLE_BINARY_RHS)
2384 return stmt;
2385 else
2386 return NULL;
2390 /* Returns true if we may perform reassociation for operation CODE in TYPE. */
2392 static bool
2393 may_reassociate_p (tree type, enum tree_code code)
2395 if (FLOAT_TYPE_P (type)
2396 && !flag_unsafe_math_optimizations)
2397 return false;
2399 return (commutative_tree_code (code)
2400 && associative_tree_code (code));
2403 /* If the operation used in STMT is associative and commutative, go through the
2404 tree of the same operations and returns its root. Distance to the root
2405 is stored in DISTANCE. */
2407 static gimple *
2408 find_associative_operation_root (gimple *stmt, unsigned *distance)
2410 tree lhs;
2411 gimple *next;
2412 enum tree_code code = gimple_assign_rhs_code (stmt);
2413 tree type = TREE_TYPE (gimple_assign_lhs (stmt));
2414 unsigned dist = 0;
2416 if (!may_reassociate_p (type, code))
2417 return NULL;
2419 while (1)
2421 lhs = gimple_assign_lhs (stmt);
2422 gcc_assert (TREE_CODE (lhs) == SSA_NAME);
2424 next = find_use_stmt (&lhs);
2425 if (!next
2426 || gimple_assign_rhs_code (next) != code)
2427 break;
2429 stmt = next;
2430 dist++;
2433 if (distance)
2434 *distance = dist;
2435 return stmt;
2438 /* Returns the common statement in that NAME1 and NAME2 have a use. If there
2439 is no such statement, returns NULL_TREE. In case the operation used on
2440 NAME1 and NAME2 is associative and commutative, returns the root of the
2441 tree formed by this operation instead of the statement that uses NAME1 or
2442 NAME2. */
2444 static gimple *
2445 find_common_use_stmt (tree *name1, tree *name2)
2447 gimple *stmt1, *stmt2;
2449 stmt1 = find_use_stmt (name1);
2450 if (!stmt1)
2451 return NULL;
2453 stmt2 = find_use_stmt (name2);
2454 if (!stmt2)
2455 return NULL;
2457 if (stmt1 == stmt2)
2458 return stmt1;
2460 stmt1 = find_associative_operation_root (stmt1, NULL);
2461 if (!stmt1)
2462 return NULL;
2463 stmt2 = find_associative_operation_root (stmt2, NULL);
2464 if (!stmt2)
2465 return NULL;
2467 return (stmt1 == stmt2 ? stmt1 : NULL);
2470 /* Checks whether R1 and R2 are combined together using CODE, with the result
2471 in RSLT_TYPE, in order R1 CODE R2 if SWAP is false and in order R2 CODE R1
2472 if it is true. If CODE is ERROR_MARK, set these values instead. */
2474 static bool
2475 combinable_refs_p (dref r1, dref r2,
2476 enum tree_code *code, bool *swap, tree *rslt_type)
2478 enum tree_code acode;
2479 bool aswap;
2480 tree atype;
2481 tree name1, name2;
2482 gimple *stmt;
2484 name1 = name_for_ref (r1);
2485 name2 = name_for_ref (r2);
2486 gcc_assert (name1 != NULL_TREE && name2 != NULL_TREE);
2488 stmt = find_common_use_stmt (&name1, &name2);
2490 if (!stmt
2491 /* A simple post-dominance check - make sure the combination
2492 is executed under the same condition as the references. */
2493 || (gimple_bb (stmt) != gimple_bb (r1->stmt)
2494 && gimple_bb (stmt) != gimple_bb (r2->stmt)))
2495 return false;
2497 acode = gimple_assign_rhs_code (stmt);
2498 aswap = (!commutative_tree_code (acode)
2499 && gimple_assign_rhs1 (stmt) != name1);
2500 atype = TREE_TYPE (gimple_assign_lhs (stmt));
2502 if (*code == ERROR_MARK)
2504 *code = acode;
2505 *swap = aswap;
2506 *rslt_type = atype;
2507 return true;
2510 return (*code == acode
2511 && *swap == aswap
2512 && *rslt_type == atype);
2515 /* Remove OP from the operation on rhs of STMT, and replace STMT with
2516 an assignment of the remaining operand. */
2518 static void
2519 remove_name_from_operation (gimple *stmt, tree op)
2521 tree other_op;
2522 gimple_stmt_iterator si;
2524 gcc_assert (is_gimple_assign (stmt));
2526 if (gimple_assign_rhs1 (stmt) == op)
2527 other_op = gimple_assign_rhs2 (stmt);
2528 else
2529 other_op = gimple_assign_rhs1 (stmt);
2531 si = gsi_for_stmt (stmt);
2532 gimple_assign_set_rhs_from_tree (&si, other_op);
2534 /* We should not have reallocated STMT. */
2535 gcc_assert (gsi_stmt (si) == stmt);
2537 update_stmt (stmt);
2540 /* Reassociates the expression in that NAME1 and NAME2 are used so that they
2541 are combined in a single statement, and returns this statement. */
2543 static gimple *
2544 reassociate_to_the_same_stmt (tree name1, tree name2)
2546 gimple *stmt1, *stmt2, *root1, *root2, *s1, *s2;
2547 gassign *new_stmt, *tmp_stmt;
2548 tree new_name, tmp_name, var, r1, r2;
2549 unsigned dist1, dist2;
2550 enum tree_code code;
2551 tree type = TREE_TYPE (name1);
2552 gimple_stmt_iterator bsi;
2554 stmt1 = find_use_stmt (&name1);
2555 stmt2 = find_use_stmt (&name2);
2556 root1 = find_associative_operation_root (stmt1, &dist1);
2557 root2 = find_associative_operation_root (stmt2, &dist2);
2558 code = gimple_assign_rhs_code (stmt1);
2560 gcc_assert (root1 && root2 && root1 == root2
2561 && code == gimple_assign_rhs_code (stmt2));
2563 /* Find the root of the nearest expression in that both NAME1 and NAME2
2564 are used. */
2565 r1 = name1;
2566 s1 = stmt1;
2567 r2 = name2;
2568 s2 = stmt2;
2570 while (dist1 > dist2)
2572 s1 = find_use_stmt (&r1);
2573 r1 = gimple_assign_lhs (s1);
2574 dist1--;
2576 while (dist2 > dist1)
2578 s2 = find_use_stmt (&r2);
2579 r2 = gimple_assign_lhs (s2);
2580 dist2--;
2583 while (s1 != s2)
2585 s1 = find_use_stmt (&r1);
2586 r1 = gimple_assign_lhs (s1);
2587 s2 = find_use_stmt (&r2);
2588 r2 = gimple_assign_lhs (s2);
2591 /* Remove NAME1 and NAME2 from the statements in that they are used
2592 currently. */
2593 remove_name_from_operation (stmt1, name1);
2594 remove_name_from_operation (stmt2, name2);
2596 /* Insert the new statement combining NAME1 and NAME2 before S1, and
2597 combine it with the rhs of S1. */
2598 var = create_tmp_reg (type, "predreastmp");
2599 new_name = make_ssa_name (var);
2600 new_stmt = gimple_build_assign (new_name, code, name1, name2);
2602 var = create_tmp_reg (type, "predreastmp");
2603 tmp_name = make_ssa_name (var);
2605 /* Rhs of S1 may now be either a binary expression with operation
2606 CODE, or gimple_val (in case that stmt1 == s1 or stmt2 == s1,
2607 so that name1 or name2 was removed from it). */
2608 tmp_stmt = gimple_build_assign (tmp_name, gimple_assign_rhs_code (s1),
2609 gimple_assign_rhs1 (s1),
2610 gimple_assign_rhs2 (s1));
2612 bsi = gsi_for_stmt (s1);
2613 gimple_assign_set_rhs_with_ops (&bsi, code, new_name, tmp_name);
2614 s1 = gsi_stmt (bsi);
2615 update_stmt (s1);
2617 gsi_insert_before (&bsi, new_stmt, GSI_SAME_STMT);
2618 gsi_insert_before (&bsi, tmp_stmt, GSI_SAME_STMT);
2620 return new_stmt;
2623 /* Returns the statement that combines references R1 and R2. In case R1
2624 and R2 are not used in the same statement, but they are used with an
2625 associative and commutative operation in the same expression, reassociate
2626 the expression so that they are used in the same statement. */
2628 static gimple *
2629 stmt_combining_refs (dref r1, dref r2)
2631 gimple *stmt1, *stmt2;
2632 tree name1 = name_for_ref (r1);
2633 tree name2 = name_for_ref (r2);
2635 stmt1 = find_use_stmt (&name1);
2636 stmt2 = find_use_stmt (&name2);
2637 if (stmt1 == stmt2)
2638 return stmt1;
2640 return reassociate_to_the_same_stmt (name1, name2);
2643 /* Tries to combine chains CH1 and CH2 together. If this succeeds, the
2644 description of the new chain is returned, otherwise we return NULL. */
2646 static chain_p
2647 combine_chains (chain_p ch1, chain_p ch2)
2649 dref r1, r2, nw;
2650 enum tree_code op = ERROR_MARK;
2651 bool swap = false;
2652 chain_p new_chain;
2653 unsigned i;
2654 tree rslt_type = NULL_TREE;
2656 if (ch1 == ch2)
2657 return NULL;
2658 if (ch1->length != ch2->length)
2659 return NULL;
2661 if (ch1->refs.length () != ch2->refs.length ())
2662 return NULL;
2664 for (i = 0; (ch1->refs.iterate (i, &r1)
2665 && ch2->refs.iterate (i, &r2)); i++)
2667 if (r1->distance != r2->distance)
2668 return NULL;
2670 if (!combinable_refs_p (r1, r2, &op, &swap, &rslt_type))
2671 return NULL;
2674 if (swap)
2675 std::swap (ch1, ch2);
2677 new_chain = XCNEW (struct chain);
2678 new_chain->type = CT_COMBINATION;
2679 new_chain->op = op;
2680 new_chain->ch1 = ch1;
2681 new_chain->ch2 = ch2;
2682 new_chain->rslt_type = rslt_type;
2683 new_chain->length = ch1->length;
2685 for (i = 0; (ch1->refs.iterate (i, &r1)
2686 && ch2->refs.iterate (i, &r2)); i++)
2688 nw = XCNEW (struct dref_d);
2689 nw->stmt = stmt_combining_refs (r1, r2);
2690 nw->distance = r1->distance;
2692 new_chain->refs.safe_push (nw);
2695 ch1->combined = true;
2696 ch2->combined = true;
2697 return new_chain;
2700 /* Recursively update position information of all offspring chains to ROOT
2701 chain's position information. */
2703 static void
2704 update_pos_for_combined_chains (chain_p root)
2706 chain_p ch1 = root->ch1, ch2 = root->ch2;
2707 dref ref, ref1, ref2;
2708 for (unsigned j = 0; (root->refs.iterate (j, &ref)
2709 && ch1->refs.iterate (j, &ref1)
2710 && ch2->refs.iterate (j, &ref2)); ++j)
2711 ref1->pos = ref2->pos = ref->pos;
2713 if (ch1->type == CT_COMBINATION)
2714 update_pos_for_combined_chains (ch1);
2715 if (ch2->type == CT_COMBINATION)
2716 update_pos_for_combined_chains (ch2);
2719 /* Returns true if statement S1 dominates statement S2. */
2721 static bool
2722 pcom_stmt_dominates_stmt_p (gimple *s1, gimple *s2)
2724 basic_block bb1 = gimple_bb (s1), bb2 = gimple_bb (s2);
2726 if (!bb1 || s1 == s2)
2727 return true;
2729 if (bb1 == bb2)
2730 return gimple_uid (s1) < gimple_uid (s2);
2732 return dominated_by_p (CDI_DOMINATORS, bb2, bb1);
2735 /* Try to combine the CHAINS in LOOP. */
2737 static void
2738 try_combine_chains (struct loop *loop, vec<chain_p> *chains)
2740 unsigned i, j;
2741 chain_p ch1, ch2, cch;
2742 auto_vec<chain_p> worklist;
2743 bool combined_p = false;
2745 FOR_EACH_VEC_ELT (*chains, i, ch1)
2746 if (chain_can_be_combined_p (ch1))
2747 worklist.safe_push (ch1);
2749 while (!worklist.is_empty ())
2751 ch1 = worklist.pop ();
2752 if (!chain_can_be_combined_p (ch1))
2753 continue;
2755 FOR_EACH_VEC_ELT (*chains, j, ch2)
2757 if (!chain_can_be_combined_p (ch2))
2758 continue;
2760 cch = combine_chains (ch1, ch2);
2761 if (cch)
2763 worklist.safe_push (cch);
2764 chains->safe_push (cch);
2765 combined_p = true;
2766 break;
2770 if (!combined_p)
2771 return;
2773 /* Setup UID for all statements in dominance order. */
2774 basic_block *bbs = get_loop_body (loop);
2775 renumber_gimple_stmt_uids_in_blocks (bbs, loop->num_nodes);
2776 free (bbs);
2778 /* Re-association in combined chains may generate statements different to
2779 order of references of the original chain. We need to keep references
2780 of combined chain in dominance order so that all uses will be inserted
2781 after definitions. Note:
2782 A) This is necessary for all combined chains.
2783 B) This is only necessary for ZERO distance references because other
2784 references inherit value from loop carried PHIs.
2786 We first update position information for all combined chains. */
2787 dref ref;
2788 for (i = 0; chains->iterate (i, &ch1); ++i)
2790 if (ch1->type != CT_COMBINATION || ch1->combined)
2791 continue;
2793 for (j = 0; ch1->refs.iterate (j, &ref); ++j)
2794 ref->pos = gimple_uid (ref->stmt);
2796 update_pos_for_combined_chains (ch1);
2798 /* Then sort references according to newly updated position information. */
2799 for (i = 0; chains->iterate (i, &ch1); ++i)
2801 if (ch1->type != CT_COMBINATION && !ch1->combined)
2802 continue;
2804 /* Find the first reference with non-ZERO distance. */
2805 if (ch1->length == 0)
2806 j = ch1->refs.length();
2807 else
2809 for (j = 0; ch1->refs.iterate (j, &ref); ++j)
2810 if (ref->distance != 0)
2811 break;
2814 /* Sort all ZERO distance references by position. */
2815 qsort (&ch1->refs[0], j, sizeof (ch1->refs[0]), order_drefs_by_pos);
2817 if (ch1->combined)
2818 continue;
2820 /* For ZERO length chain, has_max_use_after must be true since root
2821 combined stmt must dominates others. */
2822 if (ch1->length == 0)
2824 ch1->has_max_use_after = true;
2825 continue;
2827 /* Check if there is use at max distance after root for combined chains
2828 and set flag accordingly. */
2829 ch1->has_max_use_after = false;
2830 gimple *root_stmt = get_chain_root (ch1)->stmt;
2831 for (j = 1; ch1->refs.iterate (j, &ref); ++j)
2833 if (ref->distance == ch1->length
2834 && !pcom_stmt_dominates_stmt_p (ref->stmt, root_stmt))
2836 ch1->has_max_use_after = true;
2837 break;
2843 /* Prepare initializers for store elimination CHAIN in LOOP. Returns false
2844 if this is impossible because one of these initializers may trap, true
2845 otherwise. */
2847 static bool
2848 prepare_initializers_chain_store_elim (struct loop *loop, chain_p chain)
2850 unsigned i, n = chain->length;
2852 /* For now we can't eliminate stores if some of them are conditional
2853 executed. */
2854 if (!chain->all_always_accessed)
2855 return false;
2857 /* Nothing to intialize for intra-iteration store elimination. */
2858 if (n == 0 && chain->type == CT_STORE_STORE)
2859 return true;
2861 /* For store elimination chain, there is nothing to initialize if stores
2862 to be eliminated only store loop invariant values into memory. */
2863 if (chain->type == CT_STORE_STORE
2864 && is_inv_store_elimination_chain (loop, chain))
2866 chain->inv_store_elimination = true;
2867 return true;
2870 chain->inits.create (n);
2871 chain->inits.safe_grow_cleared (n);
2873 /* For store eliminatin chain like below:
2875 for (i = 0; i < len; i++)
2877 a[i] = 1;
2878 // a[i + 1] = ...
2879 a[i + 2] = 3;
2882 store to a[i + 1] is missed in loop body, it acts like bubbles. The
2883 content of a[i + 1] remain the same if the loop iterates fewer times
2884 than chain->length. We need to set up root variables for such stores
2885 by loading from memory before loop. Note we only need to load bubble
2886 elements because loop body is guaranteed to be executed at least once
2887 after loop's preheader edge. */
2888 auto_vec<bool> bubbles;
2889 bubbles.safe_grow_cleared (n + 1);
2890 for (i = 0; i < chain->refs.length (); i++)
2891 bubbles[chain->refs[i]->distance] = true;
2893 struct data_reference *dr = get_chain_root (chain)->ref;
2894 for (i = 0; i < n; i++)
2896 if (bubbles[i])
2897 continue;
2899 gimple_seq stmts = NULL;
2901 tree init = ref_at_iteration (dr, (int) 0 - i, &stmts);
2902 if (stmts)
2903 gimple_seq_add_seq_without_update (&chain->init_seq, stmts);
2905 chain->inits[i] = init;
2908 return true;
2911 /* Prepare initializers for CHAIN in LOOP. Returns false if this is
2912 impossible because one of these initializers may trap, true otherwise. */
2914 static bool
2915 prepare_initializers_chain (struct loop *loop, chain_p chain)
2917 unsigned i, n = (chain->type == CT_INVARIANT) ? 1 : chain->length;
2918 struct data_reference *dr = get_chain_root (chain)->ref;
2919 tree init;
2920 dref laref;
2921 edge entry = loop_preheader_edge (loop);
2923 if (chain->type == CT_STORE_STORE)
2924 return prepare_initializers_chain_store_elim (loop, chain);
2926 /* Find the initializers for the variables, and check that they cannot
2927 trap. */
2928 chain->inits.create (n);
2929 for (i = 0; i < n; i++)
2930 chain->inits.quick_push (NULL_TREE);
2932 /* If we have replaced some looparound phi nodes, use their initializers
2933 instead of creating our own. */
2934 FOR_EACH_VEC_ELT (chain->refs, i, laref)
2936 if (gimple_code (laref->stmt) != GIMPLE_PHI)
2937 continue;
2939 gcc_assert (laref->distance > 0);
2940 chain->inits[n - laref->distance]
2941 = PHI_ARG_DEF_FROM_EDGE (laref->stmt, entry);
2944 for (i = 0; i < n; i++)
2946 gimple_seq stmts = NULL;
2948 if (chain->inits[i] != NULL_TREE)
2949 continue;
2951 init = ref_at_iteration (dr, (int) i - n, &stmts);
2952 if (!chain->all_always_accessed && tree_could_trap_p (init))
2954 gimple_seq_discard (stmts);
2955 return false;
2958 if (stmts)
2959 gimple_seq_add_seq_without_update (&chain->init_seq, stmts);
2961 chain->inits[i] = init;
2964 return true;
2967 /* Prepare initializers for CHAINS in LOOP, and free chains that cannot
2968 be used because the initializers might trap. */
2970 static void
2971 prepare_initializers (struct loop *loop, vec<chain_p> chains)
2973 chain_p chain;
2974 unsigned i;
2976 for (i = 0; i < chains.length (); )
2978 chain = chains[i];
2979 if (prepare_initializers_chain (loop, chain))
2980 i++;
2981 else
2983 release_chain (chain);
2984 chains.unordered_remove (i);
2989 /* Generates finalizer memory references for CHAIN in LOOP. Returns true
2990 if finalizer code for CHAIN can be generated, otherwise false. */
2992 static bool
2993 prepare_finalizers_chain (struct loop *loop, chain_p chain)
2995 unsigned i, n = chain->length;
2996 struct data_reference *dr = get_chain_root (chain)->ref;
2997 tree fini, niters = number_of_latch_executions (loop);
2999 /* For now we can't eliminate stores if some of them are conditional
3000 executed. */
3001 if (!chain->all_always_accessed)
3002 return false;
3004 chain->finis.create (n);
3005 for (i = 0; i < n; i++)
3006 chain->finis.quick_push (NULL_TREE);
3008 /* We never use looparound phi node for store elimination chains. */
3010 /* Find the finalizers for the variables, and check that they cannot
3011 trap. */
3012 for (i = 0; i < n; i++)
3014 gimple_seq stmts = NULL;
3015 gcc_assert (chain->finis[i] == NULL_TREE);
3017 if (TREE_CODE (niters) != INTEGER_CST && TREE_CODE (niters) != SSA_NAME)
3019 niters = unshare_expr (niters);
3020 niters = force_gimple_operand (niters, &stmts, true, NULL);
3021 if (stmts)
3023 gimple_seq_add_seq_without_update (&chain->fini_seq, stmts);
3024 stmts = NULL;
3027 fini = ref_at_iteration (dr, (int) 0 - i, &stmts, niters);
3028 if (stmts)
3029 gimple_seq_add_seq_without_update (&chain->fini_seq, stmts);
3031 chain->finis[i] = fini;
3034 return true;
3037 /* Generates finalizer memory reference for CHAINS in LOOP. Returns true
3038 if finalizer code generation for CHAINS breaks loop closed ssa form. */
3040 static bool
3041 prepare_finalizers (struct loop *loop, vec<chain_p> chains)
3043 chain_p chain;
3044 unsigned i;
3045 bool loop_closed_ssa = false;
3047 for (i = 0; i < chains.length ();)
3049 chain = chains[i];
3051 /* Finalizer is only necessary for inter-iteration store elimination
3052 chains. */
3053 if (chain->length == 0 || chain->type != CT_STORE_STORE)
3055 i++;
3056 continue;
3059 if (prepare_finalizers_chain (loop, chain))
3061 i++;
3062 /* Be conservative, assume loop closed ssa form is corrupted
3063 by store-store chain. Though it's not always the case if
3064 eliminated stores only store loop invariant values into
3065 memory. */
3066 loop_closed_ssa = true;
3068 else
3070 release_chain (chain);
3071 chains.unordered_remove (i);
3074 return loop_closed_ssa;
3077 /* Insert all initializing gimple stmts into loop's entry edge. */
3079 static void
3080 insert_init_seqs (struct loop *loop, vec<chain_p> chains)
3082 unsigned i;
3083 edge entry = loop_preheader_edge (loop);
3085 for (i = 0; i < chains.length (); ++i)
3086 if (chains[i]->init_seq)
3088 gsi_insert_seq_on_edge_immediate (entry, chains[i]->init_seq);
3089 chains[i]->init_seq = NULL;
3093 /* Performs predictive commoning for LOOP. Sets bit 1<<0 of return value
3094 if LOOP was unrolled; Sets bit 1<<1 of return value if loop closed ssa
3095 form was corrupted. */
3097 static unsigned
3098 tree_predictive_commoning_loop (struct loop *loop)
3100 vec<data_reference_p> datarefs;
3101 vec<ddr_p> dependences;
3102 struct component *components;
3103 vec<chain_p> chains = vNULL;
3104 unsigned unroll_factor;
3105 struct tree_niter_desc desc;
3106 bool unroll = false, loop_closed_ssa = false;
3107 edge exit;
3109 if (dump_file && (dump_flags & TDF_DETAILS))
3110 fprintf (dump_file, "Processing loop %d\n", loop->num);
3112 /* Nothing for predicitive commoning if loop only iterates 1 time. */
3113 if (get_max_loop_iterations_int (loop) == 0)
3115 if (dump_file && (dump_flags & TDF_DETAILS))
3116 fprintf (dump_file, "Loop iterates only 1 time, nothing to do.\n");
3118 return 0;
3121 /* Find the data references and split them into components according to their
3122 dependence relations. */
3123 auto_vec<loop_p, 3> loop_nest;
3124 dependences.create (10);
3125 datarefs.create (10);
3126 if (! compute_data_dependences_for_loop (loop, true, &loop_nest, &datarefs,
3127 &dependences))
3129 if (dump_file && (dump_flags & TDF_DETAILS))
3130 fprintf (dump_file, "Cannot analyze data dependencies\n");
3131 free_data_refs (datarefs);
3132 free_dependence_relations (dependences);
3133 return 0;
3136 if (dump_file && (dump_flags & TDF_DETAILS))
3137 dump_data_dependence_relations (dump_file, dependences);
3139 components = split_data_refs_to_components (loop, datarefs, dependences);
3140 loop_nest.release ();
3141 free_dependence_relations (dependences);
3142 if (!components)
3144 free_data_refs (datarefs);
3145 free_affine_expand_cache (&name_expansions);
3146 return 0;
3149 if (dump_file && (dump_flags & TDF_DETAILS))
3151 fprintf (dump_file, "Initial state:\n\n");
3152 dump_components (dump_file, components);
3155 /* Find the suitable components and split them into chains. */
3156 components = filter_suitable_components (loop, components);
3158 auto_bitmap tmp_vars;
3159 looparound_phis = BITMAP_ALLOC (NULL);
3160 determine_roots (loop, components, &chains);
3161 release_components (components);
3163 if (!chains.exists ())
3165 if (dump_file && (dump_flags & TDF_DETAILS))
3166 fprintf (dump_file,
3167 "Predictive commoning failed: no suitable chains\n");
3168 goto end;
3170 prepare_initializers (loop, chains);
3171 loop_closed_ssa = prepare_finalizers (loop, chains);
3173 /* Try to combine the chains that are always worked with together. */
3174 try_combine_chains (loop, &chains);
3176 insert_init_seqs (loop, chains);
3178 if (dump_file && (dump_flags & TDF_DETAILS))
3180 fprintf (dump_file, "Before commoning:\n\n");
3181 dump_chains (dump_file, chains);
3184 /* Determine the unroll factor, and if the loop should be unrolled, ensure
3185 that its number of iterations is divisible by the factor. */
3186 unroll_factor = determine_unroll_factor (chains);
3187 scev_reset ();
3188 unroll = (unroll_factor > 1
3189 && can_unroll_loop_p (loop, unroll_factor, &desc));
3190 exit = single_dom_exit (loop);
3192 /* Execute the predictive commoning transformations, and possibly unroll the
3193 loop. */
3194 if (unroll)
3196 struct epcc_data dta;
3198 if (dump_file && (dump_flags & TDF_DETAILS))
3199 fprintf (dump_file, "Unrolling %u times.\n", unroll_factor);
3201 dta.chains = chains;
3202 dta.tmp_vars = tmp_vars;
3204 update_ssa (TODO_update_ssa_only_virtuals);
3206 /* Cfg manipulations performed in tree_transform_and_unroll_loop before
3207 execute_pred_commoning_cbck is called may cause phi nodes to be
3208 reallocated, which is a problem since CHAINS may point to these
3209 statements. To fix this, we store the ssa names defined by the
3210 phi nodes here instead of the phi nodes themselves, and restore
3211 the phi nodes in execute_pred_commoning_cbck. A bit hacky. */
3212 replace_phis_by_defined_names (chains);
3214 tree_transform_and_unroll_loop (loop, unroll_factor, exit, &desc,
3215 execute_pred_commoning_cbck, &dta);
3216 eliminate_temp_copies (loop, tmp_vars);
3218 else
3220 if (dump_file && (dump_flags & TDF_DETAILS))
3221 fprintf (dump_file,
3222 "Executing predictive commoning without unrolling.\n");
3223 execute_pred_commoning (loop, chains, tmp_vars);
3226 end: ;
3227 release_chains (chains);
3228 free_data_refs (datarefs);
3229 BITMAP_FREE (looparound_phis);
3231 free_affine_expand_cache (&name_expansions);
3233 return (unroll ? 1 : 0) | (loop_closed_ssa ? 2 : 0);
3236 /* Runs predictive commoning. */
3238 unsigned
3239 tree_predictive_commoning (void)
3241 struct loop *loop;
3242 unsigned ret = 0, changed = 0;
3244 initialize_original_copy_tables ();
3245 FOR_EACH_LOOP (loop, LI_ONLY_INNERMOST)
3246 if (optimize_loop_for_speed_p (loop))
3248 changed |= tree_predictive_commoning_loop (loop);
3250 free_original_copy_tables ();
3252 if (changed > 0)
3254 scev_reset ();
3256 if (changed > 1)
3257 rewrite_into_loop_closed_ssa (NULL, TODO_update_ssa);
3259 ret = TODO_cleanup_cfg;
3262 return ret;
3265 /* Predictive commoning Pass. */
3267 static unsigned
3268 run_tree_predictive_commoning (struct function *fun)
3270 if (number_of_loops (fun) <= 1)
3271 return 0;
3273 return tree_predictive_commoning ();
3276 namespace {
3278 const pass_data pass_data_predcom =
3280 GIMPLE_PASS, /* type */
3281 "pcom", /* name */
3282 OPTGROUP_LOOP, /* optinfo_flags */
3283 TV_PREDCOM, /* tv_id */
3284 PROP_cfg, /* properties_required */
3285 0, /* properties_provided */
3286 0, /* properties_destroyed */
3287 0, /* todo_flags_start */
3288 TODO_update_ssa_only_virtuals, /* todo_flags_finish */
3291 class pass_predcom : public gimple_opt_pass
3293 public:
3294 pass_predcom (gcc::context *ctxt)
3295 : gimple_opt_pass (pass_data_predcom, ctxt)
3298 /* opt_pass methods: */
3299 virtual bool gate (function *) { return flag_predictive_commoning != 0; }
3300 virtual unsigned int execute (function *fun)
3302 return run_tree_predictive_commoning (fun);
3305 }; // class pass_predcom
3307 } // anon namespace
3309 gimple_opt_pass *
3310 make_pass_predcom (gcc::context *ctxt)
3312 return new pass_predcom (ctxt);