oops - omitted from previous delta fixing UNIQUE_SECTION
[official-gcc.git] / gcc / haifa-sched.c
blob2c9adf2815b717d5bd8797e1770a341b4793d90c
1 /* Instruction scheduling pass.
2 Copyright (C) 1992, 93-98, 1999, 2000 Free Software Foundation, Inc.
3 Contributed by Michael Tiemann (tiemann@cygnus.com) Enhanced by,
4 and currently maintained by, Jim Wilson (wilson@cygnus.com)
6 This file is part of GNU CC.
8 GNU CC is free software; you can redistribute it and/or modify it
9 under the terms of the GNU General Public License as published by
10 the Free Software Foundation; either version 2, or (at your option)
11 any later version.
13 GNU CC is distributed in the hope that it will be useful, but
14 WITHOUT ANY WARRANTY; without even the implied warranty of
15 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
16 General Public License for more details.
18 You should have received a copy of the GNU General Public License
19 along with GNU CC; see the file COPYING. If not, write to the Free
20 the Free Software Foundation, 59 Temple Place - Suite 330,
21 Boston, MA 02111-1307, USA. */
24 /* Instruction scheduling pass.
26 This pass implements list scheduling within basic blocks. It is
27 run twice: (1) after flow analysis, but before register allocation,
28 and (2) after register allocation.
30 The first run performs interblock scheduling, moving insns between
31 different blocks in the same "region", and the second runs only
32 basic block scheduling.
34 Interblock motions performed are useful motions and speculative
35 motions, including speculative loads. Motions requiring code
36 duplication are not supported. The identification of motion type
37 and the check for validity of speculative motions requires
38 construction and analysis of the function's control flow graph.
39 The scheduler works as follows:
41 We compute insn priorities based on data dependencies. Flow
42 analysis only creates a fraction of the data-dependencies we must
43 observe: namely, only those dependencies which the combiner can be
44 expected to use. For this pass, we must therefore create the
45 remaining dependencies we need to observe: register dependencies,
46 memory dependencies, dependencies to keep function calls in order,
47 and the dependence between a conditional branch and the setting of
48 condition codes are all dealt with here.
50 The scheduler first traverses the data flow graph, starting with
51 the last instruction, and proceeding to the first, assigning values
52 to insn_priority as it goes. This sorts the instructions
53 topologically by data dependence.
55 Once priorities have been established, we order the insns using
56 list scheduling. This works as follows: starting with a list of
57 all the ready insns, and sorted according to priority number, we
58 schedule the insn from the end of the list by placing its
59 predecessors in the list according to their priority order. We
60 consider this insn scheduled by setting the pointer to the "end" of
61 the list to point to the previous insn. When an insn has no
62 predecessors, we either queue it until sufficient time has elapsed
63 or add it to the ready list. As the instructions are scheduled or
64 when stalls are introduced, the queue advances and dumps insns into
65 the ready list. When all insns down to the lowest priority have
66 been scheduled, the critical path of the basic block has been made
67 as short as possible. The remaining insns are then scheduled in
68 remaining slots.
70 Function unit conflicts are resolved during forward list scheduling
71 by tracking the time when each insn is committed to the schedule
72 and from that, the time the function units it uses must be free.
73 As insns on the ready list are considered for scheduling, those
74 that would result in a blockage of the already committed insns are
75 queued until no blockage will result.
77 The following list shows the order in which we want to break ties
78 among insns in the ready list:
80 1. choose insn with the longest path to end of bb, ties
81 broken by
82 2. choose insn with least contribution to register pressure,
83 ties broken by
84 3. prefer in-block upon interblock motion, ties broken by
85 4. prefer useful upon speculative motion, ties broken by
86 5. choose insn with largest control flow probability, ties
87 broken by
88 6. choose insn with the least dependences upon the previously
89 scheduled insn, or finally
90 7 choose the insn which has the most insns dependent on it.
91 8. choose insn with lowest UID.
93 Memory references complicate matters. Only if we can be certain
94 that memory references are not part of the data dependency graph
95 (via true, anti, or output dependence), can we move operations past
96 memory references. To first approximation, reads can be done
97 independently, while writes introduce dependencies. Better
98 approximations will yield fewer dependencies.
100 Before reload, an extended analysis of interblock data dependences
101 is required for interblock scheduling. This is performed in
102 compute_block_backward_dependences ().
104 Dependencies set up by memory references are treated in exactly the
105 same way as other dependencies, by using LOG_LINKS backward
106 dependences. LOG_LINKS are translated into INSN_DEPEND forward
107 dependences for the purpose of forward list scheduling.
109 Having optimized the critical path, we may have also unduly
110 extended the lifetimes of some registers. If an operation requires
111 that constants be loaded into registers, it is certainly desirable
112 to load those constants as early as necessary, but no earlier.
113 I.e., it will not do to load up a bunch of registers at the
114 beginning of a basic block only to use them at the end, if they
115 could be loaded later, since this may result in excessive register
116 utilization.
118 Note that since branches are never in basic blocks, but only end
119 basic blocks, this pass will not move branches. But that is ok,
120 since we can use GNU's delayed branch scheduling pass to take care
121 of this case.
123 Also note that no further optimizations based on algebraic
124 identities are performed, so this pass would be a good one to
125 perform instruction splitting, such as breaking up a multiply
126 instruction into shifts and adds where that is profitable.
128 Given the memory aliasing analysis that this pass should perform,
129 it should be possible to remove redundant stores to memory, and to
130 load values from registers instead of hitting memory.
132 Before reload, speculative insns are moved only if a 'proof' exists
133 that no exception will be caused by this, and if no live registers
134 exist that inhibit the motion (live registers constraints are not
135 represented by data dependence edges).
137 This pass must update information that subsequent passes expect to
138 be correct. Namely: reg_n_refs, reg_n_sets, reg_n_deaths,
139 reg_n_calls_crossed, and reg_live_length. Also, BLOCK_HEAD,
140 BLOCK_END.
142 The information in the line number notes is carefully retained by
143 this pass. Notes that refer to the starting and ending of
144 exception regions are also carefully retained by this pass. All
145 other NOTE insns are grouped in their same relative order at the
146 beginning of basic blocks and regions that have been scheduled.
148 The main entry point for this pass is schedule_insns(), called for
149 each function. The work of the scheduler is organized in three
150 levels: (1) function level: insns are subject to splitting,
151 control-flow-graph is constructed, regions are computed (after
152 reload, each region is of one block), (2) region level: control
153 flow graph attributes required for interblock scheduling are
154 computed (dominators, reachability, etc.), data dependences and
155 priorities are computed, and (3) block level: insns in the block
156 are actually scheduled. */
158 #include "config.h"
159 #include "system.h"
160 #include "toplev.h"
161 #include "rtl.h"
162 #include "tm_p.h"
163 #include "basic-block.h"
164 #include "regs.h"
165 #include "function.h"
166 #include "hard-reg-set.h"
167 #include "flags.h"
168 #include "insn-config.h"
169 #include "insn-attr.h"
170 #include "except.h"
171 #include "toplev.h"
172 #include "recog.h"
174 extern char *reg_known_equiv_p;
175 extern rtx *reg_known_value;
177 #ifdef INSN_SCHEDULING
179 /* target_units bitmask has 1 for each unit in the cpu. It should be
180 possible to compute this variable from the machine description.
181 But currently it is computed by examining the insn list. Since
182 this is only needed for visualization, it seems an acceptable
183 solution. (For understanding the mapping of bits to units, see
184 definition of function_units[] in "insn-attrtab.c".) */
186 static int target_units = 0;
188 /* issue_rate is the number of insns that can be scheduled in the same
189 machine cycle. It can be defined in the config/mach/mach.h file,
190 otherwise we set it to 1. */
192 static int issue_rate;
194 #ifndef ISSUE_RATE
195 #define ISSUE_RATE 1
196 #endif
198 /* sched-verbose controls the amount of debugging output the
199 scheduler prints. It is controlled by -fsched-verbose-N:
200 N>0 and no -DSR : the output is directed to stderr.
201 N>=10 will direct the printouts to stderr (regardless of -dSR).
202 N=1: same as -dSR.
203 N=2: bb's probabilities, detailed ready list info, unit/insn info.
204 N=3: rtl at abort point, control-flow, regions info.
205 N=5: dependences info. */
207 #define MAX_RGN_BLOCKS 10
208 #define MAX_RGN_INSNS 100
210 static int sched_verbose_param = 0;
211 static int sched_verbose = 0;
213 /* nr_inter/spec counts interblock/speculative motion for the function. */
214 static int nr_inter, nr_spec;
217 /* Debugging file. All printouts are sent to dump, which is always set,
218 either to stderr, or to the dump listing file (-dRS). */
219 static FILE *dump = 0;
221 /* fix_sched_param() is called from toplev.c upon detection
222 of the -fsched-***-N options. */
224 void
225 fix_sched_param (param, val)
226 const char *param, *val;
228 if (!strcmp (param, "verbose"))
229 sched_verbose_param = atoi (val);
230 else
231 warning ("fix_sched_param: unknown param: %s", param);
234 /* Describe state of dependencies used during sched_analyze phase. */
235 struct deps
237 /* The *_insns and *_mems are paired lists. Each pending memory operation
238 will have a pointer to the MEM rtx on one list and a pointer to the
239 containing insn on the other list in the same place in the list. */
241 /* We can't use add_dependence like the old code did, because a single insn
242 may have multiple memory accesses, and hence needs to be on the list
243 once for each memory access. Add_dependence won't let you add an insn
244 to a list more than once. */
246 /* An INSN_LIST containing all insns with pending read operations. */
247 rtx pending_read_insns;
249 /* An EXPR_LIST containing all MEM rtx's which are pending reads. */
250 rtx pending_read_mems;
252 /* An INSN_LIST containing all insns with pending write operations. */
253 rtx pending_write_insns;
255 /* An EXPR_LIST containing all MEM rtx's which are pending writes. */
256 rtx pending_write_mems;
258 /* Indicates the combined length of the two pending lists. We must prevent
259 these lists from ever growing too large since the number of dependencies
260 produced is at least O(N*N), and execution time is at least O(4*N*N), as
261 a function of the length of these pending lists. */
262 int pending_lists_length;
264 /* The last insn upon which all memory references must depend.
265 This is an insn which flushed the pending lists, creating a dependency
266 between it and all previously pending memory references. This creates
267 a barrier (or a checkpoint) which no memory reference is allowed to cross.
269 This includes all non constant CALL_INSNs. When we do interprocedural
270 alias analysis, this restriction can be relaxed.
271 This may also be an INSN that writes memory if the pending lists grow
272 too large. */
273 rtx last_pending_memory_flush;
275 /* The last function call we have seen. All hard regs, and, of course,
276 the last function call, must depend on this. */
277 rtx last_function_call;
279 /* The LOG_LINKS field of this is a list of insns which use a pseudo register
280 that does not already cross a call. We create dependencies between each
281 of those insn and the next call insn, to ensure that they won't cross a call
282 after scheduling is done. */
283 rtx sched_before_next_call;
285 /* Element N is the next insn that sets (hard or pseudo) register
286 N within the current basic block; or zero, if there is no
287 such insn. Needed for new registers which may be introduced
288 by splitting insns. */
289 rtx *reg_last_uses;
290 rtx *reg_last_sets;
291 rtx *reg_last_clobbers;
294 static regset reg_pending_sets;
295 static regset reg_pending_clobbers;
296 static int reg_pending_sets_all;
298 /* To speed up the test for duplicate dependency links we keep a record
299 of true dependencies created by add_dependence when the average number
300 of instructions in a basic block is very large.
302 Studies have shown that there is typically around 5 instructions between
303 branches for typical C code. So we can make a guess that the average
304 basic block is approximately 5 instructions long; we will choose 100X
305 the average size as a very large basic block.
307 Each insn has an associated bitmap for its dependencies. Each bitmap
308 has enough entries to represent a dependency on any other insn in the
309 insn chain. */
310 static sbitmap *true_dependency_cache;
312 /* Indexed by INSN_UID, the collection of all data associated with
313 a single instruction. */
315 struct haifa_insn_data
317 /* A list of insns which depend on the instruction. Unlike LOG_LINKS,
318 it represents forward dependancies. */
319 rtx depend;
321 /* The line number note in effect for each insn. For line number
322 notes, this indicates whether the note may be reused. */
323 rtx line_note;
325 /* Logical uid gives the original ordering of the insns. */
326 int luid;
328 /* A priority for each insn. */
329 int priority;
331 /* The number of incoming edges in the forward dependency graph.
332 As scheduling proceds, counts are decreased. An insn moves to
333 the ready queue when its counter reaches zero. */
334 int dep_count;
336 /* An encoding of the blockage range function. Both unit and range
337 are coded. */
338 unsigned int blockage;
340 /* Number of instructions referring to this insn. */
341 int ref_count;
343 /* The minimum clock tick at which the insn becomes ready. This is
344 used to note timing constraints for the insns in the pending list. */
345 int tick;
347 short cost;
349 /* An encoding of the function units used. */
350 short units;
352 /* This weight is an estimation of the insn's contribution to
353 register pressure. */
354 short reg_weight;
356 /* Some insns (e.g. call) are not allowed to move across blocks. */
357 unsigned int cant_move : 1;
359 /* Set if there's DEF-USE dependance between some speculatively
360 moved load insn and this one. */
361 unsigned int fed_by_spec_load : 1;
362 unsigned int is_load_insn : 1;
365 static struct haifa_insn_data *h_i_d;
367 #define INSN_DEPEND(INSN) (h_i_d[INSN_UID (INSN)].depend)
368 #define INSN_LUID(INSN) (h_i_d[INSN_UID (INSN)].luid)
369 #define INSN_PRIORITY(INSN) (h_i_d[INSN_UID (INSN)].priority)
370 #define INSN_DEP_COUNT(INSN) (h_i_d[INSN_UID (INSN)].dep_count)
371 #define INSN_COST(INSN) (h_i_d[INSN_UID (INSN)].cost)
372 #define INSN_UNIT(INSN) (h_i_d[INSN_UID (INSN)].units)
373 #define INSN_REG_WEIGHT(INSN) (h_i_d[INSN_UID (INSN)].reg_weight)
375 #define INSN_BLOCKAGE(INSN) (h_i_d[INSN_UID (INSN)].blockage)
376 #define UNIT_BITS 5
377 #define BLOCKAGE_MASK ((1 << BLOCKAGE_BITS) - 1)
378 #define ENCODE_BLOCKAGE(U, R) \
379 (((U) << BLOCKAGE_BITS \
380 | MIN_BLOCKAGE_COST (R)) << BLOCKAGE_BITS \
381 | MAX_BLOCKAGE_COST (R))
382 #define UNIT_BLOCKED(B) ((B) >> (2 * BLOCKAGE_BITS))
383 #define BLOCKAGE_RANGE(B) \
384 (((((B) >> BLOCKAGE_BITS) & BLOCKAGE_MASK) << (HOST_BITS_PER_INT / 2)) \
385 | ((B) & BLOCKAGE_MASK))
387 /* Encodings of the `<name>_unit_blockage_range' function. */
388 #define MIN_BLOCKAGE_COST(R) ((R) >> (HOST_BITS_PER_INT / 2))
389 #define MAX_BLOCKAGE_COST(R) ((R) & ((1 << (HOST_BITS_PER_INT / 2)) - 1))
391 #define DONE_PRIORITY -1
392 #define MAX_PRIORITY 0x7fffffff
393 #define TAIL_PRIORITY 0x7ffffffe
394 #define LAUNCH_PRIORITY 0x7f000001
395 #define DONE_PRIORITY_P(INSN) (INSN_PRIORITY (INSN) < 0)
396 #define LOW_PRIORITY_P(INSN) ((INSN_PRIORITY (INSN) & 0x7f000000) == 0)
398 #define INSN_REF_COUNT(INSN) (h_i_d[INSN_UID (INSN)].ref_count)
399 #define LINE_NOTE(INSN) (h_i_d[INSN_UID (INSN)].line_note)
400 #define INSN_TICK(INSN) (h_i_d[INSN_UID (INSN)].tick)
401 #define CANT_MOVE(insn) (h_i_d[INSN_UID (insn)].cant_move)
402 #define FED_BY_SPEC_LOAD(insn) (h_i_d[INSN_UID (insn)].fed_by_spec_load)
403 #define IS_LOAD_INSN(insn) (h_i_d[INSN_UID (insn)].is_load_insn)
405 /* Vector indexed by basic block number giving the starting line-number
406 for each basic block. */
407 static rtx *line_note_head;
409 /* List of important notes we must keep around. This is a pointer to the
410 last element in the list. */
411 static rtx note_list;
413 /* Queues, etc. */
415 /* An instruction is ready to be scheduled when all insns preceding it
416 have already been scheduled. It is important to ensure that all
417 insns which use its result will not be executed until its result
418 has been computed. An insn is maintained in one of four structures:
420 (P) the "Pending" set of insns which cannot be scheduled until
421 their dependencies have been satisfied.
422 (Q) the "Queued" set of insns that can be scheduled when sufficient
423 time has passed.
424 (R) the "Ready" list of unscheduled, uncommitted insns.
425 (S) the "Scheduled" list of insns.
427 Initially, all insns are either "Pending" or "Ready" depending on
428 whether their dependencies are satisfied.
430 Insns move from the "Ready" list to the "Scheduled" list as they
431 are committed to the schedule. As this occurs, the insns in the
432 "Pending" list have their dependencies satisfied and move to either
433 the "Ready" list or the "Queued" set depending on whether
434 sufficient time has passed to make them ready. As time passes,
435 insns move from the "Queued" set to the "Ready" list. Insns may
436 move from the "Ready" list to the "Queued" set if they are blocked
437 due to a function unit conflict.
439 The "Pending" list (P) are the insns in the INSN_DEPEND of the unscheduled
440 insns, i.e., those that are ready, queued, and pending.
441 The "Queued" set (Q) is implemented by the variable `insn_queue'.
442 The "Ready" list (R) is implemented by the variables `ready' and
443 `n_ready'.
444 The "Scheduled" list (S) is the new insn chain built by this pass.
446 The transition (R->S) is implemented in the scheduling loop in
447 `schedule_block' when the best insn to schedule is chosen.
448 The transition (R->Q) is implemented in `queue_insn' when an
449 insn is found to have a function unit conflict with the already
450 committed insns.
451 The transitions (P->R and P->Q) are implemented in `schedule_insn' as
452 insns move from the ready list to the scheduled list.
453 The transition (Q->R) is implemented in 'queue_to_insn' as time
454 passes or stalls are introduced. */
456 /* Implement a circular buffer to delay instructions until sufficient
457 time has passed. INSN_QUEUE_SIZE is a power of two larger than
458 MAX_BLOCKAGE and MAX_READY_COST computed by genattr.c. This is the
459 longest time an isnsn may be queued. */
460 static rtx insn_queue[INSN_QUEUE_SIZE];
461 static int q_ptr = 0;
462 static int q_size = 0;
463 #define NEXT_Q(X) (((X)+1) & (INSN_QUEUE_SIZE-1))
464 #define NEXT_Q_AFTER(X, C) (((X)+C) & (INSN_QUEUE_SIZE-1))
466 /* Forward declarations. */
467 static void add_dependence PARAMS ((rtx, rtx, enum reg_note));
468 #ifdef HAVE_cc0
469 static void remove_dependence PARAMS ((rtx, rtx));
470 #endif
471 static rtx find_insn_list PARAMS ((rtx, rtx));
472 static int insn_unit PARAMS ((rtx));
473 static unsigned int blockage_range PARAMS ((int, rtx));
474 static void clear_units PARAMS ((void));
475 static int actual_hazard_this_instance PARAMS ((int, int, rtx, int, int));
476 static void schedule_unit PARAMS ((int, rtx, int));
477 static int actual_hazard PARAMS ((int, rtx, int, int));
478 static int potential_hazard PARAMS ((int, rtx, int));
479 static int insn_cost PARAMS ((rtx, rtx, rtx));
480 static int priority PARAMS ((rtx));
481 static void free_pending_lists PARAMS ((void));
482 static void add_insn_mem_dependence PARAMS ((struct deps *, rtx *, rtx *, rtx,
483 rtx));
484 static void flush_pending_lists PARAMS ((struct deps *, rtx, int));
485 static void sched_analyze_1 PARAMS ((struct deps *, rtx, rtx));
486 static void sched_analyze_2 PARAMS ((struct deps *, rtx, rtx));
487 static void sched_analyze_insn PARAMS ((struct deps *, rtx, rtx, rtx));
488 static void sched_analyze PARAMS ((struct deps *, rtx, rtx));
489 static int rank_for_schedule PARAMS ((const PTR, const PTR));
490 static void swap_sort PARAMS ((rtx *, int));
491 static void queue_insn PARAMS ((rtx, int));
492 static int schedule_insn PARAMS ((rtx, rtx *, int, int));
493 static void find_insn_reg_weight PARAMS ((int));
494 static int schedule_block PARAMS ((int, int));
495 static char *safe_concat PARAMS ((char *, char *, const char *));
496 static int insn_issue_delay PARAMS ((rtx));
497 static void adjust_priority PARAMS ((rtx));
499 /* Control flow graph edges are kept in circular lists. */
500 typedef struct
502 int from_block;
503 int to_block;
504 int next_in;
505 int next_out;
507 haifa_edge;
508 static haifa_edge *edge_table;
510 #define NEXT_IN(edge) (edge_table[edge].next_in)
511 #define NEXT_OUT(edge) (edge_table[edge].next_out)
512 #define FROM_BLOCK(edge) (edge_table[edge].from_block)
513 #define TO_BLOCK(edge) (edge_table[edge].to_block)
515 /* Number of edges in the control flow graph. (In fact, larger than
516 that by 1, since edge 0 is unused.) */
517 static int nr_edges;
519 /* Circular list of incoming/outgoing edges of a block. */
520 static int *in_edges;
521 static int *out_edges;
523 #define IN_EDGES(block) (in_edges[block])
524 #define OUT_EDGES(block) (out_edges[block])
528 static int is_cfg_nonregular PARAMS ((void));
529 static int build_control_flow PARAMS ((struct edge_list *));
530 static void new_edge PARAMS ((int, int));
533 /* A region is the main entity for interblock scheduling: insns
534 are allowed to move between blocks in the same region, along
535 control flow graph edges, in the 'up' direction. */
536 typedef struct
538 int rgn_nr_blocks; /* Number of blocks in region. */
539 int rgn_blocks; /* cblocks in the region (actually index in rgn_bb_table). */
541 region;
543 /* Number of regions in the procedure. */
544 static int nr_regions;
546 /* Table of region descriptions. */
547 static region *rgn_table;
549 /* Array of lists of regions' blocks. */
550 static int *rgn_bb_table;
552 /* Topological order of blocks in the region (if b2 is reachable from
553 b1, block_to_bb[b2] > block_to_bb[b1]). Note: A basic block is
554 always referred to by either block or b, while its topological
555 order name (in the region) is refered to by bb. */
556 static int *block_to_bb;
558 /* The number of the region containing a block. */
559 static int *containing_rgn;
561 #define RGN_NR_BLOCKS(rgn) (rgn_table[rgn].rgn_nr_blocks)
562 #define RGN_BLOCKS(rgn) (rgn_table[rgn].rgn_blocks)
563 #define BLOCK_TO_BB(block) (block_to_bb[block])
564 #define CONTAINING_RGN(block) (containing_rgn[block])
566 void debug_regions PARAMS ((void));
567 static void find_single_block_region PARAMS ((void));
568 static void find_rgns PARAMS ((struct edge_list *, sbitmap *));
569 static int too_large PARAMS ((int, int *, int *));
571 extern void debug_live PARAMS ((int, int));
573 /* Blocks of the current region being scheduled. */
574 static int current_nr_blocks;
575 static int current_blocks;
577 /* The mapping from bb to block. */
578 #define BB_TO_BLOCK(bb) (rgn_bb_table[current_blocks + (bb)])
581 /* Bit vectors and bitset operations are needed for computations on
582 the control flow graph. */
584 typedef unsigned HOST_WIDE_INT *bitset;
585 typedef struct
587 int *first_member; /* Pointer to the list start in bitlst_table. */
588 int nr_members; /* The number of members of the bit list. */
590 bitlst;
592 static int bitlst_table_last;
593 static int bitlst_table_size;
594 static int *bitlst_table;
596 static char bitset_member PARAMS ((bitset, int, int));
597 static void extract_bitlst PARAMS ((bitset, int, int, bitlst *));
599 /* Target info declarations.
601 The block currently being scheduled is referred to as the "target" block,
602 while other blocks in the region from which insns can be moved to the
603 target are called "source" blocks. The candidate structure holds info
604 about such sources: are they valid? Speculative? Etc. */
605 typedef bitlst bblst;
606 typedef struct
608 char is_valid;
609 char is_speculative;
610 int src_prob;
611 bblst split_bbs;
612 bblst update_bbs;
614 candidate;
616 static candidate *candidate_table;
618 /* A speculative motion requires checking live information on the path
619 from 'source' to 'target'. The split blocks are those to be checked.
620 After a speculative motion, live information should be modified in
621 the 'update' blocks.
623 Lists of split and update blocks for each candidate of the current
624 target are in array bblst_table. */
625 static int *bblst_table, bblst_size, bblst_last;
627 #define IS_VALID(src) ( candidate_table[src].is_valid )
628 #define IS_SPECULATIVE(src) ( candidate_table[src].is_speculative )
629 #define SRC_PROB(src) ( candidate_table[src].src_prob )
631 /* The bb being currently scheduled. */
632 static int target_bb;
634 /* List of edges. */
635 typedef bitlst edgelst;
637 /* Target info functions. */
638 static void split_edges PARAMS ((int, int, edgelst *));
639 static void compute_trg_info PARAMS ((int));
640 void debug_candidate PARAMS ((int));
641 void debug_candidates PARAMS ((int));
644 /* Bit-set of bbs, where bit 'i' stands for bb 'i'. */
645 typedef bitset bbset;
647 /* Number of words of the bbset. */
648 static int bbset_size;
650 /* Dominators array: dom[i] contains the bbset of dominators of
651 bb i in the region. */
652 static bbset *dom;
654 /* bb 0 is the only region entry. */
655 #define IS_RGN_ENTRY(bb) (!bb)
657 /* Is bb_src dominated by bb_trg. */
658 #define IS_DOMINATED(bb_src, bb_trg) \
659 ( bitset_member (dom[bb_src], bb_trg, bbset_size) )
661 /* Probability: Prob[i] is a float in [0, 1] which is the probability
662 of bb i relative to the region entry. */
663 static float *prob;
665 /* The probability of bb_src, relative to bb_trg. Note, that while the
666 'prob[bb]' is a float in [0, 1], this macro returns an integer
667 in [0, 100]. */
668 #define GET_SRC_PROB(bb_src, bb_trg) ((int) (100.0 * (prob[bb_src] / \
669 prob[bb_trg])))
671 /* Bit-set of edges, where bit i stands for edge i. */
672 typedef bitset edgeset;
674 /* Number of edges in the region. */
675 static int rgn_nr_edges;
677 /* Array of size rgn_nr_edges. */
678 static int *rgn_edges;
680 /* Number of words in an edgeset. */
681 static int edgeset_size;
683 /* Number of bits in an edgeset. */
684 static int edgeset_bitsize;
686 /* Mapping from each edge in the graph to its number in the rgn. */
687 static int *edge_to_bit;
688 #define EDGE_TO_BIT(edge) (edge_to_bit[edge])
690 /* The split edges of a source bb is different for each target
691 bb. In order to compute this efficiently, the 'potential-split edges'
692 are computed for each bb prior to scheduling a region. This is actually
693 the split edges of each bb relative to the region entry.
695 pot_split[bb] is the set of potential split edges of bb. */
696 static edgeset *pot_split;
698 /* For every bb, a set of its ancestor edges. */
699 static edgeset *ancestor_edges;
701 static void compute_dom_prob_ps PARAMS ((int));
703 #define ABS_VALUE(x) (((x)<0)?(-(x)):(x))
704 #define INSN_PROBABILITY(INSN) (SRC_PROB (BLOCK_TO_BB (BLOCK_NUM (INSN))))
705 #define IS_SPECULATIVE_INSN(INSN) (IS_SPECULATIVE (BLOCK_TO_BB (BLOCK_NUM (INSN))))
706 #define INSN_BB(INSN) (BLOCK_TO_BB (BLOCK_NUM (INSN)))
708 /* Parameters affecting the decision of rank_for_schedule(). */
709 #define MIN_DIFF_PRIORITY 2
710 #define MIN_PROBABILITY 40
711 #define MIN_PROB_DIFF 10
713 /* Speculative scheduling functions. */
714 static int check_live_1 PARAMS ((int, rtx));
715 static void update_live_1 PARAMS ((int, rtx));
716 static int check_live PARAMS ((rtx, int));
717 static void update_live PARAMS ((rtx, int));
718 static void set_spec_fed PARAMS ((rtx));
719 static int is_pfree PARAMS ((rtx, int, int));
720 static int find_conditional_protection PARAMS ((rtx, int));
721 static int is_conditionally_protected PARAMS ((rtx, int, int));
722 static int may_trap_exp PARAMS ((rtx, int));
723 static int haifa_classify_insn PARAMS ((rtx));
724 static int is_prisky PARAMS ((rtx, int, int));
725 static int is_exception_free PARAMS ((rtx, int, int));
727 static char find_insn_mem_list PARAMS ((rtx, rtx, rtx, rtx));
728 static void compute_block_forward_dependences PARAMS ((int));
729 static void add_branch_dependences PARAMS ((rtx, rtx));
730 static void compute_block_backward_dependences PARAMS ((int));
731 void debug_dependencies PARAMS ((void));
733 /* Notes handling mechanism:
734 =========================
735 Generally, NOTES are saved before scheduling and restored after scheduling.
736 The scheduler distinguishes between three types of notes:
738 (1) LINE_NUMBER notes, generated and used for debugging. Here,
739 before scheduling a region, a pointer to the LINE_NUMBER note is
740 added to the insn following it (in save_line_notes()), and the note
741 is removed (in rm_line_notes() and unlink_line_notes()). After
742 scheduling the region, this pointer is used for regeneration of
743 the LINE_NUMBER note (in restore_line_notes()).
745 (2) LOOP_BEGIN, LOOP_END, SETJMP, EHREGION_BEG, EHREGION_END notes:
746 Before scheduling a region, a pointer to the note is added to the insn
747 that follows or precedes it. (This happens as part of the data dependence
748 computation). After scheduling an insn, the pointer contained in it is
749 used for regenerating the corresponding note (in reemit_notes).
751 (3) All other notes (e.g. INSN_DELETED): Before scheduling a block,
752 these notes are put in a list (in rm_other_notes() and
753 unlink_other_notes ()). After scheduling the block, these notes are
754 inserted at the beginning of the block (in schedule_block()). */
756 static rtx unlink_other_notes PARAMS ((rtx, rtx));
757 static rtx unlink_line_notes PARAMS ((rtx, rtx));
758 static void rm_line_notes PARAMS ((int));
759 static void save_line_notes PARAMS ((int));
760 static void restore_line_notes PARAMS ((int));
761 static void rm_redundant_line_notes PARAMS ((void));
762 static void rm_other_notes PARAMS ((rtx, rtx));
763 static rtx reemit_notes PARAMS ((rtx, rtx));
765 static void get_block_head_tail PARAMS ((int, rtx *, rtx *));
766 static void get_bb_head_tail PARAMS ((int, rtx *, rtx *));
768 static int queue_to_ready PARAMS ((rtx [], int));
770 static void debug_ready_list PARAMS ((rtx[], int));
771 static void init_target_units PARAMS ((void));
772 static void insn_print_units PARAMS ((rtx));
773 static int get_visual_tbl_length PARAMS ((void));
774 static void init_block_visualization PARAMS ((void));
775 static void print_block_visualization PARAMS ((int, const char *));
776 static void visualize_scheduled_insns PARAMS ((int, int));
777 static void visualize_no_unit PARAMS ((rtx));
778 static void visualize_stall_cycles PARAMS ((int, int));
779 static void print_exp PARAMS ((char *, rtx, int));
780 static void print_value PARAMS ((char *, rtx, int));
781 static void print_pattern PARAMS ((char *, rtx, int));
782 static void print_insn PARAMS ((char *, rtx, int));
783 void debug_reg_vector PARAMS ((regset));
785 static rtx move_insn1 PARAMS ((rtx, rtx));
786 static rtx move_insn PARAMS ((rtx, rtx));
787 static rtx group_leader PARAMS ((rtx));
788 static int set_priorities PARAMS ((int));
789 static void init_deps PARAMS ((struct deps *));
790 static void schedule_region PARAMS ((int));
792 #endif /* INSN_SCHEDULING */
794 #define SIZE_FOR_MODE(X) (GET_MODE_SIZE (GET_MODE (X)))
796 /* Add ELEM wrapped in an INSN_LIST with reg note kind DEP_TYPE to the
797 LOG_LINKS of INSN, if not already there. DEP_TYPE indicates the type
798 of dependence that this link represents. */
800 static void
801 add_dependence (insn, elem, dep_type)
802 rtx insn;
803 rtx elem;
804 enum reg_note dep_type;
806 rtx link, next;
808 /* Don't depend an insn on itself. */
809 if (insn == elem)
810 return;
812 /* We can get a dependency on deleted insns due to optimizations in
813 the register allocation and reloading or due to splitting. Any
814 such dependency is useless and can be ignored. */
815 if (GET_CODE (elem) == NOTE)
816 return;
818 /* If elem is part of a sequence that must be scheduled together, then
819 make the dependence point to the last insn of the sequence.
820 When HAVE_cc0, it is possible for NOTEs to exist between users and
821 setters of the condition codes, so we must skip past notes here.
822 Otherwise, NOTEs are impossible here. */
824 next = NEXT_INSN (elem);
826 #ifdef HAVE_cc0
827 while (next && GET_CODE (next) == NOTE)
828 next = NEXT_INSN (next);
829 #endif
831 if (next && SCHED_GROUP_P (next)
832 && GET_CODE (next) != CODE_LABEL)
834 /* Notes will never intervene here though, so don't bother checking
835 for them. */
836 /* We must reject CODE_LABELs, so that we don't get confused by one
837 that has LABEL_PRESERVE_P set, which is represented by the same
838 bit in the rtl as SCHED_GROUP_P. A CODE_LABEL can never be
839 SCHED_GROUP_P. */
840 while (NEXT_INSN (next) && SCHED_GROUP_P (NEXT_INSN (next))
841 && GET_CODE (NEXT_INSN (next)) != CODE_LABEL)
842 next = NEXT_INSN (next);
844 /* Again, don't depend an insn on itself. */
845 if (insn == next)
846 return;
848 /* Make the dependence to NEXT, the last insn of the group, instead
849 of the original ELEM. */
850 elem = next;
853 #ifdef INSN_SCHEDULING
854 /* (This code is guarded by INSN_SCHEDULING, otherwise INSN_BB is undefined.)
855 No need for interblock dependences with calls, since
856 calls are not moved between blocks. Note: the edge where
857 elem is a CALL is still required. */
858 if (GET_CODE (insn) == CALL_INSN
859 && (INSN_BB (elem) != INSN_BB (insn)))
860 return;
863 /* If we already have a true dependency for ELEM, then we do not
864 need to do anything. Avoiding the list walk below can cut
865 compile times dramatically for some code. */
866 if (true_dependency_cache
867 && TEST_BIT (true_dependency_cache[INSN_LUID (insn)], INSN_LUID (elem)))
868 return;
869 #endif
871 /* Check that we don't already have this dependence. */
872 for (link = LOG_LINKS (insn); link; link = XEXP (link, 1))
873 if (XEXP (link, 0) == elem)
875 /* If this is a more restrictive type of dependence than the existing
876 one, then change the existing dependence to this type. */
877 if ((int) dep_type < (int) REG_NOTE_KIND (link))
878 PUT_REG_NOTE_KIND (link, dep_type);
880 #ifdef INSN_SCHEDULING
881 /* If we are adding a true dependency to INSN's LOG_LINKs, then
882 note that in the bitmap cache of true dependency information. */
883 if ((int)dep_type == 0 && true_dependency_cache)
884 SET_BIT (true_dependency_cache[INSN_LUID (insn)], INSN_LUID (elem));
885 #endif
886 return;
888 /* Might want to check one level of transitivity to save conses. */
890 link = alloc_INSN_LIST (elem, LOG_LINKS (insn));
891 LOG_LINKS (insn) = link;
893 /* Insn dependency, not data dependency. */
894 PUT_REG_NOTE_KIND (link, dep_type);
896 #ifdef INSN_SCHEDULING
897 /* If we are adding a true dependency to INSN's LOG_LINKs, then
898 note that in the bitmap cache of true dependency information. */
899 if ((int)dep_type == 0 && true_dependency_cache)
900 SET_BIT (true_dependency_cache[INSN_LUID (insn)], INSN_LUID (elem));
901 #endif
904 #ifdef HAVE_cc0
905 /* Remove ELEM wrapped in an INSN_LIST from the LOG_LINKS
906 of INSN. Abort if not found. */
908 static void
909 remove_dependence (insn, elem)
910 rtx insn;
911 rtx elem;
913 rtx prev, link, next;
914 int found = 0;
916 for (prev = 0, link = LOG_LINKS (insn); link; link = next)
918 next = XEXP (link, 1);
919 if (XEXP (link, 0) == elem)
921 if (prev)
922 XEXP (prev, 1) = next;
923 else
924 LOG_LINKS (insn) = next;
926 #ifdef INSN_SCHEDULING
927 /* If we are removing a true dependency from the LOG_LINKS list,
928 make sure to remove it from the cache too. */
929 if (REG_NOTE_KIND (link) == 0 && true_dependency_cache)
930 RESET_BIT (true_dependency_cache[INSN_LUID (insn)],
931 INSN_LUID (elem));
932 #endif
934 free_INSN_LIST_node (link);
936 found = 1;
938 else
939 prev = link;
942 if (!found)
943 abort ();
944 return;
946 #endif /* HAVE_cc0 */
948 #ifndef INSN_SCHEDULING
949 void
950 schedule_insns (dump_file)
951 FILE *dump_file ATTRIBUTE_UNUSED;
954 #else
955 #ifndef __GNUC__
956 #define __inline
957 #endif
959 #ifndef HAIFA_INLINE
960 #define HAIFA_INLINE __inline
961 #endif
963 /* Computation of memory dependencies. */
965 /* Data structures for the computation of data dependences in a regions. We
966 keep one mem_deps structure for every basic block. Before analyzing the
967 data dependences for a bb, its variables are initialized as a function of
968 the variables of its predecessors. When the analysis for a bb completes,
969 we save the contents to the corresponding bb_mem_deps[bb] variable. */
971 static struct deps *bb_deps;
973 /* Pointer to the last instruction scheduled. Used by rank_for_schedule,
974 so that insns independent of the last scheduled insn will be preferred
975 over dependent instructions. */
977 static rtx last_scheduled_insn;
979 /* Functions for construction of the control flow graph. */
981 /* Return 1 if control flow graph should not be constructed, 0 otherwise.
983 We decide not to build the control flow graph if there is possibly more
984 than one entry to the function, if computed branches exist, of if we
985 have nonlocal gotos. */
987 static int
988 is_cfg_nonregular ()
990 int b;
991 rtx insn;
992 RTX_CODE code;
994 /* If we have a label that could be the target of a nonlocal goto, then
995 the cfg is not well structured. */
996 if (nonlocal_goto_handler_labels)
997 return 1;
999 /* If we have any forced labels, then the cfg is not well structured. */
1000 if (forced_labels)
1001 return 1;
1003 /* If this function has a computed jump, then we consider the cfg
1004 not well structured. */
1005 if (current_function_has_computed_jump)
1006 return 1;
1008 /* If we have exception handlers, then we consider the cfg not well
1009 structured. ?!? We should be able to handle this now that flow.c
1010 computes an accurate cfg for EH. */
1011 if (exception_handler_labels)
1012 return 1;
1014 /* If we have non-jumping insns which refer to labels, then we consider
1015 the cfg not well structured. */
1016 /* Check for labels referred to other thn by jumps. */
1017 for (b = 0; b < n_basic_blocks; b++)
1018 for (insn = BLOCK_HEAD (b);; insn = NEXT_INSN (insn))
1020 code = GET_CODE (insn);
1021 if (GET_RTX_CLASS (code) == 'i')
1023 rtx note;
1025 for (note = REG_NOTES (insn); note; note = XEXP (note, 1))
1026 if (REG_NOTE_KIND (note) == REG_LABEL)
1027 return 1;
1030 if (insn == BLOCK_END (b))
1031 break;
1034 /* All the tests passed. Consider the cfg well structured. */
1035 return 0;
1038 /* Build the control flow graph and set nr_edges.
1040 Instead of trying to build a cfg ourselves, we rely on flow to
1041 do it for us. Stamp out useless code (and bug) duplication.
1043 Return nonzero if an irregularity in the cfg is found which would
1044 prevent cross block scheduling. */
1046 static int
1047 build_control_flow (edge_list)
1048 struct edge_list *edge_list;
1050 int i, unreachable, num_edges;
1052 /* This already accounts for entry/exit edges. */
1053 num_edges = NUM_EDGES (edge_list);
1055 /* Unreachable loops with more than one basic block are detected
1056 during the DFS traversal in find_rgns.
1058 Unreachable loops with a single block are detected here. This
1059 test is redundant with the one in find_rgns, but it's much
1060 cheaper to go ahead and catch the trivial case here. */
1061 unreachable = 0;
1062 for (i = 0; i < n_basic_blocks; i++)
1064 basic_block b = BASIC_BLOCK (i);
1066 if (b->pred == NULL
1067 || (b->pred->src == b
1068 && b->pred->pred_next == NULL))
1069 unreachable = 1;
1072 /* ??? We can kill these soon. */
1073 in_edges = (int *) xcalloc (n_basic_blocks, sizeof (int));
1074 out_edges = (int *) xcalloc (n_basic_blocks, sizeof (int));
1075 edge_table = (haifa_edge *) xcalloc (num_edges, sizeof (haifa_edge));
1077 nr_edges = 0;
1078 for (i = 0; i < num_edges; i++)
1080 edge e = INDEX_EDGE (edge_list, i);
1082 if (e->dest != EXIT_BLOCK_PTR
1083 && e->src != ENTRY_BLOCK_PTR)
1084 new_edge (e->src->index, e->dest->index);
1087 /* Increment by 1, since edge 0 is unused. */
1088 nr_edges++;
1090 return unreachable;
1094 /* Record an edge in the control flow graph from SOURCE to TARGET.
1096 In theory, this is redundant with the s_succs computed above, but
1097 we have not converted all of haifa to use information from the
1098 integer lists. */
1100 static void
1101 new_edge (source, target)
1102 int source, target;
1104 int e, next_edge;
1105 int curr_edge, fst_edge;
1107 /* Check for duplicates. */
1108 fst_edge = curr_edge = OUT_EDGES (source);
1109 while (curr_edge)
1111 if (FROM_BLOCK (curr_edge) == source
1112 && TO_BLOCK (curr_edge) == target)
1114 return;
1117 curr_edge = NEXT_OUT (curr_edge);
1119 if (fst_edge == curr_edge)
1120 break;
1123 e = ++nr_edges;
1125 FROM_BLOCK (e) = source;
1126 TO_BLOCK (e) = target;
1128 if (OUT_EDGES (source))
1130 next_edge = NEXT_OUT (OUT_EDGES (source));
1131 NEXT_OUT (OUT_EDGES (source)) = e;
1132 NEXT_OUT (e) = next_edge;
1134 else
1136 OUT_EDGES (source) = e;
1137 NEXT_OUT (e) = e;
1140 if (IN_EDGES (target))
1142 next_edge = NEXT_IN (IN_EDGES (target));
1143 NEXT_IN (IN_EDGES (target)) = e;
1144 NEXT_IN (e) = next_edge;
1146 else
1148 IN_EDGES (target) = e;
1149 NEXT_IN (e) = e;
1154 /* BITSET macros for operations on the control flow graph. */
1156 /* Compute bitwise union of two bitsets. */
1157 #define BITSET_UNION(set1, set2, len) \
1158 do { register bitset tp = set1, sp = set2; \
1159 register int i; \
1160 for (i = 0; i < len; i++) \
1161 *(tp++) |= *(sp++); } while (0)
1163 /* Compute bitwise intersection of two bitsets. */
1164 #define BITSET_INTER(set1, set2, len) \
1165 do { register bitset tp = set1, sp = set2; \
1166 register int i; \
1167 for (i = 0; i < len; i++) \
1168 *(tp++) &= *(sp++); } while (0)
1170 /* Compute bitwise difference of two bitsets. */
1171 #define BITSET_DIFFER(set1, set2, len) \
1172 do { register bitset tp = set1, sp = set2; \
1173 register int i; \
1174 for (i = 0; i < len; i++) \
1175 *(tp++) &= ~*(sp++); } while (0)
1177 /* Inverts every bit of bitset 'set'. */
1178 #define BITSET_INVERT(set, len) \
1179 do { register bitset tmpset = set; \
1180 register int i; \
1181 for (i = 0; i < len; i++, tmpset++) \
1182 *tmpset = ~*tmpset; } while (0)
1184 /* Turn on the index'th bit in bitset set. */
1185 #define BITSET_ADD(set, index, len) \
1187 if (index >= HOST_BITS_PER_WIDE_INT * len) \
1188 abort (); \
1189 else \
1190 set[index/HOST_BITS_PER_WIDE_INT] |= \
1191 1 << (index % HOST_BITS_PER_WIDE_INT); \
1194 /* Turn off the index'th bit in set. */
1195 #define BITSET_REMOVE(set, index, len) \
1197 if (index >= HOST_BITS_PER_WIDE_INT * len) \
1198 abort (); \
1199 else \
1200 set[index/HOST_BITS_PER_WIDE_INT] &= \
1201 ~(1 << (index%HOST_BITS_PER_WIDE_INT)); \
1205 /* Check if the index'th bit in bitset set is on. */
1207 static char
1208 bitset_member (set, index, len)
1209 bitset set;
1210 int index, len;
1212 if (index >= HOST_BITS_PER_WIDE_INT * len)
1213 abort ();
1214 return (set[index / HOST_BITS_PER_WIDE_INT] &
1215 1 << (index % HOST_BITS_PER_WIDE_INT)) ? 1 : 0;
1219 /* Translate a bit-set SET to a list BL of the bit-set members. */
1221 static void
1222 extract_bitlst (set, len, bitlen, bl)
1223 bitset set;
1224 int len;
1225 int bitlen;
1226 bitlst *bl;
1228 int i, j, offset;
1229 unsigned HOST_WIDE_INT word;
1231 /* bblst table space is reused in each call to extract_bitlst. */
1232 bitlst_table_last = 0;
1234 bl->first_member = &bitlst_table[bitlst_table_last];
1235 bl->nr_members = 0;
1237 /* Iterate over each word in the bitset. */
1238 for (i = 0; i < len; i++)
1240 word = set[i];
1241 offset = i * HOST_BITS_PER_WIDE_INT;
1243 /* Iterate over each bit in the word, but do not
1244 go beyond the end of the defined bits. */
1245 for (j = 0; offset < bitlen && word; j++)
1247 if (word & 1)
1249 bitlst_table[bitlst_table_last++] = offset;
1250 (bl->nr_members)++;
1252 word >>= 1;
1253 ++offset;
1260 /* Functions for the construction of regions. */
1262 /* Print the regions, for debugging purposes. Callable from debugger. */
1264 void
1265 debug_regions ()
1267 int rgn, bb;
1269 fprintf (dump, "\n;; ------------ REGIONS ----------\n\n");
1270 for (rgn = 0; rgn < nr_regions; rgn++)
1272 fprintf (dump, ";;\trgn %d nr_blocks %d:\n", rgn,
1273 rgn_table[rgn].rgn_nr_blocks);
1274 fprintf (dump, ";;\tbb/block: ");
1276 for (bb = 0; bb < rgn_table[rgn].rgn_nr_blocks; bb++)
1278 current_blocks = RGN_BLOCKS (rgn);
1280 if (bb != BLOCK_TO_BB (BB_TO_BLOCK (bb)))
1281 abort ();
1283 fprintf (dump, " %d/%d ", bb, BB_TO_BLOCK (bb));
1286 fprintf (dump, "\n\n");
1291 /* Build a single block region for each basic block in the function.
1292 This allows for using the same code for interblock and basic block
1293 scheduling. */
1295 static void
1296 find_single_block_region ()
1298 int i;
1300 for (i = 0; i < n_basic_blocks; i++)
1302 rgn_bb_table[i] = i;
1303 RGN_NR_BLOCKS (i) = 1;
1304 RGN_BLOCKS (i) = i;
1305 CONTAINING_RGN (i) = i;
1306 BLOCK_TO_BB (i) = 0;
1308 nr_regions = n_basic_blocks;
1312 /* Update number of blocks and the estimate for number of insns
1313 in the region. Return 1 if the region is "too large" for interblock
1314 scheduling (compile time considerations), otherwise return 0. */
1316 static int
1317 too_large (block, num_bbs, num_insns)
1318 int block, *num_bbs, *num_insns;
1320 (*num_bbs)++;
1321 (*num_insns) += (INSN_LUID (BLOCK_END (block)) -
1322 INSN_LUID (BLOCK_HEAD (block)));
1323 if ((*num_bbs > MAX_RGN_BLOCKS) || (*num_insns > MAX_RGN_INSNS))
1324 return 1;
1325 else
1326 return 0;
1330 /* Update_loop_relations(blk, hdr): Check if the loop headed by max_hdr[blk]
1331 is still an inner loop. Put in max_hdr[blk] the header of the most inner
1332 loop containing blk. */
1333 #define UPDATE_LOOP_RELATIONS(blk, hdr) \
1335 if (max_hdr[blk] == -1) \
1336 max_hdr[blk] = hdr; \
1337 else if (dfs_nr[max_hdr[blk]] > dfs_nr[hdr]) \
1338 RESET_BIT (inner, hdr); \
1339 else if (dfs_nr[max_hdr[blk]] < dfs_nr[hdr]) \
1341 RESET_BIT (inner,max_hdr[blk]); \
1342 max_hdr[blk] = hdr; \
1347 /* Find regions for interblock scheduling.
1349 A region for scheduling can be:
1351 * A loop-free procedure, or
1353 * A reducible inner loop, or
1355 * A basic block not contained in any other region.
1358 ?!? In theory we could build other regions based on extended basic
1359 blocks or reverse extended basic blocks. Is it worth the trouble?
1361 Loop blocks that form a region are put into the region's block list
1362 in topological order.
1364 This procedure stores its results into the following global (ick) variables
1366 * rgn_nr
1367 * rgn_table
1368 * rgn_bb_table
1369 * block_to_bb
1370 * containing region
1373 We use dominator relationships to avoid making regions out of non-reducible
1374 loops.
1376 This procedure needs to be converted to work on pred/succ lists instead
1377 of edge tables. That would simplify it somewhat. */
1379 static void
1380 find_rgns (edge_list, dom)
1381 struct edge_list *edge_list;
1382 sbitmap *dom;
1384 int *max_hdr, *dfs_nr, *stack, *degree;
1385 char no_loops = 1;
1386 int node, child, loop_head, i, head, tail;
1387 int count = 0, sp, idx = 0, current_edge = out_edges[0];
1388 int num_bbs, num_insns, unreachable;
1389 int too_large_failure;
1391 /* Note if an edge has been passed. */
1392 sbitmap passed;
1394 /* Note if a block is a natural loop header. */
1395 sbitmap header;
1397 /* Note if a block is an natural inner loop header. */
1398 sbitmap inner;
1400 /* Note if a block is in the block queue. */
1401 sbitmap in_queue;
1403 /* Note if a block is in the block queue. */
1404 sbitmap in_stack;
1406 int num_edges = NUM_EDGES (edge_list);
1408 /* Perform a DFS traversal of the cfg. Identify loop headers, inner loops
1409 and a mapping from block to its loop header (if the block is contained
1410 in a loop, else -1).
1412 Store results in HEADER, INNER, and MAX_HDR respectively, these will
1413 be used as inputs to the second traversal.
1415 STACK, SP and DFS_NR are only used during the first traversal. */
1417 /* Allocate and initialize variables for the first traversal. */
1418 max_hdr = (int *) xmalloc (n_basic_blocks * sizeof (int));
1419 dfs_nr = (int *) xcalloc (n_basic_blocks, sizeof (int));
1420 stack = (int *) xmalloc (nr_edges * sizeof (int));
1422 inner = sbitmap_alloc (n_basic_blocks);
1423 sbitmap_ones (inner);
1425 header = sbitmap_alloc (n_basic_blocks);
1426 sbitmap_zero (header);
1428 passed = sbitmap_alloc (nr_edges);
1429 sbitmap_zero (passed);
1431 in_queue = sbitmap_alloc (n_basic_blocks);
1432 sbitmap_zero (in_queue);
1434 in_stack = sbitmap_alloc (n_basic_blocks);
1435 sbitmap_zero (in_stack);
1437 for (i = 0; i < n_basic_blocks; i++)
1438 max_hdr[i] = -1;
1440 /* DFS traversal to find inner loops in the cfg. */
1442 sp = -1;
1443 while (1)
1445 if (current_edge == 0 || TEST_BIT (passed, current_edge))
1447 /* We have reached a leaf node or a node that was already
1448 processed. Pop edges off the stack until we find
1449 an edge that has not yet been processed. */
1450 while (sp >= 0
1451 && (current_edge == 0 || TEST_BIT (passed, current_edge)))
1453 /* Pop entry off the stack. */
1454 current_edge = stack[sp--];
1455 node = FROM_BLOCK (current_edge);
1456 child = TO_BLOCK (current_edge);
1457 RESET_BIT (in_stack, child);
1458 if (max_hdr[child] >= 0 && TEST_BIT (in_stack, max_hdr[child]))
1459 UPDATE_LOOP_RELATIONS (node, max_hdr[child]);
1460 current_edge = NEXT_OUT (current_edge);
1463 /* See if have finished the DFS tree traversal. */
1464 if (sp < 0 && TEST_BIT (passed, current_edge))
1465 break;
1467 /* Nope, continue the traversal with the popped node. */
1468 continue;
1471 /* Process a node. */
1472 node = FROM_BLOCK (current_edge);
1473 child = TO_BLOCK (current_edge);
1474 SET_BIT (in_stack, node);
1475 dfs_nr[node] = ++count;
1477 /* If the successor is in the stack, then we've found a loop.
1478 Mark the loop, if it is not a natural loop, then it will
1479 be rejected during the second traversal. */
1480 if (TEST_BIT (in_stack, child))
1482 no_loops = 0;
1483 SET_BIT (header, child);
1484 UPDATE_LOOP_RELATIONS (node, child);
1485 SET_BIT (passed, current_edge);
1486 current_edge = NEXT_OUT (current_edge);
1487 continue;
1490 /* If the child was already visited, then there is no need to visit
1491 it again. Just update the loop relationships and restart
1492 with a new edge. */
1493 if (dfs_nr[child])
1495 if (max_hdr[child] >= 0 && TEST_BIT (in_stack, max_hdr[child]))
1496 UPDATE_LOOP_RELATIONS (node, max_hdr[child]);
1497 SET_BIT (passed, current_edge);
1498 current_edge = NEXT_OUT (current_edge);
1499 continue;
1502 /* Push an entry on the stack and continue DFS traversal. */
1503 stack[++sp] = current_edge;
1504 SET_BIT (passed, current_edge);
1505 current_edge = OUT_EDGES (child);
1507 /* This is temporary until haifa is converted to use rth's new
1508 cfg routines which have true entry/exit blocks and the
1509 appropriate edges from/to those blocks.
1511 Generally we update dfs_nr for a node when we process its
1512 out edge. However, if the node has no out edge then we will
1513 not set dfs_nr for that node. This can confuse the scheduler
1514 into thinking that we have unreachable blocks, which in turn
1515 disables cross block scheduling.
1517 So, if we have a node with no out edges, go ahead and mark it
1518 as reachable now. */
1519 if (current_edge == 0)
1520 dfs_nr[child] = ++count;
1523 /* Another check for unreachable blocks. The earlier test in
1524 is_cfg_nonregular only finds unreachable blocks that do not
1525 form a loop.
1527 The DFS traversal will mark every block that is reachable from
1528 the entry node by placing a nonzero value in dfs_nr. Thus if
1529 dfs_nr is zero for any block, then it must be unreachable. */
1530 unreachable = 0;
1531 for (i = 0; i < n_basic_blocks; i++)
1532 if (dfs_nr[i] == 0)
1534 unreachable = 1;
1535 break;
1538 /* Gross. To avoid wasting memory, the second pass uses the dfs_nr array
1539 to hold degree counts. */
1540 degree = dfs_nr;
1542 for (i = 0; i < n_basic_blocks; i++)
1543 degree[i] = 0;
1544 for (i = 0; i < num_edges; i++)
1546 edge e = INDEX_EDGE (edge_list, i);
1548 if (e->dest != EXIT_BLOCK_PTR)
1549 degree[e->dest->index]++;
1552 /* Do not perform region scheduling if there are any unreachable
1553 blocks. */
1554 if (!unreachable)
1556 int *queue;
1558 if (no_loops)
1559 SET_BIT (header, 0);
1561 /* Second travsersal:find reducible inner loops and topologically sort
1562 block of each region. */
1564 queue = (int *) xmalloc (n_basic_blocks * sizeof (int));
1566 /* Find blocks which are inner loop headers. We still have non-reducible
1567 loops to consider at this point. */
1568 for (i = 0; i < n_basic_blocks; i++)
1570 if (TEST_BIT (header, i) && TEST_BIT (inner, i))
1572 edge e;
1573 int j;
1575 /* Now check that the loop is reducible. We do this separate
1576 from finding inner loops so that we do not find a reducible
1577 loop which contains an inner non-reducible loop.
1579 A simple way to find reducible/natural loops is to verify
1580 that each block in the loop is dominated by the loop
1581 header.
1583 If there exists a block that is not dominated by the loop
1584 header, then the block is reachable from outside the loop
1585 and thus the loop is not a natural loop. */
1586 for (j = 0; j < n_basic_blocks; j++)
1588 /* First identify blocks in the loop, except for the loop
1589 entry block. */
1590 if (i == max_hdr[j] && i != j)
1592 /* Now verify that the block is dominated by the loop
1593 header. */
1594 if (!TEST_BIT (dom[j], i))
1595 break;
1599 /* If we exited the loop early, then I is the header of
1600 a non-reducible loop and we should quit processing it
1601 now. */
1602 if (j != n_basic_blocks)
1603 continue;
1605 /* I is a header of an inner loop, or block 0 in a subroutine
1606 with no loops at all. */
1607 head = tail = -1;
1608 too_large_failure = 0;
1609 loop_head = max_hdr[i];
1611 /* Decrease degree of all I's successors for topological
1612 ordering. */
1613 for (e = BASIC_BLOCK (i)->succ; e; e = e->succ_next)
1614 if (e->dest != EXIT_BLOCK_PTR)
1615 --degree[e->dest->index];
1617 /* Estimate # insns, and count # blocks in the region. */
1618 num_bbs = 1;
1619 num_insns = (INSN_LUID (BLOCK_END (i))
1620 - INSN_LUID (BLOCK_HEAD (i)));
1623 /* Find all loop latches (blocks with back edges to the loop
1624 header) or all the leaf blocks in the cfg has no loops.
1626 Place those blocks into the queue. */
1627 if (no_loops)
1629 for (j = 0; j < n_basic_blocks; j++)
1630 /* Leaf nodes have only a single successor which must
1631 be EXIT_BLOCK. */
1632 if (BASIC_BLOCK (j)->succ
1633 && BASIC_BLOCK (j)->succ->dest == EXIT_BLOCK_PTR
1634 && BASIC_BLOCK (j)->succ->succ_next == NULL)
1636 queue[++tail] = j;
1637 SET_BIT (in_queue, j);
1639 if (too_large (j, &num_bbs, &num_insns))
1641 too_large_failure = 1;
1642 break;
1646 else
1648 edge e;
1650 for (e = BASIC_BLOCK (i)->pred; e; e = e->pred_next)
1652 if (e->src == ENTRY_BLOCK_PTR)
1653 continue;
1655 node = e->src->index;
1657 if (max_hdr[node] == loop_head && node != i)
1659 /* This is a loop latch. */
1660 queue[++tail] = node;
1661 SET_BIT (in_queue, node);
1663 if (too_large (node, &num_bbs, &num_insns))
1665 too_large_failure = 1;
1666 break;
1673 /* Now add all the blocks in the loop to the queue.
1675 We know the loop is a natural loop; however the algorithm
1676 above will not always mark certain blocks as being in the
1677 loop. Consider:
1678 node children
1679 a b,c
1681 c a,d
1685 The algorithm in the DFS traversal may not mark B & D as part
1686 of the loop (ie they will not have max_hdr set to A).
1688 We know they can not be loop latches (else they would have
1689 had max_hdr set since they'd have a backedge to a dominator
1690 block). So we don't need them on the initial queue.
1692 We know they are part of the loop because they are dominated
1693 by the loop header and can be reached by a backwards walk of
1694 the edges starting with nodes on the initial queue.
1696 It is safe and desirable to include those nodes in the
1697 loop/scheduling region. To do so we would need to decrease
1698 the degree of a node if it is the target of a backedge
1699 within the loop itself as the node is placed in the queue.
1701 We do not do this because I'm not sure that the actual
1702 scheduling code will properly handle this case. ?!? */
1704 while (head < tail && !too_large_failure)
1706 edge e;
1707 child = queue[++head];
1709 for (e = BASIC_BLOCK (child)->pred; e; e = e->pred_next)
1711 node = e->src->index;
1713 /* See discussion above about nodes not marked as in
1714 this loop during the initial DFS traversal. */
1715 if (e->src == ENTRY_BLOCK_PTR
1716 || max_hdr[node] != loop_head)
1718 tail = -1;
1719 break;
1721 else if (!TEST_BIT (in_queue, node) && node != i)
1723 queue[++tail] = node;
1724 SET_BIT (in_queue, node);
1726 if (too_large (node, &num_bbs, &num_insns))
1728 too_large_failure = 1;
1729 break;
1735 if (tail >= 0 && !too_large_failure)
1737 /* Place the loop header into list of region blocks. */
1738 degree[i] = -1;
1739 rgn_bb_table[idx] = i;
1740 RGN_NR_BLOCKS (nr_regions) = num_bbs;
1741 RGN_BLOCKS (nr_regions) = idx++;
1742 CONTAINING_RGN (i) = nr_regions;
1743 BLOCK_TO_BB (i) = count = 0;
1745 /* Remove blocks from queue[] when their in degree
1746 becomes zero. Repeat until no blocks are left on the
1747 list. This produces a topological list of blocks in
1748 the region. */
1749 while (tail >= 0)
1751 if (head < 0)
1752 head = tail;
1753 child = queue[head];
1754 if (degree[child] == 0)
1756 edge e;
1758 degree[child] = -1;
1759 rgn_bb_table[idx++] = child;
1760 BLOCK_TO_BB (child) = ++count;
1761 CONTAINING_RGN (child) = nr_regions;
1762 queue[head] = queue[tail--];
1764 for (e = BASIC_BLOCK (child)->succ;
1766 e = e->succ_next)
1767 if (e->dest != EXIT_BLOCK_PTR)
1768 --degree[e->dest->index];
1770 else
1771 --head;
1773 ++nr_regions;
1777 free (queue);
1780 /* Any block that did not end up in a region is placed into a region
1781 by itself. */
1782 for (i = 0; i < n_basic_blocks; i++)
1783 if (degree[i] >= 0)
1785 rgn_bb_table[idx] = i;
1786 RGN_NR_BLOCKS (nr_regions) = 1;
1787 RGN_BLOCKS (nr_regions) = idx++;
1788 CONTAINING_RGN (i) = nr_regions++;
1789 BLOCK_TO_BB (i) = 0;
1792 free (max_hdr);
1793 free (dfs_nr);
1794 free (stack);
1795 free (passed);
1796 free (header);
1797 free (inner);
1798 free (in_queue);
1799 free (in_stack);
1803 /* Functions for regions scheduling information. */
1805 /* Compute dominators, probability, and potential-split-edges of bb.
1806 Assume that these values were already computed for bb's predecessors. */
1808 static void
1809 compute_dom_prob_ps (bb)
1810 int bb;
1812 int nxt_in_edge, fst_in_edge, pred;
1813 int fst_out_edge, nxt_out_edge, nr_out_edges, nr_rgn_out_edges;
1815 prob[bb] = 0.0;
1816 if (IS_RGN_ENTRY (bb))
1818 BITSET_ADD (dom[bb], 0, bbset_size);
1819 prob[bb] = 1.0;
1820 return;
1823 fst_in_edge = nxt_in_edge = IN_EDGES (BB_TO_BLOCK (bb));
1825 /* Intialize dom[bb] to '111..1'. */
1826 BITSET_INVERT (dom[bb], bbset_size);
1830 pred = FROM_BLOCK (nxt_in_edge);
1831 BITSET_INTER (dom[bb], dom[BLOCK_TO_BB (pred)], bbset_size);
1833 BITSET_UNION (ancestor_edges[bb], ancestor_edges[BLOCK_TO_BB (pred)],
1834 edgeset_size);
1836 BITSET_ADD (ancestor_edges[bb], EDGE_TO_BIT (nxt_in_edge), edgeset_size);
1838 nr_out_edges = 1;
1839 nr_rgn_out_edges = 0;
1840 fst_out_edge = OUT_EDGES (pred);
1841 nxt_out_edge = NEXT_OUT (fst_out_edge);
1842 BITSET_UNION (pot_split[bb], pot_split[BLOCK_TO_BB (pred)],
1843 edgeset_size);
1845 BITSET_ADD (pot_split[bb], EDGE_TO_BIT (fst_out_edge), edgeset_size);
1847 /* The successor doesn't belong in the region? */
1848 if (CONTAINING_RGN (TO_BLOCK (fst_out_edge)) !=
1849 CONTAINING_RGN (BB_TO_BLOCK (bb)))
1850 ++nr_rgn_out_edges;
1852 while (fst_out_edge != nxt_out_edge)
1854 ++nr_out_edges;
1855 /* The successor doesn't belong in the region? */
1856 if (CONTAINING_RGN (TO_BLOCK (nxt_out_edge)) !=
1857 CONTAINING_RGN (BB_TO_BLOCK (bb)))
1858 ++nr_rgn_out_edges;
1859 BITSET_ADD (pot_split[bb], EDGE_TO_BIT (nxt_out_edge), edgeset_size);
1860 nxt_out_edge = NEXT_OUT (nxt_out_edge);
1864 /* Now nr_rgn_out_edges is the number of region-exit edges from
1865 pred, and nr_out_edges will be the number of pred out edges
1866 not leaving the region. */
1867 nr_out_edges -= nr_rgn_out_edges;
1868 if (nr_rgn_out_edges > 0)
1869 prob[bb] += 0.9 * prob[BLOCK_TO_BB (pred)] / nr_out_edges;
1870 else
1871 prob[bb] += prob[BLOCK_TO_BB (pred)] / nr_out_edges;
1872 nxt_in_edge = NEXT_IN (nxt_in_edge);
1874 while (fst_in_edge != nxt_in_edge);
1876 BITSET_ADD (dom[bb], bb, bbset_size);
1877 BITSET_DIFFER (pot_split[bb], ancestor_edges[bb], edgeset_size);
1879 if (sched_verbose >= 2)
1880 fprintf (dump, ";; bb_prob(%d, %d) = %3d\n", bb, BB_TO_BLOCK (bb), (int) (100.0 * prob[bb]));
1881 } /* compute_dom_prob_ps */
1883 /* Functions for target info. */
1885 /* Compute in BL the list of split-edges of bb_src relatively to bb_trg.
1886 Note that bb_trg dominates bb_src. */
1888 static void
1889 split_edges (bb_src, bb_trg, bl)
1890 int bb_src;
1891 int bb_trg;
1892 edgelst *bl;
1894 int es = edgeset_size;
1895 edgeset src = (edgeset) xcalloc (es, sizeof (HOST_WIDE_INT));
1897 while (es--)
1898 src[es] = (pot_split[bb_src])[es];
1899 BITSET_DIFFER (src, pot_split[bb_trg], edgeset_size);
1900 extract_bitlst (src, edgeset_size, edgeset_bitsize, bl);
1901 free (src);
1905 /* Find the valid candidate-source-blocks for the target block TRG, compute
1906 their probability, and check if they are speculative or not.
1907 For speculative sources, compute their update-blocks and split-blocks. */
1909 static void
1910 compute_trg_info (trg)
1911 int trg;
1913 register candidate *sp;
1914 edgelst el;
1915 int check_block, update_idx;
1916 int i, j, k, fst_edge, nxt_edge;
1918 /* Define some of the fields for the target bb as well. */
1919 sp = candidate_table + trg;
1920 sp->is_valid = 1;
1921 sp->is_speculative = 0;
1922 sp->src_prob = 100;
1924 for (i = trg + 1; i < current_nr_blocks; i++)
1926 sp = candidate_table + i;
1928 sp->is_valid = IS_DOMINATED (i, trg);
1929 if (sp->is_valid)
1931 sp->src_prob = GET_SRC_PROB (i, trg);
1932 sp->is_valid = (sp->src_prob >= MIN_PROBABILITY);
1935 if (sp->is_valid)
1937 split_edges (i, trg, &el);
1938 sp->is_speculative = (el.nr_members) ? 1 : 0;
1939 if (sp->is_speculative && !flag_schedule_speculative)
1940 sp->is_valid = 0;
1943 if (sp->is_valid)
1945 sp->split_bbs.first_member = &bblst_table[bblst_last];
1946 sp->split_bbs.nr_members = el.nr_members;
1947 for (j = 0; j < el.nr_members; bblst_last++, j++)
1948 bblst_table[bblst_last] =
1949 TO_BLOCK (rgn_edges[el.first_member[j]]);
1950 sp->update_bbs.first_member = &bblst_table[bblst_last];
1951 update_idx = 0;
1952 for (j = 0; j < el.nr_members; j++)
1954 check_block = FROM_BLOCK (rgn_edges[el.first_member[j]]);
1955 fst_edge = nxt_edge = OUT_EDGES (check_block);
1958 for (k = 0; k < el.nr_members; k++)
1959 if (EDGE_TO_BIT (nxt_edge) == el.first_member[k])
1960 break;
1962 if (k >= el.nr_members)
1964 bblst_table[bblst_last++] = TO_BLOCK (nxt_edge);
1965 update_idx++;
1968 nxt_edge = NEXT_OUT (nxt_edge);
1970 while (fst_edge != nxt_edge);
1972 sp->update_bbs.nr_members = update_idx;
1975 else
1977 sp->split_bbs.nr_members = sp->update_bbs.nr_members = 0;
1979 sp->is_speculative = 0;
1980 sp->src_prob = 0;
1983 } /* compute_trg_info */
1986 /* Print candidates info, for debugging purposes. Callable from debugger. */
1988 void
1989 debug_candidate (i)
1990 int i;
1992 if (!candidate_table[i].is_valid)
1993 return;
1995 if (candidate_table[i].is_speculative)
1997 int j;
1998 fprintf (dump, "src b %d bb %d speculative \n", BB_TO_BLOCK (i), i);
2000 fprintf (dump, "split path: ");
2001 for (j = 0; j < candidate_table[i].split_bbs.nr_members; j++)
2003 int b = candidate_table[i].split_bbs.first_member[j];
2005 fprintf (dump, " %d ", b);
2007 fprintf (dump, "\n");
2009 fprintf (dump, "update path: ");
2010 for (j = 0; j < candidate_table[i].update_bbs.nr_members; j++)
2012 int b = candidate_table[i].update_bbs.first_member[j];
2014 fprintf (dump, " %d ", b);
2016 fprintf (dump, "\n");
2018 else
2020 fprintf (dump, " src %d equivalent\n", BB_TO_BLOCK (i));
2025 /* Print candidates info, for debugging purposes. Callable from debugger. */
2027 void
2028 debug_candidates (trg)
2029 int trg;
2031 int i;
2033 fprintf (dump, "----------- candidate table: target: b=%d bb=%d ---\n",
2034 BB_TO_BLOCK (trg), trg);
2035 for (i = trg + 1; i < current_nr_blocks; i++)
2036 debug_candidate (i);
2040 /* Functions for speculative scheduing. */
2042 /* Return 0 if x is a set of a register alive in the beginning of one
2043 of the split-blocks of src, otherwise return 1. */
2045 static int
2046 check_live_1 (src, x)
2047 int src;
2048 rtx x;
2050 register int i;
2051 register int regno;
2052 register rtx reg = SET_DEST (x);
2054 if (reg == 0)
2055 return 1;
2057 while (GET_CODE (reg) == SUBREG || GET_CODE (reg) == ZERO_EXTRACT
2058 || GET_CODE (reg) == SIGN_EXTRACT
2059 || GET_CODE (reg) == STRICT_LOW_PART)
2060 reg = XEXP (reg, 0);
2062 if (GET_CODE (reg) == PARALLEL
2063 && GET_MODE (reg) == BLKmode)
2065 register int i;
2066 for (i = XVECLEN (reg, 0) - 1; i >= 0; i--)
2067 if (check_live_1 (src, XVECEXP (reg, 0, i)))
2068 return 1;
2069 return 0;
2072 if (GET_CODE (reg) != REG)
2073 return 1;
2075 regno = REGNO (reg);
2077 if (regno < FIRST_PSEUDO_REGISTER && global_regs[regno])
2079 /* Global registers are assumed live. */
2080 return 0;
2082 else
2084 if (regno < FIRST_PSEUDO_REGISTER)
2086 /* Check for hard registers. */
2087 int j = HARD_REGNO_NREGS (regno, GET_MODE (reg));
2088 while (--j >= 0)
2090 for (i = 0; i < candidate_table[src].split_bbs.nr_members; i++)
2092 int b = candidate_table[src].split_bbs.first_member[i];
2094 if (REGNO_REG_SET_P (BASIC_BLOCK (b)->global_live_at_start,
2095 regno + j))
2097 return 0;
2102 else
2104 /* Check for psuedo registers. */
2105 for (i = 0; i < candidate_table[src].split_bbs.nr_members; i++)
2107 int b = candidate_table[src].split_bbs.first_member[i];
2109 if (REGNO_REG_SET_P (BASIC_BLOCK (b)->global_live_at_start, regno))
2111 return 0;
2117 return 1;
2121 /* If x is a set of a register R, mark that R is alive in the beginning
2122 of every update-block of src. */
2124 static void
2125 update_live_1 (src, x)
2126 int src;
2127 rtx x;
2129 register int i;
2130 register int regno;
2131 register rtx reg = SET_DEST (x);
2133 if (reg == 0)
2134 return;
2136 while (GET_CODE (reg) == SUBREG || GET_CODE (reg) == ZERO_EXTRACT
2137 || GET_CODE (reg) == SIGN_EXTRACT
2138 || GET_CODE (reg) == STRICT_LOW_PART)
2139 reg = XEXP (reg, 0);
2141 if (GET_CODE (reg) == PARALLEL
2142 && GET_MODE (reg) == BLKmode)
2144 register int i;
2145 for (i = XVECLEN (reg, 0) - 1; i >= 0; i--)
2146 update_live_1 (src, XVECEXP (reg, 0, i));
2147 return;
2150 if (GET_CODE (reg) != REG)
2151 return;
2153 /* Global registers are always live, so the code below does not apply
2154 to them. */
2156 regno = REGNO (reg);
2158 if (regno >= FIRST_PSEUDO_REGISTER || !global_regs[regno])
2160 if (regno < FIRST_PSEUDO_REGISTER)
2162 int j = HARD_REGNO_NREGS (regno, GET_MODE (reg));
2163 while (--j >= 0)
2165 for (i = 0; i < candidate_table[src].update_bbs.nr_members; i++)
2167 int b = candidate_table[src].update_bbs.first_member[i];
2169 SET_REGNO_REG_SET (BASIC_BLOCK (b)->global_live_at_start,
2170 regno + j);
2174 else
2176 for (i = 0; i < candidate_table[src].update_bbs.nr_members; i++)
2178 int b = candidate_table[src].update_bbs.first_member[i];
2180 SET_REGNO_REG_SET (BASIC_BLOCK (b)->global_live_at_start, regno);
2187 /* Return 1 if insn can be speculatively moved from block src to trg,
2188 otherwise return 0. Called before first insertion of insn to
2189 ready-list or before the scheduling. */
2191 static int
2192 check_live (insn, src)
2193 rtx insn;
2194 int src;
2196 /* Find the registers set by instruction. */
2197 if (GET_CODE (PATTERN (insn)) == SET
2198 || GET_CODE (PATTERN (insn)) == CLOBBER)
2199 return check_live_1 (src, PATTERN (insn));
2200 else if (GET_CODE (PATTERN (insn)) == PARALLEL)
2202 int j;
2203 for (j = XVECLEN (PATTERN (insn), 0) - 1; j >= 0; j--)
2204 if ((GET_CODE (XVECEXP (PATTERN (insn), 0, j)) == SET
2205 || GET_CODE (XVECEXP (PATTERN (insn), 0, j)) == CLOBBER)
2206 && !check_live_1 (src, XVECEXP (PATTERN (insn), 0, j)))
2207 return 0;
2209 return 1;
2212 return 1;
2216 /* Update the live registers info after insn was moved speculatively from
2217 block src to trg. */
2219 static void
2220 update_live (insn, src)
2221 rtx insn;
2222 int src;
2224 /* Find the registers set by instruction. */
2225 if (GET_CODE (PATTERN (insn)) == SET
2226 || GET_CODE (PATTERN (insn)) == CLOBBER)
2227 update_live_1 (src, PATTERN (insn));
2228 else if (GET_CODE (PATTERN (insn)) == PARALLEL)
2230 int j;
2231 for (j = XVECLEN (PATTERN (insn), 0) - 1; j >= 0; j--)
2232 if (GET_CODE (XVECEXP (PATTERN (insn), 0, j)) == SET
2233 || GET_CODE (XVECEXP (PATTERN (insn), 0, j)) == CLOBBER)
2234 update_live_1 (src, XVECEXP (PATTERN (insn), 0, j));
2238 /* Exception Free Loads:
2240 We define five classes of speculative loads: IFREE, IRISKY,
2241 PFREE, PRISKY, and MFREE.
2243 IFREE loads are loads that are proved to be exception-free, just
2244 by examining the load insn. Examples for such loads are loads
2245 from TOC and loads of global data.
2247 IRISKY loads are loads that are proved to be exception-risky,
2248 just by examining the load insn. Examples for such loads are
2249 volatile loads and loads from shared memory.
2251 PFREE loads are loads for which we can prove, by examining other
2252 insns, that they are exception-free. Currently, this class consists
2253 of loads for which we are able to find a "similar load", either in
2254 the target block, or, if only one split-block exists, in that split
2255 block. Load2 is similar to load1 if both have same single base
2256 register. We identify only part of the similar loads, by finding
2257 an insn upon which both load1 and load2 have a DEF-USE dependence.
2259 PRISKY loads are loads for which we can prove, by examining other
2260 insns, that they are exception-risky. Currently we have two proofs for
2261 such loads. The first proof detects loads that are probably guarded by a
2262 test on the memory address. This proof is based on the
2263 backward and forward data dependence information for the region.
2264 Let load-insn be the examined load.
2265 Load-insn is PRISKY iff ALL the following hold:
2267 - insn1 is not in the same block as load-insn
2268 - there is a DEF-USE dependence chain (insn1, ..., load-insn)
2269 - test-insn is either a compare or a branch, not in the same block
2270 as load-insn
2271 - load-insn is reachable from test-insn
2272 - there is a DEF-USE dependence chain (insn1, ..., test-insn)
2274 This proof might fail when the compare and the load are fed
2275 by an insn not in the region. To solve this, we will add to this
2276 group all loads that have no input DEF-USE dependence.
2278 The second proof detects loads that are directly or indirectly
2279 fed by a speculative load. This proof is affected by the
2280 scheduling process. We will use the flag fed_by_spec_load.
2281 Initially, all insns have this flag reset. After a speculative
2282 motion of an insn, if insn is either a load, or marked as
2283 fed_by_spec_load, we will also mark as fed_by_spec_load every
2284 insn1 for which a DEF-USE dependence (insn, insn1) exists. A
2285 load which is fed_by_spec_load is also PRISKY.
2287 MFREE (maybe-free) loads are all the remaining loads. They may be
2288 exception-free, but we cannot prove it.
2290 Now, all loads in IFREE and PFREE classes are considered
2291 exception-free, while all loads in IRISKY and PRISKY classes are
2292 considered exception-risky. As for loads in the MFREE class,
2293 these are considered either exception-free or exception-risky,
2294 depending on whether we are pessimistic or optimistic. We have
2295 to take the pessimistic approach to assure the safety of
2296 speculative scheduling, but we can take the optimistic approach
2297 by invoking the -fsched_spec_load_dangerous option. */
2299 enum INSN_TRAP_CLASS
2301 TRAP_FREE = 0, IFREE = 1, PFREE_CANDIDATE = 2,
2302 PRISKY_CANDIDATE = 3, IRISKY = 4, TRAP_RISKY = 5
2305 #define WORST_CLASS(class1, class2) \
2306 ((class1 > class2) ? class1 : class2)
2308 /* Non-zero if block bb_to is equal to, or reachable from block bb_from. */
2309 #define IS_REACHABLE(bb_from, bb_to) \
2310 (bb_from == bb_to \
2311 || IS_RGN_ENTRY (bb_from) \
2312 || (bitset_member (ancestor_edges[bb_to], \
2313 EDGE_TO_BIT (IN_EDGES (BB_TO_BLOCK (bb_from))), \
2314 edgeset_size)))
2316 /* Non-zero iff the address is comprised from at most 1 register. */
2317 #define CONST_BASED_ADDRESS_P(x) \
2318 (GET_CODE (x) == REG \
2319 || ((GET_CODE (x) == PLUS || GET_CODE (x) == MINUS \
2320 || (GET_CODE (x) == LO_SUM)) \
2321 && (GET_CODE (XEXP (x, 0)) == CONST_INT \
2322 || GET_CODE (XEXP (x, 1)) == CONST_INT)))
2324 /* Turns on the fed_by_spec_load flag for insns fed by load_insn. */
2326 static void
2327 set_spec_fed (load_insn)
2328 rtx load_insn;
2330 rtx link;
2332 for (link = INSN_DEPEND (load_insn); link; link = XEXP (link, 1))
2333 if (GET_MODE (link) == VOIDmode)
2334 FED_BY_SPEC_LOAD (XEXP (link, 0)) = 1;
2335 } /* set_spec_fed */
2337 /* On the path from the insn to load_insn_bb, find a conditional
2338 branch depending on insn, that guards the speculative load. */
2340 static int
2341 find_conditional_protection (insn, load_insn_bb)
2342 rtx insn;
2343 int load_insn_bb;
2345 rtx link;
2347 /* Iterate through DEF-USE forward dependences. */
2348 for (link = INSN_DEPEND (insn); link; link = XEXP (link, 1))
2350 rtx next = XEXP (link, 0);
2351 if ((CONTAINING_RGN (BLOCK_NUM (next)) ==
2352 CONTAINING_RGN (BB_TO_BLOCK (load_insn_bb)))
2353 && IS_REACHABLE (INSN_BB (next), load_insn_bb)
2354 && load_insn_bb != INSN_BB (next)
2355 && GET_MODE (link) == VOIDmode
2356 && (GET_CODE (next) == JUMP_INSN
2357 || find_conditional_protection (next, load_insn_bb)))
2358 return 1;
2360 return 0;
2361 } /* find_conditional_protection */
2363 /* Returns 1 if the same insn1 that participates in the computation
2364 of load_insn's address is feeding a conditional branch that is
2365 guarding on load_insn. This is true if we find a the two DEF-USE
2366 chains:
2367 insn1 -> ... -> conditional-branch
2368 insn1 -> ... -> load_insn,
2369 and if a flow path exist:
2370 insn1 -> ... -> conditional-branch -> ... -> load_insn,
2371 and if insn1 is on the path
2372 region-entry -> ... -> bb_trg -> ... load_insn.
2374 Locate insn1 by climbing on LOG_LINKS from load_insn.
2375 Locate the branch by following INSN_DEPEND from insn1. */
2377 static int
2378 is_conditionally_protected (load_insn, bb_src, bb_trg)
2379 rtx load_insn;
2380 int bb_src, bb_trg;
2382 rtx link;
2384 for (link = LOG_LINKS (load_insn); link; link = XEXP (link, 1))
2386 rtx insn1 = XEXP (link, 0);
2388 /* Must be a DEF-USE dependence upon non-branch. */
2389 if (GET_MODE (link) != VOIDmode
2390 || GET_CODE (insn1) == JUMP_INSN)
2391 continue;
2393 /* Must exist a path: region-entry -> ... -> bb_trg -> ... load_insn. */
2394 if (INSN_BB (insn1) == bb_src
2395 || (CONTAINING_RGN (BLOCK_NUM (insn1))
2396 != CONTAINING_RGN (BB_TO_BLOCK (bb_src)))
2397 || (!IS_REACHABLE (bb_trg, INSN_BB (insn1))
2398 && !IS_REACHABLE (INSN_BB (insn1), bb_trg)))
2399 continue;
2401 /* Now search for the conditional-branch. */
2402 if (find_conditional_protection (insn1, bb_src))
2403 return 1;
2405 /* Recursive step: search another insn1, "above" current insn1. */
2406 return is_conditionally_protected (insn1, bb_src, bb_trg);
2409 /* The chain does not exist. */
2410 return 0;
2411 } /* is_conditionally_protected */
2413 /* Returns 1 if a clue for "similar load" 'insn2' is found, and hence
2414 load_insn can move speculatively from bb_src to bb_trg. All the
2415 following must hold:
2417 (1) both loads have 1 base register (PFREE_CANDIDATEs).
2418 (2) load_insn and load1 have a def-use dependence upon
2419 the same insn 'insn1'.
2420 (3) either load2 is in bb_trg, or:
2421 - there's only one split-block, and
2422 - load1 is on the escape path, and
2424 From all these we can conclude that the two loads access memory
2425 addresses that differ at most by a constant, and hence if moving
2426 load_insn would cause an exception, it would have been caused by
2427 load2 anyhow. */
2429 static int
2430 is_pfree (load_insn, bb_src, bb_trg)
2431 rtx load_insn;
2432 int bb_src, bb_trg;
2434 rtx back_link;
2435 register candidate *candp = candidate_table + bb_src;
2437 if (candp->split_bbs.nr_members != 1)
2438 /* Must have exactly one escape block. */
2439 return 0;
2441 for (back_link = LOG_LINKS (load_insn);
2442 back_link; back_link = XEXP (back_link, 1))
2444 rtx insn1 = XEXP (back_link, 0);
2446 if (GET_MODE (back_link) == VOIDmode)
2448 /* Found a DEF-USE dependence (insn1, load_insn). */
2449 rtx fore_link;
2451 for (fore_link = INSN_DEPEND (insn1);
2452 fore_link; fore_link = XEXP (fore_link, 1))
2454 rtx insn2 = XEXP (fore_link, 0);
2455 if (GET_MODE (fore_link) == VOIDmode)
2457 /* Found a DEF-USE dependence (insn1, insn2). */
2458 if (haifa_classify_insn (insn2) != PFREE_CANDIDATE)
2459 /* insn2 not guaranteed to be a 1 base reg load. */
2460 continue;
2462 if (INSN_BB (insn2) == bb_trg)
2463 /* insn2 is the similar load, in the target block. */
2464 return 1;
2466 if (*(candp->split_bbs.first_member) == BLOCK_NUM (insn2))
2467 /* insn2 is a similar load, in a split-block. */
2468 return 1;
2474 /* Couldn't find a similar load. */
2475 return 0;
2476 } /* is_pfree */
2478 /* Returns a class that insn with GET_DEST(insn)=x may belong to,
2479 as found by analyzing insn's expression. */
2481 static int
2482 may_trap_exp (x, is_store)
2483 rtx x;
2484 int is_store;
2486 enum rtx_code code;
2488 if (x == 0)
2489 return TRAP_FREE;
2490 code = GET_CODE (x);
2491 if (is_store)
2493 if (code == MEM)
2494 return TRAP_RISKY;
2495 else
2496 return TRAP_FREE;
2498 if (code == MEM)
2500 /* The insn uses memory: a volatile load. */
2501 if (MEM_VOLATILE_P (x))
2502 return IRISKY;
2503 /* An exception-free load. */
2504 if (!may_trap_p (x))
2505 return IFREE;
2506 /* A load with 1 base register, to be further checked. */
2507 if (CONST_BASED_ADDRESS_P (XEXP (x, 0)))
2508 return PFREE_CANDIDATE;
2509 /* No info on the load, to be further checked. */
2510 return PRISKY_CANDIDATE;
2512 else
2514 const char *fmt;
2515 int i, insn_class = TRAP_FREE;
2517 /* Neither store nor load, check if it may cause a trap. */
2518 if (may_trap_p (x))
2519 return TRAP_RISKY;
2520 /* Recursive step: walk the insn... */
2521 fmt = GET_RTX_FORMAT (code);
2522 for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--)
2524 if (fmt[i] == 'e')
2526 int tmp_class = may_trap_exp (XEXP (x, i), is_store);
2527 insn_class = WORST_CLASS (insn_class, tmp_class);
2529 else if (fmt[i] == 'E')
2531 int j;
2532 for (j = 0; j < XVECLEN (x, i); j++)
2534 int tmp_class = may_trap_exp (XVECEXP (x, i, j), is_store);
2535 insn_class = WORST_CLASS (insn_class, tmp_class);
2536 if (insn_class == TRAP_RISKY || insn_class == IRISKY)
2537 break;
2540 if (insn_class == TRAP_RISKY || insn_class == IRISKY)
2541 break;
2543 return insn_class;
2545 } /* may_trap_exp */
2548 /* Classifies insn for the purpose of verifying that it can be
2549 moved speculatively, by examining it's patterns, returning:
2550 TRAP_RISKY: store, or risky non-load insn (e.g. division by variable).
2551 TRAP_FREE: non-load insn.
2552 IFREE: load from a globaly safe location.
2553 IRISKY: volatile load.
2554 PFREE_CANDIDATE, PRISKY_CANDIDATE: load that need to be checked for
2555 being either PFREE or PRISKY. */
2557 static int
2558 haifa_classify_insn (insn)
2559 rtx insn;
2561 rtx pat = PATTERN (insn);
2562 int tmp_class = TRAP_FREE;
2563 int insn_class = TRAP_FREE;
2564 enum rtx_code code;
2566 if (GET_CODE (pat) == PARALLEL)
2568 int i, len = XVECLEN (pat, 0);
2570 for (i = len - 1; i >= 0; i--)
2572 code = GET_CODE (XVECEXP (pat, 0, i));
2573 switch (code)
2575 case CLOBBER:
2576 /* Test if it is a 'store'. */
2577 tmp_class = may_trap_exp (XEXP (XVECEXP (pat, 0, i), 0), 1);
2578 break;
2579 case SET:
2580 /* Test if it is a store. */
2581 tmp_class = may_trap_exp (SET_DEST (XVECEXP (pat, 0, i)), 1);
2582 if (tmp_class == TRAP_RISKY)
2583 break;
2584 /* Test if it is a load. */
2585 tmp_class =
2586 WORST_CLASS (tmp_class,
2587 may_trap_exp (SET_SRC (XVECEXP (pat, 0, i)), 0));
2588 break;
2589 case TRAP_IF:
2590 tmp_class = TRAP_RISKY;
2591 break;
2592 default:;
2594 insn_class = WORST_CLASS (insn_class, tmp_class);
2595 if (insn_class == TRAP_RISKY || insn_class == IRISKY)
2596 break;
2599 else
2601 code = GET_CODE (pat);
2602 switch (code)
2604 case CLOBBER:
2605 /* Test if it is a 'store'. */
2606 tmp_class = may_trap_exp (XEXP (pat, 0), 1);
2607 break;
2608 case SET:
2609 /* Test if it is a store. */
2610 tmp_class = may_trap_exp (SET_DEST (pat), 1);
2611 if (tmp_class == TRAP_RISKY)
2612 break;
2613 /* Test if it is a load. */
2614 tmp_class =
2615 WORST_CLASS (tmp_class,
2616 may_trap_exp (SET_SRC (pat), 0));
2617 break;
2618 case TRAP_IF:
2619 tmp_class = TRAP_RISKY;
2620 break;
2621 default:;
2623 insn_class = tmp_class;
2626 return insn_class;
2628 } /* haifa_classify_insn */
2630 /* Return 1 if load_insn is prisky (i.e. if load_insn is fed by
2631 a load moved speculatively, or if load_insn is protected by
2632 a compare on load_insn's address). */
2634 static int
2635 is_prisky (load_insn, bb_src, bb_trg)
2636 rtx load_insn;
2637 int bb_src, bb_trg;
2639 if (FED_BY_SPEC_LOAD (load_insn))
2640 return 1;
2642 if (LOG_LINKS (load_insn) == NULL)
2643 /* Dependence may 'hide' out of the region. */
2644 return 1;
2646 if (is_conditionally_protected (load_insn, bb_src, bb_trg))
2647 return 1;
2649 return 0;
2650 } /* is_prisky */
2652 /* Insn is a candidate to be moved speculatively from bb_src to bb_trg.
2653 Return 1 if insn is exception-free (and the motion is valid)
2654 and 0 otherwise. */
2656 static int
2657 is_exception_free (insn, bb_src, bb_trg)
2658 rtx insn;
2659 int bb_src, bb_trg;
2661 int insn_class = haifa_classify_insn (insn);
2663 /* Handle non-load insns. */
2664 switch (insn_class)
2666 case TRAP_FREE:
2667 return 1;
2668 case TRAP_RISKY:
2669 return 0;
2670 default:;
2673 /* Handle loads. */
2674 if (!flag_schedule_speculative_load)
2675 return 0;
2676 IS_LOAD_INSN (insn) = 1;
2677 switch (insn_class)
2679 case IFREE:
2680 return (1);
2681 case IRISKY:
2682 return 0;
2683 case PFREE_CANDIDATE:
2684 if (is_pfree (insn, bb_src, bb_trg))
2685 return 1;
2686 /* Don't 'break' here: PFREE-candidate is also PRISKY-candidate. */
2687 case PRISKY_CANDIDATE:
2688 if (!flag_schedule_speculative_load_dangerous
2689 || is_prisky (insn, bb_src, bb_trg))
2690 return 0;
2691 break;
2692 default:;
2695 return flag_schedule_speculative_load_dangerous;
2696 } /* is_exception_free */
2699 /* Process an insn's memory dependencies. There are four kinds of
2700 dependencies:
2702 (0) read dependence: read follows read
2703 (1) true dependence: read follows write
2704 (2) anti dependence: write follows read
2705 (3) output dependence: write follows write
2707 We are careful to build only dependencies which actually exist, and
2708 use transitivity to avoid building too many links. */
2710 /* Return the INSN_LIST containing INSN in LIST, or NULL
2711 if LIST does not contain INSN. */
2713 HAIFA_INLINE static rtx
2714 find_insn_list (insn, list)
2715 rtx insn;
2716 rtx list;
2718 while (list)
2720 if (XEXP (list, 0) == insn)
2721 return list;
2722 list = XEXP (list, 1);
2724 return 0;
2728 /* Return 1 if the pair (insn, x) is found in (LIST, LIST1), or 0
2729 otherwise. */
2731 HAIFA_INLINE static char
2732 find_insn_mem_list (insn, x, list, list1)
2733 rtx insn, x;
2734 rtx list, list1;
2736 while (list)
2738 if (XEXP (list, 0) == insn
2739 && XEXP (list1, 0) == x)
2740 return 1;
2741 list = XEXP (list, 1);
2742 list1 = XEXP (list1, 1);
2744 return 0;
2748 /* Compute the function units used by INSN. This caches the value
2749 returned by function_units_used. A function unit is encoded as the
2750 unit number if the value is non-negative and the compliment of a
2751 mask if the value is negative. A function unit index is the
2752 non-negative encoding. */
2754 HAIFA_INLINE static int
2755 insn_unit (insn)
2756 rtx insn;
2758 register int unit = INSN_UNIT (insn);
2760 if (unit == 0)
2762 recog_memoized (insn);
2764 /* A USE insn, or something else we don't need to understand.
2765 We can't pass these directly to function_units_used because it will
2766 trigger a fatal error for unrecognizable insns. */
2767 if (INSN_CODE (insn) < 0)
2768 unit = -1;
2769 else
2771 unit = function_units_used (insn);
2772 /* Increment non-negative values so we can cache zero. */
2773 if (unit >= 0)
2774 unit++;
2776 /* We only cache 16 bits of the result, so if the value is out of
2777 range, don't cache it. */
2778 if (FUNCTION_UNITS_SIZE < HOST_BITS_PER_SHORT
2779 || unit >= 0
2780 || (unit & ~((1 << (HOST_BITS_PER_SHORT - 1)) - 1)) == 0)
2781 INSN_UNIT (insn) = unit;
2783 return (unit > 0 ? unit - 1 : unit);
2786 /* Compute the blockage range for executing INSN on UNIT. This caches
2787 the value returned by the blockage_range_function for the unit.
2788 These values are encoded in an int where the upper half gives the
2789 minimum value and the lower half gives the maximum value. */
2791 HAIFA_INLINE static unsigned int
2792 blockage_range (unit, insn)
2793 int unit;
2794 rtx insn;
2796 unsigned int blockage = INSN_BLOCKAGE (insn);
2797 unsigned int range;
2799 if ((int) UNIT_BLOCKED (blockage) != unit + 1)
2801 range = function_units[unit].blockage_range_function (insn);
2802 /* We only cache the blockage range for one unit and then only if
2803 the values fit. */
2804 if (HOST_BITS_PER_INT >= UNIT_BITS + 2 * BLOCKAGE_BITS)
2805 INSN_BLOCKAGE (insn) = ENCODE_BLOCKAGE (unit + 1, range);
2807 else
2808 range = BLOCKAGE_RANGE (blockage);
2810 return range;
2813 /* A vector indexed by function unit instance giving the last insn to use
2814 the unit. The value of the function unit instance index for unit U
2815 instance I is (U + I * FUNCTION_UNITS_SIZE). */
2816 static rtx unit_last_insn[FUNCTION_UNITS_SIZE * MAX_MULTIPLICITY];
2818 /* A vector indexed by function unit instance giving the minimum time when
2819 the unit will unblock based on the maximum blockage cost. */
2820 static int unit_tick[FUNCTION_UNITS_SIZE * MAX_MULTIPLICITY];
2822 /* A vector indexed by function unit number giving the number of insns
2823 that remain to use the unit. */
2824 static int unit_n_insns[FUNCTION_UNITS_SIZE];
2826 /* Reset the function unit state to the null state. */
2828 static void
2829 clear_units ()
2831 bzero ((char *) unit_last_insn, sizeof (unit_last_insn));
2832 bzero ((char *) unit_tick, sizeof (unit_tick));
2833 bzero ((char *) unit_n_insns, sizeof (unit_n_insns));
2836 /* Return the issue-delay of an insn. */
2838 HAIFA_INLINE static int
2839 insn_issue_delay (insn)
2840 rtx insn;
2842 int i, delay = 0;
2843 int unit = insn_unit (insn);
2845 /* Efficiency note: in fact, we are working 'hard' to compute a
2846 value that was available in md file, and is not available in
2847 function_units[] structure. It would be nice to have this
2848 value there, too. */
2849 if (unit >= 0)
2851 if (function_units[unit].blockage_range_function &&
2852 function_units[unit].blockage_function)
2853 delay = function_units[unit].blockage_function (insn, insn);
2855 else
2856 for (i = 0, unit = ~unit; unit; i++, unit >>= 1)
2857 if ((unit & 1) != 0 && function_units[i].blockage_range_function
2858 && function_units[i].blockage_function)
2859 delay = MAX (delay, function_units[i].blockage_function (insn, insn));
2861 return delay;
2864 /* Return the actual hazard cost of executing INSN on the unit UNIT,
2865 instance INSTANCE at time CLOCK if the previous actual hazard cost
2866 was COST. */
2868 HAIFA_INLINE static int
2869 actual_hazard_this_instance (unit, instance, insn, clock, cost)
2870 int unit, instance, clock, cost;
2871 rtx insn;
2873 int tick = unit_tick[instance]; /* Issue time of the last issued insn. */
2875 if (tick - clock > cost)
2877 /* The scheduler is operating forward, so unit's last insn is the
2878 executing insn and INSN is the candidate insn. We want a
2879 more exact measure of the blockage if we execute INSN at CLOCK
2880 given when we committed the execution of the unit's last insn.
2882 The blockage value is given by either the unit's max blockage
2883 constant, blockage range function, or blockage function. Use
2884 the most exact form for the given unit. */
2886 if (function_units[unit].blockage_range_function)
2888 if (function_units[unit].blockage_function)
2889 tick += (function_units[unit].blockage_function
2890 (unit_last_insn[instance], insn)
2891 - function_units[unit].max_blockage);
2892 else
2893 tick += ((int) MAX_BLOCKAGE_COST (blockage_range (unit, insn))
2894 - function_units[unit].max_blockage);
2896 if (tick - clock > cost)
2897 cost = tick - clock;
2899 return cost;
2902 /* Record INSN as having begun execution on the units encoded by UNIT at
2903 time CLOCK. */
2905 HAIFA_INLINE static void
2906 schedule_unit (unit, insn, clock)
2907 int unit, clock;
2908 rtx insn;
2910 int i;
2912 if (unit >= 0)
2914 int instance = unit;
2915 #if MAX_MULTIPLICITY > 1
2916 /* Find the first free instance of the function unit and use that
2917 one. We assume that one is free. */
2918 for (i = function_units[unit].multiplicity - 1; i > 0; i--)
2920 if (!actual_hazard_this_instance (unit, instance, insn, clock, 0))
2921 break;
2922 instance += FUNCTION_UNITS_SIZE;
2924 #endif
2925 unit_last_insn[instance] = insn;
2926 unit_tick[instance] = (clock + function_units[unit].max_blockage);
2928 else
2929 for (i = 0, unit = ~unit; unit; i++, unit >>= 1)
2930 if ((unit & 1) != 0)
2931 schedule_unit (i, insn, clock);
2934 /* Return the actual hazard cost of executing INSN on the units encoded by
2935 UNIT at time CLOCK if the previous actual hazard cost was COST. */
2937 HAIFA_INLINE static int
2938 actual_hazard (unit, insn, clock, cost)
2939 int unit, clock, cost;
2940 rtx insn;
2942 int i;
2944 if (unit >= 0)
2946 /* Find the instance of the function unit with the minimum hazard. */
2947 int instance = unit;
2948 int best_cost = actual_hazard_this_instance (unit, instance, insn,
2949 clock, cost);
2950 #if MAX_MULTIPLICITY > 1
2951 int this_cost;
2953 if (best_cost > cost)
2955 for (i = function_units[unit].multiplicity - 1; i > 0; i--)
2957 instance += FUNCTION_UNITS_SIZE;
2958 this_cost = actual_hazard_this_instance (unit, instance, insn,
2959 clock, cost);
2960 if (this_cost < best_cost)
2962 best_cost = this_cost;
2963 if (this_cost <= cost)
2964 break;
2968 #endif
2969 cost = MAX (cost, best_cost);
2971 else
2972 for (i = 0, unit = ~unit; unit; i++, unit >>= 1)
2973 if ((unit & 1) != 0)
2974 cost = actual_hazard (i, insn, clock, cost);
2976 return cost;
2979 /* Return the potential hazard cost of executing an instruction on the
2980 units encoded by UNIT if the previous potential hazard cost was COST.
2981 An insn with a large blockage time is chosen in preference to one
2982 with a smaller time; an insn that uses a unit that is more likely
2983 to be used is chosen in preference to one with a unit that is less
2984 used. We are trying to minimize a subsequent actual hazard. */
2986 HAIFA_INLINE static int
2987 potential_hazard (unit, insn, cost)
2988 int unit, cost;
2989 rtx insn;
2991 int i, ncost;
2992 unsigned int minb, maxb;
2994 if (unit >= 0)
2996 minb = maxb = function_units[unit].max_blockage;
2997 if (maxb > 1)
2999 if (function_units[unit].blockage_range_function)
3001 maxb = minb = blockage_range (unit, insn);
3002 maxb = MAX_BLOCKAGE_COST (maxb);
3003 minb = MIN_BLOCKAGE_COST (minb);
3006 if (maxb > 1)
3008 /* Make the number of instructions left dominate. Make the
3009 minimum delay dominate the maximum delay. If all these
3010 are the same, use the unit number to add an arbitrary
3011 ordering. Other terms can be added. */
3012 ncost = minb * 0x40 + maxb;
3013 ncost *= (unit_n_insns[unit] - 1) * 0x1000 + unit;
3014 if (ncost > cost)
3015 cost = ncost;
3019 else
3020 for (i = 0, unit = ~unit; unit; i++, unit >>= 1)
3021 if ((unit & 1) != 0)
3022 cost = potential_hazard (i, insn, cost);
3024 return cost;
3027 /* Compute cost of executing INSN given the dependence LINK on the insn USED.
3028 This is the number of cycles between instruction issue and
3029 instruction results. */
3031 HAIFA_INLINE static int
3032 insn_cost (insn, link, used)
3033 rtx insn, link, used;
3035 register int cost = INSN_COST (insn);
3037 if (cost == 0)
3039 recog_memoized (insn);
3041 /* A USE insn, or something else we don't need to understand.
3042 We can't pass these directly to result_ready_cost because it will
3043 trigger a fatal error for unrecognizable insns. */
3044 if (INSN_CODE (insn) < 0)
3046 INSN_COST (insn) = 1;
3047 return 1;
3049 else
3051 cost = result_ready_cost (insn);
3053 if (cost < 1)
3054 cost = 1;
3056 INSN_COST (insn) = cost;
3060 /* In this case estimate cost without caring how insn is used. */
3061 if (link == 0 && used == 0)
3062 return cost;
3064 /* A USE insn should never require the value used to be computed. This
3065 allows the computation of a function's result and parameter values to
3066 overlap the return and call. */
3067 recog_memoized (used);
3068 if (INSN_CODE (used) < 0)
3069 LINK_COST_FREE (link) = 1;
3071 /* If some dependencies vary the cost, compute the adjustment. Most
3072 commonly, the adjustment is complete: either the cost is ignored
3073 (in the case of an output- or anti-dependence), or the cost is
3074 unchanged. These values are cached in the link as LINK_COST_FREE
3075 and LINK_COST_ZERO. */
3077 if (LINK_COST_FREE (link))
3078 cost = 0;
3079 #ifdef ADJUST_COST
3080 else if (!LINK_COST_ZERO (link))
3082 int ncost = cost;
3084 ADJUST_COST (used, link, insn, ncost);
3085 if (ncost < 1)
3087 LINK_COST_FREE (link) = 1;
3088 ncost = 0;
3090 if (cost == ncost)
3091 LINK_COST_ZERO (link) = 1;
3092 cost = ncost;
3094 #endif
3095 return cost;
3098 /* Compute the priority number for INSN. */
3100 static int
3101 priority (insn)
3102 rtx insn;
3104 int this_priority;
3105 rtx link;
3107 if (GET_RTX_CLASS (GET_CODE (insn)) != 'i')
3108 return 0;
3110 if ((this_priority = INSN_PRIORITY (insn)) == 0)
3112 if (INSN_DEPEND (insn) == 0)
3113 this_priority = insn_cost (insn, 0, 0);
3114 else
3115 for (link = INSN_DEPEND (insn); link; link = XEXP (link, 1))
3117 rtx next;
3118 int next_priority;
3120 if (RTX_INTEGRATED_P (link))
3121 continue;
3123 next = XEXP (link, 0);
3125 /* Critical path is meaningful in block boundaries only. */
3126 if (BLOCK_NUM (next) != BLOCK_NUM (insn))
3127 continue;
3129 next_priority = insn_cost (insn, link, next) + priority (next);
3130 if (next_priority > this_priority)
3131 this_priority = next_priority;
3133 INSN_PRIORITY (insn) = this_priority;
3135 return this_priority;
3139 /* Remove all INSN_LISTs and EXPR_LISTs from the pending lists and add
3140 them to the unused_*_list variables, so that they can be reused. */
3142 static void
3143 free_pending_lists ()
3145 int bb;
3147 for (bb = 0; bb < current_nr_blocks; bb++)
3149 free_INSN_LIST_list (&bb_deps[bb].pending_read_insns);
3150 free_INSN_LIST_list (&bb_deps[bb].pending_write_insns);
3151 free_EXPR_LIST_list (&bb_deps[bb].pending_read_mems);
3152 free_EXPR_LIST_list (&bb_deps[bb].pending_write_mems);
3156 /* Add an INSN and MEM reference pair to a pending INSN_LIST and MEM_LIST.
3157 The MEM is a memory reference contained within INSN, which we are saving
3158 so that we can do memory aliasing on it. */
3160 static void
3161 add_insn_mem_dependence (deps, insn_list, mem_list, insn, mem)
3162 struct deps *deps;
3163 rtx *insn_list, *mem_list, insn, mem;
3165 register rtx link;
3167 link = alloc_INSN_LIST (insn, *insn_list);
3168 *insn_list = link;
3170 link = alloc_EXPR_LIST (VOIDmode, mem, *mem_list);
3171 *mem_list = link;
3173 deps->pending_lists_length++;
3176 /* Make a dependency between every memory reference on the pending lists
3177 and INSN, thus flushing the pending lists. If ONLY_WRITE, don't flush
3178 the read list. */
3180 static void
3181 flush_pending_lists (deps, insn, only_write)
3182 struct deps *deps;
3183 rtx insn;
3184 int only_write;
3186 rtx u;
3187 rtx link;
3189 while (deps->pending_read_insns && ! only_write)
3191 add_dependence (insn, XEXP (deps->pending_read_insns, 0),
3192 REG_DEP_ANTI);
3194 link = deps->pending_read_insns;
3195 deps->pending_read_insns = XEXP (deps->pending_read_insns, 1);
3196 free_INSN_LIST_node (link);
3198 link = deps->pending_read_mems;
3199 deps->pending_read_mems = XEXP (deps->pending_read_mems, 1);
3200 free_EXPR_LIST_node (link);
3202 while (deps->pending_write_insns)
3204 add_dependence (insn, XEXP (deps->pending_write_insns, 0),
3205 REG_DEP_ANTI);
3207 link = deps->pending_write_insns;
3208 deps->pending_write_insns = XEXP (deps->pending_write_insns, 1);
3209 free_INSN_LIST_node (link);
3211 link = deps->pending_write_mems;
3212 deps->pending_write_mems = XEXP (deps->pending_write_mems, 1);
3213 free_EXPR_LIST_node (link);
3215 deps->pending_lists_length = 0;
3217 /* last_pending_memory_flush is now a list of insns. */
3218 for (u = deps->last_pending_memory_flush; u; u = XEXP (u, 1))
3219 add_dependence (insn, XEXP (u, 0), REG_DEP_ANTI);
3221 free_INSN_LIST_list (&deps->last_pending_memory_flush);
3222 deps->last_pending_memory_flush = alloc_INSN_LIST (insn, NULL_RTX);
3225 /* Analyze a single SET, CLOBBER, PRE_DEC, POST_DEC, PRE_INC or POST_INC
3226 rtx, X, creating all dependencies generated by the write to the
3227 destination of X, and reads of everything mentioned. */
3229 static void
3230 sched_analyze_1 (deps, x, insn)
3231 struct deps *deps;
3232 rtx x;
3233 rtx insn;
3235 register int regno;
3236 register rtx dest = XEXP (x, 0);
3237 enum rtx_code code = GET_CODE (x);
3239 if (dest == 0)
3240 return;
3242 if (GET_CODE (dest) == PARALLEL
3243 && GET_MODE (dest) == BLKmode)
3245 register int i;
3246 for (i = XVECLEN (dest, 0) - 1; i >= 0; i--)
3247 sched_analyze_1 (deps, XVECEXP (dest, 0, i), insn);
3248 if (GET_CODE (x) == SET)
3249 sched_analyze_2 (deps, SET_SRC (x), insn);
3250 return;
3253 while (GET_CODE (dest) == STRICT_LOW_PART || GET_CODE (dest) == SUBREG
3254 || GET_CODE (dest) == ZERO_EXTRACT || GET_CODE (dest) == SIGN_EXTRACT)
3256 if (GET_CODE (dest) == ZERO_EXTRACT || GET_CODE (dest) == SIGN_EXTRACT)
3258 /* The second and third arguments are values read by this insn. */
3259 sched_analyze_2 (deps, XEXP (dest, 1), insn);
3260 sched_analyze_2 (deps, XEXP (dest, 2), insn);
3262 dest = XEXP (dest, 0);
3265 if (GET_CODE (dest) == REG)
3267 register int i;
3269 regno = REGNO (dest);
3271 /* A hard reg in a wide mode may really be multiple registers.
3272 If so, mark all of them just like the first. */
3273 if (regno < FIRST_PSEUDO_REGISTER)
3275 i = HARD_REGNO_NREGS (regno, GET_MODE (dest));
3276 while (--i >= 0)
3278 int r = regno + i;
3279 rtx u;
3281 for (u = deps->reg_last_uses[r]; u; u = XEXP (u, 1))
3282 add_dependence (insn, XEXP (u, 0), REG_DEP_ANTI);
3284 for (u = deps->reg_last_sets[r]; u; u = XEXP (u, 1))
3285 add_dependence (insn, XEXP (u, 0), REG_DEP_OUTPUT);
3287 /* Clobbers need not be ordered with respect to one
3288 another, but sets must be ordered with respect to a
3289 pending clobber. */
3290 if (code == SET)
3292 free_INSN_LIST_list (&deps->reg_last_uses[r]);
3293 for (u = deps->reg_last_clobbers[r]; u; u = XEXP (u, 1))
3294 add_dependence (insn, XEXP (u, 0), REG_DEP_OUTPUT);
3295 SET_REGNO_REG_SET (reg_pending_sets, r);
3297 else
3298 SET_REGNO_REG_SET (reg_pending_clobbers, r);
3300 /* Function calls clobber all call_used regs. */
3301 if (global_regs[r] || (code == SET && call_used_regs[r]))
3302 for (u = deps->last_function_call; u; u = XEXP (u, 1))
3303 add_dependence (insn, XEXP (u, 0), REG_DEP_ANTI);
3306 else
3308 rtx u;
3310 for (u = deps->reg_last_uses[regno]; u; u = XEXP (u, 1))
3311 add_dependence (insn, XEXP (u, 0), REG_DEP_ANTI);
3313 for (u = deps->reg_last_sets[regno]; u; u = XEXP (u, 1))
3314 add_dependence (insn, XEXP (u, 0), REG_DEP_OUTPUT);
3316 if (code == SET)
3318 free_INSN_LIST_list (&deps->reg_last_uses[regno]);
3319 for (u = deps->reg_last_clobbers[regno]; u; u = XEXP (u, 1))
3320 add_dependence (insn, XEXP (u, 0), REG_DEP_OUTPUT);
3321 SET_REGNO_REG_SET (reg_pending_sets, regno);
3323 else
3324 SET_REGNO_REG_SET (reg_pending_clobbers, regno);
3326 /* Pseudos that are REG_EQUIV to something may be replaced
3327 by that during reloading. We need only add dependencies for
3328 the address in the REG_EQUIV note. */
3329 if (!reload_completed
3330 && reg_known_equiv_p[regno]
3331 && GET_CODE (reg_known_value[regno]) == MEM)
3332 sched_analyze_2 (deps, XEXP (reg_known_value[regno], 0), insn);
3334 /* Don't let it cross a call after scheduling if it doesn't
3335 already cross one. */
3337 if (REG_N_CALLS_CROSSED (regno) == 0)
3338 for (u = deps->last_function_call; u; u = XEXP (u, 1))
3339 add_dependence (insn, XEXP (u, 0), REG_DEP_ANTI);
3342 else if (GET_CODE (dest) == MEM)
3344 /* Writing memory. */
3346 if (deps->pending_lists_length > 32)
3348 /* Flush all pending reads and writes to prevent the pending lists
3349 from getting any larger. Insn scheduling runs too slowly when
3350 these lists get long. The number 32 was chosen because it
3351 seems like a reasonable number. When compiling GCC with itself,
3352 this flush occurs 8 times for sparc, and 10 times for m88k using
3353 the number 32. */
3354 flush_pending_lists (deps, insn, 0);
3356 else
3358 rtx u;
3359 rtx pending, pending_mem;
3361 pending = deps->pending_read_insns;
3362 pending_mem = deps->pending_read_mems;
3363 while (pending)
3365 if (anti_dependence (XEXP (pending_mem, 0), dest))
3366 add_dependence (insn, XEXP (pending, 0), REG_DEP_ANTI);
3368 pending = XEXP (pending, 1);
3369 pending_mem = XEXP (pending_mem, 1);
3372 pending = deps->pending_write_insns;
3373 pending_mem = deps->pending_write_mems;
3374 while (pending)
3376 if (output_dependence (XEXP (pending_mem, 0), dest))
3377 add_dependence (insn, XEXP (pending, 0), REG_DEP_OUTPUT);
3379 pending = XEXP (pending, 1);
3380 pending_mem = XEXP (pending_mem, 1);
3383 for (u = deps->last_pending_memory_flush; u; u = XEXP (u, 1))
3384 add_dependence (insn, XEXP (u, 0), REG_DEP_ANTI);
3386 add_insn_mem_dependence (deps, &deps->pending_write_insns,
3387 &deps->pending_write_mems, insn, dest);
3389 sched_analyze_2 (deps, XEXP (dest, 0), insn);
3392 /* Analyze reads. */
3393 if (GET_CODE (x) == SET)
3394 sched_analyze_2 (deps, SET_SRC (x), insn);
3397 /* Analyze the uses of memory and registers in rtx X in INSN. */
3399 static void
3400 sched_analyze_2 (deps, x, insn)
3401 struct deps *deps;
3402 rtx x;
3403 rtx insn;
3405 register int i;
3406 register int j;
3407 register enum rtx_code code;
3408 register const char *fmt;
3410 if (x == 0)
3411 return;
3413 code = GET_CODE (x);
3415 switch (code)
3417 case CONST_INT:
3418 case CONST_DOUBLE:
3419 case SYMBOL_REF:
3420 case CONST:
3421 case LABEL_REF:
3422 /* Ignore constants. Note that we must handle CONST_DOUBLE here
3423 because it may have a cc0_rtx in its CONST_DOUBLE_CHAIN field, but
3424 this does not mean that this insn is using cc0. */
3425 return;
3427 #ifdef HAVE_cc0
3428 case CC0:
3430 rtx link, prev;
3432 /* User of CC0 depends on immediately preceding insn. */
3433 SCHED_GROUP_P (insn) = 1;
3435 /* There may be a note before this insn now, but all notes will
3436 be removed before we actually try to schedule the insns, so
3437 it won't cause a problem later. We must avoid it here though. */
3438 prev = prev_nonnote_insn (insn);
3440 /* Make a copy of all dependencies on the immediately previous insn,
3441 and add to this insn. This is so that all the dependencies will
3442 apply to the group. Remove an explicit dependence on this insn
3443 as SCHED_GROUP_P now represents it. */
3445 if (find_insn_list (prev, LOG_LINKS (insn)))
3446 remove_dependence (insn, prev);
3448 for (link = LOG_LINKS (prev); link; link = XEXP (link, 1))
3449 add_dependence (insn, XEXP (link, 0), REG_NOTE_KIND (link));
3451 return;
3453 #endif
3455 case REG:
3457 rtx u;
3458 int regno = REGNO (x);
3459 if (regno < FIRST_PSEUDO_REGISTER)
3461 int i;
3463 i = HARD_REGNO_NREGS (regno, GET_MODE (x));
3464 while (--i >= 0)
3466 int r = regno + i;
3467 deps->reg_last_uses[r]
3468 = alloc_INSN_LIST (insn, deps->reg_last_uses[r]);
3470 for (u = deps->reg_last_sets[r]; u; u = XEXP (u, 1))
3471 add_dependence (insn, XEXP (u, 0), 0);
3473 /* ??? This should never happen. */
3474 for (u = deps->reg_last_clobbers[r]; u; u = XEXP (u, 1))
3475 add_dependence (insn, XEXP (u, 0), 0);
3477 if (call_used_regs[r] || global_regs[r])
3478 /* Function calls clobber all call_used regs. */
3479 for (u = deps->last_function_call; u; u = XEXP (u, 1))
3480 add_dependence (insn, XEXP (u, 0), REG_DEP_ANTI);
3483 else
3485 deps->reg_last_uses[regno]
3486 = alloc_INSN_LIST (insn, deps->reg_last_uses[regno]);
3488 for (u = deps->reg_last_sets[regno]; u; u = XEXP (u, 1))
3489 add_dependence (insn, XEXP (u, 0), 0);
3491 /* ??? This should never happen. */
3492 for (u = deps->reg_last_clobbers[regno]; u; u = XEXP (u, 1))
3493 add_dependence (insn, XEXP (u, 0), 0);
3495 /* Pseudos that are REG_EQUIV to something may be replaced
3496 by that during reloading. We need only add dependencies for
3497 the address in the REG_EQUIV note. */
3498 if (!reload_completed
3499 && reg_known_equiv_p[regno]
3500 && GET_CODE (reg_known_value[regno]) == MEM)
3501 sched_analyze_2 (deps, XEXP (reg_known_value[regno], 0), insn);
3503 /* If the register does not already cross any calls, then add this
3504 insn to the sched_before_next_call list so that it will still
3505 not cross calls after scheduling. */
3506 if (REG_N_CALLS_CROSSED (regno) == 0)
3507 add_dependence (deps->sched_before_next_call, insn,
3508 REG_DEP_ANTI);
3510 return;
3513 case MEM:
3515 /* Reading memory. */
3516 rtx u;
3517 rtx pending, pending_mem;
3519 pending = deps->pending_read_insns;
3520 pending_mem = deps->pending_read_mems;
3521 while (pending)
3523 if (read_dependence (XEXP (pending_mem, 0), x))
3524 add_dependence (insn, XEXP (pending, 0), REG_DEP_ANTI);
3526 pending = XEXP (pending, 1);
3527 pending_mem = XEXP (pending_mem, 1);
3530 pending = deps->pending_write_insns;
3531 pending_mem = deps->pending_write_mems;
3532 while (pending)
3534 if (true_dependence (XEXP (pending_mem, 0), VOIDmode,
3535 x, rtx_varies_p))
3536 add_dependence (insn, XEXP (pending, 0), 0);
3538 pending = XEXP (pending, 1);
3539 pending_mem = XEXP (pending_mem, 1);
3542 for (u = deps->last_pending_memory_flush; u; u = XEXP (u, 1))
3543 add_dependence (insn, XEXP (u, 0), REG_DEP_ANTI);
3545 /* Always add these dependencies to pending_reads, since
3546 this insn may be followed by a write. */
3547 add_insn_mem_dependence (deps, &deps->pending_read_insns,
3548 &deps->pending_read_mems, insn, x);
3550 /* Take advantage of tail recursion here. */
3551 sched_analyze_2 (deps, XEXP (x, 0), insn);
3552 return;
3555 /* Force pending stores to memory in case a trap handler needs them. */
3556 case TRAP_IF:
3557 flush_pending_lists (deps, insn, 1);
3558 break;
3560 case ASM_OPERANDS:
3561 case ASM_INPUT:
3562 case UNSPEC_VOLATILE:
3564 rtx u;
3566 /* Traditional and volatile asm instructions must be considered to use
3567 and clobber all hard registers, all pseudo-registers and all of
3568 memory. So must TRAP_IF and UNSPEC_VOLATILE operations.
3570 Consider for instance a volatile asm that changes the fpu rounding
3571 mode. An insn should not be moved across this even if it only uses
3572 pseudo-regs because it might give an incorrectly rounded result. */
3573 if (code != ASM_OPERANDS || MEM_VOLATILE_P (x))
3575 int max_reg = max_reg_num ();
3576 for (i = 0; i < max_reg; i++)
3578 for (u = deps->reg_last_uses[i]; u; u = XEXP (u, 1))
3579 add_dependence (insn, XEXP (u, 0), REG_DEP_ANTI);
3580 free_INSN_LIST_list (&deps->reg_last_uses[i]);
3582 for (u = deps->reg_last_sets[i]; u; u = XEXP (u, 1))
3583 add_dependence (insn, XEXP (u, 0), 0);
3585 for (u = deps->reg_last_clobbers[i]; u; u = XEXP (u, 1))
3586 add_dependence (insn, XEXP (u, 0), 0);
3588 reg_pending_sets_all = 1;
3590 flush_pending_lists (deps, insn, 0);
3593 /* For all ASM_OPERANDS, we must traverse the vector of input operands.
3594 We can not just fall through here since then we would be confused
3595 by the ASM_INPUT rtx inside ASM_OPERANDS, which do not indicate
3596 traditional asms unlike their normal usage. */
3598 if (code == ASM_OPERANDS)
3600 for (j = 0; j < ASM_OPERANDS_INPUT_LENGTH (x); j++)
3601 sched_analyze_2 (deps, ASM_OPERANDS_INPUT (x, j), insn);
3602 return;
3604 break;
3607 case PRE_DEC:
3608 case POST_DEC:
3609 case PRE_INC:
3610 case POST_INC:
3611 /* These both read and modify the result. We must handle them as writes
3612 to get proper dependencies for following instructions. We must handle
3613 them as reads to get proper dependencies from this to previous
3614 instructions. Thus we need to pass them to both sched_analyze_1
3615 and sched_analyze_2. We must call sched_analyze_2 first in order
3616 to get the proper antecedent for the read. */
3617 sched_analyze_2 (deps, XEXP (x, 0), insn);
3618 sched_analyze_1 (deps, x, insn);
3619 return;
3621 default:
3622 break;
3625 /* Other cases: walk the insn. */
3626 fmt = GET_RTX_FORMAT (code);
3627 for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--)
3629 if (fmt[i] == 'e')
3630 sched_analyze_2 (deps, XEXP (x, i), insn);
3631 else if (fmt[i] == 'E')
3632 for (j = 0; j < XVECLEN (x, i); j++)
3633 sched_analyze_2 (deps, XVECEXP (x, i, j), insn);
3637 /* Analyze an INSN with pattern X to find all dependencies. */
3639 static void
3640 sched_analyze_insn (deps, x, insn, loop_notes)
3641 struct deps *deps;
3642 rtx x, insn;
3643 rtx loop_notes;
3645 register RTX_CODE code = GET_CODE (x);
3646 rtx link;
3647 int maxreg = max_reg_num ();
3648 int i;
3650 if (code == SET || code == CLOBBER)
3651 sched_analyze_1 (deps, x, insn);
3652 else if (code == PARALLEL)
3654 register int i;
3655 for (i = XVECLEN (x, 0) - 1; i >= 0; i--)
3657 code = GET_CODE (XVECEXP (x, 0, i));
3658 if (code == SET || code == CLOBBER)
3659 sched_analyze_1 (deps, XVECEXP (x, 0, i), insn);
3660 else
3661 sched_analyze_2 (deps, XVECEXP (x, 0, i), insn);
3664 else
3665 sched_analyze_2 (deps, x, insn);
3667 /* Mark registers CLOBBERED or used by called function. */
3668 if (GET_CODE (insn) == CALL_INSN)
3669 for (link = CALL_INSN_FUNCTION_USAGE (insn); link; link = XEXP (link, 1))
3671 if (GET_CODE (XEXP (link, 0)) == CLOBBER)
3672 sched_analyze_1 (deps, XEXP (link, 0), insn);
3673 else
3674 sched_analyze_2 (deps, XEXP (link, 0), insn);
3677 /* If there is a {LOOP,EHREGION}_{BEG,END} note in the middle of a basic
3678 block, then we must be sure that no instructions are scheduled across it.
3679 Otherwise, the reg_n_refs info (which depends on loop_depth) would
3680 become incorrect. */
3682 if (loop_notes)
3684 int max_reg = max_reg_num ();
3685 int schedule_barrier_found = 0;
3686 rtx link;
3688 /* Update loop_notes with any notes from this insn. Also determine
3689 if any of the notes on the list correspond to instruction scheduling
3690 barriers (loop, eh & setjmp notes, but not range notes. */
3691 link = loop_notes;
3692 while (XEXP (link, 1))
3694 if (INTVAL (XEXP (link, 0)) == NOTE_INSN_LOOP_BEG
3695 || INTVAL (XEXP (link, 0)) == NOTE_INSN_LOOP_END
3696 || INTVAL (XEXP (link, 0)) == NOTE_INSN_EH_REGION_BEG
3697 || INTVAL (XEXP (link, 0)) == NOTE_INSN_EH_REGION_END
3698 || INTVAL (XEXP (link, 0)) == NOTE_INSN_SETJMP)
3699 schedule_barrier_found = 1;
3701 link = XEXP (link, 1);
3703 XEXP (link, 1) = REG_NOTES (insn);
3704 REG_NOTES (insn) = loop_notes;
3706 /* Add dependencies if a scheduling barrier was found. */
3707 if (schedule_barrier_found)
3709 for (i = 0; i < max_reg; i++)
3711 rtx u;
3712 for (u = deps->reg_last_uses[i]; u; u = XEXP (u, 1))
3713 add_dependence (insn, XEXP (u, 0), REG_DEP_ANTI);
3714 free_INSN_LIST_list (&deps->reg_last_uses[i]);
3716 for (u = deps->reg_last_sets[i]; u; u = XEXP (u, 1))
3717 add_dependence (insn, XEXP (u, 0), 0);
3719 for (u = deps->reg_last_clobbers[i]; u; u = XEXP (u, 1))
3720 add_dependence (insn, XEXP (u, 0), 0);
3722 reg_pending_sets_all = 1;
3724 flush_pending_lists (deps, insn, 0);
3729 /* Accumulate clobbers until the next set so that it will be output dependent
3730 on all of them. At the next set we can clear the clobber list, since
3731 subsequent sets will be output dependent on it. */
3732 EXECUTE_IF_SET_IN_REG_SET
3733 (reg_pending_sets, 0, i,
3735 free_INSN_LIST_list (&deps->reg_last_sets[i]);
3736 free_INSN_LIST_list (&deps->reg_last_clobbers[i]);
3737 deps->reg_last_sets[i] = alloc_INSN_LIST (insn, NULL_RTX);
3739 EXECUTE_IF_SET_IN_REG_SET
3740 (reg_pending_clobbers, 0, i,
3742 deps->reg_last_clobbers[i]
3743 = alloc_INSN_LIST (insn, deps->reg_last_clobbers[i]);
3745 CLEAR_REG_SET (reg_pending_sets);
3746 CLEAR_REG_SET (reg_pending_clobbers);
3748 if (reg_pending_sets_all)
3750 for (i = 0; i < maxreg; i++)
3752 free_INSN_LIST_list (&deps->reg_last_sets[i]);
3753 free_INSN_LIST_list (&deps->reg_last_clobbers[i]);
3754 deps->reg_last_sets[i] = alloc_INSN_LIST (insn, NULL_RTX);
3757 reg_pending_sets_all = 0;
3760 /* Handle function calls and function returns created by the epilogue
3761 threading code. */
3762 if (GET_CODE (insn) == CALL_INSN || GET_CODE (insn) == JUMP_INSN)
3764 rtx dep_insn;
3765 rtx prev_dep_insn;
3767 /* When scheduling instructions, we make sure calls don't lose their
3768 accompanying USE insns by depending them one on another in order.
3770 Also, we must do the same thing for returns created by the epilogue
3771 threading code. Note this code works only in this special case,
3772 because other passes make no guarantee that they will never emit
3773 an instruction between a USE and a RETURN. There is such a guarantee
3774 for USE instructions immediately before a call. */
3776 prev_dep_insn = insn;
3777 dep_insn = PREV_INSN (insn);
3778 while (GET_CODE (dep_insn) == INSN
3779 && GET_CODE (PATTERN (dep_insn)) == USE
3780 && GET_CODE (XEXP (PATTERN (dep_insn), 0)) == REG)
3782 SCHED_GROUP_P (prev_dep_insn) = 1;
3784 /* Make a copy of all dependencies on dep_insn, and add to insn.
3785 This is so that all of the dependencies will apply to the
3786 group. */
3788 for (link = LOG_LINKS (dep_insn); link; link = XEXP (link, 1))
3789 add_dependence (insn, XEXP (link, 0), REG_NOTE_KIND (link));
3791 prev_dep_insn = dep_insn;
3792 dep_insn = PREV_INSN (dep_insn);
3797 /* Analyze every insn between HEAD and TAIL inclusive, creating LOG_LINKS
3798 for every dependency. */
3800 static void
3801 sched_analyze (deps, head, tail)
3802 struct deps *deps;
3803 rtx head, tail;
3805 register rtx insn;
3806 register rtx u;
3807 rtx loop_notes = 0;
3809 for (insn = head;; insn = NEXT_INSN (insn))
3811 if (GET_CODE (insn) == INSN || GET_CODE (insn) == JUMP_INSN)
3813 /* Clear out the stale LOG_LINKS from flow. */
3814 free_INSN_LIST_list (&LOG_LINKS (insn));
3816 /* Make each JUMP_INSN a scheduling barrier for memory
3817 references. */
3818 if (GET_CODE (insn) == JUMP_INSN)
3819 deps->last_pending_memory_flush
3820 = alloc_INSN_LIST (insn, deps->last_pending_memory_flush);
3821 sched_analyze_insn (deps, PATTERN (insn), insn, loop_notes);
3822 loop_notes = 0;
3824 else if (GET_CODE (insn) == CALL_INSN)
3826 rtx x;
3827 register int i;
3829 CANT_MOVE (insn) = 1;
3831 /* Clear out the stale LOG_LINKS from flow. */
3832 free_INSN_LIST_list (&LOG_LINKS (insn));
3834 /* Any instruction using a hard register which may get clobbered
3835 by a call needs to be marked as dependent on this call.
3836 This prevents a use of a hard return reg from being moved
3837 past a void call (i.e. it does not explicitly set the hard
3838 return reg). */
3840 /* If this call is followed by a NOTE_INSN_SETJMP, then assume that
3841 all registers, not just hard registers, may be clobbered by this
3842 call. */
3844 /* Insn, being a CALL_INSN, magically depends on
3845 `last_function_call' already. */
3847 if (NEXT_INSN (insn) && GET_CODE (NEXT_INSN (insn)) == NOTE
3848 && NOTE_LINE_NUMBER (NEXT_INSN (insn)) == NOTE_INSN_SETJMP)
3850 int max_reg = max_reg_num ();
3851 for (i = 0; i < max_reg; i++)
3853 for (u = deps->reg_last_uses[i]; u; u = XEXP (u, 1))
3854 add_dependence (insn, XEXP (u, 0), REG_DEP_ANTI);
3855 free_INSN_LIST_list (&deps->reg_last_uses[i]);
3857 for (u = deps->reg_last_sets[i]; u; u = XEXP (u, 1))
3858 add_dependence (insn, XEXP (u, 0), 0);
3860 for (u = deps->reg_last_clobbers[i]; u; u = XEXP (u, 1))
3861 add_dependence (insn, XEXP (u, 0), 0);
3863 reg_pending_sets_all = 1;
3865 /* Add a pair of REG_SAVE_NOTEs which we will later
3866 convert back into a NOTE_INSN_SETJMP note. See
3867 reemit_notes for why we use a pair of NOTEs. */
3868 REG_NOTES (insn) = alloc_EXPR_LIST (REG_SAVE_NOTE,
3869 GEN_INT (0),
3870 REG_NOTES (insn));
3871 REG_NOTES (insn) = alloc_EXPR_LIST (REG_SAVE_NOTE,
3872 GEN_INT (NOTE_INSN_SETJMP),
3873 REG_NOTES (insn));
3875 else
3877 for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
3878 if (call_used_regs[i] || global_regs[i])
3880 for (u = deps->reg_last_uses[i]; u; u = XEXP (u, 1))
3881 add_dependence (insn, XEXP (u, 0), REG_DEP_ANTI);
3883 for (u = deps->reg_last_sets[i]; u; u = XEXP (u, 1))
3884 add_dependence (insn, XEXP (u, 0), REG_DEP_ANTI);
3886 SET_REGNO_REG_SET (reg_pending_clobbers, i);
3890 /* For each insn which shouldn't cross a call, add a dependence
3891 between that insn and this call insn. */
3892 x = LOG_LINKS (deps->sched_before_next_call);
3893 while (x)
3895 add_dependence (insn, XEXP (x, 0), REG_DEP_ANTI);
3896 x = XEXP (x, 1);
3898 free_INSN_LIST_list (&LOG_LINKS (deps->sched_before_next_call));
3900 sched_analyze_insn (deps, PATTERN (insn), insn, loop_notes);
3901 loop_notes = 0;
3903 /* In the absence of interprocedural alias analysis, we must flush
3904 all pending reads and writes, and start new dependencies starting
3905 from here. But only flush writes for constant calls (which may
3906 be passed a pointer to something we haven't written yet). */
3907 flush_pending_lists (deps, insn, CONST_CALL_P (insn));
3909 /* Depend this function call (actually, the user of this
3910 function call) on all hard register clobberage. */
3912 /* last_function_call is now a list of insns. */
3913 free_INSN_LIST_list (&deps->last_function_call);
3914 deps->last_function_call = alloc_INSN_LIST (insn, NULL_RTX);
3917 /* See comments on reemit_notes as to why we do this.
3918 ??? Actually, the reemit_notes just say what is done, not why. */
3920 else if (GET_CODE (insn) == NOTE
3921 && (NOTE_LINE_NUMBER (insn) == NOTE_INSN_RANGE_START
3922 || NOTE_LINE_NUMBER (insn) == NOTE_INSN_RANGE_END))
3924 loop_notes = alloc_EXPR_LIST (REG_SAVE_NOTE, NOTE_RANGE_INFO (insn),
3925 loop_notes);
3926 loop_notes = alloc_EXPR_LIST (REG_SAVE_NOTE,
3927 GEN_INT (NOTE_LINE_NUMBER (insn)),
3928 loop_notes);
3930 else if (GET_CODE (insn) == NOTE
3931 && (NOTE_LINE_NUMBER (insn) == NOTE_INSN_LOOP_BEG
3932 || NOTE_LINE_NUMBER (insn) == NOTE_INSN_LOOP_END
3933 || NOTE_LINE_NUMBER (insn) == NOTE_INSN_EH_REGION_BEG
3934 || NOTE_LINE_NUMBER (insn) == NOTE_INSN_EH_REGION_END
3935 || (NOTE_LINE_NUMBER (insn) == NOTE_INSN_SETJMP
3936 && GET_CODE (PREV_INSN (insn)) != CALL_INSN)))
3938 rtx rtx_region;
3940 if (NOTE_LINE_NUMBER (insn) == NOTE_INSN_EH_REGION_BEG
3941 || NOTE_LINE_NUMBER (insn) == NOTE_INSN_EH_REGION_END)
3942 rtx_region = GEN_INT (NOTE_EH_HANDLER (insn));
3943 else
3944 rtx_region = GEN_INT (0);
3946 loop_notes = alloc_EXPR_LIST (REG_SAVE_NOTE,
3947 rtx_region,
3948 loop_notes);
3949 loop_notes = alloc_EXPR_LIST (REG_SAVE_NOTE,
3950 GEN_INT (NOTE_LINE_NUMBER (insn)),
3951 loop_notes);
3952 CONST_CALL_P (loop_notes) = CONST_CALL_P (insn);
3955 if (insn == tail)
3956 return;
3958 abort ();
3961 /* Macros and functions for keeping the priority queue sorted, and
3962 dealing with queueing and dequeueing of instructions. */
3964 #define SCHED_SORT(READY, N_READY) \
3965 do { if ((N_READY) == 2) \
3966 swap_sort (READY, N_READY); \
3967 else if ((N_READY) > 2) \
3968 qsort (READY, N_READY, sizeof (rtx), rank_for_schedule); } \
3969 while (0)
3971 /* Returns a positive value if x is preferred; returns a negative value if
3972 y is preferred. Should never return 0, since that will make the sort
3973 unstable. */
3975 static int
3976 rank_for_schedule (x, y)
3977 const PTR x;
3978 const PTR y;
3980 rtx tmp = *(rtx *)y;
3981 rtx tmp2 = *(rtx *)x;
3982 rtx link;
3983 int tmp_class, tmp2_class, depend_count1, depend_count2;
3984 int val, priority_val, spec_val, prob_val, weight_val;
3987 /* Prefer insn with higher priority. */
3988 priority_val = INSN_PRIORITY (tmp2) - INSN_PRIORITY (tmp);
3989 if (priority_val)
3990 return priority_val;
3992 /* Prefer an insn with smaller contribution to registers-pressure. */
3993 if (!reload_completed &&
3994 (weight_val = INSN_REG_WEIGHT (tmp) - INSN_REG_WEIGHT (tmp2)))
3995 return (weight_val);
3997 /* Some comparison make sense in interblock scheduling only. */
3998 if (INSN_BB (tmp) != INSN_BB (tmp2))
4000 /* Prefer an inblock motion on an interblock motion. */
4001 if ((INSN_BB (tmp2) == target_bb) && (INSN_BB (tmp) != target_bb))
4002 return 1;
4003 if ((INSN_BB (tmp) == target_bb) && (INSN_BB (tmp2) != target_bb))
4004 return -1;
4006 /* Prefer a useful motion on a speculative one. */
4007 if ((spec_val = IS_SPECULATIVE_INSN (tmp) - IS_SPECULATIVE_INSN (tmp2)))
4008 return (spec_val);
4010 /* Prefer a more probable (speculative) insn. */
4011 prob_val = INSN_PROBABILITY (tmp2) - INSN_PROBABILITY (tmp);
4012 if (prob_val)
4013 return (prob_val);
4016 /* Compare insns based on their relation to the last-scheduled-insn. */
4017 if (last_scheduled_insn)
4019 /* Classify the instructions into three classes:
4020 1) Data dependent on last schedule insn.
4021 2) Anti/Output dependent on last scheduled insn.
4022 3) Independent of last scheduled insn, or has latency of one.
4023 Choose the insn from the highest numbered class if different. */
4024 link = find_insn_list (tmp, INSN_DEPEND (last_scheduled_insn));
4025 if (link == 0 || insn_cost (last_scheduled_insn, link, tmp) == 1)
4026 tmp_class = 3;
4027 else if (REG_NOTE_KIND (link) == 0) /* Data dependence. */
4028 tmp_class = 1;
4029 else
4030 tmp_class = 2;
4032 link = find_insn_list (tmp2, INSN_DEPEND (last_scheduled_insn));
4033 if (link == 0 || insn_cost (last_scheduled_insn, link, tmp2) == 1)
4034 tmp2_class = 3;
4035 else if (REG_NOTE_KIND (link) == 0) /* Data dependence. */
4036 tmp2_class = 1;
4037 else
4038 tmp2_class = 2;
4040 if ((val = tmp2_class - tmp_class))
4041 return val;
4044 /* Prefer the insn which has more later insns that depend on it.
4045 This gives the scheduler more freedom when scheduling later
4046 instructions at the expense of added register pressure. */
4047 depend_count1 = 0;
4048 for (link = INSN_DEPEND (tmp); link; link = XEXP (link, 1))
4049 depend_count1++;
4051 depend_count2 = 0;
4052 for (link = INSN_DEPEND (tmp2); link; link = XEXP (link, 1))
4053 depend_count2++;
4055 val = depend_count2 - depend_count1;
4056 if (val)
4057 return val;
4059 /* If insns are equally good, sort by INSN_LUID (original insn order),
4060 so that we make the sort stable. This minimizes instruction movement,
4061 thus minimizing sched's effect on debugging and cross-jumping. */
4062 return INSN_LUID (tmp) - INSN_LUID (tmp2);
4065 /* Resort the array A in which only element at index N may be out of order. */
4067 HAIFA_INLINE static void
4068 swap_sort (a, n)
4069 rtx *a;
4070 int n;
4072 rtx insn = a[n - 1];
4073 int i = n - 2;
4075 while (i >= 0 && rank_for_schedule (a + i, &insn) >= 0)
4077 a[i + 1] = a[i];
4078 i -= 1;
4080 a[i + 1] = insn;
4083 static int max_priority;
4085 /* Add INSN to the insn queue so that it can be executed at least
4086 N_CYCLES after the currently executing insn. Preserve insns
4087 chain for debugging purposes. */
4089 HAIFA_INLINE static void
4090 queue_insn (insn, n_cycles)
4091 rtx insn;
4092 int n_cycles;
4094 int next_q = NEXT_Q_AFTER (q_ptr, n_cycles);
4095 rtx link = alloc_INSN_LIST (insn, insn_queue[next_q]);
4096 insn_queue[next_q] = link;
4097 q_size += 1;
4099 if (sched_verbose >= 2)
4101 fprintf (dump, ";;\t\tReady-->Q: insn %d: ", INSN_UID (insn));
4103 if (INSN_BB (insn) != target_bb)
4104 fprintf (dump, "(b%d) ", BLOCK_NUM (insn));
4106 fprintf (dump, "queued for %d cycles.\n", n_cycles);
4111 /* PREV is an insn that is ready to execute. Adjust its priority if that
4112 will help shorten or lengthen register lifetimes as appropriate. Also
4113 provide a hook for the target to tweek itself. */
4115 HAIFA_INLINE static void
4116 adjust_priority (prev)
4117 rtx prev ATTRIBUTE_UNUSED;
4119 /* ??? There used to be code here to try and estimate how an insn
4120 affected register lifetimes, but it did it by looking at REG_DEAD
4121 notes, which we removed in schedule_region. Nor did it try to
4122 take into account register pressure or anything useful like that.
4124 Revisit when we have a machine model to work with and not before. */
4126 #ifdef ADJUST_PRIORITY
4127 ADJUST_PRIORITY (prev);
4128 #endif
4131 /* Clock at which the previous instruction was issued. */
4132 static int last_clock_var;
4134 /* INSN is the "currently executing insn". Launch each insn which was
4135 waiting on INSN. READY is a vector of insns which are ready to fire.
4136 N_READY is the number of elements in READY. CLOCK is the current
4137 cycle. */
4139 static int
4140 schedule_insn (insn, ready, n_ready, clock)
4141 rtx insn;
4142 rtx *ready;
4143 int n_ready;
4144 int clock;
4146 rtx link;
4147 int unit;
4149 unit = insn_unit (insn);
4151 if (sched_verbose >= 2)
4153 fprintf (dump, ";;\t\t--> scheduling insn <<<%d>>> on unit ",
4154 INSN_UID (insn));
4155 insn_print_units (insn);
4156 fprintf (dump, "\n");
4159 if (sched_verbose && unit == -1)
4160 visualize_no_unit (insn);
4162 if (MAX_BLOCKAGE > 1 || issue_rate > 1 || sched_verbose)
4163 schedule_unit (unit, insn, clock);
4165 if (INSN_DEPEND (insn) == 0)
4166 return n_ready;
4168 /* This is used by the function adjust_priority above. */
4169 if (n_ready > 0)
4170 max_priority = MAX (INSN_PRIORITY (ready[0]), INSN_PRIORITY (insn));
4171 else
4172 max_priority = INSN_PRIORITY (insn);
4174 for (link = INSN_DEPEND (insn); link != 0; link = XEXP (link, 1))
4176 rtx next = XEXP (link, 0);
4177 int cost = insn_cost (insn, link, next);
4179 INSN_TICK (next) = MAX (INSN_TICK (next), clock + cost);
4181 if ((INSN_DEP_COUNT (next) -= 1) == 0)
4183 int effective_cost = INSN_TICK (next) - clock;
4185 /* For speculative insns, before inserting to ready/queue,
4186 check live, exception-free, and issue-delay. */
4187 if (INSN_BB (next) != target_bb
4188 && (!IS_VALID (INSN_BB (next))
4189 || CANT_MOVE (next)
4190 || (IS_SPECULATIVE_INSN (next)
4191 && (insn_issue_delay (next) > 3
4192 || !check_live (next, INSN_BB (next))
4193 || !is_exception_free (next, INSN_BB (next), target_bb)))))
4194 continue;
4196 if (sched_verbose >= 2)
4198 fprintf (dump, ";;\t\tdependences resolved: insn %d ",
4199 INSN_UID (next));
4201 if (current_nr_blocks > 1 && INSN_BB (next) != target_bb)
4202 fprintf (dump, "/b%d ", BLOCK_NUM (next));
4204 if (effective_cost < 1)
4205 fprintf (dump, "into ready\n");
4206 else
4207 fprintf (dump, "into queue with cost=%d\n", effective_cost);
4210 /* Adjust the priority of NEXT and either put it on the ready
4211 list or queue it. */
4212 adjust_priority (next);
4213 if (effective_cost < 1)
4214 ready[n_ready++] = next;
4215 else
4216 queue_insn (next, effective_cost);
4220 /* Annotate the instruction with issue information -- TImode
4221 indicates that the instruction is expected not to be able
4222 to issue on the same cycle as the previous insn. A machine
4223 may use this information to decide how the instruction should
4224 be aligned. */
4225 if (reload_completed && issue_rate > 1)
4227 PUT_MODE (insn, clock > last_clock_var ? TImode : VOIDmode);
4228 last_clock_var = clock;
4231 return n_ready;
4234 /* Functions for handling of notes. */
4236 /* Delete notes beginning with INSN and put them in the chain
4237 of notes ended by NOTE_LIST.
4238 Returns the insn following the notes. */
4240 static rtx
4241 unlink_other_notes (insn, tail)
4242 rtx insn, tail;
4244 rtx prev = PREV_INSN (insn);
4246 while (insn != tail && GET_CODE (insn) == NOTE)
4248 rtx next = NEXT_INSN (insn);
4249 /* Delete the note from its current position. */
4250 if (prev)
4251 NEXT_INSN (prev) = next;
4252 if (next)
4253 PREV_INSN (next) = prev;
4255 /* See sched_analyze to see how these are handled. */
4256 if (NOTE_LINE_NUMBER (insn) != NOTE_INSN_SETJMP
4257 && NOTE_LINE_NUMBER (insn) != NOTE_INSN_LOOP_BEG
4258 && NOTE_LINE_NUMBER (insn) != NOTE_INSN_LOOP_END
4259 && NOTE_LINE_NUMBER (insn) != NOTE_INSN_RANGE_START
4260 && NOTE_LINE_NUMBER (insn) != NOTE_INSN_RANGE_END
4261 && NOTE_LINE_NUMBER (insn) != NOTE_INSN_EH_REGION_BEG
4262 && NOTE_LINE_NUMBER (insn) != NOTE_INSN_EH_REGION_END)
4264 /* Insert the note at the end of the notes list. */
4265 PREV_INSN (insn) = note_list;
4266 if (note_list)
4267 NEXT_INSN (note_list) = insn;
4268 note_list = insn;
4271 insn = next;
4273 return insn;
4276 /* Delete line notes beginning with INSN. Record line-number notes so
4277 they can be reused. Returns the insn following the notes. */
4279 static rtx
4280 unlink_line_notes (insn, tail)
4281 rtx insn, tail;
4283 rtx prev = PREV_INSN (insn);
4285 while (insn != tail && GET_CODE (insn) == NOTE)
4287 rtx next = NEXT_INSN (insn);
4289 if (write_symbols != NO_DEBUG && NOTE_LINE_NUMBER (insn) > 0)
4291 /* Delete the note from its current position. */
4292 if (prev)
4293 NEXT_INSN (prev) = next;
4294 if (next)
4295 PREV_INSN (next) = prev;
4297 /* Record line-number notes so they can be reused. */
4298 LINE_NOTE (insn) = insn;
4300 else
4301 prev = insn;
4303 insn = next;
4305 return insn;
4308 /* Return the head and tail pointers of BB. */
4310 HAIFA_INLINE static void
4311 get_block_head_tail (b, headp, tailp)
4312 int b;
4313 rtx *headp;
4314 rtx *tailp;
4317 rtx head;
4318 rtx tail;
4320 /* HEAD and TAIL delimit the basic block being scheduled. */
4321 head = BLOCK_HEAD (b);
4322 tail = BLOCK_END (b);
4324 /* Don't include any notes or labels at the beginning of the
4325 basic block, or notes at the ends of basic blocks. */
4326 while (head != tail)
4328 if (GET_CODE (head) == NOTE)
4329 head = NEXT_INSN (head);
4330 else if (GET_CODE (tail) == NOTE)
4331 tail = PREV_INSN (tail);
4332 else if (GET_CODE (head) == CODE_LABEL)
4333 head = NEXT_INSN (head);
4334 else
4335 break;
4338 *headp = head;
4339 *tailp = tail;
4342 HAIFA_INLINE static void
4343 get_bb_head_tail (bb, headp, tailp)
4344 int bb;
4345 rtx *headp;
4346 rtx *tailp;
4348 get_block_head_tail (BB_TO_BLOCK (bb), headp, tailp);
4351 /* Delete line notes from bb. Save them so they can be later restored
4352 (in restore_line_notes ()). */
4354 static void
4355 rm_line_notes (bb)
4356 int bb;
4358 rtx next_tail;
4359 rtx tail;
4360 rtx head;
4361 rtx insn;
4363 get_bb_head_tail (bb, &head, &tail);
4365 if (head == tail
4366 && (GET_RTX_CLASS (GET_CODE (head)) != 'i'))
4367 return;
4369 next_tail = NEXT_INSN (tail);
4370 for (insn = head; insn != next_tail; insn = NEXT_INSN (insn))
4372 rtx prev;
4374 /* Farm out notes, and maybe save them in NOTE_LIST.
4375 This is needed to keep the debugger from
4376 getting completely deranged. */
4377 if (GET_CODE (insn) == NOTE)
4379 prev = insn;
4380 insn = unlink_line_notes (insn, next_tail);
4382 if (prev == tail)
4383 abort ();
4384 if (prev == head)
4385 abort ();
4386 if (insn == next_tail)
4387 abort ();
4392 /* Save line number notes for each insn in bb. */
4394 static void
4395 save_line_notes (bb)
4396 int bb;
4398 rtx head, tail;
4399 rtx next_tail;
4401 /* We must use the true line number for the first insn in the block
4402 that was computed and saved at the start of this pass. We can't
4403 use the current line number, because scheduling of the previous
4404 block may have changed the current line number. */
4406 rtx line = line_note_head[BB_TO_BLOCK (bb)];
4407 rtx insn;
4409 get_bb_head_tail (bb, &head, &tail);
4410 next_tail = NEXT_INSN (tail);
4412 for (insn = BLOCK_HEAD (BB_TO_BLOCK (bb));
4413 insn != next_tail;
4414 insn = NEXT_INSN (insn))
4415 if (GET_CODE (insn) == NOTE && NOTE_LINE_NUMBER (insn) > 0)
4416 line = insn;
4417 else
4418 LINE_NOTE (insn) = line;
4422 /* After bb was scheduled, insert line notes into the insns list. */
4424 static void
4425 restore_line_notes (bb)
4426 int bb;
4428 rtx line, note, prev, new;
4429 int added_notes = 0;
4430 int b;
4431 rtx head, next_tail, insn;
4433 b = BB_TO_BLOCK (bb);
4435 head = BLOCK_HEAD (b);
4436 next_tail = NEXT_INSN (BLOCK_END (b));
4438 /* Determine the current line-number. We want to know the current
4439 line number of the first insn of the block here, in case it is
4440 different from the true line number that was saved earlier. If
4441 different, then we need a line number note before the first insn
4442 of this block. If it happens to be the same, then we don't want to
4443 emit another line number note here. */
4444 for (line = head; line; line = PREV_INSN (line))
4445 if (GET_CODE (line) == NOTE && NOTE_LINE_NUMBER (line) > 0)
4446 break;
4448 /* Walk the insns keeping track of the current line-number and inserting
4449 the line-number notes as needed. */
4450 for (insn = head; insn != next_tail; insn = NEXT_INSN (insn))
4451 if (GET_CODE (insn) == NOTE && NOTE_LINE_NUMBER (insn) > 0)
4452 line = insn;
4453 /* This used to emit line number notes before every non-deleted note.
4454 However, this confuses a debugger, because line notes not separated
4455 by real instructions all end up at the same address. I can find no
4456 use for line number notes before other notes, so none are emitted. */
4457 else if (GET_CODE (insn) != NOTE
4458 && (note = LINE_NOTE (insn)) != 0
4459 && note != line
4460 && (line == 0
4461 || NOTE_LINE_NUMBER (note) != NOTE_LINE_NUMBER (line)
4462 || NOTE_SOURCE_FILE (note) != NOTE_SOURCE_FILE (line)))
4464 line = note;
4465 prev = PREV_INSN (insn);
4466 if (LINE_NOTE (note))
4468 /* Re-use the original line-number note. */
4469 LINE_NOTE (note) = 0;
4470 PREV_INSN (note) = prev;
4471 NEXT_INSN (prev) = note;
4472 PREV_INSN (insn) = note;
4473 NEXT_INSN (note) = insn;
4475 else
4477 added_notes++;
4478 new = emit_note_after (NOTE_LINE_NUMBER (note), prev);
4479 NOTE_SOURCE_FILE (new) = NOTE_SOURCE_FILE (note);
4480 RTX_INTEGRATED_P (new) = RTX_INTEGRATED_P (note);
4483 if (sched_verbose && added_notes)
4484 fprintf (dump, ";; added %d line-number notes\n", added_notes);
4487 /* After scheduling the function, delete redundant line notes from the
4488 insns list. */
4490 static void
4491 rm_redundant_line_notes ()
4493 rtx line = 0;
4494 rtx insn = get_insns ();
4495 int active_insn = 0;
4496 int notes = 0;
4498 /* Walk the insns deleting redundant line-number notes. Many of these
4499 are already present. The remainder tend to occur at basic
4500 block boundaries. */
4501 for (insn = get_last_insn (); insn; insn = PREV_INSN (insn))
4502 if (GET_CODE (insn) == NOTE && NOTE_LINE_NUMBER (insn) > 0)
4504 /* If there are no active insns following, INSN is redundant. */
4505 if (active_insn == 0)
4507 notes++;
4508 NOTE_SOURCE_FILE (insn) = 0;
4509 NOTE_LINE_NUMBER (insn) = NOTE_INSN_DELETED;
4511 /* If the line number is unchanged, LINE is redundant. */
4512 else if (line
4513 && NOTE_LINE_NUMBER (line) == NOTE_LINE_NUMBER (insn)
4514 && NOTE_SOURCE_FILE (line) == NOTE_SOURCE_FILE (insn))
4516 notes++;
4517 NOTE_SOURCE_FILE (line) = 0;
4518 NOTE_LINE_NUMBER (line) = NOTE_INSN_DELETED;
4519 line = insn;
4521 else
4522 line = insn;
4523 active_insn = 0;
4525 else if (!((GET_CODE (insn) == NOTE
4526 && NOTE_LINE_NUMBER (insn) == NOTE_INSN_DELETED)
4527 || (GET_CODE (insn) == INSN
4528 && (GET_CODE (PATTERN (insn)) == USE
4529 || GET_CODE (PATTERN (insn)) == CLOBBER))))
4530 active_insn++;
4532 if (sched_verbose && notes)
4533 fprintf (dump, ";; deleted %d line-number notes\n", notes);
4536 /* Delete notes between head and tail and put them in the chain
4537 of notes ended by NOTE_LIST. */
4539 static void
4540 rm_other_notes (head, tail)
4541 rtx head;
4542 rtx tail;
4544 rtx next_tail;
4545 rtx insn;
4547 if (head == tail
4548 && (GET_RTX_CLASS (GET_CODE (head)) != 'i'))
4549 return;
4551 next_tail = NEXT_INSN (tail);
4552 for (insn = head; insn != next_tail; insn = NEXT_INSN (insn))
4554 rtx prev;
4556 /* Farm out notes, and maybe save them in NOTE_LIST.
4557 This is needed to keep the debugger from
4558 getting completely deranged. */
4559 if (GET_CODE (insn) == NOTE)
4561 prev = insn;
4563 insn = unlink_other_notes (insn, next_tail);
4565 if (prev == tail)
4566 abort ();
4567 if (prev == head)
4568 abort ();
4569 if (insn == next_tail)
4570 abort ();
4575 /* Functions for computation of registers live/usage info. */
4577 /* Calculate INSN_REG_WEIGHT for all insns of a block. */
4579 static void
4580 find_insn_reg_weight (b)
4581 int b;
4583 rtx insn, next_tail, head, tail;
4585 get_block_head_tail (b, &head, &tail);
4586 next_tail = NEXT_INSN (tail);
4588 for (insn = head; insn != next_tail; insn = NEXT_INSN (insn))
4590 int reg_weight = 0;
4591 rtx x;
4593 /* Handle register life information. */
4594 if (GET_RTX_CLASS (GET_CODE (insn)) != 'i')
4595 continue;
4597 /* Increment weight for each register born here. */
4598 x = PATTERN (insn);
4599 if ((GET_CODE (x) == SET || GET_CODE (x) == CLOBBER)
4600 && register_operand (SET_DEST (x), VOIDmode))
4601 reg_weight++;
4602 else if (GET_CODE (x) == PARALLEL)
4604 int j;
4605 for (j = XVECLEN (x, 0) - 1; j >= 0; j--)
4607 x = XVECEXP (PATTERN (insn), 0, j);
4608 if ((GET_CODE (x) == SET || GET_CODE (x) == CLOBBER)
4609 && register_operand (SET_DEST (x), VOIDmode))
4610 reg_weight++;
4614 /* Decrement weight for each register that dies here. */
4615 for (x = REG_NOTES (insn); x; x = XEXP (x, 1))
4617 if (REG_NOTE_KIND (x) == REG_DEAD
4618 || REG_NOTE_KIND (x) == REG_UNUSED)
4619 reg_weight--;
4622 INSN_REG_WEIGHT (insn) = reg_weight;
4626 /* Scheduling clock, modified in schedule_block() and queue_to_ready (). */
4627 static int clock_var;
4629 /* Move insns that became ready to fire from queue to ready list. */
4631 static int
4632 queue_to_ready (ready, n_ready)
4633 rtx ready[];
4634 int n_ready;
4636 rtx insn;
4637 rtx link;
4639 q_ptr = NEXT_Q (q_ptr);
4641 /* Add all pending insns that can be scheduled without stalls to the
4642 ready list. */
4643 for (link = insn_queue[q_ptr]; link; link = XEXP (link, 1))
4646 insn = XEXP (link, 0);
4647 q_size -= 1;
4649 if (sched_verbose >= 2)
4650 fprintf (dump, ";;\t\tQ-->Ready: insn %d: ", INSN_UID (insn));
4652 if (sched_verbose >= 2 && INSN_BB (insn) != target_bb)
4653 fprintf (dump, "(b%d) ", BLOCK_NUM (insn));
4655 ready[n_ready++] = insn;
4656 if (sched_verbose >= 2)
4657 fprintf (dump, "moving to ready without stalls\n");
4659 insn_queue[q_ptr] = 0;
4661 /* If there are no ready insns, stall until one is ready and add all
4662 of the pending insns at that point to the ready list. */
4663 if (n_ready == 0)
4665 register int stalls;
4667 for (stalls = 1; stalls < INSN_QUEUE_SIZE; stalls++)
4669 if ((link = insn_queue[NEXT_Q_AFTER (q_ptr, stalls)]))
4671 for (; link; link = XEXP (link, 1))
4673 insn = XEXP (link, 0);
4674 q_size -= 1;
4676 if (sched_verbose >= 2)
4677 fprintf (dump, ";;\t\tQ-->Ready: insn %d: ", INSN_UID (insn));
4679 if (sched_verbose >= 2 && INSN_BB (insn) != target_bb)
4680 fprintf (dump, "(b%d) ", BLOCK_NUM (insn));
4682 ready[n_ready++] = insn;
4683 if (sched_verbose >= 2)
4684 fprintf (dump, "moving to ready with %d stalls\n", stalls);
4686 insn_queue[NEXT_Q_AFTER (q_ptr, stalls)] = 0;
4688 if (n_ready)
4689 break;
4693 if (sched_verbose && stalls)
4694 visualize_stall_cycles (BB_TO_BLOCK (target_bb), stalls);
4695 q_ptr = NEXT_Q_AFTER (q_ptr, stalls);
4696 clock_var += stalls;
4698 return n_ready;
4701 /* Print the ready list for debugging purposes. Callable from debugger. */
4703 static void
4704 debug_ready_list (ready, n_ready)
4705 rtx ready[];
4706 int n_ready;
4708 int i;
4710 for (i = 0; i < n_ready; i++)
4712 fprintf (dump, " %d", INSN_UID (ready[i]));
4713 if (current_nr_blocks > 1 && INSN_BB (ready[i]) != target_bb)
4714 fprintf (dump, "/b%d", BLOCK_NUM (ready[i]));
4716 fprintf (dump, "\n");
4719 /* Print names of units on which insn can/should execute, for debugging. */
4721 static void
4722 insn_print_units (insn)
4723 rtx insn;
4725 int i;
4726 int unit = insn_unit (insn);
4728 if (unit == -1)
4729 fprintf (dump, "none");
4730 else if (unit >= 0)
4731 fprintf (dump, "%s", function_units[unit].name);
4732 else
4734 fprintf (dump, "[");
4735 for (i = 0, unit = ~unit; unit; i++, unit >>= 1)
4736 if (unit & 1)
4738 fprintf (dump, "%s", function_units[i].name);
4739 if (unit != 1)
4740 fprintf (dump, " ");
4742 fprintf (dump, "]");
4746 /* MAX_VISUAL_LINES is the maximum number of lines in visualization table
4747 of a basic block. If more lines are needed, table is splitted to two.
4748 n_visual_lines is the number of lines printed so far for a block.
4749 visual_tbl contains the block visualization info.
4750 vis_no_unit holds insns in a cycle that are not mapped to any unit. */
4751 #define MAX_VISUAL_LINES 100
4752 #define INSN_LEN 30
4753 int n_visual_lines;
4754 char *visual_tbl;
4755 int n_vis_no_unit;
4756 rtx vis_no_unit[10];
4758 /* Finds units that are in use in this fuction. Required only
4759 for visualization. */
4761 static void
4762 init_target_units ()
4764 rtx insn;
4765 int unit;
4767 for (insn = get_last_insn (); insn; insn = PREV_INSN (insn))
4769 if (GET_RTX_CLASS (GET_CODE (insn)) != 'i')
4770 continue;
4772 unit = insn_unit (insn);
4774 if (unit < 0)
4775 target_units |= ~unit;
4776 else
4777 target_units |= (1 << unit);
4781 /* Return the length of the visualization table. */
4783 static int
4784 get_visual_tbl_length ()
4786 int unit, i;
4787 int n, n1;
4788 char *s;
4790 /* Compute length of one field in line. */
4791 s = (char *) alloca (INSN_LEN + 6);
4792 sprintf (s, " %33s", "uname");
4793 n1 = strlen (s);
4795 /* Compute length of one line. */
4796 n = strlen (";; ");
4797 n += n1;
4798 for (unit = 0; unit < FUNCTION_UNITS_SIZE; unit++)
4799 if (function_units[unit].bitmask & target_units)
4800 for (i = 0; i < function_units[unit].multiplicity; i++)
4801 n += n1;
4802 n += n1;
4803 n += strlen ("\n") + 2;
4805 /* Compute length of visualization string. */
4806 return (MAX_VISUAL_LINES * n);
4809 /* Init block visualization debugging info. */
4811 static void
4812 init_block_visualization ()
4814 strcpy (visual_tbl, "");
4815 n_visual_lines = 0;
4816 n_vis_no_unit = 0;
4819 #define BUF_LEN 256
4821 static char *
4822 safe_concat (buf, cur, str)
4823 char *buf;
4824 char *cur;
4825 const char *str;
4827 char *end = buf + BUF_LEN - 2; /* Leave room for null. */
4828 int c;
4830 if (cur > end)
4832 *end = '\0';
4833 return end;
4836 while (cur < end && (c = *str++) != '\0')
4837 *cur++ = c;
4839 *cur = '\0';
4840 return cur;
4843 /* This recognizes rtx, I classified as expressions. These are always
4844 represent some action on values or results of other expression, that
4845 may be stored in objects representing values. */
4847 static void
4848 print_exp (buf, x, verbose)
4849 char *buf;
4850 rtx x;
4851 int verbose;
4853 char tmp[BUF_LEN];
4854 const char *st[4];
4855 char *cur = buf;
4856 const char *fun = (char *)0;
4857 const char *sep;
4858 rtx op[4];
4859 int i;
4861 for (i = 0; i < 4; i++)
4863 st[i] = (char *)0;
4864 op[i] = NULL_RTX;
4867 switch (GET_CODE (x))
4869 case PLUS:
4870 op[0] = XEXP (x, 0);
4871 if (GET_CODE (XEXP (x, 1)) == CONST_INT
4872 && INTVAL (XEXP (x, 1)) < 0)
4874 st[1] = "-";
4875 op[1] = GEN_INT (-INTVAL (XEXP (x, 1)));
4877 else
4879 st[1] = "+";
4880 op[1] = XEXP (x, 1);
4882 break;
4883 case LO_SUM:
4884 op[0] = XEXP (x, 0);
4885 st[1] = "+low(";
4886 op[1] = XEXP (x, 1);
4887 st[2] = ")";
4888 break;
4889 case MINUS:
4890 op[0] = XEXP (x, 0);
4891 st[1] = "-";
4892 op[1] = XEXP (x, 1);
4893 break;
4894 case COMPARE:
4895 fun = "cmp";
4896 op[0] = XEXP (x, 0);
4897 op[1] = XEXP (x, 1);
4898 break;
4899 case NEG:
4900 st[0] = "-";
4901 op[0] = XEXP (x, 0);
4902 break;
4903 case MULT:
4904 op[0] = XEXP (x, 0);
4905 st[1] = "*";
4906 op[1] = XEXP (x, 1);
4907 break;
4908 case DIV:
4909 op[0] = XEXP (x, 0);
4910 st[1] = "/";
4911 op[1] = XEXP (x, 1);
4912 break;
4913 case UDIV:
4914 fun = "udiv";
4915 op[0] = XEXP (x, 0);
4916 op[1] = XEXP (x, 1);
4917 break;
4918 case MOD:
4919 op[0] = XEXP (x, 0);
4920 st[1] = "%";
4921 op[1] = XEXP (x, 1);
4922 break;
4923 case UMOD:
4924 fun = "umod";
4925 op[0] = XEXP (x, 0);
4926 op[1] = XEXP (x, 1);
4927 break;
4928 case SMIN:
4929 fun = "smin";
4930 op[0] = XEXP (x, 0);
4931 op[1] = XEXP (x, 1);
4932 break;
4933 case SMAX:
4934 fun = "smax";
4935 op[0] = XEXP (x, 0);
4936 op[1] = XEXP (x, 1);
4937 break;
4938 case UMIN:
4939 fun = "umin";
4940 op[0] = XEXP (x, 0);
4941 op[1] = XEXP (x, 1);
4942 break;
4943 case UMAX:
4944 fun = "umax";
4945 op[0] = XEXP (x, 0);
4946 op[1] = XEXP (x, 1);
4947 break;
4948 case NOT:
4949 st[0] = "!";
4950 op[0] = XEXP (x, 0);
4951 break;
4952 case AND:
4953 op[0] = XEXP (x, 0);
4954 st[1] = "&";
4955 op[1] = XEXP (x, 1);
4956 break;
4957 case IOR:
4958 op[0] = XEXP (x, 0);
4959 st[1] = "|";
4960 op[1] = XEXP (x, 1);
4961 break;
4962 case XOR:
4963 op[0] = XEXP (x, 0);
4964 st[1] = "^";
4965 op[1] = XEXP (x, 1);
4966 break;
4967 case ASHIFT:
4968 op[0] = XEXP (x, 0);
4969 st[1] = "<<";
4970 op[1] = XEXP (x, 1);
4971 break;
4972 case LSHIFTRT:
4973 op[0] = XEXP (x, 0);
4974 st[1] = " 0>>";
4975 op[1] = XEXP (x, 1);
4976 break;
4977 case ASHIFTRT:
4978 op[0] = XEXP (x, 0);
4979 st[1] = ">>";
4980 op[1] = XEXP (x, 1);
4981 break;
4982 case ROTATE:
4983 op[0] = XEXP (x, 0);
4984 st[1] = "<-<";
4985 op[1] = XEXP (x, 1);
4986 break;
4987 case ROTATERT:
4988 op[0] = XEXP (x, 0);
4989 st[1] = ">->";
4990 op[1] = XEXP (x, 1);
4991 break;
4992 case ABS:
4993 fun = "abs";
4994 op[0] = XEXP (x, 0);
4995 break;
4996 case SQRT:
4997 fun = "sqrt";
4998 op[0] = XEXP (x, 0);
4999 break;
5000 case FFS:
5001 fun = "ffs";
5002 op[0] = XEXP (x, 0);
5003 break;
5004 case EQ:
5005 op[0] = XEXP (x, 0);
5006 st[1] = "==";
5007 op[1] = XEXP (x, 1);
5008 break;
5009 case NE:
5010 op[0] = XEXP (x, 0);
5011 st[1] = "!=";
5012 op[1] = XEXP (x, 1);
5013 break;
5014 case GT:
5015 op[0] = XEXP (x, 0);
5016 st[1] = ">";
5017 op[1] = XEXP (x, 1);
5018 break;
5019 case GTU:
5020 fun = "gtu";
5021 op[0] = XEXP (x, 0);
5022 op[1] = XEXP (x, 1);
5023 break;
5024 case LT:
5025 op[0] = XEXP (x, 0);
5026 st[1] = "<";
5027 op[1] = XEXP (x, 1);
5028 break;
5029 case LTU:
5030 fun = "ltu";
5031 op[0] = XEXP (x, 0);
5032 op[1] = XEXP (x, 1);
5033 break;
5034 case GE:
5035 op[0] = XEXP (x, 0);
5036 st[1] = ">=";
5037 op[1] = XEXP (x, 1);
5038 break;
5039 case GEU:
5040 fun = "geu";
5041 op[0] = XEXP (x, 0);
5042 op[1] = XEXP (x, 1);
5043 break;
5044 case LE:
5045 op[0] = XEXP (x, 0);
5046 st[1] = "<=";
5047 op[1] = XEXP (x, 1);
5048 break;
5049 case LEU:
5050 fun = "leu";
5051 op[0] = XEXP (x, 0);
5052 op[1] = XEXP (x, 1);
5053 break;
5054 case SIGN_EXTRACT:
5055 fun = (verbose) ? "sign_extract" : "sxt";
5056 op[0] = XEXP (x, 0);
5057 op[1] = XEXP (x, 1);
5058 op[2] = XEXP (x, 2);
5059 break;
5060 case ZERO_EXTRACT:
5061 fun = (verbose) ? "zero_extract" : "zxt";
5062 op[0] = XEXP (x, 0);
5063 op[1] = XEXP (x, 1);
5064 op[2] = XEXP (x, 2);
5065 break;
5066 case SIGN_EXTEND:
5067 fun = (verbose) ? "sign_extend" : "sxn";
5068 op[0] = XEXP (x, 0);
5069 break;
5070 case ZERO_EXTEND:
5071 fun = (verbose) ? "zero_extend" : "zxn";
5072 op[0] = XEXP (x, 0);
5073 break;
5074 case FLOAT_EXTEND:
5075 fun = (verbose) ? "float_extend" : "fxn";
5076 op[0] = XEXP (x, 0);
5077 break;
5078 case TRUNCATE:
5079 fun = (verbose) ? "trunc" : "trn";
5080 op[0] = XEXP (x, 0);
5081 break;
5082 case FLOAT_TRUNCATE:
5083 fun = (verbose) ? "float_trunc" : "ftr";
5084 op[0] = XEXP (x, 0);
5085 break;
5086 case FLOAT:
5087 fun = (verbose) ? "float" : "flt";
5088 op[0] = XEXP (x, 0);
5089 break;
5090 case UNSIGNED_FLOAT:
5091 fun = (verbose) ? "uns_float" : "ufl";
5092 op[0] = XEXP (x, 0);
5093 break;
5094 case FIX:
5095 fun = "fix";
5096 op[0] = XEXP (x, 0);
5097 break;
5098 case UNSIGNED_FIX:
5099 fun = (verbose) ? "uns_fix" : "ufx";
5100 op[0] = XEXP (x, 0);
5101 break;
5102 case PRE_DEC:
5103 st[0] = "--";
5104 op[0] = XEXP (x, 0);
5105 break;
5106 case PRE_INC:
5107 st[0] = "++";
5108 op[0] = XEXP (x, 0);
5109 break;
5110 case POST_DEC:
5111 op[0] = XEXP (x, 0);
5112 st[1] = "--";
5113 break;
5114 case POST_INC:
5115 op[0] = XEXP (x, 0);
5116 st[1] = "++";
5117 break;
5118 case CALL:
5119 st[0] = "call ";
5120 op[0] = XEXP (x, 0);
5121 if (verbose)
5123 st[1] = " argc:";
5124 op[1] = XEXP (x, 1);
5126 break;
5127 case IF_THEN_ELSE:
5128 st[0] = "{(";
5129 op[0] = XEXP (x, 0);
5130 st[1] = ")?";
5131 op[1] = XEXP (x, 1);
5132 st[2] = ":";
5133 op[2] = XEXP (x, 2);
5134 st[3] = "}";
5135 break;
5136 case TRAP_IF:
5137 fun = "trap_if";
5138 op[0] = TRAP_CONDITION (x);
5139 break;
5140 case UNSPEC:
5141 case UNSPEC_VOLATILE:
5143 cur = safe_concat (buf, cur, "unspec");
5144 if (GET_CODE (x) == UNSPEC_VOLATILE)
5145 cur = safe_concat (buf, cur, "/v");
5146 cur = safe_concat (buf, cur, "[");
5147 sep = "";
5148 for (i = 0; i < XVECLEN (x, 0); i++)
5150 print_pattern (tmp, XVECEXP (x, 0, i), verbose);
5151 cur = safe_concat (buf, cur, sep);
5152 cur = safe_concat (buf, cur, tmp);
5153 sep = ",";
5155 cur = safe_concat (buf, cur, "] ");
5156 sprintf (tmp, "%d", XINT (x, 1));
5157 cur = safe_concat (buf, cur, tmp);
5159 break;
5160 default:
5161 /* If (verbose) debug_rtx (x); */
5162 st[0] = GET_RTX_NAME (GET_CODE (x));
5163 break;
5166 /* Print this as a function? */
5167 if (fun)
5169 cur = safe_concat (buf, cur, fun);
5170 cur = safe_concat (buf, cur, "(");
5173 for (i = 0; i < 4; i++)
5175 if (st[i])
5176 cur = safe_concat (buf, cur, st[i]);
5178 if (op[i])
5180 if (fun && i != 0)
5181 cur = safe_concat (buf, cur, ",");
5183 print_value (tmp, op[i], verbose);
5184 cur = safe_concat (buf, cur, tmp);
5188 if (fun)
5189 cur = safe_concat (buf, cur, ")");
5190 } /* print_exp */
5192 /* Prints rtxes, I customly classified as values. They're constants,
5193 registers, labels, symbols and memory accesses. */
5195 static void
5196 print_value (buf, x, verbose)
5197 char *buf;
5198 rtx x;
5199 int verbose;
5201 char t[BUF_LEN];
5202 char *cur = buf;
5204 switch (GET_CODE (x))
5206 case CONST_INT:
5207 sprintf (t, HOST_WIDE_INT_PRINT_HEX, INTVAL (x));
5208 cur = safe_concat (buf, cur, t);
5209 break;
5210 case CONST_DOUBLE:
5211 sprintf (t, "<0x%lx,0x%lx>", (long)XWINT (x, 2), (long)XWINT (x, 3));
5212 cur = safe_concat (buf, cur, t);
5213 break;
5214 case CONST_STRING:
5215 cur = safe_concat (buf, cur, "\"");
5216 cur = safe_concat (buf, cur, XSTR (x, 0));
5217 cur = safe_concat (buf, cur, "\"");
5218 break;
5219 case SYMBOL_REF:
5220 cur = safe_concat (buf, cur, "`");
5221 cur = safe_concat (buf, cur, XSTR (x, 0));
5222 cur = safe_concat (buf, cur, "'");
5223 break;
5224 case LABEL_REF:
5225 sprintf (t, "L%d", INSN_UID (XEXP (x, 0)));
5226 cur = safe_concat (buf, cur, t);
5227 break;
5228 case CONST:
5229 print_value (t, XEXP (x, 0), verbose);
5230 cur = safe_concat (buf, cur, "const(");
5231 cur = safe_concat (buf, cur, t);
5232 cur = safe_concat (buf, cur, ")");
5233 break;
5234 case HIGH:
5235 print_value (t, XEXP (x, 0), verbose);
5236 cur = safe_concat (buf, cur, "high(");
5237 cur = safe_concat (buf, cur, t);
5238 cur = safe_concat (buf, cur, ")");
5239 break;
5240 case REG:
5241 if (REGNO (x) < FIRST_PSEUDO_REGISTER)
5243 int c = reg_names[ REGNO (x) ][0];
5244 if (c >= '0' && c <= '9')
5245 cur = safe_concat (buf, cur, "%");
5247 cur = safe_concat (buf, cur, reg_names[ REGNO (x) ]);
5249 else
5251 sprintf (t, "r%d", REGNO (x));
5252 cur = safe_concat (buf, cur, t);
5254 break;
5255 case SUBREG:
5256 print_value (t, SUBREG_REG (x), verbose);
5257 cur = safe_concat (buf, cur, t);
5258 sprintf (t, "#%d", SUBREG_WORD (x));
5259 cur = safe_concat (buf, cur, t);
5260 break;
5261 case SCRATCH:
5262 cur = safe_concat (buf, cur, "scratch");
5263 break;
5264 case CC0:
5265 cur = safe_concat (buf, cur, "cc0");
5266 break;
5267 case PC:
5268 cur = safe_concat (buf, cur, "pc");
5269 break;
5270 case MEM:
5271 print_value (t, XEXP (x, 0), verbose);
5272 cur = safe_concat (buf, cur, "[");
5273 cur = safe_concat (buf, cur, t);
5274 cur = safe_concat (buf, cur, "]");
5275 break;
5276 default:
5277 print_exp (t, x, verbose);
5278 cur = safe_concat (buf, cur, t);
5279 break;
5281 } /* print_value */
5283 /* The next step in insn detalization, its pattern recognition. */
5285 static void
5286 print_pattern (buf, x, verbose)
5287 char *buf;
5288 rtx x;
5289 int verbose;
5291 char t1[BUF_LEN], t2[BUF_LEN], t3[BUF_LEN];
5293 switch (GET_CODE (x))
5295 case SET:
5296 print_value (t1, SET_DEST (x), verbose);
5297 print_value (t2, SET_SRC (x), verbose);
5298 sprintf (buf, "%s=%s", t1, t2);
5299 break;
5300 case RETURN:
5301 sprintf (buf, "return");
5302 break;
5303 case CALL:
5304 print_exp (buf, x, verbose);
5305 break;
5306 case CLOBBER:
5307 print_value (t1, XEXP (x, 0), verbose);
5308 sprintf (buf, "clobber %s", t1);
5309 break;
5310 case USE:
5311 print_value (t1, XEXP (x, 0), verbose);
5312 sprintf (buf, "use %s", t1);
5313 break;
5314 case PARALLEL:
5316 int i;
5318 sprintf (t1, "{");
5319 for (i = 0; i < XVECLEN (x, 0); i++)
5321 print_pattern (t2, XVECEXP (x, 0, i), verbose);
5322 sprintf (t3, "%s%s;", t1, t2);
5323 strcpy (t1, t3);
5325 sprintf (buf, "%s}", t1);
5327 break;
5328 case SEQUENCE:
5330 int i;
5332 sprintf (t1, "%%{");
5333 for (i = 0; i < XVECLEN (x, 0); i++)
5335 print_insn (t2, XVECEXP (x, 0, i), verbose);
5336 sprintf (t3, "%s%s;", t1, t2);
5337 strcpy (t1, t3);
5339 sprintf (buf, "%s%%}", t1);
5341 break;
5342 case ASM_INPUT:
5343 sprintf (buf, "asm {%s}", XSTR (x, 0));
5344 break;
5345 case ADDR_VEC:
5346 break;
5347 case ADDR_DIFF_VEC:
5348 print_value (buf, XEXP (x, 0), verbose);
5349 break;
5350 case TRAP_IF:
5351 print_value (t1, TRAP_CONDITION (x), verbose);
5352 sprintf (buf, "trap_if %s", t1);
5353 break;
5354 case UNSPEC:
5356 int i;
5358 sprintf (t1, "unspec{");
5359 for (i = 0; i < XVECLEN (x, 0); i++)
5361 print_pattern (t2, XVECEXP (x, 0, i), verbose);
5362 sprintf (t3, "%s%s;", t1, t2);
5363 strcpy (t1, t3);
5365 sprintf (buf, "%s}", t1);
5367 break;
5368 case UNSPEC_VOLATILE:
5370 int i;
5372 sprintf (t1, "unspec/v{");
5373 for (i = 0; i < XVECLEN (x, 0); i++)
5375 print_pattern (t2, XVECEXP (x, 0, i), verbose);
5376 sprintf (t3, "%s%s;", t1, t2);
5377 strcpy (t1, t3);
5379 sprintf (buf, "%s}", t1);
5381 break;
5382 default:
5383 print_value (buf, x, verbose);
5385 } /* print_pattern */
5387 /* This is the main function in rtl visualization mechanism. It
5388 accepts an rtx and tries to recognize it as an insn, then prints it
5389 properly in human readable form, resembling assembler mnemonics.
5390 For every insn it prints its UID and BB the insn belongs too.
5391 (Probably the last "option" should be extended somehow, since it
5392 depends now on sched.c inner variables ...) */
5394 static void
5395 print_insn (buf, x, verbose)
5396 char *buf;
5397 rtx x;
5398 int verbose;
5400 char t[BUF_LEN];
5401 rtx insn = x;
5403 switch (GET_CODE (x))
5405 case INSN:
5406 print_pattern (t, PATTERN (x), verbose);
5407 if (verbose)
5408 sprintf (buf, "b%d: i% 4d: %s", INSN_BB (x),
5409 INSN_UID (x), t);
5410 else
5411 sprintf (buf, "%-4d %s", INSN_UID (x), t);
5412 break;
5413 case JUMP_INSN:
5414 print_pattern (t, PATTERN (x), verbose);
5415 if (verbose)
5416 sprintf (buf, "b%d: i% 4d: jump %s", INSN_BB (x),
5417 INSN_UID (x), t);
5418 else
5419 sprintf (buf, "%-4d %s", INSN_UID (x), t);
5420 break;
5421 case CALL_INSN:
5422 x = PATTERN (insn);
5423 if (GET_CODE (x) == PARALLEL)
5425 x = XVECEXP (x, 0, 0);
5426 print_pattern (t, x, verbose);
5428 else
5429 strcpy (t, "call <...>");
5430 if (verbose)
5431 sprintf (buf, "b%d: i% 4d: %s", INSN_BB (insn),
5432 INSN_UID (insn), t);
5433 else
5434 sprintf (buf, "%-4d %s", INSN_UID (insn), t);
5435 break;
5436 case CODE_LABEL:
5437 sprintf (buf, "L%d:", INSN_UID (x));
5438 break;
5439 case BARRIER:
5440 sprintf (buf, "i% 4d: barrier", INSN_UID (x));
5441 break;
5442 case NOTE:
5443 if (NOTE_LINE_NUMBER (x) > 0)
5444 sprintf (buf, "%4d note \"%s\" %d", INSN_UID (x),
5445 NOTE_SOURCE_FILE (x), NOTE_LINE_NUMBER (x));
5446 else
5447 sprintf (buf, "%4d %s", INSN_UID (x),
5448 GET_NOTE_INSN_NAME (NOTE_LINE_NUMBER (x)));
5449 break;
5450 default:
5451 if (verbose)
5453 sprintf (buf, "Not an INSN at all\n");
5454 debug_rtx (x);
5456 else
5457 sprintf (buf, "i%-4d <What?>", INSN_UID (x));
5459 } /* print_insn */
5461 /* Print visualization debugging info. */
5463 static void
5464 print_block_visualization (b, s)
5465 int b;
5466 const char *s;
5468 int unit, i;
5470 /* Print header. */
5471 fprintf (dump, "\n;; ==================== scheduling visualization for block %d %s \n", b, s);
5473 /* Print names of units. */
5474 fprintf (dump, ";; %-8s", "clock");
5475 for (unit = 0; unit < FUNCTION_UNITS_SIZE; unit++)
5476 if (function_units[unit].bitmask & target_units)
5477 for (i = 0; i < function_units[unit].multiplicity; i++)
5478 fprintf (dump, " %-33s", function_units[unit].name);
5479 fprintf (dump, " %-8s\n", "no-unit");
5481 fprintf (dump, ";; %-8s", "=====");
5482 for (unit = 0; unit < FUNCTION_UNITS_SIZE; unit++)
5483 if (function_units[unit].bitmask & target_units)
5484 for (i = 0; i < function_units[unit].multiplicity; i++)
5485 fprintf (dump, " %-33s", "==============================");
5486 fprintf (dump, " %-8s\n", "=======");
5488 /* Print insns in each cycle. */
5489 fprintf (dump, "%s\n", visual_tbl);
5492 /* Print insns in the 'no_unit' column of visualization. */
5494 static void
5495 visualize_no_unit (insn)
5496 rtx insn;
5498 vis_no_unit[n_vis_no_unit] = insn;
5499 n_vis_no_unit++;
5502 /* Print insns scheduled in clock, for visualization. */
5504 static void
5505 visualize_scheduled_insns (b, clock)
5506 int b, clock;
5508 int i, unit;
5510 /* If no more room, split table into two. */
5511 if (n_visual_lines >= MAX_VISUAL_LINES)
5513 print_block_visualization (b, "(incomplete)");
5514 init_block_visualization ();
5517 n_visual_lines++;
5519 sprintf (visual_tbl + strlen (visual_tbl), ";; %-8d", clock);
5520 for (unit = 0; unit < FUNCTION_UNITS_SIZE; unit++)
5521 if (function_units[unit].bitmask & target_units)
5522 for (i = 0; i < function_units[unit].multiplicity; i++)
5524 int instance = unit + i * FUNCTION_UNITS_SIZE;
5525 rtx insn = unit_last_insn[instance];
5527 /* Print insns that still keep the unit busy. */
5528 if (insn &&
5529 actual_hazard_this_instance (unit, instance, insn, clock, 0))
5531 char str[BUF_LEN];
5532 print_insn (str, insn, 0);
5533 str[INSN_LEN] = '\0';
5534 sprintf (visual_tbl + strlen (visual_tbl), " %-33s", str);
5536 else
5537 sprintf (visual_tbl + strlen (visual_tbl), " %-33s", "------------------------------");
5540 /* Print insns that are not assigned to any unit. */
5541 for (i = 0; i < n_vis_no_unit; i++)
5542 sprintf (visual_tbl + strlen (visual_tbl), " %-8d",
5543 INSN_UID (vis_no_unit[i]));
5544 n_vis_no_unit = 0;
5546 sprintf (visual_tbl + strlen (visual_tbl), "\n");
5549 /* Print stalled cycles. */
5551 static void
5552 visualize_stall_cycles (b, stalls)
5553 int b, stalls;
5555 int i;
5557 /* If no more room, split table into two. */
5558 if (n_visual_lines >= MAX_VISUAL_LINES)
5560 print_block_visualization (b, "(incomplete)");
5561 init_block_visualization ();
5564 n_visual_lines++;
5566 sprintf (visual_tbl + strlen (visual_tbl), ";; ");
5567 for (i = 0; i < stalls; i++)
5568 sprintf (visual_tbl + strlen (visual_tbl), ".");
5569 sprintf (visual_tbl + strlen (visual_tbl), "\n");
5572 /* move_insn1: Remove INSN from insn chain, and link it after LAST insn. */
5574 static rtx
5575 move_insn1 (insn, last)
5576 rtx insn, last;
5578 NEXT_INSN (PREV_INSN (insn)) = NEXT_INSN (insn);
5579 PREV_INSN (NEXT_INSN (insn)) = PREV_INSN (insn);
5581 NEXT_INSN (insn) = NEXT_INSN (last);
5582 PREV_INSN (NEXT_INSN (last)) = insn;
5584 NEXT_INSN (last) = insn;
5585 PREV_INSN (insn) = last;
5587 return insn;
5590 /* Search INSN for REG_SAVE_NOTE note pairs for NOTE_INSN_SETJMP,
5591 NOTE_INSN_{LOOP,EHREGION}_{BEG,END}; and convert them back into
5592 NOTEs. The REG_SAVE_NOTE note following first one is contains the
5593 saved value for NOTE_BLOCK_NUMBER which is useful for
5594 NOTE_INSN_EH_REGION_{BEG,END} NOTEs. LAST is the last instruction
5595 output by the instruction scheduler. Return the new value of LAST. */
5597 static rtx
5598 reemit_notes (insn, last)
5599 rtx insn;
5600 rtx last;
5602 rtx note, retval;
5604 retval = last;
5605 for (note = REG_NOTES (insn); note; note = XEXP (note, 1))
5607 if (REG_NOTE_KIND (note) == REG_SAVE_NOTE)
5609 int note_type = INTVAL (XEXP (note, 0));
5610 if (note_type == NOTE_INSN_SETJMP)
5612 retval = emit_note_after (NOTE_INSN_SETJMP, insn);
5613 CONST_CALL_P (retval) = CONST_CALL_P (note);
5614 remove_note (insn, note);
5615 note = XEXP (note, 1);
5617 else if (note_type == NOTE_INSN_RANGE_START
5618 || note_type == NOTE_INSN_RANGE_END)
5620 last = emit_note_before (note_type, last);
5621 remove_note (insn, note);
5622 note = XEXP (note, 1);
5623 NOTE_RANGE_INFO (last) = XEXP (note, 0);
5625 else
5627 last = emit_note_before (note_type, last);
5628 remove_note (insn, note);
5629 note = XEXP (note, 1);
5630 if (note_type == NOTE_INSN_EH_REGION_BEG
5631 || note_type == NOTE_INSN_EH_REGION_END)
5632 NOTE_EH_HANDLER (last) = INTVAL (XEXP (note, 0));
5634 remove_note (insn, note);
5637 return retval;
5640 /* Move INSN, and all insns which should be issued before it,
5641 due to SCHED_GROUP_P flag. Reemit notes if needed.
5643 Return the last insn emitted by the scheduler, which is the
5644 return value from the first call to reemit_notes. */
5646 static rtx
5647 move_insn (insn, last)
5648 rtx insn, last;
5650 rtx retval = NULL;
5652 /* If INSN has SCHED_GROUP_P set, then issue it and any other
5653 insns with SCHED_GROUP_P set first. */
5654 while (SCHED_GROUP_P (insn))
5656 rtx prev = PREV_INSN (insn);
5658 /* Move a SCHED_GROUP_P insn. */
5659 move_insn1 (insn, last);
5660 /* If this is the first call to reemit_notes, then record
5661 its return value. */
5662 if (retval == NULL_RTX)
5663 retval = reemit_notes (insn, insn);
5664 else
5665 reemit_notes (insn, insn);
5666 insn = prev;
5669 /* Now move the first non SCHED_GROUP_P insn. */
5670 move_insn1 (insn, last);
5672 /* If this is the first call to reemit_notes, then record
5673 its return value. */
5674 if (retval == NULL_RTX)
5675 retval = reemit_notes (insn, insn);
5676 else
5677 reemit_notes (insn, insn);
5679 return retval;
5682 /* Return an insn which represents a SCHED_GROUP, which is
5683 the last insn in the group. */
5685 static rtx
5686 group_leader (insn)
5687 rtx insn;
5689 rtx prev;
5693 prev = insn;
5694 insn = next_nonnote_insn (insn);
5696 while (insn && SCHED_GROUP_P (insn) && (GET_CODE (insn) != CODE_LABEL));
5698 return prev;
5701 /* Use forward list scheduling to rearrange insns of block BB in region RGN,
5702 possibly bringing insns from subsequent blocks in the same region.
5703 Return number of insns scheduled. */
5705 static int
5706 schedule_block (bb, rgn_n_insns)
5707 int bb;
5708 int rgn_n_insns;
5710 /* Local variables. */
5711 rtx insn, last;
5712 rtx *ready;
5713 int n_ready = 0;
5714 int can_issue_more;
5716 /* Flow block of this bb. */
5717 int b = BB_TO_BLOCK (bb);
5719 /* target_n_insns == number of insns in b before scheduling starts.
5720 sched_target_n_insns == how many of b's insns were scheduled.
5721 sched_n_insns == how many insns were scheduled in b. */
5722 int target_n_insns = 0;
5723 int sched_target_n_insns = 0;
5724 int sched_n_insns = 0;
5726 #define NEED_NOTHING 0
5727 #define NEED_HEAD 1
5728 #define NEED_TAIL 2
5729 int new_needs;
5731 /* Head/tail info for this block. */
5732 rtx prev_head;
5733 rtx next_tail;
5734 rtx head;
5735 rtx tail;
5736 int bb_src;
5738 /* We used to have code to avoid getting parameters moved from hard
5739 argument registers into pseudos.
5741 However, it was removed when it proved to be of marginal benefit
5742 and caused problems because schedule_block and compute_forward_dependences
5743 had different notions of what the "head" insn was. */
5744 get_bb_head_tail (bb, &head, &tail);
5746 /* Interblock scheduling could have moved the original head insn from this
5747 block into a proceeding block. This may also cause schedule_block and
5748 compute_forward_dependences to have different notions of what the
5749 "head" insn was.
5751 If the interblock movement happened to make this block start with
5752 some notes (LOOP, EH or SETJMP) before the first real insn, then
5753 HEAD will have various special notes attached to it which must be
5754 removed so that we don't end up with extra copies of the notes. */
5755 if (GET_RTX_CLASS (GET_CODE (head)) == 'i')
5757 rtx note;
5759 for (note = REG_NOTES (head); note; note = XEXP (note, 1))
5760 if (REG_NOTE_KIND (note) == REG_SAVE_NOTE)
5761 remove_note (head, note);
5764 next_tail = NEXT_INSN (tail);
5765 prev_head = PREV_INSN (head);
5767 /* If the only insn left is a NOTE or a CODE_LABEL, then there is no need
5768 to schedule this block. */
5769 if (head == tail
5770 && (GET_RTX_CLASS (GET_CODE (head)) != 'i'))
5771 return (sched_n_insns);
5773 /* Debug info. */
5774 if (sched_verbose)
5776 fprintf (dump, ";; ======================================================\n");
5777 fprintf (dump,
5778 ";; -- basic block %d from %d to %d -- %s reload\n",
5779 b, INSN_UID (BLOCK_HEAD (b)), INSN_UID (BLOCK_END (b)),
5780 (reload_completed ? "after" : "before"));
5781 fprintf (dump, ";; ======================================================\n");
5782 fprintf (dump, "\n");
5784 visual_tbl = (char *) alloca (get_visual_tbl_length ());
5785 init_block_visualization ();
5788 /* Remove remaining note insns from the block, save them in
5789 note_list. These notes are restored at the end of
5790 schedule_block (). */
5791 note_list = 0;
5792 rm_other_notes (head, tail);
5794 target_bb = bb;
5796 /* Prepare current target block info. */
5797 if (current_nr_blocks > 1)
5799 candidate_table = (candidate *) xmalloc (current_nr_blocks
5800 * sizeof (candidate));
5802 bblst_last = 0;
5803 /* ??? It is not clear why bblst_size is computed this way. The original
5804 number was clearly too small as it resulted in compiler failures.
5805 Multiplying by the original number by 2 (to account for update_bbs
5806 members) seems to be a reasonable solution. */
5807 /* ??? Or perhaps there is a bug somewhere else in this file? */
5808 bblst_size = (current_nr_blocks - bb) * rgn_nr_edges * 2;
5809 bblst_table = (int *) xmalloc (bblst_size * sizeof (int));
5811 bitlst_table_last = 0;
5812 bitlst_table_size = rgn_nr_edges;
5813 bitlst_table = (int *) xmalloc (rgn_nr_edges * sizeof (int));
5815 compute_trg_info (bb);
5818 clear_units ();
5820 /* Allocate the ready list. */
5821 ready = (rtx *) xmalloc ((rgn_n_insns + 1) * sizeof (rtx));
5823 /* Print debugging information. */
5824 if (sched_verbose >= 5)
5825 debug_dependencies ();
5828 /* Initialize ready list with all 'ready' insns in target block.
5829 Count number of insns in the target block being scheduled. */
5830 n_ready = 0;
5831 for (insn = head; insn != next_tail; insn = NEXT_INSN (insn))
5833 rtx next;
5835 if (GET_RTX_CLASS (GET_CODE (insn)) != 'i')
5836 continue;
5837 next = NEXT_INSN (insn);
5839 if (INSN_DEP_COUNT (insn) == 0
5840 && (SCHED_GROUP_P (next) == 0 || GET_RTX_CLASS (GET_CODE (next)) != 'i'))
5841 ready[n_ready++] = insn;
5842 if (!(SCHED_GROUP_P (insn)))
5843 target_n_insns++;
5846 /* Add to ready list all 'ready' insns in valid source blocks.
5847 For speculative insns, check-live, exception-free, and
5848 issue-delay. */
5849 for (bb_src = bb + 1; bb_src < current_nr_blocks; bb_src++)
5850 if (IS_VALID (bb_src))
5852 rtx src_head;
5853 rtx src_next_tail;
5854 rtx tail, head;
5856 get_bb_head_tail (bb_src, &head, &tail);
5857 src_next_tail = NEXT_INSN (tail);
5858 src_head = head;
5860 if (head == tail
5861 && (GET_RTX_CLASS (GET_CODE (head)) != 'i'))
5862 continue;
5864 for (insn = src_head; insn != src_next_tail; insn = NEXT_INSN (insn))
5866 if (GET_RTX_CLASS (GET_CODE (insn)) != 'i')
5867 continue;
5869 if (!CANT_MOVE (insn)
5870 && (!IS_SPECULATIVE_INSN (insn)
5871 || (insn_issue_delay (insn) <= 3
5872 && check_live (insn, bb_src)
5873 && is_exception_free (insn, bb_src, target_bb))))
5875 rtx next;
5877 /* Note that we havn't squirrled away the notes for
5878 blocks other than the current. So if this is a
5879 speculative insn, NEXT might otherwise be a note. */
5880 next = next_nonnote_insn (insn);
5881 if (INSN_DEP_COUNT (insn) == 0
5882 && (! next
5883 || SCHED_GROUP_P (next) == 0
5884 || GET_RTX_CLASS (GET_CODE (next)) != 'i'))
5885 ready[n_ready++] = insn;
5890 #ifdef MD_SCHED_INIT
5891 MD_SCHED_INIT (dump, sched_verbose);
5892 #endif
5894 /* No insns scheduled in this block yet. */
5895 last_scheduled_insn = 0;
5897 /* Q_SIZE is the total number of insns in the queue. */
5898 q_ptr = 0;
5899 q_size = 0;
5900 last_clock_var = 0;
5901 bzero ((char *) insn_queue, sizeof (insn_queue));
5903 /* Start just before the beginning of time. */
5904 clock_var = -1;
5906 /* We start inserting insns after PREV_HEAD. */
5907 last = prev_head;
5909 /* Initialize INSN_QUEUE, LIST and NEW_NEEDS. */
5910 new_needs = (NEXT_INSN (prev_head) == BLOCK_HEAD (b)
5911 ? NEED_HEAD : NEED_NOTHING);
5912 if (PREV_INSN (next_tail) == BLOCK_END (b))
5913 new_needs |= NEED_TAIL;
5915 /* Loop until all the insns in BB are scheduled. */
5916 while (sched_target_n_insns < target_n_insns)
5918 clock_var++;
5920 /* Add to the ready list all pending insns that can be issued now.
5921 If there are no ready insns, increment clock until one
5922 is ready and add all pending insns at that point to the ready
5923 list. */
5924 n_ready = queue_to_ready (ready, n_ready);
5926 if (n_ready == 0)
5927 abort ();
5929 if (sched_verbose >= 2)
5931 fprintf (dump, ";;\t\tReady list after queue_to_ready: ");
5932 debug_ready_list (ready, n_ready);
5935 /* Sort the ready list based on priority. */
5936 SCHED_SORT (ready, n_ready);
5938 /* Allow the target to reorder the list, typically for
5939 better instruction bundling. */
5940 #ifdef MD_SCHED_REORDER
5941 MD_SCHED_REORDER (dump, sched_verbose, ready, n_ready, clock_var,
5942 can_issue_more);
5943 #else
5944 can_issue_more = issue_rate;
5945 #endif
5947 if (sched_verbose)
5949 fprintf (dump, "\n;;\tReady list (t =%3d): ", clock_var);
5950 debug_ready_list (ready, n_ready);
5953 /* Issue insns from ready list. */
5954 while (n_ready != 0 && can_issue_more)
5956 /* Select and remove the insn from the ready list. */
5957 rtx insn = ready[--n_ready];
5958 int cost = actual_hazard (insn_unit (insn), insn, clock_var, 0);
5960 if (cost >= 1)
5962 queue_insn (insn, cost);
5963 continue;
5966 /* An interblock motion? */
5967 if (INSN_BB (insn) != target_bb)
5969 rtx temp;
5970 basic_block b1;
5972 if (IS_SPECULATIVE_INSN (insn))
5974 if (!check_live (insn, INSN_BB (insn)))
5975 continue;
5976 update_live (insn, INSN_BB (insn));
5978 /* For speculative load, mark insns fed by it. */
5979 if (IS_LOAD_INSN (insn) || FED_BY_SPEC_LOAD (insn))
5980 set_spec_fed (insn);
5982 nr_spec++;
5984 nr_inter++;
5986 /* Find the beginning of the scheduling group. */
5987 /* ??? Ought to update basic block here, but later bits of
5988 schedule_block assumes the original insn block is
5989 still intact. */
5991 temp = insn;
5992 while (SCHED_GROUP_P (temp))
5993 temp = PREV_INSN (temp);
5995 /* Update source block boundaries. */
5996 b1 = BLOCK_FOR_INSN (temp);
5997 if (temp == b1->head && insn == b1->end)
5999 /* We moved all the insns in the basic block.
6000 Emit a note after the last insn and update the
6001 begin/end boundaries to point to the note. */
6002 rtx note = emit_note_after (NOTE_INSN_DELETED, insn);
6003 b1->head = note;
6004 b1->end = note;
6006 else if (insn == b1->end)
6008 /* We took insns from the end of the basic block,
6009 so update the end of block boundary so that it
6010 points to the first insn we did not move. */
6011 b1->end = PREV_INSN (temp);
6013 else if (temp == b1->head)
6015 /* We took insns from the start of the basic block,
6016 so update the start of block boundary so that
6017 it points to the first insn we did not move. */
6018 b1->head = NEXT_INSN (insn);
6021 else
6023 /* In block motion. */
6024 sched_target_n_insns++;
6027 last_scheduled_insn = insn;
6028 last = move_insn (insn, last);
6029 sched_n_insns++;
6031 #ifdef MD_SCHED_VARIABLE_ISSUE
6032 MD_SCHED_VARIABLE_ISSUE (dump, sched_verbose, insn,
6033 can_issue_more);
6034 #else
6035 can_issue_more--;
6036 #endif
6038 n_ready = schedule_insn (insn, ready, n_ready, clock_var);
6040 /* Close this block after scheduling its jump. */
6041 if (GET_CODE (last_scheduled_insn) == JUMP_INSN)
6042 break;
6045 /* Debug info. */
6046 if (sched_verbose)
6047 visualize_scheduled_insns (b, clock_var);
6050 /* Debug info. */
6051 if (sched_verbose)
6053 fprintf (dump, ";;\tReady list (final): ");
6054 debug_ready_list (ready, n_ready);
6055 print_block_visualization (b, "");
6058 /* Sanity check -- queue must be empty now. Meaningless if region has
6059 multiple bbs. */
6060 if (current_nr_blocks > 1)
6061 if (!flag_schedule_interblock && q_size != 0)
6062 abort ();
6064 /* Update head/tail boundaries. */
6065 head = NEXT_INSN (prev_head);
6066 tail = last;
6068 /* Restore-other-notes: NOTE_LIST is the end of a chain of notes
6069 previously found among the insns. Insert them at the beginning
6070 of the insns. */
6071 if (note_list != 0)
6073 rtx note_head = note_list;
6075 while (PREV_INSN (note_head))
6077 note_head = PREV_INSN (note_head);
6080 PREV_INSN (note_head) = PREV_INSN (head);
6081 NEXT_INSN (PREV_INSN (head)) = note_head;
6082 PREV_INSN (head) = note_list;
6083 NEXT_INSN (note_list) = head;
6084 head = note_head;
6087 /* Update target block boundaries. */
6088 if (new_needs & NEED_HEAD)
6089 BLOCK_HEAD (b) = head;
6091 if (new_needs & NEED_TAIL)
6092 BLOCK_END (b) = tail;
6094 /* Debugging. */
6095 if (sched_verbose)
6097 fprintf (dump, ";; total time = %d\n;; new basic block head = %d\n",
6098 clock_var, INSN_UID (BLOCK_HEAD (b)));
6099 fprintf (dump, ";; new basic block end = %d\n\n",
6100 INSN_UID (BLOCK_END (b)));
6103 /* Clean up. */
6104 if (current_nr_blocks > 1)
6106 free (candidate_table);
6107 free (bblst_table);
6108 free (bitlst_table);
6110 free (ready);
6112 return (sched_n_insns);
6113 } /* schedule_block () */
6116 /* Print the bit-set of registers, S, callable from debugger. */
6118 extern void
6119 debug_reg_vector (s)
6120 regset s;
6122 int regno;
6124 EXECUTE_IF_SET_IN_REG_SET (s, 0, regno,
6126 fprintf (dump, " %d", regno);
6129 fprintf (dump, "\n");
6132 /* Use the backward dependences from LOG_LINKS to build
6133 forward dependences in INSN_DEPEND. */
6135 static void
6136 compute_block_forward_dependences (bb)
6137 int bb;
6139 rtx insn, link;
6140 rtx tail, head;
6141 rtx next_tail;
6142 enum reg_note dep_type;
6144 get_bb_head_tail (bb, &head, &tail);
6145 next_tail = NEXT_INSN (tail);
6146 for (insn = head; insn != next_tail; insn = NEXT_INSN (insn))
6148 if (GET_RTX_CLASS (GET_CODE (insn)) != 'i')
6149 continue;
6151 insn = group_leader (insn);
6153 for (link = LOG_LINKS (insn); link; link = XEXP (link, 1))
6155 rtx x = group_leader (XEXP (link, 0));
6156 rtx new_link;
6158 if (x != XEXP (link, 0))
6159 continue;
6161 #ifdef ENABLE_CHECKING
6162 /* If add_dependence is working properly there should never
6163 be notes, deleted insns or duplicates in the backward
6164 links. Thus we need not check for them here.
6166 However, if we have enabled checking we might as well go
6167 ahead and verify that add_dependence worked properly. */
6168 if (GET_CODE (x) == NOTE
6169 || INSN_DELETED_P (x)
6170 || find_insn_list (insn, INSN_DEPEND (x)))
6171 abort ();
6172 #endif
6174 new_link = alloc_INSN_LIST (insn, INSN_DEPEND (x));
6176 dep_type = REG_NOTE_KIND (link);
6177 PUT_REG_NOTE_KIND (new_link, dep_type);
6179 INSN_DEPEND (x) = new_link;
6180 INSN_DEP_COUNT (insn) += 1;
6185 /* Initialize variables for region data dependence analysis.
6186 n_bbs is the number of region blocks. */
6188 static void
6189 init_deps (deps)
6190 struct deps *deps;
6192 int maxreg = max_reg_num ();
6193 deps->reg_last_uses = (rtx *) xcalloc (maxreg, sizeof (rtx));
6194 deps->reg_last_sets = (rtx *) xcalloc (maxreg, sizeof (rtx));
6195 deps->reg_last_clobbers = (rtx *) xcalloc (maxreg, sizeof (rtx));
6197 deps->pending_read_insns = 0;
6198 deps->pending_read_mems = 0;
6199 deps->pending_write_insns = 0;
6200 deps->pending_write_mems = 0;
6201 deps->pending_lists_length = 0;
6202 deps->last_pending_memory_flush = 0;
6203 deps->last_function_call = 0;
6205 deps->sched_before_next_call
6206 = gen_rtx_INSN (VOIDmode, 0, NULL_RTX, NULL_RTX,
6207 NULL_RTX, 0, NULL_RTX, NULL_RTX);
6208 LOG_LINKS (deps->sched_before_next_call) = 0;
6211 /* Add dependences so that branches are scheduled to run last in their
6212 block. */
6214 static void
6215 add_branch_dependences (head, tail)
6216 rtx head, tail;
6218 rtx insn, last;
6220 /* For all branches, calls, uses, clobbers, and cc0 setters, force them
6221 to remain in order at the end of the block by adding dependencies and
6222 giving the last a high priority. There may be notes present, and
6223 prev_head may also be a note.
6225 Branches must obviously remain at the end. Calls should remain at the
6226 end since moving them results in worse register allocation. Uses remain
6227 at the end to ensure proper register allocation. cc0 setters remaim
6228 at the end because they can't be moved away from their cc0 user. */
6229 insn = tail;
6230 last = 0;
6231 while (GET_CODE (insn) == CALL_INSN
6232 || GET_CODE (insn) == JUMP_INSN
6233 || (GET_CODE (insn) == INSN
6234 && (GET_CODE (PATTERN (insn)) == USE
6235 || GET_CODE (PATTERN (insn)) == CLOBBER
6236 #ifdef HAVE_cc0
6237 || sets_cc0_p (PATTERN (insn))
6238 #endif
6240 || GET_CODE (insn) == NOTE)
6242 if (GET_CODE (insn) != NOTE)
6244 if (last != 0
6245 && !find_insn_list (insn, LOG_LINKS (last)))
6247 add_dependence (last, insn, REG_DEP_ANTI);
6248 INSN_REF_COUNT (insn)++;
6251 CANT_MOVE (insn) = 1;
6253 last = insn;
6254 /* Skip over insns that are part of a group.
6255 Make each insn explicitly depend on the previous insn.
6256 This ensures that only the group header will ever enter
6257 the ready queue (and, when scheduled, will automatically
6258 schedule the SCHED_GROUP_P block). */
6259 while (SCHED_GROUP_P (insn))
6261 rtx temp = prev_nonnote_insn (insn);
6262 add_dependence (insn, temp, REG_DEP_ANTI);
6263 insn = temp;
6267 /* Don't overrun the bounds of the basic block. */
6268 if (insn == head)
6269 break;
6271 insn = PREV_INSN (insn);
6274 /* Make sure these insns are scheduled last in their block. */
6275 insn = last;
6276 if (insn != 0)
6277 while (insn != head)
6279 insn = prev_nonnote_insn (insn);
6281 if (INSN_REF_COUNT (insn) != 0)
6282 continue;
6284 add_dependence (last, insn, REG_DEP_ANTI);
6285 INSN_REF_COUNT (insn) = 1;
6287 /* Skip over insns that are part of a group. */
6288 while (SCHED_GROUP_P (insn))
6289 insn = prev_nonnote_insn (insn);
6293 /* After computing the dependencies for block BB, propagate the dependencies
6294 found in TMP_DEPS to the successors of the block. MAX_REG is the number
6295 of registers. */
6296 static void
6297 propagate_deps (bb, tmp_deps, max_reg)
6298 int bb;
6299 struct deps *tmp_deps;
6300 int max_reg;
6302 int b = BB_TO_BLOCK (bb);
6303 int e, first_edge;
6304 int reg;
6305 rtx link_insn, link_mem;
6306 rtx u;
6308 /* These lists should point to the right place, for correct
6309 freeing later. */
6310 bb_deps[bb].pending_read_insns = tmp_deps->pending_read_insns;
6311 bb_deps[bb].pending_read_mems = tmp_deps->pending_read_mems;
6312 bb_deps[bb].pending_write_insns = tmp_deps->pending_write_insns;
6313 bb_deps[bb].pending_write_mems = tmp_deps->pending_write_mems;
6315 /* bb's structures are inherited by its successors. */
6316 first_edge = e = OUT_EDGES (b);
6317 if (e <= 0)
6318 return;
6322 rtx x;
6323 int b_succ = TO_BLOCK (e);
6324 int bb_succ = BLOCK_TO_BB (b_succ);
6325 struct deps *succ_deps = bb_deps + bb_succ;
6327 /* Only bbs "below" bb, in the same region, are interesting. */
6328 if (CONTAINING_RGN (b) != CONTAINING_RGN (b_succ)
6329 || bb_succ <= bb)
6331 e = NEXT_OUT (e);
6332 continue;
6335 for (reg = 0; reg < max_reg; reg++)
6337 /* reg-last-uses lists are inherited by bb_succ. */
6338 for (u = tmp_deps->reg_last_uses[reg]; u; u = XEXP (u, 1))
6340 if (find_insn_list (XEXP (u, 0),
6341 succ_deps->reg_last_uses[reg]))
6342 continue;
6344 succ_deps->reg_last_uses[reg]
6345 = alloc_INSN_LIST (XEXP (u, 0),
6346 succ_deps->reg_last_uses[reg]);
6349 /* reg-last-defs lists are inherited by bb_succ. */
6350 for (u = tmp_deps->reg_last_sets[reg]; u; u = XEXP (u, 1))
6352 if (find_insn_list (XEXP (u, 0),
6353 succ_deps->reg_last_sets[reg]))
6354 continue;
6356 succ_deps->reg_last_sets[reg]
6357 = alloc_INSN_LIST (XEXP (u, 0),
6358 succ_deps->reg_last_sets[reg]);
6361 for (u = tmp_deps->reg_last_clobbers[reg]; u; u = XEXP (u, 1))
6363 if (find_insn_list (XEXP (u, 0),
6364 succ_deps->reg_last_clobbers[reg]))
6365 continue;
6367 succ_deps->reg_last_clobbers[reg]
6368 = alloc_INSN_LIST (XEXP (u, 0),
6369 succ_deps->reg_last_clobbers[reg]);
6373 /* Mem read/write lists are inherited by bb_succ. */
6374 link_insn = tmp_deps->pending_read_insns;
6375 link_mem = tmp_deps->pending_read_mems;
6376 while (link_insn)
6378 if (!(find_insn_mem_list (XEXP (link_insn, 0),
6379 XEXP (link_mem, 0),
6380 succ_deps->pending_read_insns,
6381 succ_deps->pending_read_mems)))
6382 add_insn_mem_dependence (succ_deps, &succ_deps->pending_read_insns,
6383 &succ_deps->pending_read_mems,
6384 XEXP (link_insn, 0), XEXP (link_mem, 0));
6385 link_insn = XEXP (link_insn, 1);
6386 link_mem = XEXP (link_mem, 1);
6389 link_insn = tmp_deps->pending_write_insns;
6390 link_mem = tmp_deps->pending_write_mems;
6391 while (link_insn)
6393 if (!(find_insn_mem_list (XEXP (link_insn, 0),
6394 XEXP (link_mem, 0),
6395 succ_deps->pending_write_insns,
6396 succ_deps->pending_write_mems)))
6397 add_insn_mem_dependence (succ_deps,
6398 &succ_deps->pending_write_insns,
6399 &succ_deps->pending_write_mems,
6400 XEXP (link_insn, 0), XEXP (link_mem, 0));
6402 link_insn = XEXP (link_insn, 1);
6403 link_mem = XEXP (link_mem, 1);
6406 /* last_function_call is inherited by bb_succ. */
6407 for (u = tmp_deps->last_function_call; u; u = XEXP (u, 1))
6409 if (find_insn_list (XEXP (u, 0),
6410 succ_deps->last_function_call))
6411 continue;
6413 succ_deps->last_function_call
6414 = alloc_INSN_LIST (XEXP (u, 0),
6415 succ_deps->last_function_call);
6418 /* last_pending_memory_flush is inherited by bb_succ. */
6419 for (u = tmp_deps->last_pending_memory_flush; u; u = XEXP (u, 1))
6421 if (find_insn_list (XEXP (u, 0),
6422 succ_deps->last_pending_memory_flush))
6423 continue;
6425 succ_deps->last_pending_memory_flush
6426 = alloc_INSN_LIST (XEXP (u, 0),
6427 succ_deps->last_pending_memory_flush);
6430 /* sched_before_next_call is inherited by bb_succ. */
6431 x = LOG_LINKS (tmp_deps->sched_before_next_call);
6432 for (; x; x = XEXP (x, 1))
6433 add_dependence (succ_deps->sched_before_next_call,
6434 XEXP (x, 0), REG_DEP_ANTI);
6436 e = NEXT_OUT (e);
6438 while (e != first_edge);
6441 /* Compute backward dependences inside bb. In a multiple blocks region:
6442 (1) a bb is analyzed after its predecessors, and (2) the lists in
6443 effect at the end of bb (after analyzing for bb) are inherited by
6444 bb's successrs.
6446 Specifically for reg-reg data dependences, the block insns are
6447 scanned by sched_analyze () top-to-bottom. Two lists are
6448 maintained by sched_analyze (): reg_last_sets[] for register DEFs,
6449 and reg_last_uses[] for register USEs.
6451 When analysis is completed for bb, we update for its successors:
6452 ; - DEFS[succ] = Union (DEFS [succ], DEFS [bb])
6453 ; - USES[succ] = Union (USES [succ], DEFS [bb])
6455 The mechanism for computing mem-mem data dependence is very
6456 similar, and the result is interblock dependences in the region. */
6458 static void
6459 compute_block_backward_dependences (bb)
6460 int bb;
6462 int i;
6463 rtx head, tail;
6464 int max_reg = max_reg_num ();
6465 struct deps tmp_deps;
6467 tmp_deps = bb_deps[bb];
6469 /* Do the analysis for this block. */
6470 get_bb_head_tail (bb, &head, &tail);
6471 sched_analyze (&tmp_deps, head, tail);
6472 add_branch_dependences (head, tail);
6474 if (current_nr_blocks > 1)
6475 propagate_deps (bb, &tmp_deps, max_reg);
6477 /* Free up the INSN_LISTs.
6479 Note this loop is executed max_reg * nr_regions times. It's first
6480 implementation accounted for over 90% of the calls to free_INSN_LIST_list.
6481 The list was empty for the vast majority of those calls. On the PA, not
6482 calling free_INSN_LIST_list in those cases improves -O2 compile times by
6483 3-5% on average. */
6484 for (i = 0; i < max_reg; ++i)
6486 if (tmp_deps.reg_last_clobbers[i])
6487 free_INSN_LIST_list (&tmp_deps.reg_last_clobbers[i]);
6488 if (tmp_deps.reg_last_sets[i])
6489 free_INSN_LIST_list (&tmp_deps.reg_last_sets[i]);
6490 if (tmp_deps.reg_last_uses[i])
6491 free_INSN_LIST_list (&tmp_deps.reg_last_uses[i]);
6494 /* Assert that we won't need bb_reg_last_* for this block anymore. */
6495 free (bb_deps[bb].reg_last_uses);
6496 free (bb_deps[bb].reg_last_sets);
6497 free (bb_deps[bb].reg_last_clobbers);
6498 bb_deps[bb].reg_last_uses = 0;
6499 bb_deps[bb].reg_last_sets = 0;
6500 bb_deps[bb].reg_last_clobbers = 0;
6503 /* Print dependences for debugging, callable from debugger. */
6505 void
6506 debug_dependencies ()
6508 int bb;
6510 fprintf (dump, ";; --------------- forward dependences: ------------ \n");
6511 for (bb = 0; bb < current_nr_blocks; bb++)
6513 if (1)
6515 rtx head, tail;
6516 rtx next_tail;
6517 rtx insn;
6519 get_bb_head_tail (bb, &head, &tail);
6520 next_tail = NEXT_INSN (tail);
6521 fprintf (dump, "\n;; --- Region Dependences --- b %d bb %d \n",
6522 BB_TO_BLOCK (bb), bb);
6524 fprintf (dump, ";; %7s%6s%6s%6s%6s%6s%11s%6s\n",
6525 "insn", "code", "bb", "dep", "prio", "cost", "blockage", "units");
6526 fprintf (dump, ";; %7s%6s%6s%6s%6s%6s%11s%6s\n",
6527 "----", "----", "--", "---", "----", "----", "--------", "-----");
6528 for (insn = head; insn != next_tail; insn = NEXT_INSN (insn))
6530 rtx link;
6531 int unit, range;
6533 if (GET_RTX_CLASS (GET_CODE (insn)) != 'i')
6535 int n;
6536 fprintf (dump, ";; %6d ", INSN_UID (insn));
6537 if (GET_CODE (insn) == NOTE)
6539 n = NOTE_LINE_NUMBER (insn);
6540 if (n < 0)
6541 fprintf (dump, "%s\n", GET_NOTE_INSN_NAME (n));
6542 else
6543 fprintf (dump, "line %d, file %s\n", n,
6544 NOTE_SOURCE_FILE (insn));
6546 else
6547 fprintf (dump, " {%s}\n", GET_RTX_NAME (GET_CODE (insn)));
6548 continue;
6551 unit = insn_unit (insn);
6552 range = (unit < 0
6553 || function_units[unit].blockage_range_function == 0) ? 0 :
6554 function_units[unit].blockage_range_function (insn);
6555 fprintf (dump,
6556 ";; %s%5d%6d%6d%6d%6d%6d %3d -%3d ",
6557 (SCHED_GROUP_P (insn) ? "+" : " "),
6558 INSN_UID (insn),
6559 INSN_CODE (insn),
6560 INSN_BB (insn),
6561 INSN_DEP_COUNT (insn),
6562 INSN_PRIORITY (insn),
6563 insn_cost (insn, 0, 0),
6564 (int) MIN_BLOCKAGE_COST (range),
6565 (int) MAX_BLOCKAGE_COST (range));
6566 insn_print_units (insn);
6567 fprintf (dump, "\t: ");
6568 for (link = INSN_DEPEND (insn); link; link = XEXP (link, 1))
6569 fprintf (dump, "%d ", INSN_UID (XEXP (link, 0)));
6570 fprintf (dump, "\n");
6574 fprintf (dump, "\n");
6577 /* Set_priorities: compute priority of each insn in the block. */
6579 static int
6580 set_priorities (bb)
6581 int bb;
6583 rtx insn;
6584 int n_insn;
6586 rtx tail;
6587 rtx prev_head;
6588 rtx head;
6590 get_bb_head_tail (bb, &head, &tail);
6591 prev_head = PREV_INSN (head);
6593 if (head == tail
6594 && (GET_RTX_CLASS (GET_CODE (head)) != 'i'))
6595 return 0;
6597 n_insn = 0;
6598 for (insn = tail; insn != prev_head; insn = PREV_INSN (insn))
6601 if (GET_CODE (insn) == NOTE)
6602 continue;
6604 if (!(SCHED_GROUP_P (insn)))
6605 n_insn++;
6606 (void) priority (insn);
6609 return n_insn;
6612 /* Schedule a region. A region is either an inner loop, a loop-free
6613 subroutine, or a single basic block. Each bb in the region is
6614 scheduled after its flow predecessors. */
6616 static void
6617 schedule_region (rgn)
6618 int rgn;
6620 int bb;
6621 int rgn_n_insns = 0;
6622 int sched_rgn_n_insns = 0;
6624 /* Set variables for the current region. */
6625 current_nr_blocks = RGN_NR_BLOCKS (rgn);
6626 current_blocks = RGN_BLOCKS (rgn);
6628 reg_pending_sets = ALLOCA_REG_SET ();
6629 reg_pending_clobbers = ALLOCA_REG_SET ();
6630 reg_pending_sets_all = 0;
6632 /* Initializations for region data dependence analyisis. */
6633 bb_deps = (struct deps *) xmalloc (sizeof (struct deps) * current_nr_blocks);
6634 for (bb = 0; bb < current_nr_blocks; bb++)
6635 init_deps (bb_deps + bb);
6637 /* Compute LOG_LINKS. */
6638 for (bb = 0; bb < current_nr_blocks; bb++)
6639 compute_block_backward_dependences (bb);
6641 /* Compute INSN_DEPEND. */
6642 for (bb = current_nr_blocks - 1; bb >= 0; bb--)
6643 compute_block_forward_dependences (bb);
6645 /* Delete line notes and set priorities. */
6646 for (bb = 0; bb < current_nr_blocks; bb++)
6648 if (write_symbols != NO_DEBUG)
6650 save_line_notes (bb);
6651 rm_line_notes (bb);
6654 rgn_n_insns += set_priorities (bb);
6657 /* Compute interblock info: probabilities, split-edges, dominators, etc. */
6658 if (current_nr_blocks > 1)
6660 int i;
6662 prob = (float *) xmalloc ((current_nr_blocks) * sizeof (float));
6664 bbset_size = current_nr_blocks / HOST_BITS_PER_WIDE_INT + 1;
6665 dom = (bbset *) xmalloc (current_nr_blocks * sizeof (bbset));
6666 for (i = 0; i < current_nr_blocks; i++)
6667 dom[i] = (bbset) xcalloc (bbset_size, sizeof (HOST_WIDE_INT));
6669 /* Edge to bit. */
6670 rgn_nr_edges = 0;
6671 edge_to_bit = (int *) xmalloc (nr_edges * sizeof (int));
6672 for (i = 1; i < nr_edges; i++)
6673 if (CONTAINING_RGN (FROM_BLOCK (i)) == rgn)
6674 EDGE_TO_BIT (i) = rgn_nr_edges++;
6675 rgn_edges = (int *) xmalloc (rgn_nr_edges * sizeof (int));
6677 rgn_nr_edges = 0;
6678 for (i = 1; i < nr_edges; i++)
6679 if (CONTAINING_RGN (FROM_BLOCK (i)) == (rgn))
6680 rgn_edges[rgn_nr_edges++] = i;
6682 /* Split edges. */
6683 edgeset_size = rgn_nr_edges / HOST_BITS_PER_WIDE_INT + 1;
6684 edgeset_bitsize = rgn_nr_edges;
6685 pot_split = (edgeset *) xmalloc (current_nr_blocks * sizeof (edgeset));
6686 ancestor_edges
6687 = (edgeset *) xmalloc (current_nr_blocks * sizeof (edgeset));
6688 for (i = 0; i < current_nr_blocks; i++)
6690 pot_split[i] =
6691 (edgeset) xcalloc (edgeset_size, sizeof (HOST_WIDE_INT));
6692 ancestor_edges[i] =
6693 (edgeset) xcalloc (edgeset_size, sizeof (HOST_WIDE_INT));
6696 /* Compute probabilities, dominators, split_edges. */
6697 for (bb = 0; bb < current_nr_blocks; bb++)
6698 compute_dom_prob_ps (bb);
6701 /* Now we can schedule all blocks. */
6702 for (bb = 0; bb < current_nr_blocks; bb++)
6703 sched_rgn_n_insns += schedule_block (bb, rgn_n_insns);
6705 /* Sanity check: verify that all region insns were scheduled. */
6706 if (sched_rgn_n_insns != rgn_n_insns)
6707 abort ();
6709 /* Restore line notes. */
6710 if (write_symbols != NO_DEBUG)
6712 for (bb = 0; bb < current_nr_blocks; bb++)
6713 restore_line_notes (bb);
6716 /* Done with this region. */
6717 free_pending_lists ();
6719 FREE_REG_SET (reg_pending_sets);
6720 FREE_REG_SET (reg_pending_clobbers);
6722 free (bb_deps);
6724 if (current_nr_blocks > 1)
6726 int i;
6728 free (prob);
6729 for (i = 0; i < current_nr_blocks; ++i)
6731 free (dom[i]);
6732 free (pot_split[i]);
6733 free (ancestor_edges[i]);
6735 free (dom);
6736 free (edge_to_bit);
6737 free (rgn_edges);
6738 free (pot_split);
6739 free (ancestor_edges);
6743 /* The one entry point in this file. DUMP_FILE is the dump file for
6744 this pass. */
6746 void
6747 schedule_insns (dump_file)
6748 FILE *dump_file;
6750 int *deaths_in_region;
6751 sbitmap blocks, large_region_blocks;
6752 int max_uid;
6753 int b;
6754 rtx insn;
6755 int rgn;
6756 int luid;
6757 int any_large_regions;
6759 /* Disable speculative loads in their presence if cc0 defined. */
6760 #ifdef HAVE_cc0
6761 flag_schedule_speculative_load = 0;
6762 #endif
6764 /* Taking care of this degenerate case makes the rest of
6765 this code simpler. */
6766 if (n_basic_blocks == 0)
6767 return;
6769 /* Set dump and sched_verbose for the desired debugging output. If no
6770 dump-file was specified, but -fsched-verbose-N (any N), print to stderr.
6771 For -fsched-verbose-N, N>=10, print everything to stderr. */
6772 sched_verbose = sched_verbose_param;
6773 if (sched_verbose_param == 0 && dump_file)
6774 sched_verbose = 1;
6775 dump = ((sched_verbose_param >= 10 || !dump_file) ? stderr : dump_file);
6777 nr_inter = 0;
6778 nr_spec = 0;
6780 /* Initialize issue_rate. */
6781 issue_rate = ISSUE_RATE;
6783 split_all_insns (1);
6785 /* We use LUID 0 for the fake insn (UID 0) which holds dependencies for
6786 pseudos which do not cross calls. */
6787 max_uid = get_max_uid () + 1;
6789 h_i_d = (struct haifa_insn_data *) xcalloc (max_uid, sizeof (*h_i_d));
6791 h_i_d[0].luid = 0;
6792 luid = 1;
6793 for (b = 0; b < n_basic_blocks; b++)
6794 for (insn = BLOCK_HEAD (b);; insn = NEXT_INSN (insn))
6796 INSN_LUID (insn) = luid;
6798 /* Increment the next luid, unless this is a note. We don't
6799 really need separate IDs for notes and we don't want to
6800 schedule differently depending on whether or not there are
6801 line-number notes, i.e., depending on whether or not we're
6802 generating debugging information. */
6803 if (GET_CODE (insn) != NOTE)
6804 ++luid;
6806 if (insn == BLOCK_END (b))
6807 break;
6810 /* ?!? We could save some memory by computing a per-region luid mapping
6811 which could reduce both the number of vectors in the cache and the size
6812 of each vector. Instead we just avoid the cache entirely unless the
6813 average number of instructions in a basic block is very high. See
6814 the comment before the declaration of true_dependency_cache for
6815 what we consider "very high". */
6816 if (luid / n_basic_blocks > 100 * 5)
6818 true_dependency_cache = sbitmap_vector_alloc (luid, luid);
6819 sbitmap_vector_zero (true_dependency_cache, luid);
6822 nr_regions = 0;
6823 rgn_table = (region *) xmalloc ((n_basic_blocks) * sizeof (region));
6824 rgn_bb_table = (int *) xmalloc ((n_basic_blocks) * sizeof (int));
6825 block_to_bb = (int *) xmalloc ((n_basic_blocks) * sizeof (int));
6826 containing_rgn = (int *) xmalloc ((n_basic_blocks) * sizeof (int));
6828 blocks = sbitmap_alloc (n_basic_blocks);
6829 large_region_blocks = sbitmap_alloc (n_basic_blocks);
6831 compute_bb_for_insn (max_uid);
6833 /* Compute regions for scheduling. */
6834 if (reload_completed
6835 || n_basic_blocks == 1
6836 || !flag_schedule_interblock)
6838 find_single_block_region ();
6840 else
6842 /* Verify that a 'good' control flow graph can be built. */
6843 if (is_cfg_nonregular ())
6845 find_single_block_region ();
6847 else
6849 sbitmap *dom;
6850 struct edge_list *edge_list;
6852 dom = sbitmap_vector_alloc (n_basic_blocks, n_basic_blocks);
6854 /* The scheduler runs after flow; therefore, we can't blindly call
6855 back into find_basic_blocks since doing so could invalidate the
6856 info in global_live_at_start.
6858 Consider a block consisting entirely of dead stores; after life
6859 analysis it would be a block of NOTE_INSN_DELETED notes. If
6860 we call find_basic_blocks again, then the block would be removed
6861 entirely and invalidate our the register live information.
6863 We could (should?) recompute register live information. Doing
6864 so may even be beneficial. */
6865 edge_list = create_edge_list ();
6867 /* Compute the dominators and post dominators. We don't
6868 currently use post dominators, but we should for
6869 speculative motion analysis. */
6870 compute_flow_dominators (dom, NULL);
6872 /* build_control_flow will return nonzero if it detects unreachable
6873 blocks or any other irregularity with the cfg which prevents
6874 cross block scheduling. */
6875 if (build_control_flow (edge_list) != 0)
6876 find_single_block_region ();
6877 else
6878 find_rgns (edge_list, dom);
6880 if (sched_verbose >= 3)
6881 debug_regions ();
6883 /* For now. This will move as more and more of haifa is converted
6884 to using the cfg code in flow.c. */
6885 free (dom);
6889 deaths_in_region = (int *) xmalloc (sizeof(int) * nr_regions);
6891 init_alias_analysis ();
6893 if (write_symbols != NO_DEBUG)
6895 rtx line;
6897 line_note_head = (rtx *) xcalloc (n_basic_blocks, sizeof (rtx));
6899 /* Save-line-note-head:
6900 Determine the line-number at the start of each basic block.
6901 This must be computed and saved now, because after a basic block's
6902 predecessor has been scheduled, it is impossible to accurately
6903 determine the correct line number for the first insn of the block. */
6905 for (b = 0; b < n_basic_blocks; b++)
6906 for (line = BLOCK_HEAD (b); line; line = PREV_INSN (line))
6907 if (GET_CODE (line) == NOTE && NOTE_LINE_NUMBER (line) > 0)
6909 line_note_head[b] = line;
6910 break;
6914 /* Find units used in this fuction, for visualization. */
6915 if (sched_verbose)
6916 init_target_units ();
6918 /* ??? Add a NOTE after the last insn of the last basic block. It is not
6919 known why this is done. */
6921 insn = BLOCK_END (n_basic_blocks - 1);
6922 if (NEXT_INSN (insn) == 0
6923 || (GET_CODE (insn) != NOTE
6924 && GET_CODE (insn) != CODE_LABEL
6925 /* Don't emit a NOTE if it would end up between an unconditional
6926 jump and a BARRIER. */
6927 && !(GET_CODE (insn) == JUMP_INSN
6928 && GET_CODE (NEXT_INSN (insn)) == BARRIER)))
6929 emit_note_after (NOTE_INSN_DELETED, BLOCK_END (n_basic_blocks - 1));
6931 /* Compute INSN_REG_WEIGHT for all blocks. We must do this before
6932 removing death notes. */
6933 for (b = n_basic_blocks - 1; b >= 0; b--)
6934 find_insn_reg_weight (b);
6936 /* Remove all death notes from the subroutine. */
6937 for (rgn = 0; rgn < nr_regions; rgn++)
6939 sbitmap_zero (blocks);
6940 for (b = RGN_NR_BLOCKS (rgn) - 1; b >= 0; --b)
6941 SET_BIT (blocks, rgn_bb_table [RGN_BLOCKS (rgn) + b]);
6943 deaths_in_region[rgn] = count_or_remove_death_notes (blocks, 1);
6946 /* Schedule every region in the subroutine. */
6947 for (rgn = 0; rgn < nr_regions; rgn++)
6948 schedule_region (rgn);
6950 /* Update life analysis for the subroutine. Do single block regions
6951 first so that we can verify that live_at_start didn't change. Then
6952 do all other blocks. */
6953 /* ??? There is an outside possibility that update_life_info, or more
6954 to the point propagate_block, could get called with non-zero flags
6955 more than once for one basic block. This would be kinda bad if it
6956 were to happen, since REG_INFO would be accumulated twice for the
6957 block, and we'd have twice the REG_DEAD notes.
6959 I'm fairly certain that this _shouldn't_ happen, since I don't think
6960 that live_at_start should change at region heads. Not sure what the
6961 best way to test for this kind of thing... */
6963 allocate_reg_life_data ();
6964 compute_bb_for_insn (max_uid);
6966 any_large_regions = 0;
6967 sbitmap_ones (large_region_blocks);
6969 for (rgn = 0; rgn < nr_regions; rgn++)
6970 if (RGN_NR_BLOCKS (rgn) > 1)
6971 any_large_regions = 1;
6972 else
6974 sbitmap_zero (blocks);
6975 SET_BIT (blocks, rgn_bb_table[RGN_BLOCKS (rgn)]);
6976 RESET_BIT (large_region_blocks, rgn_bb_table[RGN_BLOCKS (rgn)]);
6978 /* Don't update reg info after reload, since that affects
6979 regs_ever_live, which should not change after reload. */
6980 update_life_info (blocks, UPDATE_LIFE_LOCAL,
6981 (reload_completed ? PROP_DEATH_NOTES
6982 : PROP_DEATH_NOTES | PROP_REG_INFO));
6984 /* In the single block case, the count of registers that died should
6985 not have changed during the schedule. */
6986 if (count_or_remove_death_notes (blocks, 0) != deaths_in_region[rgn])
6987 abort ();
6990 if (any_large_regions)
6992 update_life_info (large_region_blocks, UPDATE_LIFE_GLOBAL,
6993 PROP_DEATH_NOTES | PROP_REG_INFO);
6996 /* Reposition the prologue and epilogue notes in case we moved the
6997 prologue/epilogue insns. */
6998 if (reload_completed)
6999 reposition_prologue_and_epilogue_notes (get_insns ());
7001 /* Delete redundant line notes. */
7002 if (write_symbols != NO_DEBUG)
7003 rm_redundant_line_notes ();
7005 if (sched_verbose)
7007 if (reload_completed == 0 && flag_schedule_interblock)
7009 fprintf (dump, "\n;; Procedure interblock/speculative motions == %d/%d \n",
7010 nr_inter, nr_spec);
7012 else
7014 if (nr_inter > 0)
7015 abort ();
7017 fprintf (dump, "\n\n");
7020 /* Clean up. */
7021 end_alias_analysis ();
7023 if (true_dependency_cache)
7025 free (true_dependency_cache);
7026 true_dependency_cache = NULL;
7028 free (rgn_table);
7029 free (rgn_bb_table);
7030 free (block_to_bb);
7031 free (containing_rgn);
7033 free (h_i_d);
7035 if (write_symbols != NO_DEBUG)
7036 free (line_note_head);
7038 if (edge_table)
7040 free (edge_table);
7041 edge_table = NULL;
7044 if (in_edges)
7046 free (in_edges);
7047 in_edges = NULL;
7049 if (out_edges)
7051 free (out_edges);
7052 out_edges = NULL;
7055 sbitmap_free (blocks);
7056 sbitmap_free (large_region_blocks);
7058 free (deaths_in_region);
7061 #endif /* INSN_SCHEDULING */