Make sure THEN block has any insns at before testing for indirect jump
[official-gcc.git] / gcc / haifa-sched.c
blob6371b454f30b1b2fcacd3334aff0b3a047f4f8bd
1 /* Instruction scheduling pass.
2 Copyright (C) 1992, 1993, 1994, 1995, 1996, 1997, 1998,
3 1999, 2000 Free Software Foundation, Inc.
4 Contributed by Michael Tiemann (tiemann@cygnus.com) Enhanced by,
5 and currently maintained by, Jim Wilson (wilson@cygnus.com)
7 This file is part of GNU CC.
9 GNU CC is free software; you can redistribute it and/or modify it
10 under the terms of the GNU General Public License as published by the
11 Free Software Foundation; either version 2, or (at your option) any
12 later version.
14 GNU CC is distributed in the hope that it will be useful, but WITHOUT
15 ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
16 FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
17 for more details.
19 You should have received a copy of the GNU General Public License
20 along with GNU CC; see the file COPYING. If not, write to the Free
21 the Free Software Foundation, 59 Temple Place - Suite 330, Boston, MA
22 02111-1307, USA. */
25 /* Instruction scheduling pass.
27 This pass implements list scheduling within basic blocks. It is
28 run twice: (1) after flow analysis, but before register allocation,
29 and (2) after register allocation.
31 The first run performs interblock scheduling, moving insns between
32 different blocks in the same "region", and the second runs only
33 basic block scheduling.
35 Interblock motions performed are useful motions and speculative
36 motions, including speculative loads. Motions requiring code
37 duplication are not supported. The identification of motion type
38 and the check for validity of speculative motions requires
39 construction and analysis of the function's control flow graph.
40 The scheduler works as follows:
42 We compute insn priorities based on data dependencies. Flow
43 analysis only creates a fraction of the data-dependencies we must
44 observe: namely, only those dependencies which the combiner can be
45 expected to use. For this pass, we must therefore create the
46 remaining dependencies we need to observe: register dependencies,
47 memory dependencies, dependencies to keep function calls in order,
48 and the dependence between a conditional branch and the setting of
49 condition codes are all dealt with here.
51 The scheduler first traverses the data flow graph, starting with
52 the last instruction, and proceeding to the first, assigning values
53 to insn_priority as it goes. This sorts the instructions
54 topologically by data dependence.
56 Once priorities have been established, we order the insns using
57 list scheduling. This works as follows: starting with a list of
58 all the ready insns, and sorted according to priority number, we
59 schedule the insn from the end of the list by placing its
60 predecessors in the list according to their priority order. We
61 consider this insn scheduled by setting the pointer to the "end" of
62 the list to point to the previous insn. When an insn has no
63 predecessors, we either queue it until sufficient time has elapsed
64 or add it to the ready list. As the instructions are scheduled or
65 when stalls are introduced, the queue advances and dumps insns into
66 the ready list. When all insns down to the lowest priority have
67 been scheduled, the critical path of the basic block has been made
68 as short as possible. The remaining insns are then scheduled in
69 remaining slots.
71 Function unit conflicts are resolved during forward list scheduling
72 by tracking the time when each insn is committed to the schedule
73 and from that, the time the function units it uses must be free.
74 As insns on the ready list are considered for scheduling, those
75 that would result in a blockage of the already committed insns are
76 queued until no blockage will result.
78 The following list shows the order in which we want to break ties
79 among insns in the ready list:
81 1. choose insn with the longest path to end of bb, ties
82 broken by
83 2. choose insn with least contribution to register pressure,
84 ties broken by
85 3. prefer in-block upon interblock motion, ties broken by
86 4. prefer useful upon speculative motion, ties broken by
87 5. choose insn with largest control flow probability, ties
88 broken by
89 6. choose insn with the least dependences upon the previously
90 scheduled insn, or finally
91 7 choose the insn which has the most insns dependent on it.
92 8. choose insn with lowest UID.
94 Memory references complicate matters. Only if we can be certain
95 that memory references are not part of the data dependency graph
96 (via true, anti, or output dependence), can we move operations past
97 memory references. To first approximation, reads can be done
98 independently, while writes introduce dependencies. Better
99 approximations will yield fewer dependencies.
101 Before reload, an extended analysis of interblock data dependences
102 is required for interblock scheduling. This is performed in
103 compute_block_backward_dependences ().
105 Dependencies set up by memory references are treated in exactly the
106 same way as other dependencies, by using LOG_LINKS backward
107 dependences. LOG_LINKS are translated into INSN_DEPEND forward
108 dependences for the purpose of forward list scheduling.
110 Having optimized the critical path, we may have also unduly
111 extended the lifetimes of some registers. If an operation requires
112 that constants be loaded into registers, it is certainly desirable
113 to load those constants as early as necessary, but no earlier.
114 I.e., it will not do to load up a bunch of registers at the
115 beginning of a basic block only to use them at the end, if they
116 could be loaded later, since this may result in excessive register
117 utilization.
119 Note that since branches are never in basic blocks, but only end
120 basic blocks, this pass will not move branches. But that is ok,
121 since we can use GNU's delayed branch scheduling pass to take care
122 of this case.
124 Also note that no further optimizations based on algebraic
125 identities are performed, so this pass would be a good one to
126 perform instruction splitting, such as breaking up a multiply
127 instruction into shifts and adds where that is profitable.
129 Given the memory aliasing analysis that this pass should perform,
130 it should be possible to remove redundant stores to memory, and to
131 load values from registers instead of hitting memory.
133 Before reload, speculative insns are moved only if a 'proof' exists
134 that no exception will be caused by this, and if no live registers
135 exist that inhibit the motion (live registers constraints are not
136 represented by data dependence edges).
138 This pass must update information that subsequent passes expect to
139 be correct. Namely: reg_n_refs, reg_n_sets, reg_n_deaths,
140 reg_n_calls_crossed, and reg_live_length. Also, BLOCK_HEAD,
141 BLOCK_END.
143 The information in the line number notes is carefully retained by
144 this pass. Notes that refer to the starting and ending of
145 exception regions are also carefully retained by this pass. All
146 other NOTE insns are grouped in their same relative order at the
147 beginning of basic blocks and regions that have been scheduled.
149 The main entry point for this pass is schedule_insns(), called for
150 each function. The work of the scheduler is organized in three
151 levels: (1) function level: insns are subject to splitting,
152 control-flow-graph is constructed, regions are computed (after
153 reload, each region is of one block), (2) region level: control
154 flow graph attributes required for interblock scheduling are
155 computed (dominators, reachability, etc.), data dependences and
156 priorities are computed, and (3) block level: insns in the block
157 are actually scheduled. */
159 #include "config.h"
160 #include "system.h"
161 #include "toplev.h"
162 #include "rtl.h"
163 #include "tm_p.h"
164 #include "hard-reg-set.h"
165 #include "basic-block.h"
166 #include "regs.h"
167 #include "function.h"
168 #include "flags.h"
169 #include "insn-config.h"
170 #include "insn-attr.h"
171 #include "except.h"
172 #include "toplev.h"
173 #include "recog.h"
175 extern char *reg_known_equiv_p;
176 extern rtx *reg_known_value;
178 #ifdef INSN_SCHEDULING
180 /* target_units bitmask has 1 for each unit in the cpu. It should be
181 possible to compute this variable from the machine description.
182 But currently it is computed by examining the insn list. Since
183 this is only needed for visualization, it seems an acceptable
184 solution. (For understanding the mapping of bits to units, see
185 definition of function_units[] in "insn-attrtab.c".) */
187 static int target_units = 0;
189 /* issue_rate is the number of insns that can be scheduled in the same
190 machine cycle. It can be defined in the config/mach/mach.h file,
191 otherwise we set it to 1. */
193 static int issue_rate;
195 #ifndef ISSUE_RATE
196 #define ISSUE_RATE 1
197 #endif
199 /* sched-verbose controls the amount of debugging output the
200 scheduler prints. It is controlled by -fsched-verbose=N:
201 N>0 and no -DSR : the output is directed to stderr.
202 N>=10 will direct the printouts to stderr (regardless of -dSR).
203 N=1: same as -dSR.
204 N=2: bb's probabilities, detailed ready list info, unit/insn info.
205 N=3: rtl at abort point, control-flow, regions info.
206 N=5: dependences info. */
208 #define MAX_RGN_BLOCKS 10
209 #define MAX_RGN_INSNS 100
211 static int sched_verbose_param = 0;
212 static int sched_verbose = 0;
214 /* nr_inter/spec counts interblock/speculative motion for the function. */
215 static int nr_inter, nr_spec;
218 /* Debugging file. All printouts are sent to dump, which is always set,
219 either to stderr, or to the dump listing file (-dRS). */
220 static FILE *dump = 0;
222 /* fix_sched_param() is called from toplev.c upon detection
223 of the -fsched-verbose=N option. */
225 void
226 fix_sched_param (param, val)
227 const char *param, *val;
229 if (!strcmp (param, "verbose"))
230 sched_verbose_param = atoi (val);
231 else
232 warning ("fix_sched_param: unknown param: %s", param);
235 /* Describe state of dependencies used during sched_analyze phase. */
236 struct deps
238 /* The *_insns and *_mems are paired lists. Each pending memory operation
239 will have a pointer to the MEM rtx on one list and a pointer to the
240 containing insn on the other list in the same place in the list. */
242 /* We can't use add_dependence like the old code did, because a single insn
243 may have multiple memory accesses, and hence needs to be on the list
244 once for each memory access. Add_dependence won't let you add an insn
245 to a list more than once. */
247 /* An INSN_LIST containing all insns with pending read operations. */
248 rtx pending_read_insns;
250 /* An EXPR_LIST containing all MEM rtx's which are pending reads. */
251 rtx pending_read_mems;
253 /* An INSN_LIST containing all insns with pending write operations. */
254 rtx pending_write_insns;
256 /* An EXPR_LIST containing all MEM rtx's which are pending writes. */
257 rtx pending_write_mems;
259 /* Indicates the combined length of the two pending lists. We must prevent
260 these lists from ever growing too large since the number of dependencies
261 produced is at least O(N*N), and execution time is at least O(4*N*N), as
262 a function of the length of these pending lists. */
263 int pending_lists_length;
265 /* The last insn upon which all memory references must depend.
266 This is an insn which flushed the pending lists, creating a dependency
267 between it and all previously pending memory references. This creates
268 a barrier (or a checkpoint) which no memory reference is allowed to cross.
270 This includes all non constant CALL_INSNs. When we do interprocedural
271 alias analysis, this restriction can be relaxed.
272 This may also be an INSN that writes memory if the pending lists grow
273 too large. */
274 rtx last_pending_memory_flush;
276 /* The last function call we have seen. All hard regs, and, of course,
277 the last function call, must depend on this. */
278 rtx last_function_call;
280 /* Used to keep post-call psuedo/hard reg movements together with
281 the call. */
282 int in_post_call_group_p;
284 /* The LOG_LINKS field of this is a list of insns which use a pseudo
285 register that does not already cross a call. We create
286 dependencies between each of those insn and the next call insn,
287 to ensure that they won't cross a call after scheduling is done. */
288 rtx sched_before_next_call;
290 /* Element N is the next insn that sets (hard or pseudo) register
291 N within the current basic block; or zero, if there is no
292 such insn. Needed for new registers which may be introduced
293 by splitting insns. */
294 rtx *reg_last_uses;
295 rtx *reg_last_sets;
296 rtx *reg_last_clobbers;
299 static regset reg_pending_sets;
300 static regset reg_pending_clobbers;
301 static int reg_pending_sets_all;
303 /* To speed up the test for duplicate dependency links we keep a record
304 of true dependencies created by add_dependence when the average number
305 of instructions in a basic block is very large.
307 Studies have shown that there is typically around 5 instructions between
308 branches for typical C code. So we can make a guess that the average
309 basic block is approximately 5 instructions long; we will choose 100X
310 the average size as a very large basic block.
312 Each insn has an associated bitmap for its dependencies. Each bitmap
313 has enough entries to represent a dependency on any other insn in the
314 insn chain. */
315 static sbitmap *true_dependency_cache;
317 /* Indexed by INSN_UID, the collection of all data associated with
318 a single instruction. */
320 struct haifa_insn_data
322 /* A list of insns which depend on the instruction. Unlike LOG_LINKS,
323 it represents forward dependancies. */
324 rtx depend;
326 /* The line number note in effect for each insn. For line number
327 notes, this indicates whether the note may be reused. */
328 rtx line_note;
330 /* Logical uid gives the original ordering of the insns. */
331 int luid;
333 /* A priority for each insn. */
334 int priority;
336 /* The number of incoming edges in the forward dependency graph.
337 As scheduling proceds, counts are decreased. An insn moves to
338 the ready queue when its counter reaches zero. */
339 int dep_count;
341 /* An encoding of the blockage range function. Both unit and range
342 are coded. */
343 unsigned int blockage;
345 /* Number of instructions referring to this insn. */
346 int ref_count;
348 /* The minimum clock tick at which the insn becomes ready. This is
349 used to note timing constraints for the insns in the pending list. */
350 int tick;
352 short cost;
354 /* An encoding of the function units used. */
355 short units;
357 /* This weight is an estimation of the insn's contribution to
358 register pressure. */
359 short reg_weight;
361 /* Some insns (e.g. call) are not allowed to move across blocks. */
362 unsigned int cant_move : 1;
364 /* Set if there's DEF-USE dependance between some speculatively
365 moved load insn and this one. */
366 unsigned int fed_by_spec_load : 1;
367 unsigned int is_load_insn : 1;
370 static struct haifa_insn_data *h_i_d;
372 #define INSN_DEPEND(INSN) (h_i_d[INSN_UID (INSN)].depend)
373 #define INSN_LUID(INSN) (h_i_d[INSN_UID (INSN)].luid)
374 #define INSN_PRIORITY(INSN) (h_i_d[INSN_UID (INSN)].priority)
375 #define INSN_DEP_COUNT(INSN) (h_i_d[INSN_UID (INSN)].dep_count)
376 #define INSN_COST(INSN) (h_i_d[INSN_UID (INSN)].cost)
377 #define INSN_UNIT(INSN) (h_i_d[INSN_UID (INSN)].units)
378 #define INSN_REG_WEIGHT(INSN) (h_i_d[INSN_UID (INSN)].reg_weight)
380 #define INSN_BLOCKAGE(INSN) (h_i_d[INSN_UID (INSN)].blockage)
381 #define UNIT_BITS 5
382 #define BLOCKAGE_MASK ((1 << BLOCKAGE_BITS) - 1)
383 #define ENCODE_BLOCKAGE(U, R) \
384 (((U) << BLOCKAGE_BITS \
385 | MIN_BLOCKAGE_COST (R)) << BLOCKAGE_BITS \
386 | MAX_BLOCKAGE_COST (R))
387 #define UNIT_BLOCKED(B) ((B) >> (2 * BLOCKAGE_BITS))
388 #define BLOCKAGE_RANGE(B) \
389 (((((B) >> BLOCKAGE_BITS) & BLOCKAGE_MASK) << (HOST_BITS_PER_INT / 2)) \
390 | ((B) & BLOCKAGE_MASK))
392 /* Encodings of the `<name>_unit_blockage_range' function. */
393 #define MIN_BLOCKAGE_COST(R) ((R) >> (HOST_BITS_PER_INT / 2))
394 #define MAX_BLOCKAGE_COST(R) ((R) & ((1 << (HOST_BITS_PER_INT / 2)) - 1))
396 #define DONE_PRIORITY -1
397 #define MAX_PRIORITY 0x7fffffff
398 #define TAIL_PRIORITY 0x7ffffffe
399 #define LAUNCH_PRIORITY 0x7f000001
400 #define DONE_PRIORITY_P(INSN) (INSN_PRIORITY (INSN) < 0)
401 #define LOW_PRIORITY_P(INSN) ((INSN_PRIORITY (INSN) & 0x7f000000) == 0)
403 #define INSN_REF_COUNT(INSN) (h_i_d[INSN_UID (INSN)].ref_count)
404 #define LINE_NOTE(INSN) (h_i_d[INSN_UID (INSN)].line_note)
405 #define INSN_TICK(INSN) (h_i_d[INSN_UID (INSN)].tick)
406 #define CANT_MOVE(insn) (h_i_d[INSN_UID (insn)].cant_move)
407 #define FED_BY_SPEC_LOAD(insn) (h_i_d[INSN_UID (insn)].fed_by_spec_load)
408 #define IS_LOAD_INSN(insn) (h_i_d[INSN_UID (insn)].is_load_insn)
410 /* Vector indexed by basic block number giving the starting line-number
411 for each basic block. */
412 static rtx *line_note_head;
414 /* List of important notes we must keep around. This is a pointer to the
415 last element in the list. */
416 static rtx note_list;
418 /* Queues, etc. */
420 /* An instruction is ready to be scheduled when all insns preceding it
421 have already been scheduled. It is important to ensure that all
422 insns which use its result will not be executed until its result
423 has been computed. An insn is maintained in one of four structures:
425 (P) the "Pending" set of insns which cannot be scheduled until
426 their dependencies have been satisfied.
427 (Q) the "Queued" set of insns that can be scheduled when sufficient
428 time has passed.
429 (R) the "Ready" list of unscheduled, uncommitted insns.
430 (S) the "Scheduled" list of insns.
432 Initially, all insns are either "Pending" or "Ready" depending on
433 whether their dependencies are satisfied.
435 Insns move from the "Ready" list to the "Scheduled" list as they
436 are committed to the schedule. As this occurs, the insns in the
437 "Pending" list have their dependencies satisfied and move to either
438 the "Ready" list or the "Queued" set depending on whether
439 sufficient time has passed to make them ready. As time passes,
440 insns move from the "Queued" set to the "Ready" list. Insns may
441 move from the "Ready" list to the "Queued" set if they are blocked
442 due to a function unit conflict.
444 The "Pending" list (P) are the insns in the INSN_DEPEND of the unscheduled
445 insns, i.e., those that are ready, queued, and pending.
446 The "Queued" set (Q) is implemented by the variable `insn_queue'.
447 The "Ready" list (R) is implemented by the variables `ready' and
448 `n_ready'.
449 The "Scheduled" list (S) is the new insn chain built by this pass.
451 The transition (R->S) is implemented in the scheduling loop in
452 `schedule_block' when the best insn to schedule is chosen.
453 The transition (R->Q) is implemented in `queue_insn' when an
454 insn is found to have a function unit conflict with the already
455 committed insns.
456 The transitions (P->R and P->Q) are implemented in `schedule_insn' as
457 insns move from the ready list to the scheduled list.
458 The transition (Q->R) is implemented in 'queue_to_insn' as time
459 passes or stalls are introduced. */
461 /* Implement a circular buffer to delay instructions until sufficient
462 time has passed. INSN_QUEUE_SIZE is a power of two larger than
463 MAX_BLOCKAGE and MAX_READY_COST computed by genattr.c. This is the
464 longest time an isnsn may be queued. */
465 static rtx insn_queue[INSN_QUEUE_SIZE];
466 static int q_ptr = 0;
467 static int q_size = 0;
468 #define NEXT_Q(X) (((X)+1) & (INSN_QUEUE_SIZE-1))
469 #define NEXT_Q_AFTER(X, C) (((X)+C) & (INSN_QUEUE_SIZE-1))
471 /* Forward declarations. */
472 static void add_dependence PARAMS ((rtx, rtx, enum reg_note));
473 static void remove_dependence PARAMS ((rtx, rtx));
474 static rtx find_insn_list PARAMS ((rtx, rtx));
475 static void set_sched_group_p PARAMS ((rtx));
476 static int insn_unit PARAMS ((rtx));
477 static unsigned int blockage_range PARAMS ((int, rtx));
478 static void clear_units PARAMS ((void));
479 static int actual_hazard_this_instance PARAMS ((int, int, rtx, int, int));
480 static void schedule_unit PARAMS ((int, rtx, int));
481 static int actual_hazard PARAMS ((int, rtx, int, int));
482 static int potential_hazard PARAMS ((int, rtx, int));
483 static int insn_cost PARAMS ((rtx, rtx, rtx));
484 static int priority PARAMS ((rtx));
485 static void free_pending_lists PARAMS ((void));
486 static void add_insn_mem_dependence PARAMS ((struct deps *, rtx *, rtx *, rtx,
487 rtx));
488 static void flush_pending_lists PARAMS ((struct deps *, rtx, int));
489 static void sched_analyze_1 PARAMS ((struct deps *, rtx, rtx));
490 static void sched_analyze_2 PARAMS ((struct deps *, rtx, rtx));
491 static void sched_analyze_insn PARAMS ((struct deps *, rtx, rtx, rtx));
492 static void sched_analyze PARAMS ((struct deps *, rtx, rtx));
493 static int rank_for_schedule PARAMS ((const PTR, const PTR));
494 static void swap_sort PARAMS ((rtx *, int));
495 static void queue_insn PARAMS ((rtx, int));
496 static int schedule_insn PARAMS ((rtx, rtx *, int, int));
497 static void find_insn_reg_weight PARAMS ((int));
498 static int schedule_block PARAMS ((int, int));
499 static char *safe_concat PARAMS ((char *, char *, const char *));
500 static int insn_issue_delay PARAMS ((rtx));
501 static void adjust_priority PARAMS ((rtx));
503 /* Control flow graph edges are kept in circular lists. */
504 typedef struct
506 int from_block;
507 int to_block;
508 int next_in;
509 int next_out;
511 haifa_edge;
512 static haifa_edge *edge_table;
514 #define NEXT_IN(edge) (edge_table[edge].next_in)
515 #define NEXT_OUT(edge) (edge_table[edge].next_out)
516 #define FROM_BLOCK(edge) (edge_table[edge].from_block)
517 #define TO_BLOCK(edge) (edge_table[edge].to_block)
519 /* Number of edges in the control flow graph. (In fact, larger than
520 that by 1, since edge 0 is unused.) */
521 static int nr_edges;
523 /* Circular list of incoming/outgoing edges of a block. */
524 static int *in_edges;
525 static int *out_edges;
527 #define IN_EDGES(block) (in_edges[block])
528 #define OUT_EDGES(block) (out_edges[block])
532 static int is_cfg_nonregular PARAMS ((void));
533 static int build_control_flow PARAMS ((struct edge_list *));
534 static void new_edge PARAMS ((int, int));
537 /* A region is the main entity for interblock scheduling: insns
538 are allowed to move between blocks in the same region, along
539 control flow graph edges, in the 'up' direction. */
540 typedef struct
542 int rgn_nr_blocks; /* Number of blocks in region. */
543 int rgn_blocks; /* cblocks in the region (actually index in rgn_bb_table). */
545 region;
547 /* Number of regions in the procedure. */
548 static int nr_regions;
550 /* Table of region descriptions. */
551 static region *rgn_table;
553 /* Array of lists of regions' blocks. */
554 static int *rgn_bb_table;
556 /* Topological order of blocks in the region (if b2 is reachable from
557 b1, block_to_bb[b2] > block_to_bb[b1]). Note: A basic block is
558 always referred to by either block or b, while its topological
559 order name (in the region) is refered to by bb. */
560 static int *block_to_bb;
562 /* The number of the region containing a block. */
563 static int *containing_rgn;
565 #define RGN_NR_BLOCKS(rgn) (rgn_table[rgn].rgn_nr_blocks)
566 #define RGN_BLOCKS(rgn) (rgn_table[rgn].rgn_blocks)
567 #define BLOCK_TO_BB(block) (block_to_bb[block])
568 #define CONTAINING_RGN(block) (containing_rgn[block])
570 void debug_regions PARAMS ((void));
571 static void find_single_block_region PARAMS ((void));
572 static void find_rgns PARAMS ((struct edge_list *, sbitmap *));
573 static int too_large PARAMS ((int, int *, int *));
575 extern void debug_live PARAMS ((int, int));
577 /* Blocks of the current region being scheduled. */
578 static int current_nr_blocks;
579 static int current_blocks;
581 /* The mapping from bb to block. */
582 #define BB_TO_BLOCK(bb) (rgn_bb_table[current_blocks + (bb)])
585 /* Bit vectors and bitset operations are needed for computations on
586 the control flow graph. */
588 typedef unsigned HOST_WIDE_INT *bitset;
589 typedef struct
591 int *first_member; /* Pointer to the list start in bitlst_table. */
592 int nr_members; /* The number of members of the bit list. */
594 bitlst;
596 static int bitlst_table_last;
597 static int bitlst_table_size;
598 static int *bitlst_table;
600 static char bitset_member PARAMS ((bitset, int, int));
601 static void extract_bitlst PARAMS ((bitset, int, int, bitlst *));
603 /* Target info declarations.
605 The block currently being scheduled is referred to as the "target" block,
606 while other blocks in the region from which insns can be moved to the
607 target are called "source" blocks. The candidate structure holds info
608 about such sources: are they valid? Speculative? Etc. */
609 typedef bitlst bblst;
610 typedef struct
612 char is_valid;
613 char is_speculative;
614 int src_prob;
615 bblst split_bbs;
616 bblst update_bbs;
618 candidate;
620 static candidate *candidate_table;
622 /* A speculative motion requires checking live information on the path
623 from 'source' to 'target'. The split blocks are those to be checked.
624 After a speculative motion, live information should be modified in
625 the 'update' blocks.
627 Lists of split and update blocks for each candidate of the current
628 target are in array bblst_table. */
629 static int *bblst_table, bblst_size, bblst_last;
631 #define IS_VALID(src) ( candidate_table[src].is_valid )
632 #define IS_SPECULATIVE(src) ( candidate_table[src].is_speculative )
633 #define SRC_PROB(src) ( candidate_table[src].src_prob )
635 /* The bb being currently scheduled. */
636 static int target_bb;
638 /* List of edges. */
639 typedef bitlst edgelst;
641 /* Target info functions. */
642 static void split_edges PARAMS ((int, int, edgelst *));
643 static void compute_trg_info PARAMS ((int));
644 void debug_candidate PARAMS ((int));
645 void debug_candidates PARAMS ((int));
648 /* Bit-set of bbs, where bit 'i' stands for bb 'i'. */
649 typedef bitset bbset;
651 /* Number of words of the bbset. */
652 static int bbset_size;
654 /* Dominators array: dom[i] contains the bbset of dominators of
655 bb i in the region. */
656 static bbset *dom;
658 /* bb 0 is the only region entry. */
659 #define IS_RGN_ENTRY(bb) (!bb)
661 /* Is bb_src dominated by bb_trg. */
662 #define IS_DOMINATED(bb_src, bb_trg) \
663 ( bitset_member (dom[bb_src], bb_trg, bbset_size) )
665 /* Probability: Prob[i] is a float in [0, 1] which is the probability
666 of bb i relative to the region entry. */
667 static float *prob;
669 /* The probability of bb_src, relative to bb_trg. Note, that while the
670 'prob[bb]' is a float in [0, 1], this macro returns an integer
671 in [0, 100]. */
672 #define GET_SRC_PROB(bb_src, bb_trg) ((int) (100.0 * (prob[bb_src] / \
673 prob[bb_trg])))
675 /* Bit-set of edges, where bit i stands for edge i. */
676 typedef bitset edgeset;
678 /* Number of edges in the region. */
679 static int rgn_nr_edges;
681 /* Array of size rgn_nr_edges. */
682 static int *rgn_edges;
684 /* Number of words in an edgeset. */
685 static int edgeset_size;
687 /* Number of bits in an edgeset. */
688 static int edgeset_bitsize;
690 /* Mapping from each edge in the graph to its number in the rgn. */
691 static int *edge_to_bit;
692 #define EDGE_TO_BIT(edge) (edge_to_bit[edge])
694 /* The split edges of a source bb is different for each target
695 bb. In order to compute this efficiently, the 'potential-split edges'
696 are computed for each bb prior to scheduling a region. This is actually
697 the split edges of each bb relative to the region entry.
699 pot_split[bb] is the set of potential split edges of bb. */
700 static edgeset *pot_split;
702 /* For every bb, a set of its ancestor edges. */
703 static edgeset *ancestor_edges;
705 static void compute_dom_prob_ps PARAMS ((int));
707 #define ABS_VALUE(x) (((x)<0)?(-(x)):(x))
708 #define INSN_PROBABILITY(INSN) (SRC_PROB (BLOCK_TO_BB (BLOCK_NUM (INSN))))
709 #define IS_SPECULATIVE_INSN(INSN) (IS_SPECULATIVE (BLOCK_TO_BB (BLOCK_NUM (INSN))))
710 #define INSN_BB(INSN) (BLOCK_TO_BB (BLOCK_NUM (INSN)))
712 /* Parameters affecting the decision of rank_for_schedule(). */
713 #define MIN_DIFF_PRIORITY 2
714 #define MIN_PROBABILITY 40
715 #define MIN_PROB_DIFF 10
717 /* Speculative scheduling functions. */
718 static int check_live_1 PARAMS ((int, rtx));
719 static void update_live_1 PARAMS ((int, rtx));
720 static int check_live PARAMS ((rtx, int));
721 static void update_live PARAMS ((rtx, int));
722 static void set_spec_fed PARAMS ((rtx));
723 static int is_pfree PARAMS ((rtx, int, int));
724 static int find_conditional_protection PARAMS ((rtx, int));
725 static int is_conditionally_protected PARAMS ((rtx, int, int));
726 static int may_trap_exp PARAMS ((rtx, int));
727 static int haifa_classify_insn PARAMS ((rtx));
728 static int is_prisky PARAMS ((rtx, int, int));
729 static int is_exception_free PARAMS ((rtx, int, int));
731 static char find_insn_mem_list PARAMS ((rtx, rtx, rtx, rtx));
732 static void compute_block_forward_dependences PARAMS ((int));
733 static void add_branch_dependences PARAMS ((rtx, rtx));
734 static void compute_block_backward_dependences PARAMS ((int));
735 void debug_dependencies PARAMS ((void));
737 /* Notes handling mechanism:
738 =========================
739 Generally, NOTES are saved before scheduling and restored after scheduling.
740 The scheduler distinguishes between three types of notes:
742 (1) LINE_NUMBER notes, generated and used for debugging. Here,
743 before scheduling a region, a pointer to the LINE_NUMBER note is
744 added to the insn following it (in save_line_notes()), and the note
745 is removed (in rm_line_notes() and unlink_line_notes()). After
746 scheduling the region, this pointer is used for regeneration of
747 the LINE_NUMBER note (in restore_line_notes()).
749 (2) LOOP_BEGIN, LOOP_END, SETJMP, EHREGION_BEG, EHREGION_END notes:
750 Before scheduling a region, a pointer to the note is added to the insn
751 that follows or precedes it. (This happens as part of the data dependence
752 computation). After scheduling an insn, the pointer contained in it is
753 used for regenerating the corresponding note (in reemit_notes).
755 (3) All other notes (e.g. INSN_DELETED): Before scheduling a block,
756 these notes are put in a list (in rm_other_notes() and
757 unlink_other_notes ()). After scheduling the block, these notes are
758 inserted at the beginning of the block (in schedule_block()). */
760 static rtx unlink_other_notes PARAMS ((rtx, rtx));
761 static rtx unlink_line_notes PARAMS ((rtx, rtx));
762 static void rm_line_notes PARAMS ((int));
763 static void save_line_notes PARAMS ((int));
764 static void restore_line_notes PARAMS ((int));
765 static void rm_redundant_line_notes PARAMS ((void));
766 static void rm_other_notes PARAMS ((rtx, rtx));
767 static rtx reemit_notes PARAMS ((rtx, rtx));
769 static void get_block_head_tail PARAMS ((int, rtx *, rtx *));
770 static void get_bb_head_tail PARAMS ((int, rtx *, rtx *));
772 static int queue_to_ready PARAMS ((rtx [], int));
774 static void debug_ready_list PARAMS ((rtx[], int));
775 static void init_target_units PARAMS ((void));
776 static void insn_print_units PARAMS ((rtx));
777 static int get_visual_tbl_length PARAMS ((void));
778 static void init_block_visualization PARAMS ((void));
779 static void print_block_visualization PARAMS ((int, const char *));
780 static void visualize_scheduled_insns PARAMS ((int, int));
781 static void visualize_no_unit PARAMS ((rtx));
782 static void visualize_stall_cycles PARAMS ((int, int));
783 static void print_exp PARAMS ((char *, rtx, int));
784 static void print_value PARAMS ((char *, rtx, int));
785 static void print_pattern PARAMS ((char *, rtx, int));
786 static void print_insn PARAMS ((char *, rtx, int));
787 void debug_reg_vector PARAMS ((regset));
789 static rtx move_insn1 PARAMS ((rtx, rtx));
790 static rtx move_insn PARAMS ((rtx, rtx));
791 static rtx group_leader PARAMS ((rtx));
792 static int set_priorities PARAMS ((int));
793 static void init_deps PARAMS ((struct deps *));
794 static void schedule_region PARAMS ((int));
795 static void propagate_deps PARAMS ((int, struct deps *, int));
797 #endif /* INSN_SCHEDULING */
799 #define SIZE_FOR_MODE(X) (GET_MODE_SIZE (GET_MODE (X)))
801 /* Add ELEM wrapped in an INSN_LIST with reg note kind DEP_TYPE to the
802 LOG_LINKS of INSN, if not already there. DEP_TYPE indicates the type
803 of dependence that this link represents. */
805 static void
806 add_dependence (insn, elem, dep_type)
807 rtx insn;
808 rtx elem;
809 enum reg_note dep_type;
811 rtx link, next;
813 /* Don't depend an insn on itself. */
814 if (insn == elem)
815 return;
817 /* We can get a dependency on deleted insns due to optimizations in
818 the register allocation and reloading or due to splitting. Any
819 such dependency is useless and can be ignored. */
820 if (GET_CODE (elem) == NOTE)
821 return;
823 /* If elem is part of a sequence that must be scheduled together, then
824 make the dependence point to the last insn of the sequence.
825 When HAVE_cc0, it is possible for NOTEs to exist between users and
826 setters of the condition codes, so we must skip past notes here.
827 Otherwise, NOTEs are impossible here. */
828 next = next_nonnote_insn (elem);
829 if (next && SCHED_GROUP_P (next)
830 && GET_CODE (next) != CODE_LABEL)
832 /* Notes will never intervene here though, so don't bother checking
833 for them. */
834 /* Hah! Wrong. */
835 /* We must reject CODE_LABELs, so that we don't get confused by one
836 that has LABEL_PRESERVE_P set, which is represented by the same
837 bit in the rtl as SCHED_GROUP_P. A CODE_LABEL can never be
838 SCHED_GROUP_P. */
840 rtx nnext;
841 while ((nnext = next_nonnote_insn (next)) != NULL
842 && SCHED_GROUP_P (nnext)
843 && GET_CODE (nnext) != CODE_LABEL)
844 next = nnext;
846 /* Again, don't depend an insn on itself. */
847 if (insn == next)
848 return;
850 /* Make the dependence to NEXT, the last insn of the group, instead
851 of the original ELEM. */
852 elem = next;
855 #ifdef INSN_SCHEDULING
856 /* (This code is guarded by INSN_SCHEDULING, otherwise INSN_BB is undefined.)
857 No need for interblock dependences with calls, since
858 calls are not moved between blocks. Note: the edge where
859 elem is a CALL is still required. */
860 if (GET_CODE (insn) == CALL_INSN
861 && (INSN_BB (elem) != INSN_BB (insn)))
862 return;
864 /* If we already have a true dependency for ELEM, then we do not
865 need to do anything. Avoiding the list walk below can cut
866 compile times dramatically for some code. */
867 if (true_dependency_cache
868 && TEST_BIT (true_dependency_cache[INSN_LUID (insn)], INSN_LUID (elem)))
869 return;
870 #endif
872 /* Check that we don't already have this dependence. */
873 for (link = LOG_LINKS (insn); link; link = XEXP (link, 1))
874 if (XEXP (link, 0) == elem)
876 /* If this is a more restrictive type of dependence than the existing
877 one, then change the existing dependence to this type. */
878 if ((int) dep_type < (int) REG_NOTE_KIND (link))
879 PUT_REG_NOTE_KIND (link, dep_type);
881 #ifdef INSN_SCHEDULING
882 /* If we are adding a true dependency to INSN's LOG_LINKs, then
883 note that in the bitmap cache of true dependency information. */
884 if ((int)dep_type == 0 && true_dependency_cache)
885 SET_BIT (true_dependency_cache[INSN_LUID (insn)], INSN_LUID (elem));
886 #endif
887 return;
889 /* Might want to check one level of transitivity to save conses. */
891 link = alloc_INSN_LIST (elem, LOG_LINKS (insn));
892 LOG_LINKS (insn) = link;
894 /* Insn dependency, not data dependency. */
895 PUT_REG_NOTE_KIND (link, dep_type);
897 #ifdef INSN_SCHEDULING
898 /* If we are adding a true dependency to INSN's LOG_LINKs, then
899 note that in the bitmap cache of true dependency information. */
900 if ((int)dep_type == 0 && true_dependency_cache)
901 SET_BIT (true_dependency_cache[INSN_LUID (insn)], INSN_LUID (elem));
902 #endif
905 /* Remove ELEM wrapped in an INSN_LIST from the LOG_LINKS
906 of INSN. Abort if not found. */
908 static void
909 remove_dependence (insn, elem)
910 rtx insn;
911 rtx elem;
913 rtx prev, link, next;
914 int found = 0;
916 for (prev = 0, link = LOG_LINKS (insn); link; link = next)
918 next = XEXP (link, 1);
919 if (XEXP (link, 0) == elem)
921 if (prev)
922 XEXP (prev, 1) = next;
923 else
924 LOG_LINKS (insn) = next;
926 #ifdef INSN_SCHEDULING
927 /* If we are removing a true dependency from the LOG_LINKS list,
928 make sure to remove it from the cache too. */
929 if (REG_NOTE_KIND (link) == 0 && true_dependency_cache)
930 RESET_BIT (true_dependency_cache[INSN_LUID (insn)],
931 INSN_LUID (elem));
932 #endif
934 free_INSN_LIST_node (link);
936 found = 1;
938 else
939 prev = link;
942 if (!found)
943 abort ();
944 return;
947 /* Return the INSN_LIST containing INSN in LIST, or NULL
948 if LIST does not contain INSN. */
950 static inline rtx
951 find_insn_list (insn, list)
952 rtx insn;
953 rtx list;
955 while (list)
957 if (XEXP (list, 0) == insn)
958 return list;
959 list = XEXP (list, 1);
961 return 0;
964 /* Set SCHED_GROUP_P and care for the rest of the bookkeeping that
965 goes along with that. */
967 static void
968 set_sched_group_p (insn)
969 rtx insn;
971 rtx link, prev;
973 SCHED_GROUP_P (insn) = 1;
975 /* There may be a note before this insn now, but all notes will
976 be removed before we actually try to schedule the insns, so
977 it won't cause a problem later. We must avoid it here though. */
978 prev = prev_nonnote_insn (insn);
980 /* Make a copy of all dependencies on the immediately previous insn,
981 and add to this insn. This is so that all the dependencies will
982 apply to the group. Remove an explicit dependence on this insn
983 as SCHED_GROUP_P now represents it. */
985 if (find_insn_list (prev, LOG_LINKS (insn)))
986 remove_dependence (insn, prev);
988 for (link = LOG_LINKS (prev); link; link = XEXP (link, 1))
989 add_dependence (insn, XEXP (link, 0), REG_NOTE_KIND (link));
992 #ifndef INSN_SCHEDULING
993 void
994 schedule_insns (dump_file)
995 FILE *dump_file ATTRIBUTE_UNUSED;
998 #else
999 #ifndef __GNUC__
1000 #define __inline
1001 #endif
1003 #ifndef HAIFA_INLINE
1004 #define HAIFA_INLINE __inline
1005 #endif
1007 /* Computation of memory dependencies. */
1009 /* Data structures for the computation of data dependences in a regions. We
1010 keep one mem_deps structure for every basic block. Before analyzing the
1011 data dependences for a bb, its variables are initialized as a function of
1012 the variables of its predecessors. When the analysis for a bb completes,
1013 we save the contents to the corresponding bb_mem_deps[bb] variable. */
1015 static struct deps *bb_deps;
1017 /* Pointer to the last instruction scheduled. Used by rank_for_schedule,
1018 so that insns independent of the last scheduled insn will be preferred
1019 over dependent instructions. */
1021 static rtx last_scheduled_insn;
1023 /* Functions for construction of the control flow graph. */
1025 /* Return 1 if control flow graph should not be constructed, 0 otherwise.
1027 We decide not to build the control flow graph if there is possibly more
1028 than one entry to the function, if computed branches exist, of if we
1029 have nonlocal gotos. */
1031 static int
1032 is_cfg_nonregular ()
1034 int b;
1035 rtx insn;
1036 RTX_CODE code;
1038 /* If we have a label that could be the target of a nonlocal goto, then
1039 the cfg is not well structured. */
1040 if (nonlocal_goto_handler_labels)
1041 return 1;
1043 /* If we have any forced labels, then the cfg is not well structured. */
1044 if (forced_labels)
1045 return 1;
1047 /* If this function has a computed jump, then we consider the cfg
1048 not well structured. */
1049 if (current_function_has_computed_jump)
1050 return 1;
1052 /* If we have exception handlers, then we consider the cfg not well
1053 structured. ?!? We should be able to handle this now that flow.c
1054 computes an accurate cfg for EH. */
1055 if (exception_handler_labels)
1056 return 1;
1058 /* If we have non-jumping insns which refer to labels, then we consider
1059 the cfg not well structured. */
1060 /* Check for labels referred to other thn by jumps. */
1061 for (b = 0; b < n_basic_blocks; b++)
1062 for (insn = BLOCK_HEAD (b);; insn = NEXT_INSN (insn))
1064 code = GET_CODE (insn);
1065 if (GET_RTX_CLASS (code) == 'i')
1067 rtx note;
1069 for (note = REG_NOTES (insn); note; note = XEXP (note, 1))
1070 if (REG_NOTE_KIND (note) == REG_LABEL)
1071 return 1;
1074 if (insn == BLOCK_END (b))
1075 break;
1078 /* All the tests passed. Consider the cfg well structured. */
1079 return 0;
1082 /* Build the control flow graph and set nr_edges.
1084 Instead of trying to build a cfg ourselves, we rely on flow to
1085 do it for us. Stamp out useless code (and bug) duplication.
1087 Return nonzero if an irregularity in the cfg is found which would
1088 prevent cross block scheduling. */
1090 static int
1091 build_control_flow (edge_list)
1092 struct edge_list *edge_list;
1094 int i, unreachable, num_edges;
1096 /* This already accounts for entry/exit edges. */
1097 num_edges = NUM_EDGES (edge_list);
1099 /* Unreachable loops with more than one basic block are detected
1100 during the DFS traversal in find_rgns.
1102 Unreachable loops with a single block are detected here. This
1103 test is redundant with the one in find_rgns, but it's much
1104 cheaper to go ahead and catch the trivial case here. */
1105 unreachable = 0;
1106 for (i = 0; i < n_basic_blocks; i++)
1108 basic_block b = BASIC_BLOCK (i);
1110 if (b->pred == NULL
1111 || (b->pred->src == b
1112 && b->pred->pred_next == NULL))
1113 unreachable = 1;
1116 /* ??? We can kill these soon. */
1117 in_edges = (int *) xcalloc (n_basic_blocks, sizeof (int));
1118 out_edges = (int *) xcalloc (n_basic_blocks, sizeof (int));
1119 edge_table = (haifa_edge *) xcalloc (num_edges, sizeof (haifa_edge));
1121 nr_edges = 0;
1122 for (i = 0; i < num_edges; i++)
1124 edge e = INDEX_EDGE (edge_list, i);
1126 if (e->dest != EXIT_BLOCK_PTR
1127 && e->src != ENTRY_BLOCK_PTR)
1128 new_edge (e->src->index, e->dest->index);
1131 /* Increment by 1, since edge 0 is unused. */
1132 nr_edges++;
1134 return unreachable;
1138 /* Record an edge in the control flow graph from SOURCE to TARGET.
1140 In theory, this is redundant with the s_succs computed above, but
1141 we have not converted all of haifa to use information from the
1142 integer lists. */
1144 static void
1145 new_edge (source, target)
1146 int source, target;
1148 int e, next_edge;
1149 int curr_edge, fst_edge;
1151 /* Check for duplicates. */
1152 fst_edge = curr_edge = OUT_EDGES (source);
1153 while (curr_edge)
1155 if (FROM_BLOCK (curr_edge) == source
1156 && TO_BLOCK (curr_edge) == target)
1158 return;
1161 curr_edge = NEXT_OUT (curr_edge);
1163 if (fst_edge == curr_edge)
1164 break;
1167 e = ++nr_edges;
1169 FROM_BLOCK (e) = source;
1170 TO_BLOCK (e) = target;
1172 if (OUT_EDGES (source))
1174 next_edge = NEXT_OUT (OUT_EDGES (source));
1175 NEXT_OUT (OUT_EDGES (source)) = e;
1176 NEXT_OUT (e) = next_edge;
1178 else
1180 OUT_EDGES (source) = e;
1181 NEXT_OUT (e) = e;
1184 if (IN_EDGES (target))
1186 next_edge = NEXT_IN (IN_EDGES (target));
1187 NEXT_IN (IN_EDGES (target)) = e;
1188 NEXT_IN (e) = next_edge;
1190 else
1192 IN_EDGES (target) = e;
1193 NEXT_IN (e) = e;
1198 /* BITSET macros for operations on the control flow graph. */
1200 /* Compute bitwise union of two bitsets. */
1201 #define BITSET_UNION(set1, set2, len) \
1202 do { register bitset tp = set1, sp = set2; \
1203 register int i; \
1204 for (i = 0; i < len; i++) \
1205 *(tp++) |= *(sp++); } while (0)
1207 /* Compute bitwise intersection of two bitsets. */
1208 #define BITSET_INTER(set1, set2, len) \
1209 do { register bitset tp = set1, sp = set2; \
1210 register int i; \
1211 for (i = 0; i < len; i++) \
1212 *(tp++) &= *(sp++); } while (0)
1214 /* Compute bitwise difference of two bitsets. */
1215 #define BITSET_DIFFER(set1, set2, len) \
1216 do { register bitset tp = set1, sp = set2; \
1217 register int i; \
1218 for (i = 0; i < len; i++) \
1219 *(tp++) &= ~*(sp++); } while (0)
1221 /* Inverts every bit of bitset 'set'. */
1222 #define BITSET_INVERT(set, len) \
1223 do { register bitset tmpset = set; \
1224 register int i; \
1225 for (i = 0; i < len; i++, tmpset++) \
1226 *tmpset = ~*tmpset; } while (0)
1228 /* Turn on the index'th bit in bitset set. */
1229 #define BITSET_ADD(set, index, len) \
1231 if (index >= HOST_BITS_PER_WIDE_INT * len) \
1232 abort (); \
1233 else \
1234 set[index/HOST_BITS_PER_WIDE_INT] |= \
1235 1 << (index % HOST_BITS_PER_WIDE_INT); \
1238 /* Turn off the index'th bit in set. */
1239 #define BITSET_REMOVE(set, index, len) \
1241 if (index >= HOST_BITS_PER_WIDE_INT * len) \
1242 abort (); \
1243 else \
1244 set[index/HOST_BITS_PER_WIDE_INT] &= \
1245 ~(1 << (index%HOST_BITS_PER_WIDE_INT)); \
1249 /* Check if the index'th bit in bitset set is on. */
1251 static char
1252 bitset_member (set, index, len)
1253 bitset set;
1254 int index, len;
1256 if (index >= HOST_BITS_PER_WIDE_INT * len)
1257 abort ();
1258 return (set[index / HOST_BITS_PER_WIDE_INT] &
1259 1 << (index % HOST_BITS_PER_WIDE_INT)) ? 1 : 0;
1263 /* Translate a bit-set SET to a list BL of the bit-set members. */
1265 static void
1266 extract_bitlst (set, len, bitlen, bl)
1267 bitset set;
1268 int len;
1269 int bitlen;
1270 bitlst *bl;
1272 int i, j, offset;
1273 unsigned HOST_WIDE_INT word;
1275 /* bblst table space is reused in each call to extract_bitlst. */
1276 bitlst_table_last = 0;
1278 bl->first_member = &bitlst_table[bitlst_table_last];
1279 bl->nr_members = 0;
1281 /* Iterate over each word in the bitset. */
1282 for (i = 0; i < len; i++)
1284 word = set[i];
1285 offset = i * HOST_BITS_PER_WIDE_INT;
1287 /* Iterate over each bit in the word, but do not
1288 go beyond the end of the defined bits. */
1289 for (j = 0; offset < bitlen && word; j++)
1291 if (word & 1)
1293 bitlst_table[bitlst_table_last++] = offset;
1294 (bl->nr_members)++;
1296 word >>= 1;
1297 ++offset;
1304 /* Functions for the construction of regions. */
1306 /* Print the regions, for debugging purposes. Callable from debugger. */
1308 void
1309 debug_regions ()
1311 int rgn, bb;
1313 fprintf (dump, "\n;; ------------ REGIONS ----------\n\n");
1314 for (rgn = 0; rgn < nr_regions; rgn++)
1316 fprintf (dump, ";;\trgn %d nr_blocks %d:\n", rgn,
1317 rgn_table[rgn].rgn_nr_blocks);
1318 fprintf (dump, ";;\tbb/block: ");
1320 for (bb = 0; bb < rgn_table[rgn].rgn_nr_blocks; bb++)
1322 current_blocks = RGN_BLOCKS (rgn);
1324 if (bb != BLOCK_TO_BB (BB_TO_BLOCK (bb)))
1325 abort ();
1327 fprintf (dump, " %d/%d ", bb, BB_TO_BLOCK (bb));
1330 fprintf (dump, "\n\n");
1335 /* Build a single block region for each basic block in the function.
1336 This allows for using the same code for interblock and basic block
1337 scheduling. */
1339 static void
1340 find_single_block_region ()
1342 int i;
1344 for (i = 0; i < n_basic_blocks; i++)
1346 rgn_bb_table[i] = i;
1347 RGN_NR_BLOCKS (i) = 1;
1348 RGN_BLOCKS (i) = i;
1349 CONTAINING_RGN (i) = i;
1350 BLOCK_TO_BB (i) = 0;
1352 nr_regions = n_basic_blocks;
1356 /* Update number of blocks and the estimate for number of insns
1357 in the region. Return 1 if the region is "too large" for interblock
1358 scheduling (compile time considerations), otherwise return 0. */
1360 static int
1361 too_large (block, num_bbs, num_insns)
1362 int block, *num_bbs, *num_insns;
1364 (*num_bbs)++;
1365 (*num_insns) += (INSN_LUID (BLOCK_END (block)) -
1366 INSN_LUID (BLOCK_HEAD (block)));
1367 if ((*num_bbs > MAX_RGN_BLOCKS) || (*num_insns > MAX_RGN_INSNS))
1368 return 1;
1369 else
1370 return 0;
1374 /* Update_loop_relations(blk, hdr): Check if the loop headed by max_hdr[blk]
1375 is still an inner loop. Put in max_hdr[blk] the header of the most inner
1376 loop containing blk. */
1377 #define UPDATE_LOOP_RELATIONS(blk, hdr) \
1379 if (max_hdr[blk] == -1) \
1380 max_hdr[blk] = hdr; \
1381 else if (dfs_nr[max_hdr[blk]] > dfs_nr[hdr]) \
1382 RESET_BIT (inner, hdr); \
1383 else if (dfs_nr[max_hdr[blk]] < dfs_nr[hdr]) \
1385 RESET_BIT (inner,max_hdr[blk]); \
1386 max_hdr[blk] = hdr; \
1391 /* Find regions for interblock scheduling.
1393 A region for scheduling can be:
1395 * A loop-free procedure, or
1397 * A reducible inner loop, or
1399 * A basic block not contained in any other region.
1402 ?!? In theory we could build other regions based on extended basic
1403 blocks or reverse extended basic blocks. Is it worth the trouble?
1405 Loop blocks that form a region are put into the region's block list
1406 in topological order.
1408 This procedure stores its results into the following global (ick) variables
1410 * rgn_nr
1411 * rgn_table
1412 * rgn_bb_table
1413 * block_to_bb
1414 * containing region
1417 We use dominator relationships to avoid making regions out of non-reducible
1418 loops.
1420 This procedure needs to be converted to work on pred/succ lists instead
1421 of edge tables. That would simplify it somewhat. */
1423 static void
1424 find_rgns (edge_list, dom)
1425 struct edge_list *edge_list;
1426 sbitmap *dom;
1428 int *max_hdr, *dfs_nr, *stack, *degree;
1429 char no_loops = 1;
1430 int node, child, loop_head, i, head, tail;
1431 int count = 0, sp, idx = 0, current_edge = out_edges[0];
1432 int num_bbs, num_insns, unreachable;
1433 int too_large_failure;
1435 /* Note if an edge has been passed. */
1436 sbitmap passed;
1438 /* Note if a block is a natural loop header. */
1439 sbitmap header;
1441 /* Note if a block is an natural inner loop header. */
1442 sbitmap inner;
1444 /* Note if a block is in the block queue. */
1445 sbitmap in_queue;
1447 /* Note if a block is in the block queue. */
1448 sbitmap in_stack;
1450 int num_edges = NUM_EDGES (edge_list);
1452 /* Perform a DFS traversal of the cfg. Identify loop headers, inner loops
1453 and a mapping from block to its loop header (if the block is contained
1454 in a loop, else -1).
1456 Store results in HEADER, INNER, and MAX_HDR respectively, these will
1457 be used as inputs to the second traversal.
1459 STACK, SP and DFS_NR are only used during the first traversal. */
1461 /* Allocate and initialize variables for the first traversal. */
1462 max_hdr = (int *) xmalloc (n_basic_blocks * sizeof (int));
1463 dfs_nr = (int *) xcalloc (n_basic_blocks, sizeof (int));
1464 stack = (int *) xmalloc (nr_edges * sizeof (int));
1466 inner = sbitmap_alloc (n_basic_blocks);
1467 sbitmap_ones (inner);
1469 header = sbitmap_alloc (n_basic_blocks);
1470 sbitmap_zero (header);
1472 passed = sbitmap_alloc (nr_edges);
1473 sbitmap_zero (passed);
1475 in_queue = sbitmap_alloc (n_basic_blocks);
1476 sbitmap_zero (in_queue);
1478 in_stack = sbitmap_alloc (n_basic_blocks);
1479 sbitmap_zero (in_stack);
1481 for (i = 0; i < n_basic_blocks; i++)
1482 max_hdr[i] = -1;
1484 /* DFS traversal to find inner loops in the cfg. */
1486 sp = -1;
1487 while (1)
1489 if (current_edge == 0 || TEST_BIT (passed, current_edge))
1491 /* We have reached a leaf node or a node that was already
1492 processed. Pop edges off the stack until we find
1493 an edge that has not yet been processed. */
1494 while (sp >= 0
1495 && (current_edge == 0 || TEST_BIT (passed, current_edge)))
1497 /* Pop entry off the stack. */
1498 current_edge = stack[sp--];
1499 node = FROM_BLOCK (current_edge);
1500 child = TO_BLOCK (current_edge);
1501 RESET_BIT (in_stack, child);
1502 if (max_hdr[child] >= 0 && TEST_BIT (in_stack, max_hdr[child]))
1503 UPDATE_LOOP_RELATIONS (node, max_hdr[child]);
1504 current_edge = NEXT_OUT (current_edge);
1507 /* See if have finished the DFS tree traversal. */
1508 if (sp < 0 && TEST_BIT (passed, current_edge))
1509 break;
1511 /* Nope, continue the traversal with the popped node. */
1512 continue;
1515 /* Process a node. */
1516 node = FROM_BLOCK (current_edge);
1517 child = TO_BLOCK (current_edge);
1518 SET_BIT (in_stack, node);
1519 dfs_nr[node] = ++count;
1521 /* If the successor is in the stack, then we've found a loop.
1522 Mark the loop, if it is not a natural loop, then it will
1523 be rejected during the second traversal. */
1524 if (TEST_BIT (in_stack, child))
1526 no_loops = 0;
1527 SET_BIT (header, child);
1528 UPDATE_LOOP_RELATIONS (node, child);
1529 SET_BIT (passed, current_edge);
1530 current_edge = NEXT_OUT (current_edge);
1531 continue;
1534 /* If the child was already visited, then there is no need to visit
1535 it again. Just update the loop relationships and restart
1536 with a new edge. */
1537 if (dfs_nr[child])
1539 if (max_hdr[child] >= 0 && TEST_BIT (in_stack, max_hdr[child]))
1540 UPDATE_LOOP_RELATIONS (node, max_hdr[child]);
1541 SET_BIT (passed, current_edge);
1542 current_edge = NEXT_OUT (current_edge);
1543 continue;
1546 /* Push an entry on the stack and continue DFS traversal. */
1547 stack[++sp] = current_edge;
1548 SET_BIT (passed, current_edge);
1549 current_edge = OUT_EDGES (child);
1551 /* This is temporary until haifa is converted to use rth's new
1552 cfg routines which have true entry/exit blocks and the
1553 appropriate edges from/to those blocks.
1555 Generally we update dfs_nr for a node when we process its
1556 out edge. However, if the node has no out edge then we will
1557 not set dfs_nr for that node. This can confuse the scheduler
1558 into thinking that we have unreachable blocks, which in turn
1559 disables cross block scheduling.
1561 So, if we have a node with no out edges, go ahead and mark it
1562 as reachable now. */
1563 if (current_edge == 0)
1564 dfs_nr[child] = ++count;
1567 /* Another check for unreachable blocks. The earlier test in
1568 is_cfg_nonregular only finds unreachable blocks that do not
1569 form a loop.
1571 The DFS traversal will mark every block that is reachable from
1572 the entry node by placing a nonzero value in dfs_nr. Thus if
1573 dfs_nr is zero for any block, then it must be unreachable. */
1574 unreachable = 0;
1575 for (i = 0; i < n_basic_blocks; i++)
1576 if (dfs_nr[i] == 0)
1578 unreachable = 1;
1579 break;
1582 /* Gross. To avoid wasting memory, the second pass uses the dfs_nr array
1583 to hold degree counts. */
1584 degree = dfs_nr;
1586 for (i = 0; i < n_basic_blocks; i++)
1587 degree[i] = 0;
1588 for (i = 0; i < num_edges; i++)
1590 edge e = INDEX_EDGE (edge_list, i);
1592 if (e->dest != EXIT_BLOCK_PTR)
1593 degree[e->dest->index]++;
1596 /* Do not perform region scheduling if there are any unreachable
1597 blocks. */
1598 if (!unreachable)
1600 int *queue;
1602 if (no_loops)
1603 SET_BIT (header, 0);
1605 /* Second travsersal:find reducible inner loops and topologically sort
1606 block of each region. */
1608 queue = (int *) xmalloc (n_basic_blocks * sizeof (int));
1610 /* Find blocks which are inner loop headers. We still have non-reducible
1611 loops to consider at this point. */
1612 for (i = 0; i < n_basic_blocks; i++)
1614 if (TEST_BIT (header, i) && TEST_BIT (inner, i))
1616 edge e;
1617 int j;
1619 /* Now check that the loop is reducible. We do this separate
1620 from finding inner loops so that we do not find a reducible
1621 loop which contains an inner non-reducible loop.
1623 A simple way to find reducible/natural loops is to verify
1624 that each block in the loop is dominated by the loop
1625 header.
1627 If there exists a block that is not dominated by the loop
1628 header, then the block is reachable from outside the loop
1629 and thus the loop is not a natural loop. */
1630 for (j = 0; j < n_basic_blocks; j++)
1632 /* First identify blocks in the loop, except for the loop
1633 entry block. */
1634 if (i == max_hdr[j] && i != j)
1636 /* Now verify that the block is dominated by the loop
1637 header. */
1638 if (!TEST_BIT (dom[j], i))
1639 break;
1643 /* If we exited the loop early, then I is the header of
1644 a non-reducible loop and we should quit processing it
1645 now. */
1646 if (j != n_basic_blocks)
1647 continue;
1649 /* I is a header of an inner loop, or block 0 in a subroutine
1650 with no loops at all. */
1651 head = tail = -1;
1652 too_large_failure = 0;
1653 loop_head = max_hdr[i];
1655 /* Decrease degree of all I's successors for topological
1656 ordering. */
1657 for (e = BASIC_BLOCK (i)->succ; e; e = e->succ_next)
1658 if (e->dest != EXIT_BLOCK_PTR)
1659 --degree[e->dest->index];
1661 /* Estimate # insns, and count # blocks in the region. */
1662 num_bbs = 1;
1663 num_insns = (INSN_LUID (BLOCK_END (i))
1664 - INSN_LUID (BLOCK_HEAD (i)));
1667 /* Find all loop latches (blocks with back edges to the loop
1668 header) or all the leaf blocks in the cfg has no loops.
1670 Place those blocks into the queue. */
1671 if (no_loops)
1673 for (j = 0; j < n_basic_blocks; j++)
1674 /* Leaf nodes have only a single successor which must
1675 be EXIT_BLOCK. */
1676 if (BASIC_BLOCK (j)->succ
1677 && BASIC_BLOCK (j)->succ->dest == EXIT_BLOCK_PTR
1678 && BASIC_BLOCK (j)->succ->succ_next == NULL)
1680 queue[++tail] = j;
1681 SET_BIT (in_queue, j);
1683 if (too_large (j, &num_bbs, &num_insns))
1685 too_large_failure = 1;
1686 break;
1690 else
1692 edge e;
1694 for (e = BASIC_BLOCK (i)->pred; e; e = e->pred_next)
1696 if (e->src == ENTRY_BLOCK_PTR)
1697 continue;
1699 node = e->src->index;
1701 if (max_hdr[node] == loop_head && node != i)
1703 /* This is a loop latch. */
1704 queue[++tail] = node;
1705 SET_BIT (in_queue, node);
1707 if (too_large (node, &num_bbs, &num_insns))
1709 too_large_failure = 1;
1710 break;
1717 /* Now add all the blocks in the loop to the queue.
1719 We know the loop is a natural loop; however the algorithm
1720 above will not always mark certain blocks as being in the
1721 loop. Consider:
1722 node children
1723 a b,c
1725 c a,d
1729 The algorithm in the DFS traversal may not mark B & D as part
1730 of the loop (ie they will not have max_hdr set to A).
1732 We know they can not be loop latches (else they would have
1733 had max_hdr set since they'd have a backedge to a dominator
1734 block). So we don't need them on the initial queue.
1736 We know they are part of the loop because they are dominated
1737 by the loop header and can be reached by a backwards walk of
1738 the edges starting with nodes on the initial queue.
1740 It is safe and desirable to include those nodes in the
1741 loop/scheduling region. To do so we would need to decrease
1742 the degree of a node if it is the target of a backedge
1743 within the loop itself as the node is placed in the queue.
1745 We do not do this because I'm not sure that the actual
1746 scheduling code will properly handle this case. ?!? */
1748 while (head < tail && !too_large_failure)
1750 edge e;
1751 child = queue[++head];
1753 for (e = BASIC_BLOCK (child)->pred; e; e = e->pred_next)
1755 node = e->src->index;
1757 /* See discussion above about nodes not marked as in
1758 this loop during the initial DFS traversal. */
1759 if (e->src == ENTRY_BLOCK_PTR
1760 || max_hdr[node] != loop_head)
1762 tail = -1;
1763 break;
1765 else if (!TEST_BIT (in_queue, node) && node != i)
1767 queue[++tail] = node;
1768 SET_BIT (in_queue, node);
1770 if (too_large (node, &num_bbs, &num_insns))
1772 too_large_failure = 1;
1773 break;
1779 if (tail >= 0 && !too_large_failure)
1781 /* Place the loop header into list of region blocks. */
1782 degree[i] = -1;
1783 rgn_bb_table[idx] = i;
1784 RGN_NR_BLOCKS (nr_regions) = num_bbs;
1785 RGN_BLOCKS (nr_regions) = idx++;
1786 CONTAINING_RGN (i) = nr_regions;
1787 BLOCK_TO_BB (i) = count = 0;
1789 /* Remove blocks from queue[] when their in degree
1790 becomes zero. Repeat until no blocks are left on the
1791 list. This produces a topological list of blocks in
1792 the region. */
1793 while (tail >= 0)
1795 if (head < 0)
1796 head = tail;
1797 child = queue[head];
1798 if (degree[child] == 0)
1800 edge e;
1802 degree[child] = -1;
1803 rgn_bb_table[idx++] = child;
1804 BLOCK_TO_BB (child) = ++count;
1805 CONTAINING_RGN (child) = nr_regions;
1806 queue[head] = queue[tail--];
1808 for (e = BASIC_BLOCK (child)->succ;
1810 e = e->succ_next)
1811 if (e->dest != EXIT_BLOCK_PTR)
1812 --degree[e->dest->index];
1814 else
1815 --head;
1817 ++nr_regions;
1821 free (queue);
1824 /* Any block that did not end up in a region is placed into a region
1825 by itself. */
1826 for (i = 0; i < n_basic_blocks; i++)
1827 if (degree[i] >= 0)
1829 rgn_bb_table[idx] = i;
1830 RGN_NR_BLOCKS (nr_regions) = 1;
1831 RGN_BLOCKS (nr_regions) = idx++;
1832 CONTAINING_RGN (i) = nr_regions++;
1833 BLOCK_TO_BB (i) = 0;
1836 free (max_hdr);
1837 free (dfs_nr);
1838 free (stack);
1839 free (passed);
1840 free (header);
1841 free (inner);
1842 free (in_queue);
1843 free (in_stack);
1847 /* Functions for regions scheduling information. */
1849 /* Compute dominators, probability, and potential-split-edges of bb.
1850 Assume that these values were already computed for bb's predecessors. */
1852 static void
1853 compute_dom_prob_ps (bb)
1854 int bb;
1856 int nxt_in_edge, fst_in_edge, pred;
1857 int fst_out_edge, nxt_out_edge, nr_out_edges, nr_rgn_out_edges;
1859 prob[bb] = 0.0;
1860 if (IS_RGN_ENTRY (bb))
1862 BITSET_ADD (dom[bb], 0, bbset_size);
1863 prob[bb] = 1.0;
1864 return;
1867 fst_in_edge = nxt_in_edge = IN_EDGES (BB_TO_BLOCK (bb));
1869 /* Intialize dom[bb] to '111..1'. */
1870 BITSET_INVERT (dom[bb], bbset_size);
1874 pred = FROM_BLOCK (nxt_in_edge);
1875 BITSET_INTER (dom[bb], dom[BLOCK_TO_BB (pred)], bbset_size);
1877 BITSET_UNION (ancestor_edges[bb], ancestor_edges[BLOCK_TO_BB (pred)],
1878 edgeset_size);
1880 BITSET_ADD (ancestor_edges[bb], EDGE_TO_BIT (nxt_in_edge), edgeset_size);
1882 nr_out_edges = 1;
1883 nr_rgn_out_edges = 0;
1884 fst_out_edge = OUT_EDGES (pred);
1885 nxt_out_edge = NEXT_OUT (fst_out_edge);
1886 BITSET_UNION (pot_split[bb], pot_split[BLOCK_TO_BB (pred)],
1887 edgeset_size);
1889 BITSET_ADD (pot_split[bb], EDGE_TO_BIT (fst_out_edge), edgeset_size);
1891 /* The successor doesn't belong in the region? */
1892 if (CONTAINING_RGN (TO_BLOCK (fst_out_edge)) !=
1893 CONTAINING_RGN (BB_TO_BLOCK (bb)))
1894 ++nr_rgn_out_edges;
1896 while (fst_out_edge != nxt_out_edge)
1898 ++nr_out_edges;
1899 /* The successor doesn't belong in the region? */
1900 if (CONTAINING_RGN (TO_BLOCK (nxt_out_edge)) !=
1901 CONTAINING_RGN (BB_TO_BLOCK (bb)))
1902 ++nr_rgn_out_edges;
1903 BITSET_ADD (pot_split[bb], EDGE_TO_BIT (nxt_out_edge), edgeset_size);
1904 nxt_out_edge = NEXT_OUT (nxt_out_edge);
1908 /* Now nr_rgn_out_edges is the number of region-exit edges from
1909 pred, and nr_out_edges will be the number of pred out edges
1910 not leaving the region. */
1911 nr_out_edges -= nr_rgn_out_edges;
1912 if (nr_rgn_out_edges > 0)
1913 prob[bb] += 0.9 * prob[BLOCK_TO_BB (pred)] / nr_out_edges;
1914 else
1915 prob[bb] += prob[BLOCK_TO_BB (pred)] / nr_out_edges;
1916 nxt_in_edge = NEXT_IN (nxt_in_edge);
1918 while (fst_in_edge != nxt_in_edge);
1920 BITSET_ADD (dom[bb], bb, bbset_size);
1921 BITSET_DIFFER (pot_split[bb], ancestor_edges[bb], edgeset_size);
1923 if (sched_verbose >= 2)
1924 fprintf (dump, ";; bb_prob(%d, %d) = %3d\n", bb, BB_TO_BLOCK (bb), (int) (100.0 * prob[bb]));
1925 } /* compute_dom_prob_ps */
1927 /* Functions for target info. */
1929 /* Compute in BL the list of split-edges of bb_src relatively to bb_trg.
1930 Note that bb_trg dominates bb_src. */
1932 static void
1933 split_edges (bb_src, bb_trg, bl)
1934 int bb_src;
1935 int bb_trg;
1936 edgelst *bl;
1938 int es = edgeset_size;
1939 edgeset src = (edgeset) xcalloc (es, sizeof (HOST_WIDE_INT));
1941 while (es--)
1942 src[es] = (pot_split[bb_src])[es];
1943 BITSET_DIFFER (src, pot_split[bb_trg], edgeset_size);
1944 extract_bitlst (src, edgeset_size, edgeset_bitsize, bl);
1945 free (src);
1949 /* Find the valid candidate-source-blocks for the target block TRG, compute
1950 their probability, and check if they are speculative or not.
1951 For speculative sources, compute their update-blocks and split-blocks. */
1953 static void
1954 compute_trg_info (trg)
1955 int trg;
1957 register candidate *sp;
1958 edgelst el;
1959 int check_block, update_idx;
1960 int i, j, k, fst_edge, nxt_edge;
1962 /* Define some of the fields for the target bb as well. */
1963 sp = candidate_table + trg;
1964 sp->is_valid = 1;
1965 sp->is_speculative = 0;
1966 sp->src_prob = 100;
1968 for (i = trg + 1; i < current_nr_blocks; i++)
1970 sp = candidate_table + i;
1972 sp->is_valid = IS_DOMINATED (i, trg);
1973 if (sp->is_valid)
1975 sp->src_prob = GET_SRC_PROB (i, trg);
1976 sp->is_valid = (sp->src_prob >= MIN_PROBABILITY);
1979 if (sp->is_valid)
1981 split_edges (i, trg, &el);
1982 sp->is_speculative = (el.nr_members) ? 1 : 0;
1983 if (sp->is_speculative && !flag_schedule_speculative)
1984 sp->is_valid = 0;
1987 if (sp->is_valid)
1989 sp->split_bbs.first_member = &bblst_table[bblst_last];
1990 sp->split_bbs.nr_members = el.nr_members;
1991 for (j = 0; j < el.nr_members; bblst_last++, j++)
1992 bblst_table[bblst_last] =
1993 TO_BLOCK (rgn_edges[el.first_member[j]]);
1994 sp->update_bbs.first_member = &bblst_table[bblst_last];
1995 update_idx = 0;
1996 for (j = 0; j < el.nr_members; j++)
1998 check_block = FROM_BLOCK (rgn_edges[el.first_member[j]]);
1999 fst_edge = nxt_edge = OUT_EDGES (check_block);
2002 for (k = 0; k < el.nr_members; k++)
2003 if (EDGE_TO_BIT (nxt_edge) == el.first_member[k])
2004 break;
2006 if (k >= el.nr_members)
2008 bblst_table[bblst_last++] = TO_BLOCK (nxt_edge);
2009 update_idx++;
2012 nxt_edge = NEXT_OUT (nxt_edge);
2014 while (fst_edge != nxt_edge);
2016 sp->update_bbs.nr_members = update_idx;
2019 else
2021 sp->split_bbs.nr_members = sp->update_bbs.nr_members = 0;
2023 sp->is_speculative = 0;
2024 sp->src_prob = 0;
2027 } /* compute_trg_info */
2030 /* Print candidates info, for debugging purposes. Callable from debugger. */
2032 void
2033 debug_candidate (i)
2034 int i;
2036 if (!candidate_table[i].is_valid)
2037 return;
2039 if (candidate_table[i].is_speculative)
2041 int j;
2042 fprintf (dump, "src b %d bb %d speculative \n", BB_TO_BLOCK (i), i);
2044 fprintf (dump, "split path: ");
2045 for (j = 0; j < candidate_table[i].split_bbs.nr_members; j++)
2047 int b = candidate_table[i].split_bbs.first_member[j];
2049 fprintf (dump, " %d ", b);
2051 fprintf (dump, "\n");
2053 fprintf (dump, "update path: ");
2054 for (j = 0; j < candidate_table[i].update_bbs.nr_members; j++)
2056 int b = candidate_table[i].update_bbs.first_member[j];
2058 fprintf (dump, " %d ", b);
2060 fprintf (dump, "\n");
2062 else
2064 fprintf (dump, " src %d equivalent\n", BB_TO_BLOCK (i));
2069 /* Print candidates info, for debugging purposes. Callable from debugger. */
2071 void
2072 debug_candidates (trg)
2073 int trg;
2075 int i;
2077 fprintf (dump, "----------- candidate table: target: b=%d bb=%d ---\n",
2078 BB_TO_BLOCK (trg), trg);
2079 for (i = trg + 1; i < current_nr_blocks; i++)
2080 debug_candidate (i);
2084 /* Functions for speculative scheduing. */
2086 /* Return 0 if x is a set of a register alive in the beginning of one
2087 of the split-blocks of src, otherwise return 1. */
2089 static int
2090 check_live_1 (src, x)
2091 int src;
2092 rtx x;
2094 register int i;
2095 register int regno;
2096 register rtx reg = SET_DEST (x);
2098 if (reg == 0)
2099 return 1;
2101 while (GET_CODE (reg) == SUBREG || GET_CODE (reg) == ZERO_EXTRACT
2102 || GET_CODE (reg) == SIGN_EXTRACT
2103 || GET_CODE (reg) == STRICT_LOW_PART)
2104 reg = XEXP (reg, 0);
2106 if (GET_CODE (reg) == PARALLEL
2107 && GET_MODE (reg) == BLKmode)
2109 register int i;
2110 for (i = XVECLEN (reg, 0) - 1; i >= 0; i--)
2111 if (check_live_1 (src, XVECEXP (reg, 0, i)))
2112 return 1;
2113 return 0;
2116 if (GET_CODE (reg) != REG)
2117 return 1;
2119 regno = REGNO (reg);
2121 if (regno < FIRST_PSEUDO_REGISTER && global_regs[regno])
2123 /* Global registers are assumed live. */
2124 return 0;
2126 else
2128 if (regno < FIRST_PSEUDO_REGISTER)
2130 /* Check for hard registers. */
2131 int j = HARD_REGNO_NREGS (regno, GET_MODE (reg));
2132 while (--j >= 0)
2134 for (i = 0; i < candidate_table[src].split_bbs.nr_members; i++)
2136 int b = candidate_table[src].split_bbs.first_member[i];
2138 if (REGNO_REG_SET_P (BASIC_BLOCK (b)->global_live_at_start,
2139 regno + j))
2141 return 0;
2146 else
2148 /* Check for psuedo registers. */
2149 for (i = 0; i < candidate_table[src].split_bbs.nr_members; i++)
2151 int b = candidate_table[src].split_bbs.first_member[i];
2153 if (REGNO_REG_SET_P (BASIC_BLOCK (b)->global_live_at_start, regno))
2155 return 0;
2161 return 1;
2165 /* If x is a set of a register R, mark that R is alive in the beginning
2166 of every update-block of src. */
2168 static void
2169 update_live_1 (src, x)
2170 int src;
2171 rtx x;
2173 register int i;
2174 register int regno;
2175 register rtx reg = SET_DEST (x);
2177 if (reg == 0)
2178 return;
2180 while (GET_CODE (reg) == SUBREG || GET_CODE (reg) == ZERO_EXTRACT
2181 || GET_CODE (reg) == SIGN_EXTRACT
2182 || GET_CODE (reg) == STRICT_LOW_PART)
2183 reg = XEXP (reg, 0);
2185 if (GET_CODE (reg) == PARALLEL
2186 && GET_MODE (reg) == BLKmode)
2188 register int i;
2189 for (i = XVECLEN (reg, 0) - 1; i >= 0; i--)
2190 update_live_1 (src, XVECEXP (reg, 0, i));
2191 return;
2194 if (GET_CODE (reg) != REG)
2195 return;
2197 /* Global registers are always live, so the code below does not apply
2198 to them. */
2200 regno = REGNO (reg);
2202 if (regno >= FIRST_PSEUDO_REGISTER || !global_regs[regno])
2204 if (regno < FIRST_PSEUDO_REGISTER)
2206 int j = HARD_REGNO_NREGS (regno, GET_MODE (reg));
2207 while (--j >= 0)
2209 for (i = 0; i < candidate_table[src].update_bbs.nr_members; i++)
2211 int b = candidate_table[src].update_bbs.first_member[i];
2213 SET_REGNO_REG_SET (BASIC_BLOCK (b)->global_live_at_start,
2214 regno + j);
2218 else
2220 for (i = 0; i < candidate_table[src].update_bbs.nr_members; i++)
2222 int b = candidate_table[src].update_bbs.first_member[i];
2224 SET_REGNO_REG_SET (BASIC_BLOCK (b)->global_live_at_start, regno);
2231 /* Return 1 if insn can be speculatively moved from block src to trg,
2232 otherwise return 0. Called before first insertion of insn to
2233 ready-list or before the scheduling. */
2235 static int
2236 check_live (insn, src)
2237 rtx insn;
2238 int src;
2240 /* Find the registers set by instruction. */
2241 if (GET_CODE (PATTERN (insn)) == SET
2242 || GET_CODE (PATTERN (insn)) == CLOBBER)
2243 return check_live_1 (src, PATTERN (insn));
2244 else if (GET_CODE (PATTERN (insn)) == PARALLEL)
2246 int j;
2247 for (j = XVECLEN (PATTERN (insn), 0) - 1; j >= 0; j--)
2248 if ((GET_CODE (XVECEXP (PATTERN (insn), 0, j)) == SET
2249 || GET_CODE (XVECEXP (PATTERN (insn), 0, j)) == CLOBBER)
2250 && !check_live_1 (src, XVECEXP (PATTERN (insn), 0, j)))
2251 return 0;
2253 return 1;
2256 return 1;
2260 /* Update the live registers info after insn was moved speculatively from
2261 block src to trg. */
2263 static void
2264 update_live (insn, src)
2265 rtx insn;
2266 int src;
2268 /* Find the registers set by instruction. */
2269 if (GET_CODE (PATTERN (insn)) == SET
2270 || GET_CODE (PATTERN (insn)) == CLOBBER)
2271 update_live_1 (src, PATTERN (insn));
2272 else if (GET_CODE (PATTERN (insn)) == PARALLEL)
2274 int j;
2275 for (j = XVECLEN (PATTERN (insn), 0) - 1; j >= 0; j--)
2276 if (GET_CODE (XVECEXP (PATTERN (insn), 0, j)) == SET
2277 || GET_CODE (XVECEXP (PATTERN (insn), 0, j)) == CLOBBER)
2278 update_live_1 (src, XVECEXP (PATTERN (insn), 0, j));
2282 /* Exception Free Loads:
2284 We define five classes of speculative loads: IFREE, IRISKY,
2285 PFREE, PRISKY, and MFREE.
2287 IFREE loads are loads that are proved to be exception-free, just
2288 by examining the load insn. Examples for such loads are loads
2289 from TOC and loads of global data.
2291 IRISKY loads are loads that are proved to be exception-risky,
2292 just by examining the load insn. Examples for such loads are
2293 volatile loads and loads from shared memory.
2295 PFREE loads are loads for which we can prove, by examining other
2296 insns, that they are exception-free. Currently, this class consists
2297 of loads for which we are able to find a "similar load", either in
2298 the target block, or, if only one split-block exists, in that split
2299 block. Load2 is similar to load1 if both have same single base
2300 register. We identify only part of the similar loads, by finding
2301 an insn upon which both load1 and load2 have a DEF-USE dependence.
2303 PRISKY loads are loads for which we can prove, by examining other
2304 insns, that they are exception-risky. Currently we have two proofs for
2305 such loads. The first proof detects loads that are probably guarded by a
2306 test on the memory address. This proof is based on the
2307 backward and forward data dependence information for the region.
2308 Let load-insn be the examined load.
2309 Load-insn is PRISKY iff ALL the following hold:
2311 - insn1 is not in the same block as load-insn
2312 - there is a DEF-USE dependence chain (insn1, ..., load-insn)
2313 - test-insn is either a compare or a branch, not in the same block
2314 as load-insn
2315 - load-insn is reachable from test-insn
2316 - there is a DEF-USE dependence chain (insn1, ..., test-insn)
2318 This proof might fail when the compare and the load are fed
2319 by an insn not in the region. To solve this, we will add to this
2320 group all loads that have no input DEF-USE dependence.
2322 The second proof detects loads that are directly or indirectly
2323 fed by a speculative load. This proof is affected by the
2324 scheduling process. We will use the flag fed_by_spec_load.
2325 Initially, all insns have this flag reset. After a speculative
2326 motion of an insn, if insn is either a load, or marked as
2327 fed_by_spec_load, we will also mark as fed_by_spec_load every
2328 insn1 for which a DEF-USE dependence (insn, insn1) exists. A
2329 load which is fed_by_spec_load is also PRISKY.
2331 MFREE (maybe-free) loads are all the remaining loads. They may be
2332 exception-free, but we cannot prove it.
2334 Now, all loads in IFREE and PFREE classes are considered
2335 exception-free, while all loads in IRISKY and PRISKY classes are
2336 considered exception-risky. As for loads in the MFREE class,
2337 these are considered either exception-free or exception-risky,
2338 depending on whether we are pessimistic or optimistic. We have
2339 to take the pessimistic approach to assure the safety of
2340 speculative scheduling, but we can take the optimistic approach
2341 by invoking the -fsched_spec_load_dangerous option. */
2343 enum INSN_TRAP_CLASS
2345 TRAP_FREE = 0, IFREE = 1, PFREE_CANDIDATE = 2,
2346 PRISKY_CANDIDATE = 3, IRISKY = 4, TRAP_RISKY = 5
2349 #define WORST_CLASS(class1, class2) \
2350 ((class1 > class2) ? class1 : class2)
2352 /* Non-zero if block bb_to is equal to, or reachable from block bb_from. */
2353 #define IS_REACHABLE(bb_from, bb_to) \
2354 (bb_from == bb_to \
2355 || IS_RGN_ENTRY (bb_from) \
2356 || (bitset_member (ancestor_edges[bb_to], \
2357 EDGE_TO_BIT (IN_EDGES (BB_TO_BLOCK (bb_from))), \
2358 edgeset_size)))
2360 /* Non-zero iff the address is comprised from at most 1 register. */
2361 #define CONST_BASED_ADDRESS_P(x) \
2362 (GET_CODE (x) == REG \
2363 || ((GET_CODE (x) == PLUS || GET_CODE (x) == MINUS \
2364 || (GET_CODE (x) == LO_SUM)) \
2365 && (GET_CODE (XEXP (x, 0)) == CONST_INT \
2366 || GET_CODE (XEXP (x, 1)) == CONST_INT)))
2368 /* Turns on the fed_by_spec_load flag for insns fed by load_insn. */
2370 static void
2371 set_spec_fed (load_insn)
2372 rtx load_insn;
2374 rtx link;
2376 for (link = INSN_DEPEND (load_insn); link; link = XEXP (link, 1))
2377 if (GET_MODE (link) == VOIDmode)
2378 FED_BY_SPEC_LOAD (XEXP (link, 0)) = 1;
2379 } /* set_spec_fed */
2381 /* On the path from the insn to load_insn_bb, find a conditional
2382 branch depending on insn, that guards the speculative load. */
2384 static int
2385 find_conditional_protection (insn, load_insn_bb)
2386 rtx insn;
2387 int load_insn_bb;
2389 rtx link;
2391 /* Iterate through DEF-USE forward dependences. */
2392 for (link = INSN_DEPEND (insn); link; link = XEXP (link, 1))
2394 rtx next = XEXP (link, 0);
2395 if ((CONTAINING_RGN (BLOCK_NUM (next)) ==
2396 CONTAINING_RGN (BB_TO_BLOCK (load_insn_bb)))
2397 && IS_REACHABLE (INSN_BB (next), load_insn_bb)
2398 && load_insn_bb != INSN_BB (next)
2399 && GET_MODE (link) == VOIDmode
2400 && (GET_CODE (next) == JUMP_INSN
2401 || find_conditional_protection (next, load_insn_bb)))
2402 return 1;
2404 return 0;
2405 } /* find_conditional_protection */
2407 /* Returns 1 if the same insn1 that participates in the computation
2408 of load_insn's address is feeding a conditional branch that is
2409 guarding on load_insn. This is true if we find a the two DEF-USE
2410 chains:
2411 insn1 -> ... -> conditional-branch
2412 insn1 -> ... -> load_insn,
2413 and if a flow path exist:
2414 insn1 -> ... -> conditional-branch -> ... -> load_insn,
2415 and if insn1 is on the path
2416 region-entry -> ... -> bb_trg -> ... load_insn.
2418 Locate insn1 by climbing on LOG_LINKS from load_insn.
2419 Locate the branch by following INSN_DEPEND from insn1. */
2421 static int
2422 is_conditionally_protected (load_insn, bb_src, bb_trg)
2423 rtx load_insn;
2424 int bb_src, bb_trg;
2426 rtx link;
2428 for (link = LOG_LINKS (load_insn); link; link = XEXP (link, 1))
2430 rtx insn1 = XEXP (link, 0);
2432 /* Must be a DEF-USE dependence upon non-branch. */
2433 if (GET_MODE (link) != VOIDmode
2434 || GET_CODE (insn1) == JUMP_INSN)
2435 continue;
2437 /* Must exist a path: region-entry -> ... -> bb_trg -> ... load_insn. */
2438 if (INSN_BB (insn1) == bb_src
2439 || (CONTAINING_RGN (BLOCK_NUM (insn1))
2440 != CONTAINING_RGN (BB_TO_BLOCK (bb_src)))
2441 || (!IS_REACHABLE (bb_trg, INSN_BB (insn1))
2442 && !IS_REACHABLE (INSN_BB (insn1), bb_trg)))
2443 continue;
2445 /* Now search for the conditional-branch. */
2446 if (find_conditional_protection (insn1, bb_src))
2447 return 1;
2449 /* Recursive step: search another insn1, "above" current insn1. */
2450 return is_conditionally_protected (insn1, bb_src, bb_trg);
2453 /* The chain does not exist. */
2454 return 0;
2455 } /* is_conditionally_protected */
2457 /* Returns 1 if a clue for "similar load" 'insn2' is found, and hence
2458 load_insn can move speculatively from bb_src to bb_trg. All the
2459 following must hold:
2461 (1) both loads have 1 base register (PFREE_CANDIDATEs).
2462 (2) load_insn and load1 have a def-use dependence upon
2463 the same insn 'insn1'.
2464 (3) either load2 is in bb_trg, or:
2465 - there's only one split-block, and
2466 - load1 is on the escape path, and
2468 From all these we can conclude that the two loads access memory
2469 addresses that differ at most by a constant, and hence if moving
2470 load_insn would cause an exception, it would have been caused by
2471 load2 anyhow. */
2473 static int
2474 is_pfree (load_insn, bb_src, bb_trg)
2475 rtx load_insn;
2476 int bb_src, bb_trg;
2478 rtx back_link;
2479 register candidate *candp = candidate_table + bb_src;
2481 if (candp->split_bbs.nr_members != 1)
2482 /* Must have exactly one escape block. */
2483 return 0;
2485 for (back_link = LOG_LINKS (load_insn);
2486 back_link; back_link = XEXP (back_link, 1))
2488 rtx insn1 = XEXP (back_link, 0);
2490 if (GET_MODE (back_link) == VOIDmode)
2492 /* Found a DEF-USE dependence (insn1, load_insn). */
2493 rtx fore_link;
2495 for (fore_link = INSN_DEPEND (insn1);
2496 fore_link; fore_link = XEXP (fore_link, 1))
2498 rtx insn2 = XEXP (fore_link, 0);
2499 if (GET_MODE (fore_link) == VOIDmode)
2501 /* Found a DEF-USE dependence (insn1, insn2). */
2502 if (haifa_classify_insn (insn2) != PFREE_CANDIDATE)
2503 /* insn2 not guaranteed to be a 1 base reg load. */
2504 continue;
2506 if (INSN_BB (insn2) == bb_trg)
2507 /* insn2 is the similar load, in the target block. */
2508 return 1;
2510 if (*(candp->split_bbs.first_member) == BLOCK_NUM (insn2))
2511 /* insn2 is a similar load, in a split-block. */
2512 return 1;
2518 /* Couldn't find a similar load. */
2519 return 0;
2520 } /* is_pfree */
2522 /* Returns a class that insn with GET_DEST(insn)=x may belong to,
2523 as found by analyzing insn's expression. */
2525 static int
2526 may_trap_exp (x, is_store)
2527 rtx x;
2528 int is_store;
2530 enum rtx_code code;
2532 if (x == 0)
2533 return TRAP_FREE;
2534 code = GET_CODE (x);
2535 if (is_store)
2537 if (code == MEM)
2538 return TRAP_RISKY;
2539 else
2540 return TRAP_FREE;
2542 if (code == MEM)
2544 /* The insn uses memory: a volatile load. */
2545 if (MEM_VOLATILE_P (x))
2546 return IRISKY;
2547 /* An exception-free load. */
2548 if (!may_trap_p (x))
2549 return IFREE;
2550 /* A load with 1 base register, to be further checked. */
2551 if (CONST_BASED_ADDRESS_P (XEXP (x, 0)))
2552 return PFREE_CANDIDATE;
2553 /* No info on the load, to be further checked. */
2554 return PRISKY_CANDIDATE;
2556 else
2558 const char *fmt;
2559 int i, insn_class = TRAP_FREE;
2561 /* Neither store nor load, check if it may cause a trap. */
2562 if (may_trap_p (x))
2563 return TRAP_RISKY;
2564 /* Recursive step: walk the insn... */
2565 fmt = GET_RTX_FORMAT (code);
2566 for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--)
2568 if (fmt[i] == 'e')
2570 int tmp_class = may_trap_exp (XEXP (x, i), is_store);
2571 insn_class = WORST_CLASS (insn_class, tmp_class);
2573 else if (fmt[i] == 'E')
2575 int j;
2576 for (j = 0; j < XVECLEN (x, i); j++)
2578 int tmp_class = may_trap_exp (XVECEXP (x, i, j), is_store);
2579 insn_class = WORST_CLASS (insn_class, tmp_class);
2580 if (insn_class == TRAP_RISKY || insn_class == IRISKY)
2581 break;
2584 if (insn_class == TRAP_RISKY || insn_class == IRISKY)
2585 break;
2587 return insn_class;
2589 } /* may_trap_exp */
2592 /* Classifies insn for the purpose of verifying that it can be
2593 moved speculatively, by examining it's patterns, returning:
2594 TRAP_RISKY: store, or risky non-load insn (e.g. division by variable).
2595 TRAP_FREE: non-load insn.
2596 IFREE: load from a globaly safe location.
2597 IRISKY: volatile load.
2598 PFREE_CANDIDATE, PRISKY_CANDIDATE: load that need to be checked for
2599 being either PFREE or PRISKY. */
2601 static int
2602 haifa_classify_insn (insn)
2603 rtx insn;
2605 rtx pat = PATTERN (insn);
2606 int tmp_class = TRAP_FREE;
2607 int insn_class = TRAP_FREE;
2608 enum rtx_code code;
2610 if (GET_CODE (pat) == PARALLEL)
2612 int i, len = XVECLEN (pat, 0);
2614 for (i = len - 1; i >= 0; i--)
2616 code = GET_CODE (XVECEXP (pat, 0, i));
2617 switch (code)
2619 case CLOBBER:
2620 /* Test if it is a 'store'. */
2621 tmp_class = may_trap_exp (XEXP (XVECEXP (pat, 0, i), 0), 1);
2622 break;
2623 case SET:
2624 /* Test if it is a store. */
2625 tmp_class = may_trap_exp (SET_DEST (XVECEXP (pat, 0, i)), 1);
2626 if (tmp_class == TRAP_RISKY)
2627 break;
2628 /* Test if it is a load. */
2629 tmp_class =
2630 WORST_CLASS (tmp_class,
2631 may_trap_exp (SET_SRC (XVECEXP (pat, 0, i)), 0));
2632 break;
2633 case COND_EXEC:
2634 case TRAP_IF:
2635 tmp_class = TRAP_RISKY;
2636 break;
2637 default:;
2639 insn_class = WORST_CLASS (insn_class, tmp_class);
2640 if (insn_class == TRAP_RISKY || insn_class == IRISKY)
2641 break;
2644 else
2646 code = GET_CODE (pat);
2647 switch (code)
2649 case CLOBBER:
2650 /* Test if it is a 'store'. */
2651 tmp_class = may_trap_exp (XEXP (pat, 0), 1);
2652 break;
2653 case SET:
2654 /* Test if it is a store. */
2655 tmp_class = may_trap_exp (SET_DEST (pat), 1);
2656 if (tmp_class == TRAP_RISKY)
2657 break;
2658 /* Test if it is a load. */
2659 tmp_class =
2660 WORST_CLASS (tmp_class,
2661 may_trap_exp (SET_SRC (pat), 0));
2662 break;
2663 case COND_EXEC:
2664 case TRAP_IF:
2665 tmp_class = TRAP_RISKY;
2666 break;
2667 default:;
2669 insn_class = tmp_class;
2672 return insn_class;
2674 } /* haifa_classify_insn */
2676 /* Return 1 if load_insn is prisky (i.e. if load_insn is fed by
2677 a load moved speculatively, or if load_insn is protected by
2678 a compare on load_insn's address). */
2680 static int
2681 is_prisky (load_insn, bb_src, bb_trg)
2682 rtx load_insn;
2683 int bb_src, bb_trg;
2685 if (FED_BY_SPEC_LOAD (load_insn))
2686 return 1;
2688 if (LOG_LINKS (load_insn) == NULL)
2689 /* Dependence may 'hide' out of the region. */
2690 return 1;
2692 if (is_conditionally_protected (load_insn, bb_src, bb_trg))
2693 return 1;
2695 return 0;
2696 } /* is_prisky */
2698 /* Insn is a candidate to be moved speculatively from bb_src to bb_trg.
2699 Return 1 if insn is exception-free (and the motion is valid)
2700 and 0 otherwise. */
2702 static int
2703 is_exception_free (insn, bb_src, bb_trg)
2704 rtx insn;
2705 int bb_src, bb_trg;
2707 int insn_class = haifa_classify_insn (insn);
2709 /* Handle non-load insns. */
2710 switch (insn_class)
2712 case TRAP_FREE:
2713 return 1;
2714 case TRAP_RISKY:
2715 return 0;
2716 default:;
2719 /* Handle loads. */
2720 if (!flag_schedule_speculative_load)
2721 return 0;
2722 IS_LOAD_INSN (insn) = 1;
2723 switch (insn_class)
2725 case IFREE:
2726 return (1);
2727 case IRISKY:
2728 return 0;
2729 case PFREE_CANDIDATE:
2730 if (is_pfree (insn, bb_src, bb_trg))
2731 return 1;
2732 /* Don't 'break' here: PFREE-candidate is also PRISKY-candidate. */
2733 case PRISKY_CANDIDATE:
2734 if (!flag_schedule_speculative_load_dangerous
2735 || is_prisky (insn, bb_src, bb_trg))
2736 return 0;
2737 break;
2738 default:;
2741 return flag_schedule_speculative_load_dangerous;
2742 } /* is_exception_free */
2745 /* Process an insn's memory dependencies. There are four kinds of
2746 dependencies:
2748 (0) read dependence: read follows read
2749 (1) true dependence: read follows write
2750 (2) anti dependence: write follows read
2751 (3) output dependence: write follows write
2753 We are careful to build only dependencies which actually exist, and
2754 use transitivity to avoid building too many links. */
2756 /* Return 1 if the pair (insn, x) is found in (LIST, LIST1), or 0
2757 otherwise. */
2759 HAIFA_INLINE static char
2760 find_insn_mem_list (insn, x, list, list1)
2761 rtx insn, x;
2762 rtx list, list1;
2764 while (list)
2766 if (XEXP (list, 0) == insn
2767 && XEXP (list1, 0) == x)
2768 return 1;
2769 list = XEXP (list, 1);
2770 list1 = XEXP (list1, 1);
2772 return 0;
2776 /* Compute the function units used by INSN. This caches the value
2777 returned by function_units_used. A function unit is encoded as the
2778 unit number if the value is non-negative and the compliment of a
2779 mask if the value is negative. A function unit index is the
2780 non-negative encoding. */
2782 HAIFA_INLINE static int
2783 insn_unit (insn)
2784 rtx insn;
2786 register int unit = INSN_UNIT (insn);
2788 if (unit == 0)
2790 recog_memoized (insn);
2792 /* A USE insn, or something else we don't need to understand.
2793 We can't pass these directly to function_units_used because it will
2794 trigger a fatal error for unrecognizable insns. */
2795 if (INSN_CODE (insn) < 0)
2796 unit = -1;
2797 else
2799 unit = function_units_used (insn);
2800 /* Increment non-negative values so we can cache zero. */
2801 if (unit >= 0)
2802 unit++;
2804 /* We only cache 16 bits of the result, so if the value is out of
2805 range, don't cache it. */
2806 if (FUNCTION_UNITS_SIZE < HOST_BITS_PER_SHORT
2807 || unit >= 0
2808 || (unit & ~((1 << (HOST_BITS_PER_SHORT - 1)) - 1)) == 0)
2809 INSN_UNIT (insn) = unit;
2811 return (unit > 0 ? unit - 1 : unit);
2814 /* Compute the blockage range for executing INSN on UNIT. This caches
2815 the value returned by the blockage_range_function for the unit.
2816 These values are encoded in an int where the upper half gives the
2817 minimum value and the lower half gives the maximum value. */
2819 HAIFA_INLINE static unsigned int
2820 blockage_range (unit, insn)
2821 int unit;
2822 rtx insn;
2824 unsigned int blockage = INSN_BLOCKAGE (insn);
2825 unsigned int range;
2827 if ((int) UNIT_BLOCKED (blockage) != unit + 1)
2829 range = function_units[unit].blockage_range_function (insn);
2830 /* We only cache the blockage range for one unit and then only if
2831 the values fit. */
2832 if (HOST_BITS_PER_INT >= UNIT_BITS + 2 * BLOCKAGE_BITS)
2833 INSN_BLOCKAGE (insn) = ENCODE_BLOCKAGE (unit + 1, range);
2835 else
2836 range = BLOCKAGE_RANGE (blockage);
2838 return range;
2841 /* A vector indexed by function unit instance giving the last insn to use
2842 the unit. The value of the function unit instance index for unit U
2843 instance I is (U + I * FUNCTION_UNITS_SIZE). */
2844 static rtx unit_last_insn[FUNCTION_UNITS_SIZE * MAX_MULTIPLICITY];
2846 /* A vector indexed by function unit instance giving the minimum time when
2847 the unit will unblock based on the maximum blockage cost. */
2848 static int unit_tick[FUNCTION_UNITS_SIZE * MAX_MULTIPLICITY];
2850 /* A vector indexed by function unit number giving the number of insns
2851 that remain to use the unit. */
2852 static int unit_n_insns[FUNCTION_UNITS_SIZE];
2854 /* Reset the function unit state to the null state. */
2856 static void
2857 clear_units ()
2859 bzero ((char *) unit_last_insn, sizeof (unit_last_insn));
2860 bzero ((char *) unit_tick, sizeof (unit_tick));
2861 bzero ((char *) unit_n_insns, sizeof (unit_n_insns));
2864 /* Return the issue-delay of an insn. */
2866 HAIFA_INLINE static int
2867 insn_issue_delay (insn)
2868 rtx insn;
2870 int i, delay = 0;
2871 int unit = insn_unit (insn);
2873 /* Efficiency note: in fact, we are working 'hard' to compute a
2874 value that was available in md file, and is not available in
2875 function_units[] structure. It would be nice to have this
2876 value there, too. */
2877 if (unit >= 0)
2879 if (function_units[unit].blockage_range_function &&
2880 function_units[unit].blockage_function)
2881 delay = function_units[unit].blockage_function (insn, insn);
2883 else
2884 for (i = 0, unit = ~unit; unit; i++, unit >>= 1)
2885 if ((unit & 1) != 0 && function_units[i].blockage_range_function
2886 && function_units[i].blockage_function)
2887 delay = MAX (delay, function_units[i].blockage_function (insn, insn));
2889 return delay;
2892 /* Return the actual hazard cost of executing INSN on the unit UNIT,
2893 instance INSTANCE at time CLOCK if the previous actual hazard cost
2894 was COST. */
2896 HAIFA_INLINE static int
2897 actual_hazard_this_instance (unit, instance, insn, clock, cost)
2898 int unit, instance, clock, cost;
2899 rtx insn;
2901 int tick = unit_tick[instance]; /* Issue time of the last issued insn. */
2903 if (tick - clock > cost)
2905 /* The scheduler is operating forward, so unit's last insn is the
2906 executing insn and INSN is the candidate insn. We want a
2907 more exact measure of the blockage if we execute INSN at CLOCK
2908 given when we committed the execution of the unit's last insn.
2910 The blockage value is given by either the unit's max blockage
2911 constant, blockage range function, or blockage function. Use
2912 the most exact form for the given unit. */
2914 if (function_units[unit].blockage_range_function)
2916 if (function_units[unit].blockage_function)
2917 tick += (function_units[unit].blockage_function
2918 (unit_last_insn[instance], insn)
2919 - function_units[unit].max_blockage);
2920 else
2921 tick += ((int) MAX_BLOCKAGE_COST (blockage_range (unit, insn))
2922 - function_units[unit].max_blockage);
2924 if (tick - clock > cost)
2925 cost = tick - clock;
2927 return cost;
2930 /* Record INSN as having begun execution on the units encoded by UNIT at
2931 time CLOCK. */
2933 HAIFA_INLINE static void
2934 schedule_unit (unit, insn, clock)
2935 int unit, clock;
2936 rtx insn;
2938 int i;
2940 if (unit >= 0)
2942 int instance = unit;
2943 #if MAX_MULTIPLICITY > 1
2944 /* Find the first free instance of the function unit and use that
2945 one. We assume that one is free. */
2946 for (i = function_units[unit].multiplicity - 1; i > 0; i--)
2948 if (!actual_hazard_this_instance (unit, instance, insn, clock, 0))
2949 break;
2950 instance += FUNCTION_UNITS_SIZE;
2952 #endif
2953 unit_last_insn[instance] = insn;
2954 unit_tick[instance] = (clock + function_units[unit].max_blockage);
2956 else
2957 for (i = 0, unit = ~unit; unit; i++, unit >>= 1)
2958 if ((unit & 1) != 0)
2959 schedule_unit (i, insn, clock);
2962 /* Return the actual hazard cost of executing INSN on the units encoded by
2963 UNIT at time CLOCK if the previous actual hazard cost was COST. */
2965 HAIFA_INLINE static int
2966 actual_hazard (unit, insn, clock, cost)
2967 int unit, clock, cost;
2968 rtx insn;
2970 int i;
2972 if (unit >= 0)
2974 /* Find the instance of the function unit with the minimum hazard. */
2975 int instance = unit;
2976 int best_cost = actual_hazard_this_instance (unit, instance, insn,
2977 clock, cost);
2978 #if MAX_MULTIPLICITY > 1
2979 int this_cost;
2981 if (best_cost > cost)
2983 for (i = function_units[unit].multiplicity - 1; i > 0; i--)
2985 instance += FUNCTION_UNITS_SIZE;
2986 this_cost = actual_hazard_this_instance (unit, instance, insn,
2987 clock, cost);
2988 if (this_cost < best_cost)
2990 best_cost = this_cost;
2991 if (this_cost <= cost)
2992 break;
2996 #endif
2997 cost = MAX (cost, best_cost);
2999 else
3000 for (i = 0, unit = ~unit; unit; i++, unit >>= 1)
3001 if ((unit & 1) != 0)
3002 cost = actual_hazard (i, insn, clock, cost);
3004 return cost;
3007 /* Return the potential hazard cost of executing an instruction on the
3008 units encoded by UNIT if the previous potential hazard cost was COST.
3009 An insn with a large blockage time is chosen in preference to one
3010 with a smaller time; an insn that uses a unit that is more likely
3011 to be used is chosen in preference to one with a unit that is less
3012 used. We are trying to minimize a subsequent actual hazard. */
3014 HAIFA_INLINE static int
3015 potential_hazard (unit, insn, cost)
3016 int unit, cost;
3017 rtx insn;
3019 int i, ncost;
3020 unsigned int minb, maxb;
3022 if (unit >= 0)
3024 minb = maxb = function_units[unit].max_blockage;
3025 if (maxb > 1)
3027 if (function_units[unit].blockage_range_function)
3029 maxb = minb = blockage_range (unit, insn);
3030 maxb = MAX_BLOCKAGE_COST (maxb);
3031 minb = MIN_BLOCKAGE_COST (minb);
3034 if (maxb > 1)
3036 /* Make the number of instructions left dominate. Make the
3037 minimum delay dominate the maximum delay. If all these
3038 are the same, use the unit number to add an arbitrary
3039 ordering. Other terms can be added. */
3040 ncost = minb * 0x40 + maxb;
3041 ncost *= (unit_n_insns[unit] - 1) * 0x1000 + unit;
3042 if (ncost > cost)
3043 cost = ncost;
3047 else
3048 for (i = 0, unit = ~unit; unit; i++, unit >>= 1)
3049 if ((unit & 1) != 0)
3050 cost = potential_hazard (i, insn, cost);
3052 return cost;
3055 /* Compute cost of executing INSN given the dependence LINK on the insn USED.
3056 This is the number of cycles between instruction issue and
3057 instruction results. */
3059 HAIFA_INLINE static int
3060 insn_cost (insn, link, used)
3061 rtx insn, link, used;
3063 register int cost = INSN_COST (insn);
3065 if (cost == 0)
3067 recog_memoized (insn);
3069 /* A USE insn, or something else we don't need to understand.
3070 We can't pass these directly to result_ready_cost because it will
3071 trigger a fatal error for unrecognizable insns. */
3072 if (INSN_CODE (insn) < 0)
3074 INSN_COST (insn) = 1;
3075 return 1;
3077 else
3079 cost = result_ready_cost (insn);
3081 if (cost < 1)
3082 cost = 1;
3084 INSN_COST (insn) = cost;
3088 /* In this case estimate cost without caring how insn is used. */
3089 if (link == 0 && used == 0)
3090 return cost;
3092 /* A USE insn should never require the value used to be computed. This
3093 allows the computation of a function's result and parameter values to
3094 overlap the return and call. */
3095 recog_memoized (used);
3096 if (INSN_CODE (used) < 0)
3097 LINK_COST_FREE (link) = 1;
3099 /* If some dependencies vary the cost, compute the adjustment. Most
3100 commonly, the adjustment is complete: either the cost is ignored
3101 (in the case of an output- or anti-dependence), or the cost is
3102 unchanged. These values are cached in the link as LINK_COST_FREE
3103 and LINK_COST_ZERO. */
3105 if (LINK_COST_FREE (link))
3106 cost = 0;
3107 #ifdef ADJUST_COST
3108 else if (!LINK_COST_ZERO (link))
3110 int ncost = cost;
3112 ADJUST_COST (used, link, insn, ncost);
3113 if (ncost < 1)
3115 LINK_COST_FREE (link) = 1;
3116 ncost = 0;
3118 if (cost == ncost)
3119 LINK_COST_ZERO (link) = 1;
3120 cost = ncost;
3122 #endif
3123 return cost;
3126 /* Compute the priority number for INSN. */
3128 static int
3129 priority (insn)
3130 rtx insn;
3132 int this_priority;
3133 rtx link;
3135 if (! INSN_P (insn))
3136 return 0;
3138 if ((this_priority = INSN_PRIORITY (insn)) == 0)
3140 if (INSN_DEPEND (insn) == 0)
3141 this_priority = insn_cost (insn, 0, 0);
3142 else
3143 for (link = INSN_DEPEND (insn); link; link = XEXP (link, 1))
3145 rtx next;
3146 int next_priority;
3148 if (RTX_INTEGRATED_P (link))
3149 continue;
3151 next = XEXP (link, 0);
3153 /* Critical path is meaningful in block boundaries only. */
3154 if (BLOCK_NUM (next) != BLOCK_NUM (insn))
3155 continue;
3157 next_priority = insn_cost (insn, link, next) + priority (next);
3158 if (next_priority > this_priority)
3159 this_priority = next_priority;
3161 INSN_PRIORITY (insn) = this_priority;
3163 return this_priority;
3167 /* Remove all INSN_LISTs and EXPR_LISTs from the pending lists and add
3168 them to the unused_*_list variables, so that they can be reused. */
3170 static void
3171 free_pending_lists ()
3173 int bb;
3175 for (bb = 0; bb < current_nr_blocks; bb++)
3177 free_INSN_LIST_list (&bb_deps[bb].pending_read_insns);
3178 free_INSN_LIST_list (&bb_deps[bb].pending_write_insns);
3179 free_EXPR_LIST_list (&bb_deps[bb].pending_read_mems);
3180 free_EXPR_LIST_list (&bb_deps[bb].pending_write_mems);
3184 /* Add an INSN and MEM reference pair to a pending INSN_LIST and MEM_LIST.
3185 The MEM is a memory reference contained within INSN, which we are saving
3186 so that we can do memory aliasing on it. */
3188 static void
3189 add_insn_mem_dependence (deps, insn_list, mem_list, insn, mem)
3190 struct deps *deps;
3191 rtx *insn_list, *mem_list, insn, mem;
3193 register rtx link;
3195 link = alloc_INSN_LIST (insn, *insn_list);
3196 *insn_list = link;
3198 link = alloc_EXPR_LIST (VOIDmode, mem, *mem_list);
3199 *mem_list = link;
3201 deps->pending_lists_length++;
3204 /* Make a dependency between every memory reference on the pending lists
3205 and INSN, thus flushing the pending lists. If ONLY_WRITE, don't flush
3206 the read list. */
3208 static void
3209 flush_pending_lists (deps, insn, only_write)
3210 struct deps *deps;
3211 rtx insn;
3212 int only_write;
3214 rtx u;
3215 rtx link;
3217 while (deps->pending_read_insns && ! only_write)
3219 add_dependence (insn, XEXP (deps->pending_read_insns, 0),
3220 REG_DEP_ANTI);
3222 link = deps->pending_read_insns;
3223 deps->pending_read_insns = XEXP (deps->pending_read_insns, 1);
3224 free_INSN_LIST_node (link);
3226 link = deps->pending_read_mems;
3227 deps->pending_read_mems = XEXP (deps->pending_read_mems, 1);
3228 free_EXPR_LIST_node (link);
3230 while (deps->pending_write_insns)
3232 add_dependence (insn, XEXP (deps->pending_write_insns, 0),
3233 REG_DEP_ANTI);
3235 link = deps->pending_write_insns;
3236 deps->pending_write_insns = XEXP (deps->pending_write_insns, 1);
3237 free_INSN_LIST_node (link);
3239 link = deps->pending_write_mems;
3240 deps->pending_write_mems = XEXP (deps->pending_write_mems, 1);
3241 free_EXPR_LIST_node (link);
3243 deps->pending_lists_length = 0;
3245 /* last_pending_memory_flush is now a list of insns. */
3246 for (u = deps->last_pending_memory_flush; u; u = XEXP (u, 1))
3247 add_dependence (insn, XEXP (u, 0), REG_DEP_ANTI);
3249 free_INSN_LIST_list (&deps->last_pending_memory_flush);
3250 deps->last_pending_memory_flush = alloc_INSN_LIST (insn, NULL_RTX);
3253 /* Analyze a single SET, CLOBBER, PRE_DEC, POST_DEC, PRE_INC or POST_INC
3254 rtx, X, creating all dependencies generated by the write to the
3255 destination of X, and reads of everything mentioned. */
3257 static void
3258 sched_analyze_1 (deps, x, insn)
3259 struct deps *deps;
3260 rtx x;
3261 rtx insn;
3263 register int regno;
3264 register rtx dest = XEXP (x, 0);
3265 enum rtx_code code = GET_CODE (x);
3267 if (dest == 0)
3268 return;
3270 if (GET_CODE (dest) == PARALLEL
3271 && GET_MODE (dest) == BLKmode)
3273 register int i;
3274 for (i = XVECLEN (dest, 0) - 1; i >= 0; i--)
3275 sched_analyze_1 (deps, XVECEXP (dest, 0, i), insn);
3276 if (GET_CODE (x) == SET)
3277 sched_analyze_2 (deps, SET_SRC (x), insn);
3278 return;
3281 while (GET_CODE (dest) == STRICT_LOW_PART || GET_CODE (dest) == SUBREG
3282 || GET_CODE (dest) == ZERO_EXTRACT || GET_CODE (dest) == SIGN_EXTRACT)
3284 if (GET_CODE (dest) == ZERO_EXTRACT || GET_CODE (dest) == SIGN_EXTRACT)
3286 /* The second and third arguments are values read by this insn. */
3287 sched_analyze_2 (deps, XEXP (dest, 1), insn);
3288 sched_analyze_2 (deps, XEXP (dest, 2), insn);
3290 dest = XEXP (dest, 0);
3293 if (GET_CODE (dest) == REG)
3295 register int i;
3297 regno = REGNO (dest);
3299 /* A hard reg in a wide mode may really be multiple registers.
3300 If so, mark all of them just like the first. */
3301 if (regno < FIRST_PSEUDO_REGISTER)
3303 i = HARD_REGNO_NREGS (regno, GET_MODE (dest));
3304 while (--i >= 0)
3306 int r = regno + i;
3307 rtx u;
3309 for (u = deps->reg_last_uses[r]; u; u = XEXP (u, 1))
3310 add_dependence (insn, XEXP (u, 0), REG_DEP_ANTI);
3312 for (u = deps->reg_last_sets[r]; u; u = XEXP (u, 1))
3313 add_dependence (insn, XEXP (u, 0), REG_DEP_OUTPUT);
3315 /* Clobbers need not be ordered with respect to one
3316 another, but sets must be ordered with respect to a
3317 pending clobber. */
3318 if (code == SET)
3320 free_INSN_LIST_list (&deps->reg_last_uses[r]);
3321 for (u = deps->reg_last_clobbers[r]; u; u = XEXP (u, 1))
3322 add_dependence (insn, XEXP (u, 0), REG_DEP_OUTPUT);
3323 SET_REGNO_REG_SET (reg_pending_sets, r);
3325 else
3326 SET_REGNO_REG_SET (reg_pending_clobbers, r);
3328 /* Function calls clobber all call_used regs. */
3329 if (global_regs[r] || (code == SET && call_used_regs[r]))
3330 for (u = deps->last_function_call; u; u = XEXP (u, 1))
3331 add_dependence (insn, XEXP (u, 0), REG_DEP_ANTI);
3334 else
3336 rtx u;
3338 for (u = deps->reg_last_uses[regno]; u; u = XEXP (u, 1))
3339 add_dependence (insn, XEXP (u, 0), REG_DEP_ANTI);
3341 for (u = deps->reg_last_sets[regno]; u; u = XEXP (u, 1))
3342 add_dependence (insn, XEXP (u, 0), REG_DEP_OUTPUT);
3344 if (code == SET)
3346 free_INSN_LIST_list (&deps->reg_last_uses[regno]);
3347 for (u = deps->reg_last_clobbers[regno]; u; u = XEXP (u, 1))
3348 add_dependence (insn, XEXP (u, 0), REG_DEP_OUTPUT);
3349 SET_REGNO_REG_SET (reg_pending_sets, regno);
3351 else
3352 SET_REGNO_REG_SET (reg_pending_clobbers, regno);
3354 /* Pseudos that are REG_EQUIV to something may be replaced
3355 by that during reloading. We need only add dependencies for
3356 the address in the REG_EQUIV note. */
3357 if (!reload_completed
3358 && reg_known_equiv_p[regno]
3359 && GET_CODE (reg_known_value[regno]) == MEM)
3360 sched_analyze_2 (deps, XEXP (reg_known_value[regno], 0), insn);
3362 /* Don't let it cross a call after scheduling if it doesn't
3363 already cross one. */
3365 if (REG_N_CALLS_CROSSED (regno) == 0)
3366 for (u = deps->last_function_call; u; u = XEXP (u, 1))
3367 add_dependence (insn, XEXP (u, 0), REG_DEP_ANTI);
3370 else if (GET_CODE (dest) == MEM)
3372 /* Writing memory. */
3374 if (deps->pending_lists_length > 32)
3376 /* Flush all pending reads and writes to prevent the pending lists
3377 from getting any larger. Insn scheduling runs too slowly when
3378 these lists get long. The number 32 was chosen because it
3379 seems like a reasonable number. When compiling GCC with itself,
3380 this flush occurs 8 times for sparc, and 10 times for m88k using
3381 the number 32. */
3382 flush_pending_lists (deps, insn, 0);
3384 else
3386 rtx u;
3387 rtx pending, pending_mem;
3389 pending = deps->pending_read_insns;
3390 pending_mem = deps->pending_read_mems;
3391 while (pending)
3393 if (anti_dependence (XEXP (pending_mem, 0), dest))
3394 add_dependence (insn, XEXP (pending, 0), REG_DEP_ANTI);
3396 pending = XEXP (pending, 1);
3397 pending_mem = XEXP (pending_mem, 1);
3400 pending = deps->pending_write_insns;
3401 pending_mem = deps->pending_write_mems;
3402 while (pending)
3404 if (output_dependence (XEXP (pending_mem, 0), dest))
3405 add_dependence (insn, XEXP (pending, 0), REG_DEP_OUTPUT);
3407 pending = XEXP (pending, 1);
3408 pending_mem = XEXP (pending_mem, 1);
3411 for (u = deps->last_pending_memory_flush; u; u = XEXP (u, 1))
3412 add_dependence (insn, XEXP (u, 0), REG_DEP_ANTI);
3414 add_insn_mem_dependence (deps, &deps->pending_write_insns,
3415 &deps->pending_write_mems, insn, dest);
3417 sched_analyze_2 (deps, XEXP (dest, 0), insn);
3420 /* Analyze reads. */
3421 if (GET_CODE (x) == SET)
3422 sched_analyze_2 (deps, SET_SRC (x), insn);
3425 /* Analyze the uses of memory and registers in rtx X in INSN. */
3427 static void
3428 sched_analyze_2 (deps, x, insn)
3429 struct deps *deps;
3430 rtx x;
3431 rtx insn;
3433 register int i;
3434 register int j;
3435 register enum rtx_code code;
3436 register const char *fmt;
3438 if (x == 0)
3439 return;
3441 code = GET_CODE (x);
3443 switch (code)
3445 case CONST_INT:
3446 case CONST_DOUBLE:
3447 case SYMBOL_REF:
3448 case CONST:
3449 case LABEL_REF:
3450 /* Ignore constants. Note that we must handle CONST_DOUBLE here
3451 because it may have a cc0_rtx in its CONST_DOUBLE_CHAIN field, but
3452 this does not mean that this insn is using cc0. */
3453 return;
3455 #ifdef HAVE_cc0
3456 case CC0:
3457 /* User of CC0 depends on immediately preceding insn. */
3458 set_sched_group_p (insn);
3459 return;
3460 #endif
3462 case REG:
3464 rtx u;
3465 int regno = REGNO (x);
3466 if (regno < FIRST_PSEUDO_REGISTER)
3468 int i;
3470 i = HARD_REGNO_NREGS (regno, GET_MODE (x));
3471 while (--i >= 0)
3473 int r = regno + i;
3474 deps->reg_last_uses[r]
3475 = alloc_INSN_LIST (insn, deps->reg_last_uses[r]);
3477 for (u = deps->reg_last_sets[r]; u; u = XEXP (u, 1))
3478 add_dependence (insn, XEXP (u, 0), 0);
3480 /* ??? This should never happen. */
3481 for (u = deps->reg_last_clobbers[r]; u; u = XEXP (u, 1))
3482 add_dependence (insn, XEXP (u, 0), 0);
3484 if (call_used_regs[r] || global_regs[r])
3485 /* Function calls clobber all call_used regs. */
3486 for (u = deps->last_function_call; u; u = XEXP (u, 1))
3487 add_dependence (insn, XEXP (u, 0), REG_DEP_ANTI);
3490 else
3492 deps->reg_last_uses[regno]
3493 = alloc_INSN_LIST (insn, deps->reg_last_uses[regno]);
3495 for (u = deps->reg_last_sets[regno]; u; u = XEXP (u, 1))
3496 add_dependence (insn, XEXP (u, 0), 0);
3498 /* ??? This should never happen. */
3499 for (u = deps->reg_last_clobbers[regno]; u; u = XEXP (u, 1))
3500 add_dependence (insn, XEXP (u, 0), 0);
3502 /* Pseudos that are REG_EQUIV to something may be replaced
3503 by that during reloading. We need only add dependencies for
3504 the address in the REG_EQUIV note. */
3505 if (!reload_completed
3506 && reg_known_equiv_p[regno]
3507 && GET_CODE (reg_known_value[regno]) == MEM)
3508 sched_analyze_2 (deps, XEXP (reg_known_value[regno], 0), insn);
3510 /* If the register does not already cross any calls, then add this
3511 insn to the sched_before_next_call list so that it will still
3512 not cross calls after scheduling. */
3513 if (REG_N_CALLS_CROSSED (regno) == 0)
3514 add_dependence (deps->sched_before_next_call, insn,
3515 REG_DEP_ANTI);
3517 return;
3520 case MEM:
3522 /* Reading memory. */
3523 rtx u;
3524 rtx pending, pending_mem;
3526 pending = deps->pending_read_insns;
3527 pending_mem = deps->pending_read_mems;
3528 while (pending)
3530 if (read_dependence (XEXP (pending_mem, 0), x))
3531 add_dependence (insn, XEXP (pending, 0), REG_DEP_ANTI);
3533 pending = XEXP (pending, 1);
3534 pending_mem = XEXP (pending_mem, 1);
3537 pending = deps->pending_write_insns;
3538 pending_mem = deps->pending_write_mems;
3539 while (pending)
3541 if (true_dependence (XEXP (pending_mem, 0), VOIDmode,
3542 x, rtx_varies_p))
3543 add_dependence (insn, XEXP (pending, 0), 0);
3545 pending = XEXP (pending, 1);
3546 pending_mem = XEXP (pending_mem, 1);
3549 for (u = deps->last_pending_memory_flush; u; u = XEXP (u, 1))
3550 add_dependence (insn, XEXP (u, 0), REG_DEP_ANTI);
3552 /* Always add these dependencies to pending_reads, since
3553 this insn may be followed by a write. */
3554 add_insn_mem_dependence (deps, &deps->pending_read_insns,
3555 &deps->pending_read_mems, insn, x);
3557 /* Take advantage of tail recursion here. */
3558 sched_analyze_2 (deps, XEXP (x, 0), insn);
3559 return;
3562 /* Force pending stores to memory in case a trap handler needs them. */
3563 case TRAP_IF:
3564 flush_pending_lists (deps, insn, 1);
3565 break;
3567 case ASM_OPERANDS:
3568 case ASM_INPUT:
3569 case UNSPEC_VOLATILE:
3571 rtx u;
3573 /* Traditional and volatile asm instructions must be considered to use
3574 and clobber all hard registers, all pseudo-registers and all of
3575 memory. So must TRAP_IF and UNSPEC_VOLATILE operations.
3577 Consider for instance a volatile asm that changes the fpu rounding
3578 mode. An insn should not be moved across this even if it only uses
3579 pseudo-regs because it might give an incorrectly rounded result. */
3580 if (code != ASM_OPERANDS || MEM_VOLATILE_P (x))
3582 int max_reg = max_reg_num ();
3583 for (i = 0; i < max_reg; i++)
3585 for (u = deps->reg_last_uses[i]; u; u = XEXP (u, 1))
3586 add_dependence (insn, XEXP (u, 0), REG_DEP_ANTI);
3587 free_INSN_LIST_list (&deps->reg_last_uses[i]);
3589 for (u = deps->reg_last_sets[i]; u; u = XEXP (u, 1))
3590 add_dependence (insn, XEXP (u, 0), 0);
3592 for (u = deps->reg_last_clobbers[i]; u; u = XEXP (u, 1))
3593 add_dependence (insn, XEXP (u, 0), 0);
3595 reg_pending_sets_all = 1;
3597 flush_pending_lists (deps, insn, 0);
3600 /* For all ASM_OPERANDS, we must traverse the vector of input operands.
3601 We can not just fall through here since then we would be confused
3602 by the ASM_INPUT rtx inside ASM_OPERANDS, which do not indicate
3603 traditional asms unlike their normal usage. */
3605 if (code == ASM_OPERANDS)
3607 for (j = 0; j < ASM_OPERANDS_INPUT_LENGTH (x); j++)
3608 sched_analyze_2 (deps, ASM_OPERANDS_INPUT (x, j), insn);
3609 return;
3611 break;
3614 case PRE_DEC:
3615 case POST_DEC:
3616 case PRE_INC:
3617 case POST_INC:
3618 /* These both read and modify the result. We must handle them as writes
3619 to get proper dependencies for following instructions. We must handle
3620 them as reads to get proper dependencies from this to previous
3621 instructions. Thus we need to pass them to both sched_analyze_1
3622 and sched_analyze_2. We must call sched_analyze_2 first in order
3623 to get the proper antecedent for the read. */
3624 sched_analyze_2 (deps, XEXP (x, 0), insn);
3625 sched_analyze_1 (deps, x, insn);
3626 return;
3628 case POST_MODIFY:
3629 case PRE_MODIFY:
3630 /* op0 = op0 + op1 */
3631 sched_analyze_2 (deps, XEXP (x, 0), insn);
3632 sched_analyze_2 (deps, XEXP (x, 1), insn);
3633 sched_analyze_1 (deps, x, insn);
3634 return;
3636 default:
3637 break;
3640 /* Other cases: walk the insn. */
3641 fmt = GET_RTX_FORMAT (code);
3642 for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--)
3644 if (fmt[i] == 'e')
3645 sched_analyze_2 (deps, XEXP (x, i), insn);
3646 else if (fmt[i] == 'E')
3647 for (j = 0; j < XVECLEN (x, i); j++)
3648 sched_analyze_2 (deps, XVECEXP (x, i, j), insn);
3652 /* Analyze an INSN with pattern X to find all dependencies. */
3654 static void
3655 sched_analyze_insn (deps, x, insn, loop_notes)
3656 struct deps *deps;
3657 rtx x, insn;
3658 rtx loop_notes;
3660 register RTX_CODE code = GET_CODE (x);
3661 rtx link;
3662 int maxreg = max_reg_num ();
3663 int i;
3665 if (code == COND_EXEC)
3667 sched_analyze_2 (deps, COND_EXEC_TEST (x), insn);
3669 /* ??? Should be recording conditions so we reduce the number of
3670 false dependancies. */
3671 x = COND_EXEC_CODE (x);
3672 code = GET_CODE (x);
3674 if (code == SET || code == CLOBBER)
3675 sched_analyze_1 (deps, x, insn);
3676 else if (code == PARALLEL)
3678 register int i;
3679 for (i = XVECLEN (x, 0) - 1; i >= 0; i--)
3681 rtx sub = XVECEXP (x, 0, i);
3682 code = GET_CODE (sub);
3684 if (code == COND_EXEC)
3686 sched_analyze_2 (deps, COND_EXEC_TEST (sub), insn);
3687 sub = COND_EXEC_CODE (sub);
3688 code = GET_CODE (sub);
3690 if (code == SET || code == CLOBBER)
3691 sched_analyze_1 (deps, sub, insn);
3692 else
3693 sched_analyze_2 (deps, sub, insn);
3696 else
3697 sched_analyze_2 (deps, x, insn);
3699 /* Mark registers CLOBBERED or used by called function. */
3700 if (GET_CODE (insn) == CALL_INSN)
3701 for (link = CALL_INSN_FUNCTION_USAGE (insn); link; link = XEXP (link, 1))
3703 if (GET_CODE (XEXP (link, 0)) == CLOBBER)
3704 sched_analyze_1 (deps, XEXP (link, 0), insn);
3705 else
3706 sched_analyze_2 (deps, XEXP (link, 0), insn);
3709 /* If there is a {LOOP,EHREGION}_{BEG,END} note in the middle of a basic
3710 block, then we must be sure that no instructions are scheduled across it.
3711 Otherwise, the reg_n_refs info (which depends on loop_depth) would
3712 become incorrect. */
3714 if (loop_notes)
3716 int max_reg = max_reg_num ();
3717 int schedule_barrier_found = 0;
3718 rtx link;
3720 /* Update loop_notes with any notes from this insn. Also determine
3721 if any of the notes on the list correspond to instruction scheduling
3722 barriers (loop, eh & setjmp notes, but not range notes. */
3723 link = loop_notes;
3724 while (XEXP (link, 1))
3726 if (INTVAL (XEXP (link, 0)) == NOTE_INSN_LOOP_BEG
3727 || INTVAL (XEXP (link, 0)) == NOTE_INSN_LOOP_END
3728 || INTVAL (XEXP (link, 0)) == NOTE_INSN_EH_REGION_BEG
3729 || INTVAL (XEXP (link, 0)) == NOTE_INSN_EH_REGION_END
3730 || INTVAL (XEXP (link, 0)) == NOTE_INSN_SETJMP)
3731 schedule_barrier_found = 1;
3733 link = XEXP (link, 1);
3735 XEXP (link, 1) = REG_NOTES (insn);
3736 REG_NOTES (insn) = loop_notes;
3738 /* Add dependencies if a scheduling barrier was found. */
3739 if (schedule_barrier_found)
3741 for (i = 0; i < max_reg; i++)
3743 rtx u;
3744 for (u = deps->reg_last_uses[i]; u; u = XEXP (u, 1))
3745 add_dependence (insn, XEXP (u, 0), REG_DEP_ANTI);
3746 free_INSN_LIST_list (&deps->reg_last_uses[i]);
3748 for (u = deps->reg_last_sets[i]; u; u = XEXP (u, 1))
3749 add_dependence (insn, XEXP (u, 0), 0);
3751 for (u = deps->reg_last_clobbers[i]; u; u = XEXP (u, 1))
3752 add_dependence (insn, XEXP (u, 0), 0);
3754 reg_pending_sets_all = 1;
3756 flush_pending_lists (deps, insn, 0);
3761 /* Accumulate clobbers until the next set so that it will be output dependent
3762 on all of them. At the next set we can clear the clobber list, since
3763 subsequent sets will be output dependent on it. */
3764 EXECUTE_IF_SET_IN_REG_SET
3765 (reg_pending_sets, 0, i,
3767 free_INSN_LIST_list (&deps->reg_last_sets[i]);
3768 free_INSN_LIST_list (&deps->reg_last_clobbers[i]);
3769 deps->reg_last_sets[i] = alloc_INSN_LIST (insn, NULL_RTX);
3771 EXECUTE_IF_SET_IN_REG_SET
3772 (reg_pending_clobbers, 0, i,
3774 deps->reg_last_clobbers[i]
3775 = alloc_INSN_LIST (insn, deps->reg_last_clobbers[i]);
3777 CLEAR_REG_SET (reg_pending_sets);
3778 CLEAR_REG_SET (reg_pending_clobbers);
3780 if (reg_pending_sets_all)
3782 for (i = 0; i < maxreg; i++)
3784 free_INSN_LIST_list (&deps->reg_last_sets[i]);
3785 free_INSN_LIST_list (&deps->reg_last_clobbers[i]);
3786 deps->reg_last_sets[i] = alloc_INSN_LIST (insn, NULL_RTX);
3789 reg_pending_sets_all = 0;
3792 /* If a post-call group is still open, see if it should remain so.
3793 This insn must be a simple move of a hard reg to a pseudo or
3794 vice-versa.
3796 We must avoid moving these insns for correctness on
3797 SMALL_REGISTER_CLASS machines, and for special registers like
3798 PIC_OFFSET_TABLE_REGNUM. For simplicity, extend this to all
3799 hard regs for all targets. */
3801 if (deps->in_post_call_group_p)
3803 rtx tmp, set = single_set (insn);
3804 int src_regno, dest_regno;
3806 if (set == NULL)
3807 goto end_call_group;
3809 tmp = SET_DEST (set);
3810 if (GET_CODE (tmp) == SUBREG)
3811 tmp = SUBREG_REG (tmp);
3812 if (GET_CODE (tmp) == REG)
3813 dest_regno = REGNO (tmp);
3814 else
3815 goto end_call_group;
3817 tmp = SET_SRC (set);
3818 if (GET_CODE (tmp) == SUBREG)
3819 tmp = SUBREG_REG (tmp);
3820 if (GET_CODE (tmp) == REG)
3821 src_regno = REGNO (tmp);
3822 else
3823 goto end_call_group;
3825 if (src_regno < FIRST_PSEUDO_REGISTER
3826 || dest_regno < FIRST_PSEUDO_REGISTER)
3828 set_sched_group_p (insn);
3829 CANT_MOVE (insn) = 1;
3831 else
3833 end_call_group:
3834 deps->in_post_call_group_p = 0;
3839 /* Analyze every insn between HEAD and TAIL inclusive, creating LOG_LINKS
3840 for every dependency. */
3842 static void
3843 sched_analyze (deps, head, tail)
3844 struct deps *deps;
3845 rtx head, tail;
3847 register rtx insn;
3848 register rtx u;
3849 rtx loop_notes = 0;
3851 for (insn = head;; insn = NEXT_INSN (insn))
3853 if (GET_CODE (insn) == INSN || GET_CODE (insn) == JUMP_INSN)
3855 /* Clear out the stale LOG_LINKS from flow. */
3856 free_INSN_LIST_list (&LOG_LINKS (insn));
3858 /* Clear out stale SCHED_GROUP_P. */
3859 SCHED_GROUP_P (insn) = 0;
3861 /* Make each JUMP_INSN a scheduling barrier for memory
3862 references. */
3863 if (GET_CODE (insn) == JUMP_INSN)
3864 deps->last_pending_memory_flush
3865 = alloc_INSN_LIST (insn, deps->last_pending_memory_flush);
3866 sched_analyze_insn (deps, PATTERN (insn), insn, loop_notes);
3867 loop_notes = 0;
3869 else if (GET_CODE (insn) == CALL_INSN)
3871 rtx x;
3872 register int i;
3874 /* Clear out stale SCHED_GROUP_P. */
3875 SCHED_GROUP_P (insn) = 0;
3877 CANT_MOVE (insn) = 1;
3879 /* Clear out the stale LOG_LINKS from flow. */
3880 free_INSN_LIST_list (&LOG_LINKS (insn));
3882 /* Any instruction using a hard register which may get clobbered
3883 by a call needs to be marked as dependent on this call.
3884 This prevents a use of a hard return reg from being moved
3885 past a void call (i.e. it does not explicitly set the hard
3886 return reg). */
3888 /* If this call is followed by a NOTE_INSN_SETJMP, then assume that
3889 all registers, not just hard registers, may be clobbered by this
3890 call. */
3892 /* Insn, being a CALL_INSN, magically depends on
3893 `last_function_call' already. */
3895 if (NEXT_INSN (insn) && GET_CODE (NEXT_INSN (insn)) == NOTE
3896 && NOTE_LINE_NUMBER (NEXT_INSN (insn)) == NOTE_INSN_SETJMP)
3898 int max_reg = max_reg_num ();
3899 for (i = 0; i < max_reg; i++)
3901 for (u = deps->reg_last_uses[i]; u; u = XEXP (u, 1))
3902 add_dependence (insn, XEXP (u, 0), REG_DEP_ANTI);
3903 free_INSN_LIST_list (&deps->reg_last_uses[i]);
3905 for (u = deps->reg_last_sets[i]; u; u = XEXP (u, 1))
3906 add_dependence (insn, XEXP (u, 0), 0);
3908 for (u = deps->reg_last_clobbers[i]; u; u = XEXP (u, 1))
3909 add_dependence (insn, XEXP (u, 0), 0);
3911 reg_pending_sets_all = 1;
3913 /* Add a pair of REG_SAVE_NOTEs which we will later
3914 convert back into a NOTE_INSN_SETJMP note. See
3915 reemit_notes for why we use a pair of NOTEs. */
3916 REG_NOTES (insn) = alloc_EXPR_LIST (REG_SAVE_NOTE,
3917 GEN_INT (0),
3918 REG_NOTES (insn));
3919 REG_NOTES (insn) = alloc_EXPR_LIST (REG_SAVE_NOTE,
3920 GEN_INT (NOTE_INSN_SETJMP),
3921 REG_NOTES (insn));
3923 else
3925 for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
3926 if (call_used_regs[i] || global_regs[i])
3928 for (u = deps->reg_last_uses[i]; u; u = XEXP (u, 1))
3929 add_dependence (insn, XEXP (u, 0), REG_DEP_ANTI);
3931 for (u = deps->reg_last_sets[i]; u; u = XEXP (u, 1))
3932 add_dependence (insn, XEXP (u, 0), REG_DEP_ANTI);
3934 SET_REGNO_REG_SET (reg_pending_clobbers, i);
3938 /* For each insn which shouldn't cross a call, add a dependence
3939 between that insn and this call insn. */
3940 x = LOG_LINKS (deps->sched_before_next_call);
3941 while (x)
3943 add_dependence (insn, XEXP (x, 0), REG_DEP_ANTI);
3944 x = XEXP (x, 1);
3946 free_INSN_LIST_list (&LOG_LINKS (deps->sched_before_next_call));
3948 sched_analyze_insn (deps, PATTERN (insn), insn, loop_notes);
3949 loop_notes = 0;
3951 /* In the absence of interprocedural alias analysis, we must flush
3952 all pending reads and writes, and start new dependencies starting
3953 from here. But only flush writes for constant calls (which may
3954 be passed a pointer to something we haven't written yet). */
3955 flush_pending_lists (deps, insn, CONST_CALL_P (insn));
3957 /* Depend this function call (actually, the user of this
3958 function call) on all hard register clobberage. */
3960 /* last_function_call is now a list of insns. */
3961 free_INSN_LIST_list (&deps->last_function_call);
3962 deps->last_function_call = alloc_INSN_LIST (insn, NULL_RTX);
3964 /* Before reload, begin a post-call group, so as to keep the
3965 lifetimes of hard registers correct. */
3966 if (! reload_completed)
3967 deps->in_post_call_group_p = 1;
3970 /* See comments on reemit_notes as to why we do this.
3971 ??? Actually, the reemit_notes just say what is done, not why. */
3973 else if (GET_CODE (insn) == NOTE
3974 && (NOTE_LINE_NUMBER (insn) == NOTE_INSN_RANGE_BEG
3975 || NOTE_LINE_NUMBER (insn) == NOTE_INSN_RANGE_END))
3977 loop_notes = alloc_EXPR_LIST (REG_SAVE_NOTE, NOTE_RANGE_INFO (insn),
3978 loop_notes);
3979 loop_notes = alloc_EXPR_LIST (REG_SAVE_NOTE,
3980 GEN_INT (NOTE_LINE_NUMBER (insn)),
3981 loop_notes);
3983 else if (GET_CODE (insn) == NOTE
3984 && (NOTE_LINE_NUMBER (insn) == NOTE_INSN_LOOP_BEG
3985 || NOTE_LINE_NUMBER (insn) == NOTE_INSN_LOOP_END
3986 || NOTE_LINE_NUMBER (insn) == NOTE_INSN_EH_REGION_BEG
3987 || NOTE_LINE_NUMBER (insn) == NOTE_INSN_EH_REGION_END
3988 || (NOTE_LINE_NUMBER (insn) == NOTE_INSN_SETJMP
3989 && GET_CODE (PREV_INSN (insn)) != CALL_INSN)))
3991 rtx rtx_region;
3993 if (NOTE_LINE_NUMBER (insn) == NOTE_INSN_EH_REGION_BEG
3994 || NOTE_LINE_NUMBER (insn) == NOTE_INSN_EH_REGION_END)
3995 rtx_region = GEN_INT (NOTE_EH_HANDLER (insn));
3996 else
3997 rtx_region = GEN_INT (0);
3999 loop_notes = alloc_EXPR_LIST (REG_SAVE_NOTE,
4000 rtx_region,
4001 loop_notes);
4002 loop_notes = alloc_EXPR_LIST (REG_SAVE_NOTE,
4003 GEN_INT (NOTE_LINE_NUMBER (insn)),
4004 loop_notes);
4005 CONST_CALL_P (loop_notes) = CONST_CALL_P (insn);
4008 if (insn == tail)
4009 return;
4011 abort ();
4014 /* Macros and functions for keeping the priority queue sorted, and
4015 dealing with queueing and dequeueing of instructions. */
4017 #define SCHED_SORT(READY, N_READY) \
4018 do { if ((N_READY) == 2) \
4019 swap_sort (READY, N_READY); \
4020 else if ((N_READY) > 2) \
4021 qsort (READY, N_READY, sizeof (rtx), rank_for_schedule); } \
4022 while (0)
4024 /* Returns a positive value if x is preferred; returns a negative value if
4025 y is preferred. Should never return 0, since that will make the sort
4026 unstable. */
4028 static int
4029 rank_for_schedule (x, y)
4030 const PTR x;
4031 const PTR y;
4033 rtx tmp = *(const rtx *)y;
4034 rtx tmp2 = *(const rtx *)x;
4035 rtx link;
4036 int tmp_class, tmp2_class, depend_count1, depend_count2;
4037 int val, priority_val, spec_val, prob_val, weight_val;
4040 /* Prefer insn with higher priority. */
4041 priority_val = INSN_PRIORITY (tmp2) - INSN_PRIORITY (tmp);
4042 if (priority_val)
4043 return priority_val;
4045 /* Prefer an insn with smaller contribution to registers-pressure. */
4046 if (!reload_completed &&
4047 (weight_val = INSN_REG_WEIGHT (tmp) - INSN_REG_WEIGHT (tmp2)))
4048 return (weight_val);
4050 /* Some comparison make sense in interblock scheduling only. */
4051 if (INSN_BB (tmp) != INSN_BB (tmp2))
4053 /* Prefer an inblock motion on an interblock motion. */
4054 if ((INSN_BB (tmp2) == target_bb) && (INSN_BB (tmp) != target_bb))
4055 return 1;
4056 if ((INSN_BB (tmp) == target_bb) && (INSN_BB (tmp2) != target_bb))
4057 return -1;
4059 /* Prefer a useful motion on a speculative one. */
4060 if ((spec_val = IS_SPECULATIVE_INSN (tmp) - IS_SPECULATIVE_INSN (tmp2)))
4061 return (spec_val);
4063 /* Prefer a more probable (speculative) insn. */
4064 prob_val = INSN_PROBABILITY (tmp2) - INSN_PROBABILITY (tmp);
4065 if (prob_val)
4066 return (prob_val);
4069 /* Compare insns based on their relation to the last-scheduled-insn. */
4070 if (last_scheduled_insn)
4072 /* Classify the instructions into three classes:
4073 1) Data dependent on last schedule insn.
4074 2) Anti/Output dependent on last scheduled insn.
4075 3) Independent of last scheduled insn, or has latency of one.
4076 Choose the insn from the highest numbered class if different. */
4077 link = find_insn_list (tmp, INSN_DEPEND (last_scheduled_insn));
4078 if (link == 0 || insn_cost (last_scheduled_insn, link, tmp) == 1)
4079 tmp_class = 3;
4080 else if (REG_NOTE_KIND (link) == 0) /* Data dependence. */
4081 tmp_class = 1;
4082 else
4083 tmp_class = 2;
4085 link = find_insn_list (tmp2, INSN_DEPEND (last_scheduled_insn));
4086 if (link == 0 || insn_cost (last_scheduled_insn, link, tmp2) == 1)
4087 tmp2_class = 3;
4088 else if (REG_NOTE_KIND (link) == 0) /* Data dependence. */
4089 tmp2_class = 1;
4090 else
4091 tmp2_class = 2;
4093 if ((val = tmp2_class - tmp_class))
4094 return val;
4097 /* Prefer the insn which has more later insns that depend on it.
4098 This gives the scheduler more freedom when scheduling later
4099 instructions at the expense of added register pressure. */
4100 depend_count1 = 0;
4101 for (link = INSN_DEPEND (tmp); link; link = XEXP (link, 1))
4102 depend_count1++;
4104 depend_count2 = 0;
4105 for (link = INSN_DEPEND (tmp2); link; link = XEXP (link, 1))
4106 depend_count2++;
4108 val = depend_count2 - depend_count1;
4109 if (val)
4110 return val;
4112 /* If insns are equally good, sort by INSN_LUID (original insn order),
4113 so that we make the sort stable. This minimizes instruction movement,
4114 thus minimizing sched's effect on debugging and cross-jumping. */
4115 return INSN_LUID (tmp) - INSN_LUID (tmp2);
4118 /* Resort the array A in which only element at index N may be out of order. */
4120 HAIFA_INLINE static void
4121 swap_sort (a, n)
4122 rtx *a;
4123 int n;
4125 rtx insn = a[n - 1];
4126 int i = n - 2;
4128 while (i >= 0 && rank_for_schedule (a + i, &insn) >= 0)
4130 a[i + 1] = a[i];
4131 i -= 1;
4133 a[i + 1] = insn;
4136 static int max_priority;
4138 /* Add INSN to the insn queue so that it can be executed at least
4139 N_CYCLES after the currently executing insn. Preserve insns
4140 chain for debugging purposes. */
4142 HAIFA_INLINE static void
4143 queue_insn (insn, n_cycles)
4144 rtx insn;
4145 int n_cycles;
4147 int next_q = NEXT_Q_AFTER (q_ptr, n_cycles);
4148 rtx link = alloc_INSN_LIST (insn, insn_queue[next_q]);
4149 insn_queue[next_q] = link;
4150 q_size += 1;
4152 if (sched_verbose >= 2)
4154 fprintf (dump, ";;\t\tReady-->Q: insn %d: ", INSN_UID (insn));
4156 if (INSN_BB (insn) != target_bb)
4157 fprintf (dump, "(b%d) ", BLOCK_NUM (insn));
4159 fprintf (dump, "queued for %d cycles.\n", n_cycles);
4164 /* PREV is an insn that is ready to execute. Adjust its priority if that
4165 will help shorten or lengthen register lifetimes as appropriate. Also
4166 provide a hook for the target to tweek itself. */
4168 HAIFA_INLINE static void
4169 adjust_priority (prev)
4170 rtx prev ATTRIBUTE_UNUSED;
4172 /* ??? There used to be code here to try and estimate how an insn
4173 affected register lifetimes, but it did it by looking at REG_DEAD
4174 notes, which we removed in schedule_region. Nor did it try to
4175 take into account register pressure or anything useful like that.
4177 Revisit when we have a machine model to work with and not before. */
4179 #ifdef ADJUST_PRIORITY
4180 ADJUST_PRIORITY (prev);
4181 #endif
4184 /* Clock at which the previous instruction was issued. */
4185 static int last_clock_var;
4187 /* INSN is the "currently executing insn". Launch each insn which was
4188 waiting on INSN. READY is a vector of insns which are ready to fire.
4189 N_READY is the number of elements in READY. CLOCK is the current
4190 cycle. */
4192 static int
4193 schedule_insn (insn, ready, n_ready, clock)
4194 rtx insn;
4195 rtx *ready;
4196 int n_ready;
4197 int clock;
4199 rtx link;
4200 int unit;
4202 unit = insn_unit (insn);
4204 if (sched_verbose >= 2)
4206 fprintf (dump, ";;\t\t--> scheduling insn <<<%d>>> on unit ",
4207 INSN_UID (insn));
4208 insn_print_units (insn);
4209 fprintf (dump, "\n");
4212 if (sched_verbose && unit == -1)
4213 visualize_no_unit (insn);
4215 if (MAX_BLOCKAGE > 1 || issue_rate > 1 || sched_verbose)
4216 schedule_unit (unit, insn, clock);
4218 if (INSN_DEPEND (insn) == 0)
4219 return n_ready;
4221 /* This is used by the function adjust_priority above. */
4222 if (n_ready > 0)
4223 max_priority = MAX (INSN_PRIORITY (ready[0]), INSN_PRIORITY (insn));
4224 else
4225 max_priority = INSN_PRIORITY (insn);
4227 for (link = INSN_DEPEND (insn); link != 0; link = XEXP (link, 1))
4229 rtx next = XEXP (link, 0);
4230 int cost = insn_cost (insn, link, next);
4232 INSN_TICK (next) = MAX (INSN_TICK (next), clock + cost);
4234 if ((INSN_DEP_COUNT (next) -= 1) == 0)
4236 int effective_cost = INSN_TICK (next) - clock;
4238 /* For speculative insns, before inserting to ready/queue,
4239 check live, exception-free, and issue-delay. */
4240 if (INSN_BB (next) != target_bb
4241 && (!IS_VALID (INSN_BB (next))
4242 || CANT_MOVE (next)
4243 || (IS_SPECULATIVE_INSN (next)
4244 && (insn_issue_delay (next) > 3
4245 || !check_live (next, INSN_BB (next))
4246 || !is_exception_free (next, INSN_BB (next), target_bb)))))
4247 continue;
4249 if (sched_verbose >= 2)
4251 fprintf (dump, ";;\t\tdependences resolved: insn %d ",
4252 INSN_UID (next));
4254 if (current_nr_blocks > 1 && INSN_BB (next) != target_bb)
4255 fprintf (dump, "/b%d ", BLOCK_NUM (next));
4257 if (effective_cost < 1)
4258 fprintf (dump, "into ready\n");
4259 else
4260 fprintf (dump, "into queue with cost=%d\n", effective_cost);
4263 /* Adjust the priority of NEXT and either put it on the ready
4264 list or queue it. */
4265 adjust_priority (next);
4266 if (effective_cost < 1)
4267 ready[n_ready++] = next;
4268 else
4269 queue_insn (next, effective_cost);
4273 /* Annotate the instruction with issue information -- TImode
4274 indicates that the instruction is expected not to be able
4275 to issue on the same cycle as the previous insn. A machine
4276 may use this information to decide how the instruction should
4277 be aligned. */
4278 if (reload_completed && issue_rate > 1)
4280 PUT_MODE (insn, clock > last_clock_var ? TImode : VOIDmode);
4281 last_clock_var = clock;
4284 return n_ready;
4287 /* Functions for handling of notes. */
4289 /* Delete notes beginning with INSN and put them in the chain
4290 of notes ended by NOTE_LIST.
4291 Returns the insn following the notes. */
4293 static rtx
4294 unlink_other_notes (insn, tail)
4295 rtx insn, tail;
4297 rtx prev = PREV_INSN (insn);
4299 while (insn != tail && GET_CODE (insn) == NOTE)
4301 rtx next = NEXT_INSN (insn);
4302 /* Delete the note from its current position. */
4303 if (prev)
4304 NEXT_INSN (prev) = next;
4305 if (next)
4306 PREV_INSN (next) = prev;
4308 /* See sched_analyze to see how these are handled. */
4309 if (NOTE_LINE_NUMBER (insn) != NOTE_INSN_SETJMP
4310 && NOTE_LINE_NUMBER (insn) != NOTE_INSN_LOOP_BEG
4311 && NOTE_LINE_NUMBER (insn) != NOTE_INSN_LOOP_END
4312 && NOTE_LINE_NUMBER (insn) != NOTE_INSN_RANGE_BEG
4313 && NOTE_LINE_NUMBER (insn) != NOTE_INSN_RANGE_END
4314 && NOTE_LINE_NUMBER (insn) != NOTE_INSN_EH_REGION_BEG
4315 && NOTE_LINE_NUMBER (insn) != NOTE_INSN_EH_REGION_END)
4317 /* Insert the note at the end of the notes list. */
4318 PREV_INSN (insn) = note_list;
4319 if (note_list)
4320 NEXT_INSN (note_list) = insn;
4321 note_list = insn;
4324 insn = next;
4326 return insn;
4329 /* Delete line notes beginning with INSN. Record line-number notes so
4330 they can be reused. Returns the insn following the notes. */
4332 static rtx
4333 unlink_line_notes (insn, tail)
4334 rtx insn, tail;
4336 rtx prev = PREV_INSN (insn);
4338 while (insn != tail && GET_CODE (insn) == NOTE)
4340 rtx next = NEXT_INSN (insn);
4342 if (write_symbols != NO_DEBUG && NOTE_LINE_NUMBER (insn) > 0)
4344 /* Delete the note from its current position. */
4345 if (prev)
4346 NEXT_INSN (prev) = next;
4347 if (next)
4348 PREV_INSN (next) = prev;
4350 /* Record line-number notes so they can be reused. */
4351 LINE_NOTE (insn) = insn;
4353 else
4354 prev = insn;
4356 insn = next;
4358 return insn;
4361 /* Return the head and tail pointers of BB. */
4363 HAIFA_INLINE static void
4364 get_block_head_tail (b, headp, tailp)
4365 int b;
4366 rtx *headp;
4367 rtx *tailp;
4370 rtx head;
4371 rtx tail;
4373 /* HEAD and TAIL delimit the basic block being scheduled. */
4374 head = BLOCK_HEAD (b);
4375 tail = BLOCK_END (b);
4377 /* Don't include any notes or labels at the beginning of the
4378 basic block, or notes at the ends of basic blocks. */
4379 while (head != tail)
4381 if (GET_CODE (head) == NOTE)
4382 head = NEXT_INSN (head);
4383 else if (GET_CODE (tail) == NOTE)
4384 tail = PREV_INSN (tail);
4385 else if (GET_CODE (head) == CODE_LABEL)
4386 head = NEXT_INSN (head);
4387 else
4388 break;
4391 *headp = head;
4392 *tailp = tail;
4395 HAIFA_INLINE static void
4396 get_bb_head_tail (bb, headp, tailp)
4397 int bb;
4398 rtx *headp;
4399 rtx *tailp;
4401 get_block_head_tail (BB_TO_BLOCK (bb), headp, tailp);
4404 /* Delete line notes from bb. Save them so they can be later restored
4405 (in restore_line_notes ()). */
4407 static void
4408 rm_line_notes (bb)
4409 int bb;
4411 rtx next_tail;
4412 rtx tail;
4413 rtx head;
4414 rtx insn;
4416 get_bb_head_tail (bb, &head, &tail);
4418 if (head == tail && (! INSN_P (head)))
4419 return;
4421 next_tail = NEXT_INSN (tail);
4422 for (insn = head; insn != next_tail; insn = NEXT_INSN (insn))
4424 rtx prev;
4426 /* Farm out notes, and maybe save them in NOTE_LIST.
4427 This is needed to keep the debugger from
4428 getting completely deranged. */
4429 if (GET_CODE (insn) == NOTE)
4431 prev = insn;
4432 insn = unlink_line_notes (insn, next_tail);
4434 if (prev == tail)
4435 abort ();
4436 if (prev == head)
4437 abort ();
4438 if (insn == next_tail)
4439 abort ();
4444 /* Save line number notes for each insn in bb. */
4446 static void
4447 save_line_notes (bb)
4448 int bb;
4450 rtx head, tail;
4451 rtx next_tail;
4453 /* We must use the true line number for the first insn in the block
4454 that was computed and saved at the start of this pass. We can't
4455 use the current line number, because scheduling of the previous
4456 block may have changed the current line number. */
4458 rtx line = line_note_head[BB_TO_BLOCK (bb)];
4459 rtx insn;
4461 get_bb_head_tail (bb, &head, &tail);
4462 next_tail = NEXT_INSN (tail);
4464 for (insn = BLOCK_HEAD (BB_TO_BLOCK (bb));
4465 insn != next_tail;
4466 insn = NEXT_INSN (insn))
4467 if (GET_CODE (insn) == NOTE && NOTE_LINE_NUMBER (insn) > 0)
4468 line = insn;
4469 else
4470 LINE_NOTE (insn) = line;
4474 /* After bb was scheduled, insert line notes into the insns list. */
4476 static void
4477 restore_line_notes (bb)
4478 int bb;
4480 rtx line, note, prev, new;
4481 int added_notes = 0;
4482 int b;
4483 rtx head, next_tail, insn;
4485 b = BB_TO_BLOCK (bb);
4487 head = BLOCK_HEAD (b);
4488 next_tail = NEXT_INSN (BLOCK_END (b));
4490 /* Determine the current line-number. We want to know the current
4491 line number of the first insn of the block here, in case it is
4492 different from the true line number that was saved earlier. If
4493 different, then we need a line number note before the first insn
4494 of this block. If it happens to be the same, then we don't want to
4495 emit another line number note here. */
4496 for (line = head; line; line = PREV_INSN (line))
4497 if (GET_CODE (line) == NOTE && NOTE_LINE_NUMBER (line) > 0)
4498 break;
4500 /* Walk the insns keeping track of the current line-number and inserting
4501 the line-number notes as needed. */
4502 for (insn = head; insn != next_tail; insn = NEXT_INSN (insn))
4503 if (GET_CODE (insn) == NOTE && NOTE_LINE_NUMBER (insn) > 0)
4504 line = insn;
4505 /* This used to emit line number notes before every non-deleted note.
4506 However, this confuses a debugger, because line notes not separated
4507 by real instructions all end up at the same address. I can find no
4508 use for line number notes before other notes, so none are emitted. */
4509 else if (GET_CODE (insn) != NOTE
4510 && (note = LINE_NOTE (insn)) != 0
4511 && note != line
4512 && (line == 0
4513 || NOTE_LINE_NUMBER (note) != NOTE_LINE_NUMBER (line)
4514 || NOTE_SOURCE_FILE (note) != NOTE_SOURCE_FILE (line)))
4516 line = note;
4517 prev = PREV_INSN (insn);
4518 if (LINE_NOTE (note))
4520 /* Re-use the original line-number note. */
4521 LINE_NOTE (note) = 0;
4522 PREV_INSN (note) = prev;
4523 NEXT_INSN (prev) = note;
4524 PREV_INSN (insn) = note;
4525 NEXT_INSN (note) = insn;
4527 else
4529 added_notes++;
4530 new = emit_note_after (NOTE_LINE_NUMBER (note), prev);
4531 NOTE_SOURCE_FILE (new) = NOTE_SOURCE_FILE (note);
4532 RTX_INTEGRATED_P (new) = RTX_INTEGRATED_P (note);
4535 if (sched_verbose && added_notes)
4536 fprintf (dump, ";; added %d line-number notes\n", added_notes);
4539 /* After scheduling the function, delete redundant line notes from the
4540 insns list. */
4542 static void
4543 rm_redundant_line_notes ()
4545 rtx line = 0;
4546 rtx insn = get_insns ();
4547 int active_insn = 0;
4548 int notes = 0;
4550 /* Walk the insns deleting redundant line-number notes. Many of these
4551 are already present. The remainder tend to occur at basic
4552 block boundaries. */
4553 for (insn = get_last_insn (); insn; insn = PREV_INSN (insn))
4554 if (GET_CODE (insn) == NOTE && NOTE_LINE_NUMBER (insn) > 0)
4556 /* If there are no active insns following, INSN is redundant. */
4557 if (active_insn == 0)
4559 notes++;
4560 NOTE_SOURCE_FILE (insn) = 0;
4561 NOTE_LINE_NUMBER (insn) = NOTE_INSN_DELETED;
4563 /* If the line number is unchanged, LINE is redundant. */
4564 else if (line
4565 && NOTE_LINE_NUMBER (line) == NOTE_LINE_NUMBER (insn)
4566 && NOTE_SOURCE_FILE (line) == NOTE_SOURCE_FILE (insn))
4568 notes++;
4569 NOTE_SOURCE_FILE (line) = 0;
4570 NOTE_LINE_NUMBER (line) = NOTE_INSN_DELETED;
4571 line = insn;
4573 else
4574 line = insn;
4575 active_insn = 0;
4577 else if (!((GET_CODE (insn) == NOTE
4578 && NOTE_LINE_NUMBER (insn) == NOTE_INSN_DELETED)
4579 || (GET_CODE (insn) == INSN
4580 && (GET_CODE (PATTERN (insn)) == USE
4581 || GET_CODE (PATTERN (insn)) == CLOBBER))))
4582 active_insn++;
4584 if (sched_verbose && notes)
4585 fprintf (dump, ";; deleted %d line-number notes\n", notes);
4588 /* Delete notes between head and tail and put them in the chain
4589 of notes ended by NOTE_LIST. */
4591 static void
4592 rm_other_notes (head, tail)
4593 rtx head;
4594 rtx tail;
4596 rtx next_tail;
4597 rtx insn;
4599 if (head == tail && (! INSN_P (head)))
4600 return;
4602 next_tail = NEXT_INSN (tail);
4603 for (insn = head; insn != next_tail; insn = NEXT_INSN (insn))
4605 rtx prev;
4607 /* Farm out notes, and maybe save them in NOTE_LIST.
4608 This is needed to keep the debugger from
4609 getting completely deranged. */
4610 if (GET_CODE (insn) == NOTE)
4612 prev = insn;
4614 insn = unlink_other_notes (insn, next_tail);
4616 if (prev == tail)
4617 abort ();
4618 if (prev == head)
4619 abort ();
4620 if (insn == next_tail)
4621 abort ();
4626 /* Functions for computation of registers live/usage info. */
4628 /* Calculate INSN_REG_WEIGHT for all insns of a block. */
4630 static void
4631 find_insn_reg_weight (b)
4632 int b;
4634 rtx insn, next_tail, head, tail;
4636 get_block_head_tail (b, &head, &tail);
4637 next_tail = NEXT_INSN (tail);
4639 for (insn = head; insn != next_tail; insn = NEXT_INSN (insn))
4641 int reg_weight = 0;
4642 rtx x;
4644 /* Handle register life information. */
4645 if (! INSN_P (insn))
4646 continue;
4648 /* Increment weight for each register born here. */
4649 x = PATTERN (insn);
4650 if ((GET_CODE (x) == SET || GET_CODE (x) == CLOBBER)
4651 && register_operand (SET_DEST (x), VOIDmode))
4652 reg_weight++;
4653 else if (GET_CODE (x) == PARALLEL)
4655 int j;
4656 for (j = XVECLEN (x, 0) - 1; j >= 0; j--)
4658 x = XVECEXP (PATTERN (insn), 0, j);
4659 if ((GET_CODE (x) == SET || GET_CODE (x) == CLOBBER)
4660 && register_operand (SET_DEST (x), VOIDmode))
4661 reg_weight++;
4665 /* Decrement weight for each register that dies here. */
4666 for (x = REG_NOTES (insn); x; x = XEXP (x, 1))
4668 if (REG_NOTE_KIND (x) == REG_DEAD
4669 || REG_NOTE_KIND (x) == REG_UNUSED)
4670 reg_weight--;
4673 INSN_REG_WEIGHT (insn) = reg_weight;
4677 /* Scheduling clock, modified in schedule_block() and queue_to_ready (). */
4678 static int clock_var;
4680 /* Move insns that became ready to fire from queue to ready list. */
4682 static int
4683 queue_to_ready (ready, n_ready)
4684 rtx ready[];
4685 int n_ready;
4687 rtx insn;
4688 rtx link;
4690 q_ptr = NEXT_Q (q_ptr);
4692 /* Add all pending insns that can be scheduled without stalls to the
4693 ready list. */
4694 for (link = insn_queue[q_ptr]; link; link = XEXP (link, 1))
4697 insn = XEXP (link, 0);
4698 q_size -= 1;
4700 if (sched_verbose >= 2)
4701 fprintf (dump, ";;\t\tQ-->Ready: insn %d: ", INSN_UID (insn));
4703 if (sched_verbose >= 2 && INSN_BB (insn) != target_bb)
4704 fprintf (dump, "(b%d) ", BLOCK_NUM (insn));
4706 ready[n_ready++] = insn;
4707 if (sched_verbose >= 2)
4708 fprintf (dump, "moving to ready without stalls\n");
4710 insn_queue[q_ptr] = 0;
4712 /* If there are no ready insns, stall until one is ready and add all
4713 of the pending insns at that point to the ready list. */
4714 if (n_ready == 0)
4716 register int stalls;
4718 for (stalls = 1; stalls < INSN_QUEUE_SIZE; stalls++)
4720 if ((link = insn_queue[NEXT_Q_AFTER (q_ptr, stalls)]))
4722 for (; link; link = XEXP (link, 1))
4724 insn = XEXP (link, 0);
4725 q_size -= 1;
4727 if (sched_verbose >= 2)
4728 fprintf (dump, ";;\t\tQ-->Ready: insn %d: ", INSN_UID (insn));
4730 if (sched_verbose >= 2 && INSN_BB (insn) != target_bb)
4731 fprintf (dump, "(b%d) ", BLOCK_NUM (insn));
4733 ready[n_ready++] = insn;
4734 if (sched_verbose >= 2)
4735 fprintf (dump, "moving to ready with %d stalls\n", stalls);
4737 insn_queue[NEXT_Q_AFTER (q_ptr, stalls)] = 0;
4739 if (n_ready)
4740 break;
4744 if (sched_verbose && stalls)
4745 visualize_stall_cycles (BB_TO_BLOCK (target_bb), stalls);
4746 q_ptr = NEXT_Q_AFTER (q_ptr, stalls);
4747 clock_var += stalls;
4749 return n_ready;
4752 /* Print the ready list for debugging purposes. Callable from debugger. */
4754 static void
4755 debug_ready_list (ready, n_ready)
4756 rtx ready[];
4757 int n_ready;
4759 int i;
4761 for (i = 0; i < n_ready; i++)
4763 fprintf (dump, " %d", INSN_UID (ready[i]));
4764 if (current_nr_blocks > 1 && INSN_BB (ready[i]) != target_bb)
4765 fprintf (dump, "/b%d", BLOCK_NUM (ready[i]));
4767 fprintf (dump, "\n");
4770 /* Print names of units on which insn can/should execute, for debugging. */
4772 static void
4773 insn_print_units (insn)
4774 rtx insn;
4776 int i;
4777 int unit = insn_unit (insn);
4779 if (unit == -1)
4780 fprintf (dump, "none");
4781 else if (unit >= 0)
4782 fprintf (dump, "%s", function_units[unit].name);
4783 else
4785 fprintf (dump, "[");
4786 for (i = 0, unit = ~unit; unit; i++, unit >>= 1)
4787 if (unit & 1)
4789 fprintf (dump, "%s", function_units[i].name);
4790 if (unit != 1)
4791 fprintf (dump, " ");
4793 fprintf (dump, "]");
4797 /* MAX_VISUAL_LINES is the maximum number of lines in visualization table
4798 of a basic block. If more lines are needed, table is splitted to two.
4799 n_visual_lines is the number of lines printed so far for a block.
4800 visual_tbl contains the block visualization info.
4801 vis_no_unit holds insns in a cycle that are not mapped to any unit. */
4802 #define MAX_VISUAL_LINES 100
4803 #define INSN_LEN 30
4804 int n_visual_lines;
4805 char *visual_tbl;
4806 int n_vis_no_unit;
4807 rtx vis_no_unit[10];
4809 /* Finds units that are in use in this fuction. Required only
4810 for visualization. */
4812 static void
4813 init_target_units ()
4815 rtx insn;
4816 int unit;
4818 for (insn = get_last_insn (); insn; insn = PREV_INSN (insn))
4820 if (! INSN_P (insn))
4821 continue;
4823 unit = insn_unit (insn);
4825 if (unit < 0)
4826 target_units |= ~unit;
4827 else
4828 target_units |= (1 << unit);
4832 /* Return the length of the visualization table. */
4834 static int
4835 get_visual_tbl_length ()
4837 int unit, i;
4838 int n, n1;
4839 char *s;
4841 /* Compute length of one field in line. */
4842 s = (char *) alloca (INSN_LEN + 6);
4843 sprintf (s, " %33s", "uname");
4844 n1 = strlen (s);
4846 /* Compute length of one line. */
4847 n = strlen (";; ");
4848 n += n1;
4849 for (unit = 0; unit < FUNCTION_UNITS_SIZE; unit++)
4850 if (function_units[unit].bitmask & target_units)
4851 for (i = 0; i < function_units[unit].multiplicity; i++)
4852 n += n1;
4853 n += n1;
4854 n += strlen ("\n") + 2;
4856 /* Compute length of visualization string. */
4857 return (MAX_VISUAL_LINES * n);
4860 /* Init block visualization debugging info. */
4862 static void
4863 init_block_visualization ()
4865 strcpy (visual_tbl, "");
4866 n_visual_lines = 0;
4867 n_vis_no_unit = 0;
4870 #define BUF_LEN 2048
4872 static char *
4873 safe_concat (buf, cur, str)
4874 char *buf;
4875 char *cur;
4876 const char *str;
4878 char *end = buf + BUF_LEN - 2; /* Leave room for null. */
4879 int c;
4881 if (cur > end)
4883 *end = '\0';
4884 return end;
4887 while (cur < end && (c = *str++) != '\0')
4888 *cur++ = c;
4890 *cur = '\0';
4891 return cur;
4894 /* This recognizes rtx, I classified as expressions. These are always
4895 represent some action on values or results of other expression, that
4896 may be stored in objects representing values. */
4898 static void
4899 print_exp (buf, x, verbose)
4900 char *buf;
4901 rtx x;
4902 int verbose;
4904 char tmp[BUF_LEN];
4905 const char *st[4];
4906 char *cur = buf;
4907 const char *fun = (char *)0;
4908 const char *sep;
4909 rtx op[4];
4910 int i;
4912 for (i = 0; i < 4; i++)
4914 st[i] = (char *)0;
4915 op[i] = NULL_RTX;
4918 switch (GET_CODE (x))
4920 case PLUS:
4921 op[0] = XEXP (x, 0);
4922 if (GET_CODE (XEXP (x, 1)) == CONST_INT
4923 && INTVAL (XEXP (x, 1)) < 0)
4925 st[1] = "-";
4926 op[1] = GEN_INT (-INTVAL (XEXP (x, 1)));
4928 else
4930 st[1] = "+";
4931 op[1] = XEXP (x, 1);
4933 break;
4934 case LO_SUM:
4935 op[0] = XEXP (x, 0);
4936 st[1] = "+low(";
4937 op[1] = XEXP (x, 1);
4938 st[2] = ")";
4939 break;
4940 case MINUS:
4941 op[0] = XEXP (x, 0);
4942 st[1] = "-";
4943 op[1] = XEXP (x, 1);
4944 break;
4945 case COMPARE:
4946 fun = "cmp";
4947 op[0] = XEXP (x, 0);
4948 op[1] = XEXP (x, 1);
4949 break;
4950 case NEG:
4951 st[0] = "-";
4952 op[0] = XEXP (x, 0);
4953 break;
4954 case MULT:
4955 op[0] = XEXP (x, 0);
4956 st[1] = "*";
4957 op[1] = XEXP (x, 1);
4958 break;
4959 case DIV:
4960 op[0] = XEXP (x, 0);
4961 st[1] = "/";
4962 op[1] = XEXP (x, 1);
4963 break;
4964 case UDIV:
4965 fun = "udiv";
4966 op[0] = XEXP (x, 0);
4967 op[1] = XEXP (x, 1);
4968 break;
4969 case MOD:
4970 op[0] = XEXP (x, 0);
4971 st[1] = "%";
4972 op[1] = XEXP (x, 1);
4973 break;
4974 case UMOD:
4975 fun = "umod";
4976 op[0] = XEXP (x, 0);
4977 op[1] = XEXP (x, 1);
4978 break;
4979 case SMIN:
4980 fun = "smin";
4981 op[0] = XEXP (x, 0);
4982 op[1] = XEXP (x, 1);
4983 break;
4984 case SMAX:
4985 fun = "smax";
4986 op[0] = XEXP (x, 0);
4987 op[1] = XEXP (x, 1);
4988 break;
4989 case UMIN:
4990 fun = "umin";
4991 op[0] = XEXP (x, 0);
4992 op[1] = XEXP (x, 1);
4993 break;
4994 case UMAX:
4995 fun = "umax";
4996 op[0] = XEXP (x, 0);
4997 op[1] = XEXP (x, 1);
4998 break;
4999 case NOT:
5000 st[0] = "!";
5001 op[0] = XEXP (x, 0);
5002 break;
5003 case AND:
5004 op[0] = XEXP (x, 0);
5005 st[1] = "&";
5006 op[1] = XEXP (x, 1);
5007 break;
5008 case IOR:
5009 op[0] = XEXP (x, 0);
5010 st[1] = "|";
5011 op[1] = XEXP (x, 1);
5012 break;
5013 case XOR:
5014 op[0] = XEXP (x, 0);
5015 st[1] = "^";
5016 op[1] = XEXP (x, 1);
5017 break;
5018 case ASHIFT:
5019 op[0] = XEXP (x, 0);
5020 st[1] = "<<";
5021 op[1] = XEXP (x, 1);
5022 break;
5023 case LSHIFTRT:
5024 op[0] = XEXP (x, 0);
5025 st[1] = " 0>>";
5026 op[1] = XEXP (x, 1);
5027 break;
5028 case ASHIFTRT:
5029 op[0] = XEXP (x, 0);
5030 st[1] = ">>";
5031 op[1] = XEXP (x, 1);
5032 break;
5033 case ROTATE:
5034 op[0] = XEXP (x, 0);
5035 st[1] = "<-<";
5036 op[1] = XEXP (x, 1);
5037 break;
5038 case ROTATERT:
5039 op[0] = XEXP (x, 0);
5040 st[1] = ">->";
5041 op[1] = XEXP (x, 1);
5042 break;
5043 case ABS:
5044 fun = "abs";
5045 op[0] = XEXP (x, 0);
5046 break;
5047 case SQRT:
5048 fun = "sqrt";
5049 op[0] = XEXP (x, 0);
5050 break;
5051 case FFS:
5052 fun = "ffs";
5053 op[0] = XEXP (x, 0);
5054 break;
5055 case EQ:
5056 op[0] = XEXP (x, 0);
5057 st[1] = "==";
5058 op[1] = XEXP (x, 1);
5059 break;
5060 case NE:
5061 op[0] = XEXP (x, 0);
5062 st[1] = "!=";
5063 op[1] = XEXP (x, 1);
5064 break;
5065 case GT:
5066 op[0] = XEXP (x, 0);
5067 st[1] = ">";
5068 op[1] = XEXP (x, 1);
5069 break;
5070 case GTU:
5071 fun = "gtu";
5072 op[0] = XEXP (x, 0);
5073 op[1] = XEXP (x, 1);
5074 break;
5075 case LT:
5076 op[0] = XEXP (x, 0);
5077 st[1] = "<";
5078 op[1] = XEXP (x, 1);
5079 break;
5080 case LTU:
5081 fun = "ltu";
5082 op[0] = XEXP (x, 0);
5083 op[1] = XEXP (x, 1);
5084 break;
5085 case GE:
5086 op[0] = XEXP (x, 0);
5087 st[1] = ">=";
5088 op[1] = XEXP (x, 1);
5089 break;
5090 case GEU:
5091 fun = "geu";
5092 op[0] = XEXP (x, 0);
5093 op[1] = XEXP (x, 1);
5094 break;
5095 case LE:
5096 op[0] = XEXP (x, 0);
5097 st[1] = "<=";
5098 op[1] = XEXP (x, 1);
5099 break;
5100 case LEU:
5101 fun = "leu";
5102 op[0] = XEXP (x, 0);
5103 op[1] = XEXP (x, 1);
5104 break;
5105 case SIGN_EXTRACT:
5106 fun = (verbose) ? "sign_extract" : "sxt";
5107 op[0] = XEXP (x, 0);
5108 op[1] = XEXP (x, 1);
5109 op[2] = XEXP (x, 2);
5110 break;
5111 case ZERO_EXTRACT:
5112 fun = (verbose) ? "zero_extract" : "zxt";
5113 op[0] = XEXP (x, 0);
5114 op[1] = XEXP (x, 1);
5115 op[2] = XEXP (x, 2);
5116 break;
5117 case SIGN_EXTEND:
5118 fun = (verbose) ? "sign_extend" : "sxn";
5119 op[0] = XEXP (x, 0);
5120 break;
5121 case ZERO_EXTEND:
5122 fun = (verbose) ? "zero_extend" : "zxn";
5123 op[0] = XEXP (x, 0);
5124 break;
5125 case FLOAT_EXTEND:
5126 fun = (verbose) ? "float_extend" : "fxn";
5127 op[0] = XEXP (x, 0);
5128 break;
5129 case TRUNCATE:
5130 fun = (verbose) ? "trunc" : "trn";
5131 op[0] = XEXP (x, 0);
5132 break;
5133 case FLOAT_TRUNCATE:
5134 fun = (verbose) ? "float_trunc" : "ftr";
5135 op[0] = XEXP (x, 0);
5136 break;
5137 case FLOAT:
5138 fun = (verbose) ? "float" : "flt";
5139 op[0] = XEXP (x, 0);
5140 break;
5141 case UNSIGNED_FLOAT:
5142 fun = (verbose) ? "uns_float" : "ufl";
5143 op[0] = XEXP (x, 0);
5144 break;
5145 case FIX:
5146 fun = "fix";
5147 op[0] = XEXP (x, 0);
5148 break;
5149 case UNSIGNED_FIX:
5150 fun = (verbose) ? "uns_fix" : "ufx";
5151 op[0] = XEXP (x, 0);
5152 break;
5153 case PRE_DEC:
5154 st[0] = "--";
5155 op[0] = XEXP (x, 0);
5156 break;
5157 case PRE_INC:
5158 st[0] = "++";
5159 op[0] = XEXP (x, 0);
5160 break;
5161 case POST_DEC:
5162 op[0] = XEXP (x, 0);
5163 st[1] = "--";
5164 break;
5165 case POST_INC:
5166 op[0] = XEXP (x, 0);
5167 st[1] = "++";
5168 break;
5169 case CALL:
5170 st[0] = "call ";
5171 op[0] = XEXP (x, 0);
5172 if (verbose)
5174 st[1] = " argc:";
5175 op[1] = XEXP (x, 1);
5177 break;
5178 case IF_THEN_ELSE:
5179 st[0] = "{(";
5180 op[0] = XEXP (x, 0);
5181 st[1] = ")?";
5182 op[1] = XEXP (x, 1);
5183 st[2] = ":";
5184 op[2] = XEXP (x, 2);
5185 st[3] = "}";
5186 break;
5187 case TRAP_IF:
5188 fun = "trap_if";
5189 op[0] = TRAP_CONDITION (x);
5190 break;
5191 case UNSPEC:
5192 case UNSPEC_VOLATILE:
5194 cur = safe_concat (buf, cur, "unspec");
5195 if (GET_CODE (x) == UNSPEC_VOLATILE)
5196 cur = safe_concat (buf, cur, "/v");
5197 cur = safe_concat (buf, cur, "[");
5198 sep = "";
5199 for (i = 0; i < XVECLEN (x, 0); i++)
5201 print_pattern (tmp, XVECEXP (x, 0, i), verbose);
5202 cur = safe_concat (buf, cur, sep);
5203 cur = safe_concat (buf, cur, tmp);
5204 sep = ",";
5206 cur = safe_concat (buf, cur, "] ");
5207 sprintf (tmp, "%d", XINT (x, 1));
5208 cur = safe_concat (buf, cur, tmp);
5210 break;
5211 default:
5212 /* If (verbose) debug_rtx (x); */
5213 st[0] = GET_RTX_NAME (GET_CODE (x));
5214 break;
5217 /* Print this as a function? */
5218 if (fun)
5220 cur = safe_concat (buf, cur, fun);
5221 cur = safe_concat (buf, cur, "(");
5224 for (i = 0; i < 4; i++)
5226 if (st[i])
5227 cur = safe_concat (buf, cur, st[i]);
5229 if (op[i])
5231 if (fun && i != 0)
5232 cur = safe_concat (buf, cur, ",");
5234 print_value (tmp, op[i], verbose);
5235 cur = safe_concat (buf, cur, tmp);
5239 if (fun)
5240 cur = safe_concat (buf, cur, ")");
5241 } /* print_exp */
5243 /* Prints rtxes, I customly classified as values. They're constants,
5244 registers, labels, symbols and memory accesses. */
5246 static void
5247 print_value (buf, x, verbose)
5248 char *buf;
5249 rtx x;
5250 int verbose;
5252 char t[BUF_LEN];
5253 char *cur = buf;
5255 switch (GET_CODE (x))
5257 case CONST_INT:
5258 sprintf (t, HOST_WIDE_INT_PRINT_HEX, INTVAL (x));
5259 cur = safe_concat (buf, cur, t);
5260 break;
5261 case CONST_DOUBLE:
5262 sprintf (t, "<0x%lx,0x%lx>", (long)XWINT (x, 2), (long)XWINT (x, 3));
5263 cur = safe_concat (buf, cur, t);
5264 break;
5265 case CONST_STRING:
5266 cur = safe_concat (buf, cur, "\"");
5267 cur = safe_concat (buf, cur, XSTR (x, 0));
5268 cur = safe_concat (buf, cur, "\"");
5269 break;
5270 case SYMBOL_REF:
5271 cur = safe_concat (buf, cur, "`");
5272 cur = safe_concat (buf, cur, XSTR (x, 0));
5273 cur = safe_concat (buf, cur, "'");
5274 break;
5275 case LABEL_REF:
5276 sprintf (t, "L%d", INSN_UID (XEXP (x, 0)));
5277 cur = safe_concat (buf, cur, t);
5278 break;
5279 case CONST:
5280 print_value (t, XEXP (x, 0), verbose);
5281 cur = safe_concat (buf, cur, "const(");
5282 cur = safe_concat (buf, cur, t);
5283 cur = safe_concat (buf, cur, ")");
5284 break;
5285 case HIGH:
5286 print_value (t, XEXP (x, 0), verbose);
5287 cur = safe_concat (buf, cur, "high(");
5288 cur = safe_concat (buf, cur, t);
5289 cur = safe_concat (buf, cur, ")");
5290 break;
5291 case REG:
5292 if (REGNO (x) < FIRST_PSEUDO_REGISTER)
5294 int c = reg_names[ REGNO (x) ][0];
5295 if (c >= '0' && c <= '9')
5296 cur = safe_concat (buf, cur, "%");
5298 cur = safe_concat (buf, cur, reg_names[ REGNO (x) ]);
5300 else
5302 sprintf (t, "r%d", REGNO (x));
5303 cur = safe_concat (buf, cur, t);
5305 break;
5306 case SUBREG:
5307 print_value (t, SUBREG_REG (x), verbose);
5308 cur = safe_concat (buf, cur, t);
5309 sprintf (t, "#%d", SUBREG_WORD (x));
5310 cur = safe_concat (buf, cur, t);
5311 break;
5312 case SCRATCH:
5313 cur = safe_concat (buf, cur, "scratch");
5314 break;
5315 case CC0:
5316 cur = safe_concat (buf, cur, "cc0");
5317 break;
5318 case PC:
5319 cur = safe_concat (buf, cur, "pc");
5320 break;
5321 case MEM:
5322 print_value (t, XEXP (x, 0), verbose);
5323 cur = safe_concat (buf, cur, "[");
5324 cur = safe_concat (buf, cur, t);
5325 cur = safe_concat (buf, cur, "]");
5326 break;
5327 default:
5328 print_exp (t, x, verbose);
5329 cur = safe_concat (buf, cur, t);
5330 break;
5332 } /* print_value */
5334 /* The next step in insn detalization, its pattern recognition. */
5336 static void
5337 print_pattern (buf, x, verbose)
5338 char *buf;
5339 rtx x;
5340 int verbose;
5342 char t1[BUF_LEN], t2[BUF_LEN], t3[BUF_LEN];
5344 switch (GET_CODE (x))
5346 case SET:
5347 print_value (t1, SET_DEST (x), verbose);
5348 print_value (t2, SET_SRC (x), verbose);
5349 sprintf (buf, "%s=%s", t1, t2);
5350 break;
5351 case RETURN:
5352 sprintf (buf, "return");
5353 break;
5354 case CALL:
5355 print_exp (buf, x, verbose);
5356 break;
5357 case CLOBBER:
5358 print_value (t1, XEXP (x, 0), verbose);
5359 sprintf (buf, "clobber %s", t1);
5360 break;
5361 case USE:
5362 print_value (t1, XEXP (x, 0), verbose);
5363 sprintf (buf, "use %s", t1);
5364 break;
5365 case COND_EXEC:
5366 print_value (t1, COND_EXEC_CODE (x), verbose);
5367 print_value (t2, COND_EXEC_TEST (x), verbose);
5368 sprintf (buf, "cond_exec %s %s", t1, t2);
5369 break;
5370 case PARALLEL:
5372 int i;
5374 sprintf (t1, "{");
5375 for (i = 0; i < XVECLEN (x, 0); i++)
5377 print_pattern (t2, XVECEXP (x, 0, i), verbose);
5378 sprintf (t3, "%s%s;", t1, t2);
5379 strcpy (t1, t3);
5381 sprintf (buf, "%s}", t1);
5383 break;
5384 case SEQUENCE:
5386 int i;
5388 sprintf (t1, "%%{");
5389 for (i = 0; i < XVECLEN (x, 0); i++)
5391 print_insn (t2, XVECEXP (x, 0, i), verbose);
5392 sprintf (t3, "%s%s;", t1, t2);
5393 strcpy (t1, t3);
5395 sprintf (buf, "%s%%}", t1);
5397 break;
5398 case ASM_INPUT:
5399 sprintf (buf, "asm {%s}", XSTR (x, 0));
5400 break;
5401 case ADDR_VEC:
5402 break;
5403 case ADDR_DIFF_VEC:
5404 print_value (buf, XEXP (x, 0), verbose);
5405 break;
5406 case TRAP_IF:
5407 print_value (t1, TRAP_CONDITION (x), verbose);
5408 sprintf (buf, "trap_if %s", t1);
5409 break;
5410 case UNSPEC:
5412 int i;
5414 sprintf (t1, "unspec{");
5415 for (i = 0; i < XVECLEN (x, 0); i++)
5417 print_pattern (t2, XVECEXP (x, 0, i), verbose);
5418 sprintf (t3, "%s%s;", t1, t2);
5419 strcpy (t1, t3);
5421 sprintf (buf, "%s}", t1);
5423 break;
5424 case UNSPEC_VOLATILE:
5426 int i;
5428 sprintf (t1, "unspec/v{");
5429 for (i = 0; i < XVECLEN (x, 0); i++)
5431 print_pattern (t2, XVECEXP (x, 0, i), verbose);
5432 sprintf (t3, "%s%s;", t1, t2);
5433 strcpy (t1, t3);
5435 sprintf (buf, "%s}", t1);
5437 break;
5438 default:
5439 print_value (buf, x, verbose);
5441 } /* print_pattern */
5443 /* This is the main function in rtl visualization mechanism. It
5444 accepts an rtx and tries to recognize it as an insn, then prints it
5445 properly in human readable form, resembling assembler mnemonics.
5446 For every insn it prints its UID and BB the insn belongs too.
5447 (Probably the last "option" should be extended somehow, since it
5448 depends now on sched.c inner variables ...) */
5450 static void
5451 print_insn (buf, x, verbose)
5452 char *buf;
5453 rtx x;
5454 int verbose;
5456 char t[BUF_LEN];
5457 rtx insn = x;
5459 switch (GET_CODE (x))
5461 case INSN:
5462 print_pattern (t, PATTERN (x), verbose);
5463 if (verbose)
5464 sprintf (buf, "b%d: i% 4d: %s", INSN_BB (x),
5465 INSN_UID (x), t);
5466 else
5467 sprintf (buf, "%-4d %s", INSN_UID (x), t);
5468 break;
5469 case JUMP_INSN:
5470 print_pattern (t, PATTERN (x), verbose);
5471 if (verbose)
5472 sprintf (buf, "b%d: i% 4d: jump %s", INSN_BB (x),
5473 INSN_UID (x), t);
5474 else
5475 sprintf (buf, "%-4d %s", INSN_UID (x), t);
5476 break;
5477 case CALL_INSN:
5478 x = PATTERN (insn);
5479 if (GET_CODE (x) == PARALLEL)
5481 x = XVECEXP (x, 0, 0);
5482 print_pattern (t, x, verbose);
5484 else
5485 strcpy (t, "call <...>");
5486 if (verbose)
5487 sprintf (buf, "b%d: i% 4d: %s", INSN_BB (insn),
5488 INSN_UID (insn), t);
5489 else
5490 sprintf (buf, "%-4d %s", INSN_UID (insn), t);
5491 break;
5492 case CODE_LABEL:
5493 sprintf (buf, "L%d:", INSN_UID (x));
5494 break;
5495 case BARRIER:
5496 sprintf (buf, "i% 4d: barrier", INSN_UID (x));
5497 break;
5498 case NOTE:
5499 if (NOTE_LINE_NUMBER (x) > 0)
5500 sprintf (buf, "%4d note \"%s\" %d", INSN_UID (x),
5501 NOTE_SOURCE_FILE (x), NOTE_LINE_NUMBER (x));
5502 else
5503 sprintf (buf, "%4d %s", INSN_UID (x),
5504 GET_NOTE_INSN_NAME (NOTE_LINE_NUMBER (x)));
5505 break;
5506 default:
5507 if (verbose)
5509 sprintf (buf, "Not an INSN at all\n");
5510 debug_rtx (x);
5512 else
5513 sprintf (buf, "i%-4d <What?>", INSN_UID (x));
5515 } /* print_insn */
5517 /* Print visualization debugging info. */
5519 static void
5520 print_block_visualization (b, s)
5521 int b;
5522 const char *s;
5524 int unit, i;
5526 /* Print header. */
5527 fprintf (dump, "\n;; ==================== scheduling visualization for block %d %s \n", b, s);
5529 /* Print names of units. */
5530 fprintf (dump, ";; %-8s", "clock");
5531 for (unit = 0; unit < FUNCTION_UNITS_SIZE; unit++)
5532 if (function_units[unit].bitmask & target_units)
5533 for (i = 0; i < function_units[unit].multiplicity; i++)
5534 fprintf (dump, " %-33s", function_units[unit].name);
5535 fprintf (dump, " %-8s\n", "no-unit");
5537 fprintf (dump, ";; %-8s", "=====");
5538 for (unit = 0; unit < FUNCTION_UNITS_SIZE; unit++)
5539 if (function_units[unit].bitmask & target_units)
5540 for (i = 0; i < function_units[unit].multiplicity; i++)
5541 fprintf (dump, " %-33s", "==============================");
5542 fprintf (dump, " %-8s\n", "=======");
5544 /* Print insns in each cycle. */
5545 fprintf (dump, "%s\n", visual_tbl);
5548 /* Print insns in the 'no_unit' column of visualization. */
5550 static void
5551 visualize_no_unit (insn)
5552 rtx insn;
5554 vis_no_unit[n_vis_no_unit] = insn;
5555 n_vis_no_unit++;
5558 /* Print insns scheduled in clock, for visualization. */
5560 static void
5561 visualize_scheduled_insns (b, clock)
5562 int b, clock;
5564 int i, unit;
5566 /* If no more room, split table into two. */
5567 if (n_visual_lines >= MAX_VISUAL_LINES)
5569 print_block_visualization (b, "(incomplete)");
5570 init_block_visualization ();
5573 n_visual_lines++;
5575 sprintf (visual_tbl + strlen (visual_tbl), ";; %-8d", clock);
5576 for (unit = 0; unit < FUNCTION_UNITS_SIZE; unit++)
5577 if (function_units[unit].bitmask & target_units)
5578 for (i = 0; i < function_units[unit].multiplicity; i++)
5580 int instance = unit + i * FUNCTION_UNITS_SIZE;
5581 rtx insn = unit_last_insn[instance];
5583 /* Print insns that still keep the unit busy. */
5584 if (insn &&
5585 actual_hazard_this_instance (unit, instance, insn, clock, 0))
5587 char str[BUF_LEN];
5588 print_insn (str, insn, 0);
5589 str[INSN_LEN] = '\0';
5590 sprintf (visual_tbl + strlen (visual_tbl), " %-33s", str);
5592 else
5593 sprintf (visual_tbl + strlen (visual_tbl), " %-33s", "------------------------------");
5596 /* Print insns that are not assigned to any unit. */
5597 for (i = 0; i < n_vis_no_unit; i++)
5598 sprintf (visual_tbl + strlen (visual_tbl), " %-8d",
5599 INSN_UID (vis_no_unit[i]));
5600 n_vis_no_unit = 0;
5602 sprintf (visual_tbl + strlen (visual_tbl), "\n");
5605 /* Print stalled cycles. */
5607 static void
5608 visualize_stall_cycles (b, stalls)
5609 int b, stalls;
5611 int i;
5613 /* If no more room, split table into two. */
5614 if (n_visual_lines >= MAX_VISUAL_LINES)
5616 print_block_visualization (b, "(incomplete)");
5617 init_block_visualization ();
5620 n_visual_lines++;
5622 sprintf (visual_tbl + strlen (visual_tbl), ";; ");
5623 for (i = 0; i < stalls; i++)
5624 sprintf (visual_tbl + strlen (visual_tbl), ".");
5625 sprintf (visual_tbl + strlen (visual_tbl), "\n");
5628 /* move_insn1: Remove INSN from insn chain, and link it after LAST insn. */
5630 static rtx
5631 move_insn1 (insn, last)
5632 rtx insn, last;
5634 NEXT_INSN (PREV_INSN (insn)) = NEXT_INSN (insn);
5635 PREV_INSN (NEXT_INSN (insn)) = PREV_INSN (insn);
5637 NEXT_INSN (insn) = NEXT_INSN (last);
5638 PREV_INSN (NEXT_INSN (last)) = insn;
5640 NEXT_INSN (last) = insn;
5641 PREV_INSN (insn) = last;
5643 return insn;
5646 /* Search INSN for REG_SAVE_NOTE note pairs for NOTE_INSN_SETJMP,
5647 NOTE_INSN_{LOOP,EHREGION}_{BEG,END}; and convert them back into
5648 NOTEs. The REG_SAVE_NOTE note following first one is contains the
5649 saved value for NOTE_BLOCK_NUMBER which is useful for
5650 NOTE_INSN_EH_REGION_{BEG,END} NOTEs. LAST is the last instruction
5651 output by the instruction scheduler. Return the new value of LAST. */
5653 static rtx
5654 reemit_notes (insn, last)
5655 rtx insn;
5656 rtx last;
5658 rtx note, retval;
5660 retval = last;
5661 for (note = REG_NOTES (insn); note; note = XEXP (note, 1))
5663 if (REG_NOTE_KIND (note) == REG_SAVE_NOTE)
5665 enum insn_note note_type = INTVAL (XEXP (note, 0));
5667 if (note_type == NOTE_INSN_SETJMP)
5669 retval = emit_note_after (NOTE_INSN_SETJMP, insn);
5670 CONST_CALL_P (retval) = CONST_CALL_P (note);
5671 remove_note (insn, note);
5672 note = XEXP (note, 1);
5674 else if (note_type == NOTE_INSN_RANGE_BEG
5675 || note_type == NOTE_INSN_RANGE_END)
5677 last = emit_note_before (note_type, last);
5678 remove_note (insn, note);
5679 note = XEXP (note, 1);
5680 NOTE_RANGE_INFO (last) = XEXP (note, 0);
5682 else
5684 last = emit_note_before (note_type, last);
5685 remove_note (insn, note);
5686 note = XEXP (note, 1);
5687 if (note_type == NOTE_INSN_EH_REGION_BEG
5688 || note_type == NOTE_INSN_EH_REGION_END)
5689 NOTE_EH_HANDLER (last) = INTVAL (XEXP (note, 0));
5691 remove_note (insn, note);
5694 return retval;
5697 /* Move INSN, and all insns which should be issued before it,
5698 due to SCHED_GROUP_P flag. Reemit notes if needed.
5700 Return the last insn emitted by the scheduler, which is the
5701 return value from the first call to reemit_notes. */
5703 static rtx
5704 move_insn (insn, last)
5705 rtx insn, last;
5707 rtx retval = NULL;
5709 /* If INSN has SCHED_GROUP_P set, then issue it and any other
5710 insns with SCHED_GROUP_P set first. */
5711 while (SCHED_GROUP_P (insn))
5713 rtx prev = PREV_INSN (insn);
5715 /* Move a SCHED_GROUP_P insn. */
5716 move_insn1 (insn, last);
5717 /* If this is the first call to reemit_notes, then record
5718 its return value. */
5719 if (retval == NULL_RTX)
5720 retval = reemit_notes (insn, insn);
5721 else
5722 reemit_notes (insn, insn);
5723 insn = prev;
5726 /* Now move the first non SCHED_GROUP_P insn. */
5727 move_insn1 (insn, last);
5729 /* If this is the first call to reemit_notes, then record
5730 its return value. */
5731 if (retval == NULL_RTX)
5732 retval = reemit_notes (insn, insn);
5733 else
5734 reemit_notes (insn, insn);
5736 return retval;
5739 /* Return an insn which represents a SCHED_GROUP, which is
5740 the last insn in the group. */
5742 static rtx
5743 group_leader (insn)
5744 rtx insn;
5746 rtx prev;
5750 prev = insn;
5751 insn = next_nonnote_insn (insn);
5753 while (insn && SCHED_GROUP_P (insn) && (GET_CODE (insn) != CODE_LABEL));
5755 return prev;
5758 /* Use forward list scheduling to rearrange insns of block BB in region RGN,
5759 possibly bringing insns from subsequent blocks in the same region.
5760 Return number of insns scheduled. */
5762 static int
5763 schedule_block (bb, rgn_n_insns)
5764 int bb;
5765 int rgn_n_insns;
5767 /* Local variables. */
5768 rtx insn, last;
5769 rtx *ready;
5770 int n_ready = 0;
5771 int can_issue_more;
5773 /* Flow block of this bb. */
5774 int b = BB_TO_BLOCK (bb);
5776 /* target_n_insns == number of insns in b before scheduling starts.
5777 sched_target_n_insns == how many of b's insns were scheduled.
5778 sched_n_insns == how many insns were scheduled in b. */
5779 int target_n_insns = 0;
5780 int sched_target_n_insns = 0;
5781 int sched_n_insns = 0;
5783 #define NEED_NOTHING 0
5784 #define NEED_HEAD 1
5785 #define NEED_TAIL 2
5786 int new_needs;
5788 /* Head/tail info for this block. */
5789 rtx prev_head;
5790 rtx next_tail;
5791 rtx head;
5792 rtx tail;
5793 int bb_src;
5795 /* We used to have code to avoid getting parameters moved from hard
5796 argument registers into pseudos.
5798 However, it was removed when it proved to be of marginal benefit
5799 and caused problems because schedule_block and compute_forward_dependences
5800 had different notions of what the "head" insn was. */
5801 get_bb_head_tail (bb, &head, &tail);
5803 /* rm_other_notes only removes notes which are _inside_ the
5804 block---that is, it won't remove notes before the first real insn
5805 or after the last real insn of the block. So if the first insn
5806 has a REG_SAVE_NOTE which would otherwise be emitted before the
5807 insn, it is redundant with the note before the start of the
5808 block, and so we have to take it out.
5810 FIXME: Probably the same thing should be done with REG_SAVE_NOTEs
5811 referencing NOTE_INSN_SETJMP at the end of the block. */
5812 if (INSN_P (head))
5814 rtx note;
5816 for (note = REG_NOTES (head); note; note = XEXP (note, 1))
5817 if (REG_NOTE_KIND (note) == REG_SAVE_NOTE)
5819 if (INTVAL (XEXP (note, 0)) != NOTE_INSN_SETJMP)
5821 remove_note (head, note);
5822 note = XEXP (note, 1);
5823 remove_note (head, note);
5825 else
5826 note = XEXP (note, 1);
5830 next_tail = NEXT_INSN (tail);
5831 prev_head = PREV_INSN (head);
5833 /* If the only insn left is a NOTE or a CODE_LABEL, then there is no need
5834 to schedule this block. */
5835 if (head == tail && (! INSN_P (head)))
5836 return (sched_n_insns);
5838 /* Debug info. */
5839 if (sched_verbose)
5841 fprintf (dump, ";; ======================================================\n");
5842 fprintf (dump,
5843 ";; -- basic block %d from %d to %d -- %s reload\n",
5844 b, INSN_UID (BLOCK_HEAD (b)), INSN_UID (BLOCK_END (b)),
5845 (reload_completed ? "after" : "before"));
5846 fprintf (dump, ";; ======================================================\n");
5847 fprintf (dump, "\n");
5849 visual_tbl = (char *) alloca (get_visual_tbl_length ());
5850 init_block_visualization ();
5853 /* Remove remaining note insns from the block, save them in
5854 note_list. These notes are restored at the end of
5855 schedule_block (). */
5856 note_list = 0;
5857 rm_other_notes (head, tail);
5859 target_bb = bb;
5861 /* Prepare current target block info. */
5862 if (current_nr_blocks > 1)
5864 candidate_table = (candidate *) xmalloc (current_nr_blocks
5865 * sizeof (candidate));
5867 bblst_last = 0;
5868 /* ??? It is not clear why bblst_size is computed this way. The original
5869 number was clearly too small as it resulted in compiler failures.
5870 Multiplying by the original number by 2 (to account for update_bbs
5871 members) seems to be a reasonable solution. */
5872 /* ??? Or perhaps there is a bug somewhere else in this file? */
5873 bblst_size = (current_nr_blocks - bb) * rgn_nr_edges * 2;
5874 bblst_table = (int *) xmalloc (bblst_size * sizeof (int));
5876 bitlst_table_last = 0;
5877 bitlst_table_size = rgn_nr_edges;
5878 bitlst_table = (int *) xmalloc (rgn_nr_edges * sizeof (int));
5880 compute_trg_info (bb);
5883 clear_units ();
5885 /* Allocate the ready list. */
5886 ready = (rtx *) xmalloc ((rgn_n_insns + 1) * sizeof (rtx));
5888 /* Print debugging information. */
5889 if (sched_verbose >= 5)
5890 debug_dependencies ();
5893 /* Initialize ready list with all 'ready' insns in target block.
5894 Count number of insns in the target block being scheduled. */
5895 n_ready = 0;
5896 for (insn = head; insn != next_tail; insn = NEXT_INSN (insn))
5898 rtx next;
5900 if (! INSN_P (insn))
5901 continue;
5902 next = NEXT_INSN (insn);
5904 if (INSN_DEP_COUNT (insn) == 0
5905 && (SCHED_GROUP_P (next) == 0 || ! INSN_P (next)))
5906 ready[n_ready++] = insn;
5907 if (!(SCHED_GROUP_P (insn)))
5908 target_n_insns++;
5911 /* Add to ready list all 'ready' insns in valid source blocks.
5912 For speculative insns, check-live, exception-free, and
5913 issue-delay. */
5914 for (bb_src = bb + 1; bb_src < current_nr_blocks; bb_src++)
5915 if (IS_VALID (bb_src))
5917 rtx src_head;
5918 rtx src_next_tail;
5919 rtx tail, head;
5921 get_bb_head_tail (bb_src, &head, &tail);
5922 src_next_tail = NEXT_INSN (tail);
5923 src_head = head;
5925 if (head == tail && (! INSN_P (head)))
5926 continue;
5928 for (insn = src_head; insn != src_next_tail; insn = NEXT_INSN (insn))
5930 if (! INSN_P (insn))
5931 continue;
5933 if (!CANT_MOVE (insn)
5934 && (!IS_SPECULATIVE_INSN (insn)
5935 || (insn_issue_delay (insn) <= 3
5936 && check_live (insn, bb_src)
5937 && is_exception_free (insn, bb_src, target_bb))))
5939 rtx next;
5941 /* Note that we havn't squirrled away the notes for
5942 blocks other than the current. So if this is a
5943 speculative insn, NEXT might otherwise be a note. */
5944 next = next_nonnote_insn (insn);
5945 if (INSN_DEP_COUNT (insn) == 0
5946 && (! next
5947 || SCHED_GROUP_P (next) == 0
5948 || ! INSN_P (next)))
5949 ready[n_ready++] = insn;
5954 #ifdef MD_SCHED_INIT
5955 MD_SCHED_INIT (dump, sched_verbose);
5956 #endif
5958 /* No insns scheduled in this block yet. */
5959 last_scheduled_insn = 0;
5961 /* Q_SIZE is the total number of insns in the queue. */
5962 q_ptr = 0;
5963 q_size = 0;
5964 last_clock_var = 0;
5965 bzero ((char *) insn_queue, sizeof (insn_queue));
5967 /* Start just before the beginning of time. */
5968 clock_var = -1;
5970 /* We start inserting insns after PREV_HEAD. */
5971 last = prev_head;
5973 /* Initialize INSN_QUEUE, LIST and NEW_NEEDS. */
5974 new_needs = (NEXT_INSN (prev_head) == BLOCK_HEAD (b)
5975 ? NEED_HEAD : NEED_NOTHING);
5976 if (PREV_INSN (next_tail) == BLOCK_END (b))
5977 new_needs |= NEED_TAIL;
5979 /* Loop until all the insns in BB are scheduled. */
5980 while (sched_target_n_insns < target_n_insns)
5982 clock_var++;
5984 /* Add to the ready list all pending insns that can be issued now.
5985 If there are no ready insns, increment clock until one
5986 is ready and add all pending insns at that point to the ready
5987 list. */
5988 n_ready = queue_to_ready (ready, n_ready);
5990 if (n_ready == 0)
5991 abort ();
5993 if (sched_verbose >= 2)
5995 fprintf (dump, ";;\t\tReady list after queue_to_ready: ");
5996 debug_ready_list (ready, n_ready);
5999 /* Sort the ready list based on priority. */
6000 SCHED_SORT (ready, n_ready);
6002 /* Allow the target to reorder the list, typically for
6003 better instruction bundling. */
6004 #ifdef MD_SCHED_REORDER
6005 MD_SCHED_REORDER (dump, sched_verbose, ready, n_ready, clock_var,
6006 can_issue_more);
6007 #else
6008 can_issue_more = issue_rate;
6009 #endif
6011 if (sched_verbose)
6013 fprintf (dump, "\n;;\tReady list (t =%3d): ", clock_var);
6014 debug_ready_list (ready, n_ready);
6017 /* Issue insns from ready list. */
6018 while (n_ready != 0 && can_issue_more)
6020 /* Select and remove the insn from the ready list. */
6021 rtx insn = ready[--n_ready];
6022 int cost = actual_hazard (insn_unit (insn), insn, clock_var, 0);
6024 if (cost >= 1)
6026 queue_insn (insn, cost);
6027 continue;
6030 /* An interblock motion? */
6031 if (INSN_BB (insn) != target_bb)
6033 rtx temp;
6034 basic_block b1;
6036 if (IS_SPECULATIVE_INSN (insn))
6038 if (!check_live (insn, INSN_BB (insn)))
6039 continue;
6040 update_live (insn, INSN_BB (insn));
6042 /* For speculative load, mark insns fed by it. */
6043 if (IS_LOAD_INSN (insn) || FED_BY_SPEC_LOAD (insn))
6044 set_spec_fed (insn);
6046 nr_spec++;
6048 nr_inter++;
6050 /* Find the beginning of the scheduling group. */
6051 /* ??? Ought to update basic block here, but later bits of
6052 schedule_block assumes the original insn block is
6053 still intact. */
6055 temp = insn;
6056 while (SCHED_GROUP_P (temp))
6057 temp = PREV_INSN (temp);
6059 /* Update source block boundaries. */
6060 b1 = BLOCK_FOR_INSN (temp);
6061 if (temp == b1->head && insn == b1->end)
6063 /* We moved all the insns in the basic block.
6064 Emit a note after the last insn and update the
6065 begin/end boundaries to point to the note. */
6066 rtx note = emit_note_after (NOTE_INSN_DELETED, insn);
6067 b1->head = note;
6068 b1->end = note;
6070 else if (insn == b1->end)
6072 /* We took insns from the end of the basic block,
6073 so update the end of block boundary so that it
6074 points to the first insn we did not move. */
6075 b1->end = PREV_INSN (temp);
6077 else if (temp == b1->head)
6079 /* We took insns from the start of the basic block,
6080 so update the start of block boundary so that
6081 it points to the first insn we did not move. */
6082 b1->head = NEXT_INSN (insn);
6085 else
6087 /* In block motion. */
6088 sched_target_n_insns++;
6091 last_scheduled_insn = insn;
6092 last = move_insn (insn, last);
6093 sched_n_insns++;
6095 #ifdef MD_SCHED_VARIABLE_ISSUE
6096 MD_SCHED_VARIABLE_ISSUE (dump, sched_verbose, insn,
6097 can_issue_more);
6098 #else
6099 can_issue_more--;
6100 #endif
6102 n_ready = schedule_insn (insn, ready, n_ready, clock_var);
6104 /* Close this block after scheduling its jump. */
6105 if (GET_CODE (last_scheduled_insn) == JUMP_INSN)
6106 break;
6109 /* Debug info. */
6110 if (sched_verbose)
6111 visualize_scheduled_insns (b, clock_var);
6114 /* Debug info. */
6115 if (sched_verbose)
6117 fprintf (dump, ";;\tReady list (final): ");
6118 debug_ready_list (ready, n_ready);
6119 print_block_visualization (b, "");
6122 /* Sanity check -- queue must be empty now. Meaningless if region has
6123 multiple bbs. */
6124 if (current_nr_blocks > 1)
6125 if (!flag_schedule_interblock && q_size != 0)
6126 abort ();
6128 /* Update head/tail boundaries. */
6129 head = NEXT_INSN (prev_head);
6130 tail = last;
6132 /* Restore-other-notes: NOTE_LIST is the end of a chain of notes
6133 previously found among the insns. Insert them at the beginning
6134 of the insns. */
6135 if (note_list != 0)
6137 rtx note_head = note_list;
6139 while (PREV_INSN (note_head))
6141 note_head = PREV_INSN (note_head);
6144 PREV_INSN (note_head) = PREV_INSN (head);
6145 NEXT_INSN (PREV_INSN (head)) = note_head;
6146 PREV_INSN (head) = note_list;
6147 NEXT_INSN (note_list) = head;
6148 head = note_head;
6151 /* Update target block boundaries. */
6152 if (new_needs & NEED_HEAD)
6153 BLOCK_HEAD (b) = head;
6155 if (new_needs & NEED_TAIL)
6156 BLOCK_END (b) = tail;
6158 /* Debugging. */
6159 if (sched_verbose)
6161 fprintf (dump, ";; total time = %d\n;; new basic block head = %d\n",
6162 clock_var, INSN_UID (BLOCK_HEAD (b)));
6163 fprintf (dump, ";; new basic block end = %d\n\n",
6164 INSN_UID (BLOCK_END (b)));
6167 /* Clean up. */
6168 if (current_nr_blocks > 1)
6170 free (candidate_table);
6171 free (bblst_table);
6172 free (bitlst_table);
6174 free (ready);
6176 return (sched_n_insns);
6177 } /* schedule_block () */
6180 /* Print the bit-set of registers, S, callable from debugger. */
6182 extern void
6183 debug_reg_vector (s)
6184 regset s;
6186 int regno;
6188 EXECUTE_IF_SET_IN_REG_SET (s, 0, regno,
6190 fprintf (dump, " %d", regno);
6193 fprintf (dump, "\n");
6196 /* Use the backward dependences from LOG_LINKS to build
6197 forward dependences in INSN_DEPEND. */
6199 static void
6200 compute_block_forward_dependences (bb)
6201 int bb;
6203 rtx insn, link;
6204 rtx tail, head;
6205 rtx next_tail;
6206 enum reg_note dep_type;
6208 get_bb_head_tail (bb, &head, &tail);
6209 next_tail = NEXT_INSN (tail);
6210 for (insn = head; insn != next_tail; insn = NEXT_INSN (insn))
6212 if (! INSN_P (insn))
6213 continue;
6215 insn = group_leader (insn);
6217 for (link = LOG_LINKS (insn); link; link = XEXP (link, 1))
6219 rtx x = group_leader (XEXP (link, 0));
6220 rtx new_link;
6222 if (x != XEXP (link, 0))
6223 continue;
6225 #ifdef ENABLE_CHECKING
6226 /* If add_dependence is working properly there should never
6227 be notes, deleted insns or duplicates in the backward
6228 links. Thus we need not check for them here.
6230 However, if we have enabled checking we might as well go
6231 ahead and verify that add_dependence worked properly. */
6232 if (GET_CODE (x) == NOTE
6233 || INSN_DELETED_P (x)
6234 || find_insn_list (insn, INSN_DEPEND (x)))
6235 abort ();
6236 #endif
6238 new_link = alloc_INSN_LIST (insn, INSN_DEPEND (x));
6240 dep_type = REG_NOTE_KIND (link);
6241 PUT_REG_NOTE_KIND (new_link, dep_type);
6243 INSN_DEPEND (x) = new_link;
6244 INSN_DEP_COUNT (insn) += 1;
6249 /* Initialize variables for region data dependence analysis.
6250 n_bbs is the number of region blocks. */
6252 static void
6253 init_deps (deps)
6254 struct deps *deps;
6256 int maxreg = max_reg_num ();
6257 deps->reg_last_uses = (rtx *) xcalloc (maxreg, sizeof (rtx));
6258 deps->reg_last_sets = (rtx *) xcalloc (maxreg, sizeof (rtx));
6259 deps->reg_last_clobbers = (rtx *) xcalloc (maxreg, sizeof (rtx));
6261 deps->pending_read_insns = 0;
6262 deps->pending_read_mems = 0;
6263 deps->pending_write_insns = 0;
6264 deps->pending_write_mems = 0;
6265 deps->pending_lists_length = 0;
6266 deps->last_pending_memory_flush = 0;
6267 deps->last_function_call = 0;
6268 deps->in_post_call_group_p = 0;
6270 deps->sched_before_next_call
6271 = gen_rtx_INSN (VOIDmode, 0, NULL_RTX, NULL_RTX,
6272 NULL_RTX, 0, NULL_RTX, NULL_RTX);
6273 LOG_LINKS (deps->sched_before_next_call) = 0;
6276 /* Add dependences so that branches are scheduled to run last in their
6277 block. */
6279 static void
6280 add_branch_dependences (head, tail)
6281 rtx head, tail;
6283 rtx insn, last;
6285 /* For all branches, calls, uses, clobbers, and cc0 setters, force them
6286 to remain in order at the end of the block by adding dependencies and
6287 giving the last a high priority. There may be notes present, and
6288 prev_head may also be a note.
6290 Branches must obviously remain at the end. Calls should remain at the
6291 end since moving them results in worse register allocation. Uses remain
6292 at the end to ensure proper register allocation. cc0 setters remaim
6293 at the end because they can't be moved away from their cc0 user. */
6294 insn = tail;
6295 last = 0;
6296 while (GET_CODE (insn) == CALL_INSN
6297 || GET_CODE (insn) == JUMP_INSN
6298 || (GET_CODE (insn) == INSN
6299 && (GET_CODE (PATTERN (insn)) == USE
6300 || GET_CODE (PATTERN (insn)) == CLOBBER
6301 #ifdef HAVE_cc0
6302 || sets_cc0_p (PATTERN (insn))
6303 #endif
6305 || GET_CODE (insn) == NOTE)
6307 if (GET_CODE (insn) != NOTE)
6309 if (last != 0
6310 && !find_insn_list (insn, LOG_LINKS (last)))
6312 add_dependence (last, insn, REG_DEP_ANTI);
6313 INSN_REF_COUNT (insn)++;
6316 CANT_MOVE (insn) = 1;
6318 last = insn;
6319 /* Skip over insns that are part of a group.
6320 Make each insn explicitly depend on the previous insn.
6321 This ensures that only the group header will ever enter
6322 the ready queue (and, when scheduled, will automatically
6323 schedule the SCHED_GROUP_P block). */
6324 while (SCHED_GROUP_P (insn))
6326 rtx temp = prev_nonnote_insn (insn);
6327 add_dependence (insn, temp, REG_DEP_ANTI);
6328 insn = temp;
6332 /* Don't overrun the bounds of the basic block. */
6333 if (insn == head)
6334 break;
6336 insn = PREV_INSN (insn);
6339 /* Make sure these insns are scheduled last in their block. */
6340 insn = last;
6341 if (insn != 0)
6342 while (insn != head)
6344 insn = prev_nonnote_insn (insn);
6346 if (INSN_REF_COUNT (insn) != 0)
6347 continue;
6349 add_dependence (last, insn, REG_DEP_ANTI);
6350 INSN_REF_COUNT (insn) = 1;
6352 /* Skip over insns that are part of a group. */
6353 while (SCHED_GROUP_P (insn))
6354 insn = prev_nonnote_insn (insn);
6358 /* After computing the dependencies for block BB, propagate the dependencies
6359 found in TMP_DEPS to the successors of the block. MAX_REG is the number
6360 of registers. */
6361 static void
6362 propagate_deps (bb, tmp_deps, max_reg)
6363 int bb;
6364 struct deps *tmp_deps;
6365 int max_reg;
6367 int b = BB_TO_BLOCK (bb);
6368 int e, first_edge;
6369 int reg;
6370 rtx link_insn, link_mem;
6371 rtx u;
6373 /* These lists should point to the right place, for correct
6374 freeing later. */
6375 bb_deps[bb].pending_read_insns = tmp_deps->pending_read_insns;
6376 bb_deps[bb].pending_read_mems = tmp_deps->pending_read_mems;
6377 bb_deps[bb].pending_write_insns = tmp_deps->pending_write_insns;
6378 bb_deps[bb].pending_write_mems = tmp_deps->pending_write_mems;
6380 /* bb's structures are inherited by its successors. */
6381 first_edge = e = OUT_EDGES (b);
6382 if (e <= 0)
6383 return;
6387 rtx x;
6388 int b_succ = TO_BLOCK (e);
6389 int bb_succ = BLOCK_TO_BB (b_succ);
6390 struct deps *succ_deps = bb_deps + bb_succ;
6392 /* Only bbs "below" bb, in the same region, are interesting. */
6393 if (CONTAINING_RGN (b) != CONTAINING_RGN (b_succ)
6394 || bb_succ <= bb)
6396 e = NEXT_OUT (e);
6397 continue;
6400 for (reg = 0; reg < max_reg; reg++)
6402 /* reg-last-uses lists are inherited by bb_succ. */
6403 for (u = tmp_deps->reg_last_uses[reg]; u; u = XEXP (u, 1))
6405 if (find_insn_list (XEXP (u, 0),
6406 succ_deps->reg_last_uses[reg]))
6407 continue;
6409 succ_deps->reg_last_uses[reg]
6410 = alloc_INSN_LIST (XEXP (u, 0),
6411 succ_deps->reg_last_uses[reg]);
6414 /* reg-last-defs lists are inherited by bb_succ. */
6415 for (u = tmp_deps->reg_last_sets[reg]; u; u = XEXP (u, 1))
6417 if (find_insn_list (XEXP (u, 0),
6418 succ_deps->reg_last_sets[reg]))
6419 continue;
6421 succ_deps->reg_last_sets[reg]
6422 = alloc_INSN_LIST (XEXP (u, 0),
6423 succ_deps->reg_last_sets[reg]);
6426 for (u = tmp_deps->reg_last_clobbers[reg]; u; u = XEXP (u, 1))
6428 if (find_insn_list (XEXP (u, 0),
6429 succ_deps->reg_last_clobbers[reg]))
6430 continue;
6432 succ_deps->reg_last_clobbers[reg]
6433 = alloc_INSN_LIST (XEXP (u, 0),
6434 succ_deps->reg_last_clobbers[reg]);
6438 /* Mem read/write lists are inherited by bb_succ. */
6439 link_insn = tmp_deps->pending_read_insns;
6440 link_mem = tmp_deps->pending_read_mems;
6441 while (link_insn)
6443 if (!(find_insn_mem_list (XEXP (link_insn, 0),
6444 XEXP (link_mem, 0),
6445 succ_deps->pending_read_insns,
6446 succ_deps->pending_read_mems)))
6447 add_insn_mem_dependence (succ_deps, &succ_deps->pending_read_insns,
6448 &succ_deps->pending_read_mems,
6449 XEXP (link_insn, 0), XEXP (link_mem, 0));
6450 link_insn = XEXP (link_insn, 1);
6451 link_mem = XEXP (link_mem, 1);
6454 link_insn = tmp_deps->pending_write_insns;
6455 link_mem = tmp_deps->pending_write_mems;
6456 while (link_insn)
6458 if (!(find_insn_mem_list (XEXP (link_insn, 0),
6459 XEXP (link_mem, 0),
6460 succ_deps->pending_write_insns,
6461 succ_deps->pending_write_mems)))
6462 add_insn_mem_dependence (succ_deps,
6463 &succ_deps->pending_write_insns,
6464 &succ_deps->pending_write_mems,
6465 XEXP (link_insn, 0), XEXP (link_mem, 0));
6467 link_insn = XEXP (link_insn, 1);
6468 link_mem = XEXP (link_mem, 1);
6471 /* last_function_call is inherited by bb_succ. */
6472 for (u = tmp_deps->last_function_call; u; u = XEXP (u, 1))
6474 if (find_insn_list (XEXP (u, 0),
6475 succ_deps->last_function_call))
6476 continue;
6478 succ_deps->last_function_call
6479 = alloc_INSN_LIST (XEXP (u, 0),
6480 succ_deps->last_function_call);
6483 /* last_pending_memory_flush is inherited by bb_succ. */
6484 for (u = tmp_deps->last_pending_memory_flush; u; u = XEXP (u, 1))
6486 if (find_insn_list (XEXP (u, 0),
6487 succ_deps->last_pending_memory_flush))
6488 continue;
6490 succ_deps->last_pending_memory_flush
6491 = alloc_INSN_LIST (XEXP (u, 0),
6492 succ_deps->last_pending_memory_flush);
6495 /* sched_before_next_call is inherited by bb_succ. */
6496 x = LOG_LINKS (tmp_deps->sched_before_next_call);
6497 for (; x; x = XEXP (x, 1))
6498 add_dependence (succ_deps->sched_before_next_call,
6499 XEXP (x, 0), REG_DEP_ANTI);
6501 e = NEXT_OUT (e);
6503 while (e != first_edge);
6506 /* Compute backward dependences inside bb. In a multiple blocks region:
6507 (1) a bb is analyzed after its predecessors, and (2) the lists in
6508 effect at the end of bb (after analyzing for bb) are inherited by
6509 bb's successrs.
6511 Specifically for reg-reg data dependences, the block insns are
6512 scanned by sched_analyze () top-to-bottom. Two lists are
6513 maintained by sched_analyze (): reg_last_sets[] for register DEFs,
6514 and reg_last_uses[] for register USEs.
6516 When analysis is completed for bb, we update for its successors:
6517 ; - DEFS[succ] = Union (DEFS [succ], DEFS [bb])
6518 ; - USES[succ] = Union (USES [succ], DEFS [bb])
6520 The mechanism for computing mem-mem data dependence is very
6521 similar, and the result is interblock dependences in the region. */
6523 static void
6524 compute_block_backward_dependences (bb)
6525 int bb;
6527 int i;
6528 rtx head, tail;
6529 int max_reg = max_reg_num ();
6530 struct deps tmp_deps;
6532 tmp_deps = bb_deps[bb];
6534 /* Do the analysis for this block. */
6535 get_bb_head_tail (bb, &head, &tail);
6536 sched_analyze (&tmp_deps, head, tail);
6537 add_branch_dependences (head, tail);
6539 if (current_nr_blocks > 1)
6540 propagate_deps (bb, &tmp_deps, max_reg);
6542 /* Free up the INSN_LISTs.
6544 Note this loop is executed max_reg * nr_regions times. It's first
6545 implementation accounted for over 90% of the calls to free_INSN_LIST_list.
6546 The list was empty for the vast majority of those calls. On the PA, not
6547 calling free_INSN_LIST_list in those cases improves -O2 compile times by
6548 3-5% on average. */
6549 for (i = 0; i < max_reg; ++i)
6551 if (tmp_deps.reg_last_clobbers[i])
6552 free_INSN_LIST_list (&tmp_deps.reg_last_clobbers[i]);
6553 if (tmp_deps.reg_last_sets[i])
6554 free_INSN_LIST_list (&tmp_deps.reg_last_sets[i]);
6555 if (tmp_deps.reg_last_uses[i])
6556 free_INSN_LIST_list (&tmp_deps.reg_last_uses[i]);
6559 /* Assert that we won't need bb_reg_last_* for this block anymore. */
6560 free (bb_deps[bb].reg_last_uses);
6561 free (bb_deps[bb].reg_last_sets);
6562 free (bb_deps[bb].reg_last_clobbers);
6563 bb_deps[bb].reg_last_uses = 0;
6564 bb_deps[bb].reg_last_sets = 0;
6565 bb_deps[bb].reg_last_clobbers = 0;
6568 /* Print dependences for debugging, callable from debugger. */
6570 void
6571 debug_dependencies ()
6573 int bb;
6575 fprintf (dump, ";; --------------- forward dependences: ------------ \n");
6576 for (bb = 0; bb < current_nr_blocks; bb++)
6578 if (1)
6580 rtx head, tail;
6581 rtx next_tail;
6582 rtx insn;
6584 get_bb_head_tail (bb, &head, &tail);
6585 next_tail = NEXT_INSN (tail);
6586 fprintf (dump, "\n;; --- Region Dependences --- b %d bb %d \n",
6587 BB_TO_BLOCK (bb), bb);
6589 fprintf (dump, ";; %7s%6s%6s%6s%6s%6s%11s%6s\n",
6590 "insn", "code", "bb", "dep", "prio", "cost", "blockage", "units");
6591 fprintf (dump, ";; %7s%6s%6s%6s%6s%6s%11s%6s\n",
6592 "----", "----", "--", "---", "----", "----", "--------", "-----");
6593 for (insn = head; insn != next_tail; insn = NEXT_INSN (insn))
6595 rtx link;
6596 int unit, range;
6598 if (! INSN_P (insn))
6600 int n;
6601 fprintf (dump, ";; %6d ", INSN_UID (insn));
6602 if (GET_CODE (insn) == NOTE)
6604 n = NOTE_LINE_NUMBER (insn);
6605 if (n < 0)
6606 fprintf (dump, "%s\n", GET_NOTE_INSN_NAME (n));
6607 else
6608 fprintf (dump, "line %d, file %s\n", n,
6609 NOTE_SOURCE_FILE (insn));
6611 else
6612 fprintf (dump, " {%s}\n", GET_RTX_NAME (GET_CODE (insn)));
6613 continue;
6616 unit = insn_unit (insn);
6617 range = (unit < 0
6618 || function_units[unit].blockage_range_function == 0) ? 0 :
6619 function_units[unit].blockage_range_function (insn);
6620 fprintf (dump,
6621 ";; %s%5d%6d%6d%6d%6d%6d %3d -%3d ",
6622 (SCHED_GROUP_P (insn) ? "+" : " "),
6623 INSN_UID (insn),
6624 INSN_CODE (insn),
6625 INSN_BB (insn),
6626 INSN_DEP_COUNT (insn),
6627 INSN_PRIORITY (insn),
6628 insn_cost (insn, 0, 0),
6629 (int) MIN_BLOCKAGE_COST (range),
6630 (int) MAX_BLOCKAGE_COST (range));
6631 insn_print_units (insn);
6632 fprintf (dump, "\t: ");
6633 for (link = INSN_DEPEND (insn); link; link = XEXP (link, 1))
6634 fprintf (dump, "%d ", INSN_UID (XEXP (link, 0)));
6635 fprintf (dump, "\n");
6639 fprintf (dump, "\n");
6642 /* Set_priorities: compute priority of each insn in the block. */
6644 static int
6645 set_priorities (bb)
6646 int bb;
6648 rtx insn;
6649 int n_insn;
6651 rtx tail;
6652 rtx prev_head;
6653 rtx head;
6655 get_bb_head_tail (bb, &head, &tail);
6656 prev_head = PREV_INSN (head);
6658 if (head == tail && (! INSN_P (head)))
6659 return 0;
6661 n_insn = 0;
6662 for (insn = tail; insn != prev_head; insn = PREV_INSN (insn))
6665 if (GET_CODE (insn) == NOTE)
6666 continue;
6668 if (!(SCHED_GROUP_P (insn)))
6669 n_insn++;
6670 (void) priority (insn);
6673 return n_insn;
6676 /* Schedule a region. A region is either an inner loop, a loop-free
6677 subroutine, or a single basic block. Each bb in the region is
6678 scheduled after its flow predecessors. */
6680 static void
6681 schedule_region (rgn)
6682 int rgn;
6684 int bb;
6685 int rgn_n_insns = 0;
6686 int sched_rgn_n_insns = 0;
6687 regset_head reg_pending_sets_head;
6688 regset_head reg_pending_clobbers_head;
6690 /* Set variables for the current region. */
6691 current_nr_blocks = RGN_NR_BLOCKS (rgn);
6692 current_blocks = RGN_BLOCKS (rgn);
6694 reg_pending_sets = INITIALIZE_REG_SET (reg_pending_sets_head);
6695 reg_pending_clobbers = INITIALIZE_REG_SET (reg_pending_clobbers_head);
6696 reg_pending_sets_all = 0;
6698 /* Initializations for region data dependence analyisis. */
6699 bb_deps = (struct deps *) xmalloc (sizeof (struct deps) * current_nr_blocks);
6700 for (bb = 0; bb < current_nr_blocks; bb++)
6701 init_deps (bb_deps + bb);
6703 /* Compute LOG_LINKS. */
6704 for (bb = 0; bb < current_nr_blocks; bb++)
6705 compute_block_backward_dependences (bb);
6707 /* Compute INSN_DEPEND. */
6708 for (bb = current_nr_blocks - 1; bb >= 0; bb--)
6709 compute_block_forward_dependences (bb);
6711 /* Delete line notes and set priorities. */
6712 for (bb = 0; bb < current_nr_blocks; bb++)
6714 if (write_symbols != NO_DEBUG)
6716 save_line_notes (bb);
6717 rm_line_notes (bb);
6720 rgn_n_insns += set_priorities (bb);
6723 /* Compute interblock info: probabilities, split-edges, dominators, etc. */
6724 if (current_nr_blocks > 1)
6726 int i;
6728 prob = (float *) xmalloc ((current_nr_blocks) * sizeof (float));
6730 bbset_size = current_nr_blocks / HOST_BITS_PER_WIDE_INT + 1;
6731 dom = (bbset *) xmalloc (current_nr_blocks * sizeof (bbset));
6732 for (i = 0; i < current_nr_blocks; i++)
6733 dom[i] = (bbset) xcalloc (bbset_size, sizeof (HOST_WIDE_INT));
6735 /* Edge to bit. */
6736 rgn_nr_edges = 0;
6737 edge_to_bit = (int *) xmalloc (nr_edges * sizeof (int));
6738 for (i = 1; i < nr_edges; i++)
6739 if (CONTAINING_RGN (FROM_BLOCK (i)) == rgn)
6740 EDGE_TO_BIT (i) = rgn_nr_edges++;
6741 rgn_edges = (int *) xmalloc (rgn_nr_edges * sizeof (int));
6743 rgn_nr_edges = 0;
6744 for (i = 1; i < nr_edges; i++)
6745 if (CONTAINING_RGN (FROM_BLOCK (i)) == (rgn))
6746 rgn_edges[rgn_nr_edges++] = i;
6748 /* Split edges. */
6749 edgeset_size = rgn_nr_edges / HOST_BITS_PER_WIDE_INT + 1;
6750 edgeset_bitsize = rgn_nr_edges;
6751 pot_split = (edgeset *) xmalloc (current_nr_blocks * sizeof (edgeset));
6752 ancestor_edges
6753 = (edgeset *) xmalloc (current_nr_blocks * sizeof (edgeset));
6754 for (i = 0; i < current_nr_blocks; i++)
6756 pot_split[i] =
6757 (edgeset) xcalloc (edgeset_size, sizeof (HOST_WIDE_INT));
6758 ancestor_edges[i] =
6759 (edgeset) xcalloc (edgeset_size, sizeof (HOST_WIDE_INT));
6762 /* Compute probabilities, dominators, split_edges. */
6763 for (bb = 0; bb < current_nr_blocks; bb++)
6764 compute_dom_prob_ps (bb);
6767 /* Now we can schedule all blocks. */
6768 for (bb = 0; bb < current_nr_blocks; bb++)
6769 sched_rgn_n_insns += schedule_block (bb, rgn_n_insns);
6771 /* Sanity check: verify that all region insns were scheduled. */
6772 if (sched_rgn_n_insns != rgn_n_insns)
6773 abort ();
6775 /* Restore line notes. */
6776 if (write_symbols != NO_DEBUG)
6778 for (bb = 0; bb < current_nr_blocks; bb++)
6779 restore_line_notes (bb);
6782 /* Done with this region. */
6783 free_pending_lists ();
6785 FREE_REG_SET (reg_pending_sets);
6786 FREE_REG_SET (reg_pending_clobbers);
6788 free (bb_deps);
6790 if (current_nr_blocks > 1)
6792 int i;
6794 free (prob);
6795 for (i = 0; i < current_nr_blocks; ++i)
6797 free (dom[i]);
6798 free (pot_split[i]);
6799 free (ancestor_edges[i]);
6801 free (dom);
6802 free (edge_to_bit);
6803 free (rgn_edges);
6804 free (pot_split);
6805 free (ancestor_edges);
6809 /* The one entry point in this file. DUMP_FILE is the dump file for
6810 this pass. */
6812 void
6813 schedule_insns (dump_file)
6814 FILE *dump_file;
6816 int *deaths_in_region;
6817 sbitmap blocks, large_region_blocks;
6818 int max_uid;
6819 int b;
6820 rtx insn;
6821 int rgn;
6822 int luid;
6823 int any_large_regions;
6825 /* Disable speculative loads in their presence if cc0 defined. */
6826 #ifdef HAVE_cc0
6827 flag_schedule_speculative_load = 0;
6828 #endif
6830 /* Taking care of this degenerate case makes the rest of
6831 this code simpler. */
6832 if (n_basic_blocks == 0)
6833 return;
6835 /* Set dump and sched_verbose for the desired debugging output. If no
6836 dump-file was specified, but -fsched-verbose=N (any N), print to stderr.
6837 For -fsched-verbose=N, N>=10, print everything to stderr. */
6838 sched_verbose = sched_verbose_param;
6839 if (sched_verbose_param == 0 && dump_file)
6840 sched_verbose = 1;
6841 dump = ((sched_verbose_param >= 10 || !dump_file) ? stderr : dump_file);
6843 nr_inter = 0;
6844 nr_spec = 0;
6846 /* Initialize issue_rate. */
6847 issue_rate = ISSUE_RATE;
6849 split_all_insns (1);
6851 /* We use LUID 0 for the fake insn (UID 0) which holds dependencies for
6852 pseudos which do not cross calls. */
6853 max_uid = get_max_uid () + 1;
6855 h_i_d = (struct haifa_insn_data *) xcalloc (max_uid, sizeof (*h_i_d));
6857 h_i_d[0].luid = 0;
6858 luid = 1;
6859 for (b = 0; b < n_basic_blocks; b++)
6860 for (insn = BLOCK_HEAD (b);; insn = NEXT_INSN (insn))
6862 INSN_LUID (insn) = luid;
6864 /* Increment the next luid, unless this is a note. We don't
6865 really need separate IDs for notes and we don't want to
6866 schedule differently depending on whether or not there are
6867 line-number notes, i.e., depending on whether or not we're
6868 generating debugging information. */
6869 if (GET_CODE (insn) != NOTE)
6870 ++luid;
6872 if (insn == BLOCK_END (b))
6873 break;
6876 /* ?!? We could save some memory by computing a per-region luid mapping
6877 which could reduce both the number of vectors in the cache and the size
6878 of each vector. Instead we just avoid the cache entirely unless the
6879 average number of instructions in a basic block is very high. See
6880 the comment before the declaration of true_dependency_cache for
6881 what we consider "very high". */
6882 if (luid / n_basic_blocks > 100 * 5)
6884 true_dependency_cache = sbitmap_vector_alloc (luid, luid);
6885 sbitmap_vector_zero (true_dependency_cache, luid);
6888 nr_regions = 0;
6889 rgn_table = (region *) xmalloc ((n_basic_blocks) * sizeof (region));
6890 rgn_bb_table = (int *) xmalloc ((n_basic_blocks) * sizeof (int));
6891 block_to_bb = (int *) xmalloc ((n_basic_blocks) * sizeof (int));
6892 containing_rgn = (int *) xmalloc ((n_basic_blocks) * sizeof (int));
6894 blocks = sbitmap_alloc (n_basic_blocks);
6895 large_region_blocks = sbitmap_alloc (n_basic_blocks);
6897 compute_bb_for_insn (max_uid);
6899 /* Compute regions for scheduling. */
6900 if (reload_completed
6901 || n_basic_blocks == 1
6902 || !flag_schedule_interblock)
6904 find_single_block_region ();
6906 else
6908 /* Verify that a 'good' control flow graph can be built. */
6909 if (is_cfg_nonregular ())
6911 find_single_block_region ();
6913 else
6915 sbitmap *dom;
6916 struct edge_list *edge_list;
6918 dom = sbitmap_vector_alloc (n_basic_blocks, n_basic_blocks);
6920 /* The scheduler runs after flow; therefore, we can't blindly call
6921 back into find_basic_blocks since doing so could invalidate the
6922 info in global_live_at_start.
6924 Consider a block consisting entirely of dead stores; after life
6925 analysis it would be a block of NOTE_INSN_DELETED notes. If
6926 we call find_basic_blocks again, then the block would be removed
6927 entirely and invalidate our the register live information.
6929 We could (should?) recompute register live information. Doing
6930 so may even be beneficial. */
6931 edge_list = create_edge_list ();
6933 /* Compute the dominators and post dominators. We don't
6934 currently use post dominators, but we should for
6935 speculative motion analysis. */
6936 compute_flow_dominators (dom, NULL);
6938 /* build_control_flow will return nonzero if it detects unreachable
6939 blocks or any other irregularity with the cfg which prevents
6940 cross block scheduling. */
6941 if (build_control_flow (edge_list) != 0)
6942 find_single_block_region ();
6943 else
6944 find_rgns (edge_list, dom);
6946 if (sched_verbose >= 3)
6947 debug_regions ();
6949 /* We are done with flow's edge list. */
6950 free_edge_list (edge_list);
6952 /* For now. This will move as more and more of haifa is converted
6953 to using the cfg code in flow.c. */
6954 free (dom);
6958 deaths_in_region = (int *) xmalloc (sizeof (int) * nr_regions);
6960 init_alias_analysis ();
6962 if (write_symbols != NO_DEBUG)
6964 rtx line;
6966 line_note_head = (rtx *) xcalloc (n_basic_blocks, sizeof (rtx));
6968 /* Save-line-note-head:
6969 Determine the line-number at the start of each basic block.
6970 This must be computed and saved now, because after a basic block's
6971 predecessor has been scheduled, it is impossible to accurately
6972 determine the correct line number for the first insn of the block. */
6974 for (b = 0; b < n_basic_blocks; b++)
6975 for (line = BLOCK_HEAD (b); line; line = PREV_INSN (line))
6976 if (GET_CODE (line) == NOTE && NOTE_LINE_NUMBER (line) > 0)
6978 line_note_head[b] = line;
6979 break;
6983 /* Find units used in this fuction, for visualization. */
6984 if (sched_verbose)
6985 init_target_units ();
6987 /* ??? Add a NOTE after the last insn of the last basic block. It is not
6988 known why this is done. */
6990 insn = BLOCK_END (n_basic_blocks - 1);
6991 if (NEXT_INSN (insn) == 0
6992 || (GET_CODE (insn) != NOTE
6993 && GET_CODE (insn) != CODE_LABEL
6994 /* Don't emit a NOTE if it would end up between an unconditional
6995 jump and a BARRIER. */
6996 && !(GET_CODE (insn) == JUMP_INSN
6997 && GET_CODE (NEXT_INSN (insn)) == BARRIER)))
6998 emit_note_after (NOTE_INSN_DELETED, BLOCK_END (n_basic_blocks - 1));
7000 /* Compute INSN_REG_WEIGHT for all blocks. We must do this before
7001 removing death notes. */
7002 for (b = n_basic_blocks - 1; b >= 0; b--)
7003 find_insn_reg_weight (b);
7005 /* Remove all death notes from the subroutine. */
7006 for (rgn = 0; rgn < nr_regions; rgn++)
7008 sbitmap_zero (blocks);
7009 for (b = RGN_NR_BLOCKS (rgn) - 1; b >= 0; --b)
7010 SET_BIT (blocks, rgn_bb_table [RGN_BLOCKS (rgn) + b]);
7012 deaths_in_region[rgn] = count_or_remove_death_notes (blocks, 1);
7015 /* Schedule every region in the subroutine. */
7016 for (rgn = 0; rgn < nr_regions; rgn++)
7017 schedule_region (rgn);
7019 /* Update life analysis for the subroutine. Do single block regions
7020 first so that we can verify that live_at_start didn't change. Then
7021 do all other blocks. */
7022 /* ??? There is an outside possibility that update_life_info, or more
7023 to the point propagate_block, could get called with non-zero flags
7024 more than once for one basic block. This would be kinda bad if it
7025 were to happen, since REG_INFO would be accumulated twice for the
7026 block, and we'd have twice the REG_DEAD notes.
7028 I'm fairly certain that this _shouldn't_ happen, since I don't think
7029 that live_at_start should change at region heads. Not sure what the
7030 best way to test for this kind of thing... */
7032 allocate_reg_life_data ();
7033 compute_bb_for_insn (max_uid);
7035 any_large_regions = 0;
7036 sbitmap_ones (large_region_blocks);
7038 for (rgn = 0; rgn < nr_regions; rgn++)
7039 if (RGN_NR_BLOCKS (rgn) > 1)
7040 any_large_regions = 1;
7041 else
7043 sbitmap_zero (blocks);
7044 SET_BIT (blocks, rgn_bb_table[RGN_BLOCKS (rgn)]);
7045 RESET_BIT (large_region_blocks, rgn_bb_table[RGN_BLOCKS (rgn)]);
7047 /* Don't update reg info after reload, since that affects
7048 regs_ever_live, which should not change after reload. */
7049 update_life_info (blocks, UPDATE_LIFE_LOCAL,
7050 (reload_completed ? PROP_DEATH_NOTES
7051 : PROP_DEATH_NOTES | PROP_REG_INFO));
7053 #ifndef HAVE_conditional_execution
7054 /* ??? REG_DEAD notes only exist for unconditional deaths. We need
7055 a count of the conditional plus unconditional deaths for this to
7056 work out. */
7057 /* In the single block case, the count of registers that died should
7058 not have changed during the schedule. */
7059 if (count_or_remove_death_notes (blocks, 0) != deaths_in_region[rgn])
7060 abort ();
7061 #endif
7064 if (any_large_regions)
7066 update_life_info (large_region_blocks, UPDATE_LIFE_GLOBAL,
7067 PROP_DEATH_NOTES | PROP_REG_INFO);
7070 /* Reposition the prologue and epilogue notes in case we moved the
7071 prologue/epilogue insns. */
7072 if (reload_completed)
7073 reposition_prologue_and_epilogue_notes (get_insns ());
7075 /* Delete redundant line notes. */
7076 if (write_symbols != NO_DEBUG)
7077 rm_redundant_line_notes ();
7079 if (sched_verbose)
7081 if (reload_completed == 0 && flag_schedule_interblock)
7083 fprintf (dump, "\n;; Procedure interblock/speculative motions == %d/%d \n",
7084 nr_inter, nr_spec);
7086 else
7088 if (nr_inter > 0)
7089 abort ();
7091 fprintf (dump, "\n\n");
7094 /* Clean up. */
7095 end_alias_analysis ();
7097 if (true_dependency_cache)
7099 free (true_dependency_cache);
7100 true_dependency_cache = NULL;
7102 free (rgn_table);
7103 free (rgn_bb_table);
7104 free (block_to_bb);
7105 free (containing_rgn);
7107 free (h_i_d);
7109 if (write_symbols != NO_DEBUG)
7110 free (line_note_head);
7112 if (edge_table)
7114 free (edge_table);
7115 edge_table = NULL;
7118 if (in_edges)
7120 free (in_edges);
7121 in_edges = NULL;
7123 if (out_edges)
7125 free (out_edges);
7126 out_edges = NULL;
7129 sbitmap_free (blocks);
7130 sbitmap_free (large_region_blocks);
7132 free (deaths_in_region);
7135 #endif /* INSN_SCHEDULING */