isl_basic_map_realign: perform Gaussian elimination on result Many parts of isl assume that Gaussian elimination has been applied to the equality constraints. In particular singleton_extract_point makes this assumption. The input to singleton_extract_point may have undergone parameter alignment. This parameter alignment (ultimately performed by isl_basic_map_realign) therefore needs to make sure the result preserves this property. Reported-by: Tobias Grosser <tobias@grosser.es> Signed-off-by: Sven Verdoolaege <skimo@kotnet.org>

isl_map_simplify.c: coalesce_divs: add braces to avoid "dangling else" warning The "dangling else" that clang complains about was introduced in 8f099db (isl_map_simplify.c: coalesce_divs: fix overly zealous clean-up, Thu Jan 14 17:11:15 2016 +0100). Signed-off-by: Sven Verdoolaege <skimo@kotnet.org>

isl_schedule_constraints_compute_schedule: optionally maximize coincidence Now that the scheduler can be instructed to incrementally combine (clusters of) strongly connected components, it becomes easy to refuse to combine such clusters if this combination would reduce the number of coincident schedule dimensions. Introduce an option to select this behavior. Signed-off-by: Sven Verdoolaege <skimo@kotnet.org>

isl_schedule_constraints_compute_schedule: allow merging with unbounded distance The previous commit introduced the option to incrementally combine SCCs. However, each time two (or more) clusters of SCCs are combined, the merge is only allowed if at least one proximity constraint is completely optimized. Take for example the atax benchmark from PolyBench: for (i = 0; i < _PB_N; i++) y[i] = 0; for (i = 0; i < _PB_M; i++) { tmp[i] = SCALAR_VAL(0.0); for (j = 0; j < _PB_N; j++) tmp[i] = tmp[i] + A[i][j] * x[j]; for (j = 0; j < _PB_N; j++) y[j] = y[j] + A[i][j] * tmp[i]; } The third statement is not merged with the fourth statement because every value written by the third statement is used by several instance of the fourth statement. It may however be useful to tile all statements together. This would especially be the case if the third statement were to be removed, such that there would be a proximity dependence between the second and the fourth statement. In both cases, a single value written by one statement is read by several instances of another statement. Allow merging such statements by allowing non-bounded dependence distances, but only in cases where the proximity constraint itself does not allow complete optimization in all dimensions. However, such merges are only allowed after all other complete merges have been tried. For example, in atax, the first statement can also be merged with the fourth statement and this can be done with zero distances. This merge should be done first because after the merge of the third and the fourth statement there is an internal lower bound on the distance over proximity constraints and the optimizer will therefore not attempt to reduce the distance over the proximity constraint between the first and the fourth statement beyond this lower bound. Signed-off-by: Sven Verdoolaege <skimo@kotnet.org>

isl_schedule_constraints_compute_schedule: allow incremental combination of SCCs Unless the schedule_serialize_sccs option is set, an entire (weakly) connected component is currently scheduled as a whole. This may result in schedule problems with a large number of variables, which in turn may result in high scheduling times, and it makes it more difficult to control how strongly connected components in the dependence graph should be combined as everything needs to be encoded in a single (I)LP problem. Introduce an alternative way of computing a schedule for a weakly connected component where a partial schedule is first computed for each strongly connected component (SCC) within the weakly connected component separately and where these SCCs are subsequently combined into clusters by scheduling the SCCs (or clusters of SCCs) with respect to each other. This usually results in a smaller number of variables since the number of statements in an SCC and the number of clusters that are being combined together in one go is usually smaller than the total number of statements in the weakly connected component. There is also more direct control over which SCCs should be combined together. In particular, in the next commit this flexibility will be exploited to (optionally) only combine clusters if this does not reduce the level of coincidence. The heuristic used to select pairs of clusters to be merged has only received limited investigation. The current heuristic favors the combination of bands that are linked by a proximity edge that allows many schedule dimensions to be aligned and that are not far apart in the topological order of the strongly connected components. The merge of two (or more) clusters is only accepted if their is at least one proximity constraint between two clusters that is completely optimized (up to a small constant). An alternative for scheduling clusters with respect to each other would be to reschedule all statements in the clusters that are being merged. This would typically require more variables, but it would allow for schedules that are now being missed because the schedule of an SCC is fixed once it has been computed. A schedule_whole_component option is introduced to choose between the old and the new algorithm, which is default on, corresponding to the original behavior. Signed-off-by: Sven Verdoolaege <skimo@kotnet.org>

add isl_basic_map_transform_dims That is, generalize isl_basic_set_transform_dims to a function that also applies to basic maps. This will be useful for evaluating the relative importance of proximity constraints. Signed-off-by: Sven Verdoolaege <skimo@kotnet.org>

add isl_tarjan_graph_component This is useful if the user is only interested in the strongly connected component containing a particular node. In particular, we will use this function to combine SCCs into a cluster containing two pre-selected SCCs and all intermediate SCCs. Signed-off-by: Sven Verdoolaege <skimo@kotnet.org>

isl_scheduler.c: make detect_ccs more generally useful In particular, take a function defining the edges as input instead of deciding which function to use based on a "weak" property. This will allow us to reuse this function to detect strongly connected components in modified graphs. In particular, it will allow us to keep into account the clustering when we incrementally combine strongly connected components into clusters. Signed-off-by: Sven Verdoolaege <skimo@kotnet.org>

isl_scheduler.c: compute_schedule_from_partial: allow uninitialized graph The function compute_schedule_from_partial currently only gets called on the same graph on which compute_schedule_wcc_band was called before. This means that necessarily compute_maxvar has been called on the graph as well. In the future, we will want to also call compute_schedule_from_partial on other graphs, in which case compute_schedule_from_partial may need to call compute_maxvar itself. Signed-off-by: Sven Verdoolaege <skimo@kotnet.org>

isl_scheduler.c: compute_schedule_wcc: extract out compute_schedule_wcc_partial This further reduces the size of compute_schedule_wcc and, together with the previous commit, this allows us to perform additional computations between the pure construction of a band and the incoporation of that band in the schedule. This will allow us in particular to attempt to combine strongly connected components first. Signed-off-by: Sven Verdoolaege <skimo@kotnet.org>

isl_scheduler.c: compute_schedule_wcc: extract out compute_schedule_finish_band This reduces the size of compute_schedule_wcc and, together with the next commit, this will allow us to perform additional computations between the pure construction of a band and the incoporation of that band in the schedule. This will allow us in particular to attempt to combine strongly connected components first. Signed-off-by: Sven Verdoolaege <skimo@kotnet.org>