From bda4da29890fe53199ba01c9937378d4926f7984 Mon Sep 17 00:00:00 2001 From: Sven Verdoolaege Date: Wed, 9 Mar 2016 10:28:51 +0100 Subject: [PATCH] turn on --isl-schedule-maximize-coincidence by default Maximizing coincidence ensures that two groups of statements with a different number of coincident schedule dimensions are not fused together into a single group with the smaller number of coincident schedule dimensions. This means that the group with the higher number of coincident schedule dimensions can be mapped to more block and thread dimensions. For example, with --isl-schedule-maximize-coincidence turned on, the PolyBench trmm test case gets the schedule domain: "[n, m] -> { S_1[i, j] : 0 <= i < m and 0 <= j < n; S_0[i, j, k] : i >= 0 and 0 <= j < n and i < k < m }" child: sequence: - filter: "[n, m] -> { S_0[i, j, k] }" child: schedule: "[n, m] -> [{ S_0[i, j, k] -> [(j)] }, { S_0[i, j, k] -> [(k)] }, { S_0[i, j, k] -> [(i)] }]" permutable: 1 coincident: [ 1, 0, 0 ] - filter: "[n, m] -> { S_1[i, j] }" child: schedule: "[n, m] -> [{ S_1[i, j] -> [(i)] }, { S_1[i, j] -> [(j)] }]" permutable: 1 coincident: [ 1, 1 ] Note that there is a sequence of two groups, one with one coincident schedule dimension and one with two coincident schedule dimensions. With the option turned off, the following schedule is constructed instead domain: "[n, m] -> { S_1[i, j] : 0 <= i < m and 0 <= j < n; S_0[i, j, k] : i >= 0 and 0 <= j < n and i < k < m }" child: schedule: "[n, m] -> [{ S_0[i, j, k] -> [(j)]; S_1[i, j] -> [(j)] }, { S_0[i, j, k] -> [(k)]; S_1[i, j] -> [(m)] }, { S_0[i, j, k] -> [(i)]; S_1[i, j] -> [(i)] }]" permutable: 1 coincident: [ 1, 0, 0 ] child: sequence: - filter: "[n, m] -> { S_0[i, j, k] }" - filter: "[n, m] -> { S_1[i, j] }" That is, the two statements are fused into a single group with only one coincident schedule dimension. Note that the maximize coincidence option only has an effect when the whole component option is turned off, so do that as well. This in turn has the effect of making the maximize band depth option, which is already turned on by default, more effective. For example, for the PolyBench symm test case, when the whole component option is turned on, the following schedule is generated. domain: "[n, m] -> { S_0[i, j] : 0 <= i < m and 0 <= j < n; S_1[i, j, k] : i < m and 0 <= j < n and 0 <= k < i; S_2[i, j, k] : i < m and 0 <= j < n and 0 <= k < i; S_3[i, j] : 0 <= i < m and 0 <= j < n }" child: schedule: "[n, m] -> [{ S_0[i, j] -> [(i)]; S_3[i, j] -> [(i)]; S_2[i, j, k] -> [(i)]; S_1[i, j, k] -> [(k)] }, { S_0[i, j] -> [(j)]; S_3[i, j] -> [(j)]; S_2[i, j, k] -> [(j)]; S_1[i, j, k] -> [(j)] }]" permutable: 1 coincident: [ 1, 1 ] child: sequence: - filter: "[n, m] -> { S_0[i, j] }" - filter: "[n, m] -> { S_3[i, j]; S_2[i, j, k] }" child: schedule: "[n, m] -> [{ S_3[i, j] -> [(i)]; S_2[i, j, k] -> [(k)] }]" child: set: - filter: "[n, m] -> { S_3[i, j] }" - filter: "[n, m] -> { S_2[i, j, k] }" - filter: "[n, m] -> { S_1[i, j, k] }" child: schedule: "[n, m] -> [{ S_1[i, j, k] -> [(i)] }]" This schedule does not actually maximize the band depth as shown by the schedule computed with the whole component option turned off: domain: "[n, m] -> { S_0[i, j] : 0 <= i < m and 0 <= j < n; S_1[i, j, k] : i < m and 0 <= j < n and 0 <= k < i; S_2[i, j, k] : i < m and 0 <= j < n and 0 <= k < i; S_3[i, j] : 0 <= i < m and 0 <= j < n }" child: sequence: - filter: "[n, m] -> { S_0[i, j]; S_3[i, j]; S_2[i, j, k] }" child: schedule: "[n, m] -> [{ S_0[i, j] -> [(i)]; S_3[i, j] -> [(i)]; S_2[i, j, k] -> [(i)] }, { S_0[i, j] -> [(j)]; S_3[i, j] -> [(j)]; S_2[i, j, k] -> [(j)] }]" permutable: 1 coincident: [ 1, 1 ] child: sequence: - filter: "[n, m] -> { S_0[i, j] }" - filter: "[n, m] -> { S_2[i, j, k] }" child: schedule: "[n, m] -> [{ S_2[i, j, k] -> [(k)] }]" - filter: "[n, m] -> { S_3[i, j] }" - filter: "[n, m] -> { S_1[i, j, k] }" child: schedule: "[n, m] -> [{ S_1[i, j, k] -> [(j)] }, { S_1[i, j, k] -> [(k)] }, { S_1[i, j, k] -> [(i)] }]" permutable: 1 coincident: [ 1, 1, 0 ] This means that the code will now be split over two kernels, but the code in the second kernel can be tiled in 3D instead of only 2D. It is not immediately obvious whether this is an improvement or not. Perhaps the usefulness of the maximize band depth option should be reevaluated. Signed-off-by: Sven Verdoolaege --- ppcg.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/ppcg.c b/ppcg.c index ce4a3ab..fa9c1b0 100644 --- a/ppcg.c +++ b/ppcg.c @@ -1031,7 +1031,9 @@ int main(int argc, char **argv) ppcg_options_set_target_defaults(options->ppcg); isl_options_set_ast_build_detect_min_max(ctx, 1); isl_options_set_ast_print_macro_once(ctx, 1); + isl_options_set_schedule_whole_component(ctx, 0); isl_options_set_schedule_maximize_band_depth(ctx, 1); + isl_options_set_schedule_maximize_coincidence(ctx, 1); pet_options_set_encapsulate_dynamic_control(ctx, 1); argc = options_parse(options, argc, argv, ISL_ARG_ALL); -- 2.11.4.GIT