Public Git Hosting - ppcg.git/commit

turn on --isl-schedule-maximize-coincidence by default

Maximizing coincidence ensures that two groups of statements
with a different number of coincident schedule dimensions
are not fused together into a single group with the smaller
number of coincident schedule dimensions.
This means that the group with the higher number of
coincident schedule dimensions can be mapped to more
block and thread dimensions.

For example, with --isl-schedule-maximize-coincidence turned on,
the PolyBench trmm test case gets the schedule

    domain: "[n, m] -> { S_1[i, j] : 0 <= i < m and 0 <= j < n; S_0[i, j, k] : i >= 0 and 0 <= j < n and i < k < m }"
    child:
      sequence:
      - filter: "[n, m] -> { S_0[i, j, k] }"
child:
  schedule: "[n, m] -> [{ S_0[i, j, k] -> [(j)] }, { S_0[i, j, k] -> [(k)] }, { S_0[i, j, k] -> [(i)] }]"
  permutable: 1
  coincident: [ 1, 0, 0 ]
      - filter: "[n, m] -> { S_1[i, j] }"
child:
  schedule: "[n, m] -> [{ S_1[i, j] -> [(i)] }, { S_1[i, j] -> [(j)] }]"
  permutable: 1
  coincident: [ 1, 1 ]

Note that there is a sequence of two groups, one with
one coincident schedule dimension and one with two
coincident schedule dimensions.  With the option turned off,
the following schedule is constructed instead

    domain: "[n, m] -> { S_1[i, j] : 0 <= i < m and 0 <= j < n; S_0[i, j, k] : i >= 0 and 0 <= j < n and i < k < m }"
    child:
      schedule: "[n, m] -> [{ S_0[i, j, k] -> [(j)]; S_1[i, j] -> [(j)] }, { S_0[i, j, k] -> [(k)]; S_1[i, j] -> [(m)] }, { S_0[i, j, k] -> [(i)]; S_1[i, j] -> [(i)] }]"
      permutable: 1
      coincident: [ 1, 0, 0 ]
      child:
sequence:
- filter: "[n, m] -> { S_0[i, j, k] }"
- filter: "[n, m] -> { S_1[i, j] }"

That is, the two statements are fused into a single group
with only one coincident schedule dimension.

Note that the maximize coincidence option only has an effect
when the whole component option is turned off, so do that as well.
This in turn has the effect of making the maximize band depth option,
which is already turned on by default, more effective.

For example, for the PolyBench symm test case, when the whole
component option is turned on, the following schedule is generated.

    domain: "[n, m] -> { S_0[i, j] : 0 <= i < m and 0 <= j < n; S_1[i, j, k] : i < m and 0 <= j < n and 0 <= k < i; S_2[i, j, k] : i < m and 0 <= j < n and 0 <= k < i; S_3[i, j] : 0 <= i < m and 0 <= j < n }"
    child:
      schedule: "[n, m] -> [{ S_0[i, j] -> [(i)]; S_3[i, j] -> [(i)]; S_2[i, j, k] -> [(i)]; S_1[i, j, k] -> [(k)] }, { S_0[i, j] -> [(j)]; S_3[i, j] -> [(j)]; S_2[i, j, k] -> [(j)]; S_1[i, j, k] -> [(j)] }]"
      permutable: 1
      coincident: [ 1, 1 ]
      child:
sequence:
- filter: "[n, m] -> { S_0[i, j] }"
- filter: "[n, m] -> { S_3[i, j]; S_2[i, j, k] }"
  child:
    schedule: "[n, m] -> [{ S_3[i, j] -> [(i)]; S_2[i, j, k] -> [(k)] }]"
    child:
      set:
      - filter: "[n, m] -> { S_3[i, j] }"
      - filter: "[n, m] -> { S_2[i, j, k] }"
- filter: "[n, m] -> { S_1[i, j, k] }"
  child:
    schedule: "[n, m] -> [{ S_1[i, j, k] -> [(i)] }]"

This schedule does not actually maximize the band depth as
shown by the schedule computed with the whole component option turned off:

    domain: "[n, m] -> { S_0[i, j] : 0 <= i < m and 0 <= j < n; S_1[i, j, k] : i < m and 0 <= j < n and 0 <= k < i; S_2[i, j, k] : i < m and 0 <= j < n and 0 <= k < i; S_3[i, j] : 0 <= i < m and 0 <= j < n }"
    child:
      sequence:
      - filter: "[n, m] -> { S_0[i, j]; S_3[i, j]; S_2[i, j, k] }"
child:
  schedule: "[n, m] -> [{ S_0[i, j] -> [(i)]; S_3[i, j] -> [(i)]; S_2[i, j, k] -> [(i)] }, { S_0[i, j] -> [(j)]; S_3[i, j] -> [(j)]; S_2[i, j, k] -> [(j)] }]"
  permutable: 1
  coincident: [ 1, 1 ]
  child:
    sequence:
    - filter: "[n, m] -> { S_0[i, j] }"
    - filter: "[n, m] -> { S_2[i, j, k] }"
      child:
schedule: "[n, m] -> [{ S_2[i, j, k] -> [(k)] }]"
    - filter: "[n, m] -> { S_3[i, j] }"
      - filter: "[n, m] -> { S_1[i, j, k] }"
child:
  schedule: "[n, m] -> [{ S_1[i, j, k] -> [(j)] }, { S_1[i, j, k] -> [(k)] }, { S_1[i, j, k] -> [(i)] }]"
  permutable: 1
  coincident: [ 1, 1, 0 ]

This means that the code will now be split over two kernels,
but the code in the second kernel can be tiled in 3D instead of only 2D.
It is not immediately obvious whether this is an improvement or not.

Perhaps the usefulness of the maximize band depth option should
be reevaluated.

Signed-off-by: Sven Verdoolaege <skimo@kotnet.org>

commit	bda4da29890fe53199ba01c9937378d4926f7984
author	Sven Verdoolaege <skimo@kotnet.org>
	Wed, 9 Mar 2016 09:28:51 +0000 (9 10:28 +0100)
committer	Sven Verdoolaege <skimo@kotnet.org>
	Wed, 9 Mar 2016 10:26:44 +0000 (9 11:26 +0100)
tree	1d1c9873d9c1fe38952342cfde8d6c6bda2afe03	tree \| snapshot (tar.gz zip)
parent	7213b0634e886821305428c1e9253e9ef4a1352c	commit \| diff