ppcg.git
13 months agoPPCG 0.09.1masterppcg-0.09.1
Sven Verdoolaege [Sun, 2 Apr 2023 12:20:19 +0000 (2 14:20 +0200)]
PPCG 0.09.1

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
13 months agoupdate pet to version 0.11.7
Sven Verdoolaege [Sun, 2 Apr 2023 08:40:34 +0000 (2 10:40 +0200)]
update pet to version 0.11.7

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
13 months agoupdate isl to version 0.26
Sven Verdoolaege [Sun, 2 Apr 2023 08:11:25 +0000 (2 10:11 +0200)]
update isl to version 0.26

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
13 months agoupdate isl for change in coalescing
Sven Verdoolaege [Tue, 7 Mar 2023 20:56:33 +0000 (7 21:56 +0100)]
update isl for change in coalescing

This has an effect on one of the pet test cases.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
14 months agoupdate pet for support for recent versions of clang
Sven Verdoolaege [Sun, 12 Feb 2023 21:56:53 +0000 (12 22:56 +0100)]
update pet for support for recent versions of clang

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
14 months agoupdate isl for support for recent versions of clang
Sven Verdoolaege [Sun, 12 Feb 2023 21:55:47 +0000 (12 22:55 +0100)]
update isl for support for recent versions of clang

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
14 months agoadd configure~ to .gitignore
Sven Verdoolaege [Sun, 12 Feb 2023 11:38:23 +0000 (12 12:38 +0100)]
add configure~ to .gitignore

Autoconf 2.70 (and later) may leave such backup files.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
14 months agocpu.c: construct_cpu_schedule_constraints: set context
Sven Verdoolaege [Sun, 12 Feb 2023 11:34:53 +0000 (12 12:34 +0100)]
cpu.c: construct_cpu_schedule_constraints: set context

Do this for consistency with the GPU backend.
The context provides extra information that may be exploited
by the scheduler.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
14 months agoreplace obsolete AC_PROG_LIBTOOL by LT_INIT
Sven Verdoolaege [Sun, 5 Feb 2023 10:50:17 +0000 (5 11:50 +0100)]
replace obsolete AC_PROG_LIBTOOL by LT_INIT

AC_PROG_LIBTOOL has been obsolete since libtool 2.2.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
16 months agoschedule.c: add missing include
Sven Verdoolaege [Sat, 17 Dec 2022 13:32:55 +0000 (17 14:32 +0100)]
schedule.c: add missing include

This was missing from ppcg-0.08.5-9-g6b473a41 (try and remove strides
in bands before tiling, Sat Jul 17 23:08:11 2021 +0200).

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
22 months agoPPCG 0.09ppcg-0.09
Sven Verdoolaege [Sat, 2 Jul 2022 14:16:32 +0000 (2 16:16 +0200)]
PPCG 0.09

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
22 months agoupdate pet to version 0.11.6
Sven Verdoolaege [Sat, 2 Jul 2022 13:58:49 +0000 (2 15:58 +0200)]
update pet to version 0.11.6

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
22 months agoupdate isl to version 0.25
Sven Verdoolaege [Sat, 2 Jul 2022 13:23:29 +0000 (2 15:23 +0200)]
update isl to version 0.25

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
22 months agoupdate pet for support for recent versions of clang
Sven Verdoolaege [Sat, 2 Jul 2022 12:15:00 +0000 (2 14:15 +0200)]
update pet for support for recent versions of clang

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
22 months agoupdate isl for support for recent versions of clang
Sven Verdoolaege [Sat, 2 Jul 2022 10:42:13 +0000 (2 12:42 +0200)]
update isl for support for recent versions of clang

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
22 months agoinform user about empty context
Sven Verdoolaege [Mon, 29 Nov 2021 15:03:27 +0000 (29 16:03 +0100)]
inform user about empty context

An empty context means that the original code cannot be executed.
It can be confusing to see the entire code is missing from
the generated code if the user is not aware of the empty context.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
22 months agoavoid using empty context to print array declarations
Sven Verdoolaege [Sun, 28 Nov 2021 20:34:02 +0000 (28 21:34 +0100)]
avoid using empty context to print array declarations

The context can be empty if it turns out the scop cannot
be executed for any value of the parameters.
Using such an empty context to simplify expressions
can result in void expressions that cannot be printed.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
22 months agoupdate pet for preserving array extents in case of empty context
Sven Verdoolaege [Sat, 2 Jul 2022 08:39:19 +0000 (2 10:39 +0200)]
update pet for preserving array extents in case of empty context

This is needed for the test case in the next commit.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
2 years agogpu: avoid mapping some nested non-permutable bands to the device
Sven Verdoolaege [Sat, 27 Nov 2021 16:43:45 +0000 (27 17:43 +0100)]
gpu: avoid mapping some nested non-permutable bands to the device

If the outer node of the schedule tree is a set or a sequence,
then all or the initial and final children that do not have
any permutable bands are already executed on the CPU.
The outer node may however have children that are themselves
set or sequence node.  If only some of their children have
permutable bands, then the entire child of the outer node
was still being mapped to the device.
Allow initial/final children to be collected recursively
to avoid more non-permutable bands getting mapped to the device.
In the case of a set node, recursion can be performed on any child
since the initial/final parts of any child of a set node can
be moved first/last.
In the case of a sequence node, recursion can only
be performed on the first/last child.

Note that the entire set of collected descendants is moved
before/after the other descendants.  The relative order
within each of these two parts is preserved.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
2 years agogpu: avoid mapping final non-permutable bands to the device
Sven Verdoolaege [Fri, 24 Dec 2021 15:04:38 +0000 (24 16:04 +0100)]
gpu: avoid mapping final non-permutable bands to the device

If the outer node of the schedule tree is a sequence and the final
children of this sequence do not have any permutable bands,
then there is no point in including these final children
in the part that is mapped to the device.
Instead, these final children can be run on the CPU instead.

This extends earlier support for separating out independent and
initial non-permutable bands in ppcg-0.03-191-g6fa73710 (gpu:
avoid mapping independent non-permutable bands to the device,
Thu Oct 24 13:15:10 2013 +0200) and ppcg-0.04-49-g201e18aa (gpu:
avoid mapping initial non-permutable bands to the device,
Tue Dec 15 12:13:30 2015 +0100).

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
2 years agogpu.c: isolate_permutable_subtrees: say why permutable subtrees are put first
Sven Verdoolaege [Mon, 27 Dec 2021 10:27:46 +0000 (27 11:27 +0100)]
gpu.c: isolate_permutable_subtrees: say why permutable subtrees are put first

That is, in the case of a set node, the permutable subtrees
can be executed either before or after the subtrees without
permutable bands.  Explain why the permutable subtrees
are executed first.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
2 years agogpu.c: get_non_parallel_subtree_filters: optionally select final subtrees
Sven Verdoolaege [Fri, 24 Dec 2021 14:58:52 +0000 (24 15:58 +0100)]
gpu.c: get_non_parallel_subtree_filters: optionally select final subtrees

This will allow isolate_permutable_subtrees to separate out
both initial and final subtrees in an upcoming commit.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
2 years agogpu.c: get_non_parallel_subtree_filters: select subtrees based on node type
Sven Verdoolaege [Fri, 24 Dec 2021 14:55:33 +0000 (24 15:55 +0100)]
gpu.c: get_non_parallel_subtree_filters: select subtrees based on node type

The function get_non_parallel_subtree_filters selects either
the initial non-parallel subtrees or all non-parallel subtrees.
Originally, the choice was made based on a function parameter,
but in the calls of this function this parameter is uniquely
determined by the node type.
Directly use the node type inside get_non_parallel_subtree_filters.
This makes it easier to introduce a function parameter
to choose between the initial and the final subtrees
(for sequence nodes) in the next commit.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
2 years agogpu.c: get_non_parallel_subtree_filters: use isl_bool for local variable
Sven Verdoolaege [Fri, 24 Dec 2021 14:42:29 +0000 (24 15:42 +0100)]
gpu.c: get_non_parallel_subtree_filters: use isl_bool for local variable

This was missing from ppcg-0.07-21-g56f199d9 (gpu.c:
has_any_permutable_node: return isl_bool, Thu Sep 14 17:26:13 2017 +0200).

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
2 years agogpu.c: get_non_parallel_subtree_filters: fix typo in comment
Sven Verdoolaege [Mon, 27 Dec 2021 10:42:48 +0000 (27 11:42 +0100)]
gpu.c: get_non_parallel_subtree_filters: fix typo in comment

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
2 years agogpu.c: declare_accessed_local_variables: skip empty domains
Sven Verdoolaege [Sat, 27 Nov 2021 16:37:45 +0000 (27 17:37 +0100)]
gpu.c: declare_accessed_local_variables: skip empty domains

If the domain is empty then clearly no declarations are needed
for the statement instances in the domain and the iteration
over the arrays can be skipped.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
2 years agogpu.c: declare_accessed_local_variables: use isl_bool for local variable
Sven Verdoolaege [Mon, 27 Dec 2021 09:55:19 +0000 (27 10:55 +0100)]
gpu.c: declare_accessed_local_variables: use isl_bool for local variable

This was missing from ppcg-0.03-215-gacbf2ded (update isl for introduction
of isl_bool and isl_stat, Sun May 24 11:18:42 2015 +0200).

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
2 years agotake into account live-out writes while computing may-persist set
Sven Verdoolaege [Fri, 24 Dec 2021 13:36:44 +0000 (24 14:36 +0100)]
take into account live-out writes while computing may-persist set

For determining which elements need to be copied into the device
the set of elements that may need to be preserved by
the selected subtree is computed since the copy-out operation
will overwrite these elements.
This may-persist set consists of the elements
that may need to be preserved by the entire analyzed fragment and
those that are in potential dataflow across the selected subtree.

There is, however, a third class of elements that may need
to be preserved and those are the ones are effectively written
by the analyzed fragment, that are need after this fragment and
that may get overwritten by a copy-out operation.

For example, in the new test cases, an element of the b-array
is written in a statement that does not get mapped
to the device and that is executed before the statement
that is mapped to the device.  Since this statement
also writes to (different elements of) the b-array,
this array get copied-out and overwrite the original element
if the b-array is not copied in first.

Add those elements to the may-persist set as well
so that they will be taken into account while computing
the copy-in set.

Note that the outer_may_overwrite field of ppcg_may_persist_data
could also be replaced by its domain.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
2 years agoupdate isl for isl_union_map_intersect_domain_wrapped_domain
Sven Verdoolaege [Fri, 24 Dec 2021 13:35:50 +0000 (24 14:35 +0100)]
update isl for isl_union_map_intersect_domain_wrapped_domain

This is needed in the next commit.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
2 years agogpu.c: add_to_from_device: store copy-out relation in copy_out
Sven Verdoolaege [Sat, 25 Dec 2021 12:58:44 +0000 (25 13:58 +0100)]
gpu.c: add_to_from_device: store copy-out relation in copy_out

The variable may_write is reused for different kinds of relations,
one of which is constructed starting from the copy-out relation,
which is itself constructed from another may-write relation.
The copy-out relation was therefore also stored in may_write and
then copied into copy_out.  Store the copy_out relation directly
in copy_out and then copy that to may_write instead.
This clarifies what the object now stored in copy_out represents
at the point of the call to node_may_persist,
to which it will be passed in an upcoming commit.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
2 years agogpu.c: add_to_from_device: move call to node_may_persist up
Sven Verdoolaege [Fri, 24 Dec 2021 13:36:44 +0000 (24 14:36 +0100)]
gpu.c: add_to_from_device: move call to node_may_persist up

In an upcoming commit, the copy-out relation in terms
of the statement instances will be passed down to node_may_persist.
The call therefore needs to be performed before
the statement instances are replaced by the prefix schedule
in the copy-out relation.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
2 years agogpu.c: add_to_from_device: pass down domain to node_may_persist
Sven Verdoolaege [Sun, 28 Nov 2021 14:31:47 +0000 (28 15:31 +0100)]
gpu.c: add_to_from_device: pass down domain to node_may_persist

The domain has already been computed by the caller of add_to_from_device
so it does not need to be recomputed inside node_may_persist.

Note that there is a call to mark_kernels in between,
which may group statements and cause the expanded domain
in node_may_persist to not have all original domain constraints.
That is, the domain passed down may be a subset
of the originally computed domain, but the extra elements
are irrelevant because they do not belong to the original
statement instance sets.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
2 years agogpu.c: add_to_from_device: compute copy-out before mapping to prefix schedule
Sven Verdoolaege [Fri, 24 Dec 2021 13:36:44 +0000 (24 14:36 +0100)]
gpu.c: add_to_from_device: compute copy-out before mapping to prefix schedule

The copy-out relation is needed in terms of the statement instances
in an upcoming commit, so compute the copy-out from the may-write
before replacing the statement instances by the prefix schedule.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
2 years agogpu.c: filter_flow: extract out apply_filter
Sven Verdoolaege [Fri, 24 Dec 2021 13:36:44 +0000 (24 14:36 +0100)]
gpu.c: filter_flow: extract out apply_filter

The extracted function will be reused in an upcoming commit.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
2 years agogpu.c: update_may_persist_at_filter: remove minor code duplication
Sven Verdoolaege [Mon, 29 Nov 2021 21:18:07 +0000 (29 22:18 +0100)]
gpu.c: update_may_persist_at_filter: remove minor code duplication

The same call to filter_flow needs to be performed for both types
of nodes, so it only needs to appear once.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
2 years agogpu.c: update_may_persist_at_filter: extract out remove_all_external_flow
Sven Verdoolaege [Mon, 29 Nov 2021 21:17:35 +0000 (29 22:17 +0100)]
gpu.c: update_may_persist_at_filter: extract out remove_all_external_flow

This makes it easier to remove some minor code duplication
in the next commit.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
2 years agogpu.c: update_may_persist_at_filter: return isl_stat
Sven Verdoolaege [Sat, 25 Dec 2021 14:44:35 +0000 (25 15:44 +0100)]
gpu.c: update_may_persist_at_filter: return isl_stat

This clarifies what the possible return values are.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
2 years agogpu.c: update_may_persist_at_band: return isl_stat
Sven Verdoolaege [Sat, 25 Dec 2021 14:22:52 +0000 (25 15:22 +0100)]
gpu.c: update_may_persist_at_band: return isl_stat

This clarifies what the possible return values are.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
2 years agogpu.c: fix typo in comment
Sven Verdoolaege [Sun, 28 Nov 2021 11:10:25 +0000 (28 12:10 +0100)]
gpu.c: fix typo in comment

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
2 years agotry and remove strides in bands before tiling
Sven Verdoolaege [Sat, 17 Jul 2021 21:08:11 +0000 (17 23:08 +0200)]
try and remove strides in bands before tiling

If the tiled band is strided then the tile sizes
are effectively reduced by the strides.
This is especially important for the GPU target
where the point loops get mapped directly to thread ids and
it is wasteful to only use a strided subset of those.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
2 years agoschedule.c: shift_to_origin: move up construction of universe domain
Sven Verdoolaege [Sat, 17 Jul 2021 21:08:11 +0000 (17 23:08 +0200)]
schedule.c: shift_to_origin: move up construction of universe domain

This will be useful, though not strictly necessary, in the next commit.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
2 years agoupdate isl for isl_set_get_lattice_tile
Sven Verdoolaege [Wed, 21 Jul 2021 15:25:09 +0000 (21 17:25 +0200)]
update isl for isl_set_get_lattice_tile

This will be used in an upcoming commit.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
2 years agoadd test case with a strided statement instance set
Sven Verdoolaege [Sat, 17 Jul 2021 21:07:49 +0000 (17 23:07 +0200)]
add test case with a strided statement instance set

There is currently no special handling of strided statement instance sets,
but some support will be added in an upcoming commit.
Add a test case to ensure this does not break code generation.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
2 years agoupdate pet for more relaxed pet_expr_is_equal index comparison
Sven Verdoolaege [Mon, 26 Jul 2021 19:38:31 +0000 (26 21:38 +0200)]
update pet for more relaxed pet_expr_is_equal index comparison

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
2 years agogpu.c: insert_empty_permutable_band: set domain of partial schedule
Sven Verdoolaege [Sun, 11 Jul 2021 15:27:03 +0000 (11 17:27 +0200)]
gpu.c: insert_empty_permutable_band: set domain of partial schedule

This ensures that the shifting introduced in the previous commit
can also be applied to these zero-dimensional bands.
In particular, shift_to_origin calls isl_schedule_node_band_shift,
which checks that the domain of the partial schedule
of the band node is not affected by the shift.
This assumes, however, that the partial schedule
has a domain to begin with.  Since insert_empty_permutable_band
did not originally set the domain, calling shift_to_origin
on such a node would result in a failure.

An alternative approach would be to special case
zero-dimensional band nodes in shift_to_origin,
or even to skip the shifting if the shift is obviously zero.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
2 years agotry and shift bands to the origin before tiling
Sven Verdoolaege [Thu, 13 May 2021 13:31:47 +0000 (13 15:31 +0200)]
try and shift bands to the origin before tiling

If the tiled band does not start at the origin
then the initial tiles are likely to be partial tiles.
Shifting to the origin improves the changes
of starting with full tiles.
Do not perform any shift if this would involve piecewise expressions
since doing so is likely to result in more complicated code.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
2 years agoupdate isl for add isl_pw_*_{isa,as}_*
Sven Verdoolaege [Thu, 13 May 2021 12:44:26 +0000 (13 14:44 +0200)]
update isl for add isl_pw_*_{isa,as}_*

These will be used in the next commit.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
2 years agobuild libisl.la before libpet.la
Sven Verdoolaege [Thu, 13 May 2021 09:43:08 +0000 (13 11:43 +0200)]
build libisl.la before libpet.la

If both are being built, then libpet.la depends on libisl.la.
The same mechanism was originally used to ensure libcloog-isl.la
was built after libisl.la, but this was removed in 7d8b25e5
(drop cloog submodule, Tue Aug 21 18:29:05 2012 +0200).

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
3 years agoPPCG 0.08.5ppcg-0.08.5
Sven Verdoolaege [Sat, 1 May 2021 10:53:34 +0000 (1 12:53 +0200)]
PPCG 0.08.5

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
3 years agoupdate pet to version 0.11.5
Sven Verdoolaege [Sat, 1 May 2021 10:43:10 +0000 (1 12:43 +0200)]
update pet to version 0.11.5

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
3 years agoupdate isl to version 0.24
Sven Verdoolaege [Sat, 1 May 2021 10:22:53 +0000 (1 12:22 +0200)]
update isl to version 0.24

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
3 years agogpu.c: extract_single_tagged_access: clean up space copying
Sven Verdoolaege [Wed, 17 Feb 2021 21:39:14 +0000 (17 22:39 +0100)]
gpu.c: extract_single_tagged_access: clean up space copying

The copy of space is needed by isl_space_domain,
but it was being assigned to space2, which is subsequently overwritten.
In practice, this happens to work out because taking
a copy only changes the reference count and returns
the same pointer, but the original code is still confusing,
if not technically incorrect.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
3 years agoPPCG 0.08.4ppcg-0.08.4
Sven Verdoolaege [Sat, 14 Nov 2020 11:49:53 +0000 (14 12:49 +0100)]
PPCG 0.08.4

3 years agoupdate pet to version 0.11.4
Sven Verdoolaege [Sun, 1 Nov 2020 16:42:31 +0000 (1 17:42 +0100)]
update pet to version 0.11.4

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
3 years agoupdate isl to version 0.23
Sven Verdoolaege [Sun, 1 Nov 2020 14:53:18 +0000 (1 15:53 +0100)]
update isl to version 0.23

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
3 years agoREADME: refer to pet/README for the latest supported release of clang
Sven Verdoolaege [Thu, 9 Jul 2020 08:06:39 +0000 (9 10:06 +0200)]
README: refer to pet/README for the latest supported release of clang

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
3 years agotell user about eliminated dead code
Sven Verdoolaege [Fri, 3 Jul 2020 17:01:57 +0000 (3 19:01 +0200)]
tell user about eliminated dead code

It may be confusing to see some statements or statement instances
missing from the generated code if the user is not aware
of dead code elimination.

4 years agoPPCG 0.08.3ppcg-0.08.3
Sven Verdoolaege [Sat, 12 Oct 2019 11:16:03 +0000 (12 13:16 +0200)]
PPCG 0.08.3

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
4 years agoupdate pet to version 0.11.3
Sven Verdoolaege [Sat, 12 Oct 2019 11:04:45 +0000 (12 13:04 +0200)]
update pet to version 0.11.3

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
4 years agoupdate isl to version 0.22
Sven Verdoolaege [Sat, 12 Oct 2019 10:57:45 +0000 (12 12:57 +0200)]
update isl to version 0.22

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
4 years agotake into account contraction when generating OpenMP support
Sven Verdoolaege [Thu, 22 Aug 2019 20:22:23 +0000 (22 22:22 +0200)]
take into account contraction when generating OpenMP support

Commit ppcg-0.05-103-gc7d7a176 (optionally group chains of statements,
Thu Mar 17 17:06:40 2016 +0100) introduced the possibility
of grouping several statements into a single statement for scheduling,
but failed to update the detection of parallel loops
in the OpenMP support accordingly.

In particular, when statements are grouped together,
the schedule tree refers to the groups rather than the individual
statements, while the dependence relations still refer
to the original statements.
The practical result is that all dependences get ignored and
every loop is considered parallel when grouping takes place.

Reformulate the partial schedule used in the detection
of parallel loops in terms of the original statements
such that the dependences do get taken into account.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
4 years agocpu.c: print_scop: extract out init_build_info
Sven Verdoolaege [Thu, 22 Aug 2019 20:13:51 +0000 (22 22:13 +0200)]
cpu.c: print_scop: extract out init_build_info

This makes it easier to add extra initializations without
cluttering print_scop.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
4 years agocpu.c: ast_schedule_dim_is_parallel: pass ast_build_userinfo
Sven Verdoolaege [Thu, 22 Aug 2019 20:19:47 +0000 (22 22:19 +0200)]
cpu.c: ast_schedule_dim_is_parallel: pass ast_build_userinfo

This will give ast_schedule_dim_is_parallel access to other fields
in the structure.  In particular, it will allow the function
to access the contraction that will be added to this structure.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
4 years agoupdate pet for support for recent versions of clang
Sven Verdoolaege [Sat, 17 Aug 2019 18:26:17 +0000 (17 20:26 +0200)]
update pet for support for recent versions of clang

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
4 years agoupdate isl for support for recent versions of clang
Sven Verdoolaege [Sat, 17 Aug 2019 18:23:32 +0000 (17 20:23 +0200)]
update isl for support for recent versions of clang

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
4 years agotell user about any grouping that is applied prior to scheduling
Sven Verdoolaege [Sat, 17 Aug 2019 10:13:08 +0000 (17 12:13 +0200)]
tell user about any grouping that is applied prior to scheduling

The computed schedule can be confusing if the user
is unaware of the grouping.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
5 years agoPPCG 0.08.2ppcg-0.08.2
Sven Verdoolaege [Sun, 10 Mar 2019 15:47:17 +0000 (10 16:47 +0100)]
PPCG 0.08.2

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
5 years agoupdate pet to version 0.11.2
Sven Verdoolaege [Sun, 10 Mar 2019 15:36:08 +0000 (10 16:36 +0100)]
update pet to version 0.11.2

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
5 years agoupdate isl to version 0.21
Sven Verdoolaege [Sun, 10 Mar 2019 15:26:58 +0000 (10 16:26 +0100)]
update isl to version 0.21

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
5 years agoupdate pet for support for recent versions of clang
Sven Verdoolaege [Sat, 26 Jan 2019 09:43:35 +0000 (26 10:43 +0100)]
update pet for support for recent versions of clang

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
5 years agoupdate isl for support for recent versions of clang
Sven Verdoolaege [Sat, 26 Jan 2019 09:25:45 +0000 (26 10:25 +0100)]
update isl for support for recent versions of clang

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
5 years agoupdate isl for move of interface/all.h
Sven Verdoolaege [Sat, 26 Jan 2019 09:13:46 +0000 (26 10:13 +0100)]
update isl for move of interface/all.h

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
5 years agoPPCG 0.08.1ppcg-0.08.1
Sven Verdoolaege [Tue, 17 Jul 2018 12:40:27 +0000 (17 14:40 +0200)]
PPCG 0.08.1

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
5 years agoupdate pet to version 0.11.1
Sven Verdoolaege [Tue, 17 Jul 2018 12:23:13 +0000 (17 14:23 +0200)]
update pet to version 0.11.1

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
5 years agoupdate isl to version 0.20
Sven Verdoolaege [Tue, 17 Jul 2018 09:23:09 +0000 (17 11:23 +0200)]
update isl to version 0.20

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
5 years agogpu_group.c: can_tile: use isl_map_get_range_simple_fixed_box_hull
Sven Verdoolaege [Tue, 15 May 2018 13:13:19 +0000 (15 15:13 +0200)]
gpu_group.c: can_tile: use isl_map_get_range_simple_fixed_box_hull

This functionality was moved to isl.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
5 years agoupdate isl for isl_map_get_range_simple_fixed_box_hull
Sven Verdoolaege [Tue, 15 May 2018 13:13:06 +0000 (15 15:13 +0200)]
update isl for isl_map_get_range_simple_fixed_box_hull

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
5 years agoupdate pet for direct header inclusions
Sven Verdoolaege [Tue, 15 May 2018 10:09:18 +0000 (15 12:09 +0200)]
update pet for direct header inclusions

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
5 years agogpu_group.c: compute_array_dim_size: improve error handling
Sven Verdoolaege [Thu, 24 May 2018 12:33:02 +0000 (24 14:33 +0200)]
gpu_group.c: compute_array_dim_size: improve error handling

In particular, start from an infinite bound to be able
to detect the difference between failure to find a bound and
failures that occur during the computation.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
5 years agogpu_group.c: compute_size_in_direction: extract out is_suitable_bound
Sven Verdoolaege [Thu, 24 May 2018 11:01:46 +0000 (24 13:01 +0200)]
gpu_group.c: compute_size_in_direction: extract out is_suitable_bound

This improves readability and error handling.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
5 years agogpu_group.c: compute_size_in_direction: drop unused variable
Sven Verdoolaege [Thu, 24 May 2018 10:59:31 +0000 (24 12:59 +0200)]
gpu_group.c: compute_size_in_direction: drop unused variable

It was already unused when it was introduced in 9f18b065 (initial
version of ppcg, Tue Jun 21 17:40:50 2011 +0200).

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
5 years agogpu_group.c: can_tile: separate stride detection from bound computation
Sven Verdoolaege [Tue, 15 May 2018 08:27:54 +0000 (15 10:27 +0200)]
gpu_group.c: can_tile: separate stride detection from bound computation

In particular, instead of looking at each individual dimension
in the access and determining a stride and a bound for each
separately, first determine (and remove) all strides and
only then compute bounds.
This will make it easier to promote the entire bound computation
functionality to isl.

The code also gets simplified a bit and it could be further simplified
by changing the way the shifts and the strides are stored.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
5 years agogpu_array_bound: always store shift/stride
Sven Verdoolaege [Tue, 15 May 2018 07:28:03 +0000 (15 09:28 +0200)]
gpu_array_bound: always store shift/stride

Since ppcg-0.08-4-g9040bd2f (gpu_group.c: set_stride: use
isl_map_get_range_stride_info, Mon Apr 16 15:53:49 2018 +0200),
a valid shift and stride is always computed (0 and 1 in the trivial
case), so these results might as well be stored even in the trivial
case.  This allows some special casing to be removed.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
5 years agogpu_group.c: set_stride: return indication of whether stride was found
Sven Verdoolaege [Tue, 15 May 2018 07:25:26 +0000 (15 09:25 +0200)]
gpu_group.c: set_stride: return indication of whether stride was found

This allows the caller to use this information directly instead
of having to figure it out indirectly from bound->stride.
This will make is easier to always store a stride in the next commit,
even if it is trivial.
Note that the caller could also check if this stride is trivial
or not, but the callee already performs this check, so it might
as well return this information.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
5 years agogpu_group.c: can_tile: return isl_bool
Sven Verdoolaege [Mon, 14 May 2018 14:33:33 +0000 (14 16:33 +0200)]
gpu_group.c: can_tile: return isl_bool

This allows some error handling in the caller to be moved
inside can_tile, where it is easier to follow.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
5 years agogpu_group.c: compute_group_bounds_core: return isl_stat
Sven Verdoolaege [Mon, 14 May 2018 14:26:25 +0000 (14 16:26 +0200)]
gpu_group.c: compute_group_bounds_core: return isl_stat

This clarifies what the possible return values are.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
6 years agoprint.c: directly include required headers
Sven Verdoolaege [Tue, 24 Apr 2018 19:51:16 +0000 (24 21:51 +0200)]
print.c: directly include required headers

Do so instead of relying on the headers getting included indirectly.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
6 years agogpu_group.c: set_stride: use isl_map_get_range_stride_info
Sven Verdoolaege [Mon, 16 Apr 2018 13:53:49 +0000 (16 15:53 +0200)]
gpu_group.c: set_stride: use isl_map_get_range_stride_info

This removes some code duplication with respect to isl.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
6 years agoupdate isl for isl_map_get_range_stride_info
Sven Verdoolaege [Mon, 16 Apr 2018 13:53:00 +0000 (16 15:53 +0200)]
update isl for isl_map_get_range_stride_info

Note that this update also includes some minor improvement
to the AST generation, which may affect code generated by PPCG.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
6 years agogpu_group.c: check_stride: extract out set_stride
Sven Verdoolaege [Mon, 16 Apr 2018 13:33:45 +0000 (16 15:33 +0200)]
gpu_group.c: check_stride: extract out set_stride

That is, isolate the part that looks for a stride
such that it can easily be replaced.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
6 years agogpu_group.c: fix typo in comment
Sven Verdoolaege [Tue, 17 Apr 2018 14:00:17 +0000 (17 16:00 +0200)]
gpu_group.c: fix typo in comment

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
6 years agoPPCG 0.08ppcg-0.08
Sven Verdoolaege [Sun, 18 Feb 2018 16:48:08 +0000 (18 17:48 +0100)]
PPCG 0.08

6 years agoupdate pet to version 0.11
Sven Verdoolaege [Sun, 18 Feb 2018 11:07:53 +0000 (18 12:07 +0100)]
update pet to version 0.11

6 years agoupdate isl to version 0.19
Sven Verdoolaege [Sat, 17 Feb 2018 15:12:07 +0000 (17 16:12 +0100)]
update isl to version 0.19

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
6 years agocheck working OpenMP support
Sven Verdoolaege [Sat, 17 Feb 2018 19:11:50 +0000 (17 20:11 +0100)]
check working OpenMP support

Some versions of clang miscompile a mapping of the PolyBench/C correlation
benchmark to OpenMP.  The value of a parameter appears to be reset
after a parallel for loop.  Check that this pattern produces
correct results before running PolyBench/C OpenMP tests.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
6 years agom4/ax_check_openmp.m4: print result of check for OpenMP support
Sven Verdoolaege [Sat, 17 Feb 2018 19:05:33 +0000 (17 20:05 +0100)]
m4/ax_check_openmp.m4: print result of check for OpenMP support

This was missing from ppcg-0.03-17-gce4c7542 (polybench_test.sh.in:
do not perform openmp tests when using clang, Thu Feb 5 13:10:48 2015 +0100).

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
6 years agopolybench_test.sh.in: break early if execution fails
Sven Verdoolaege [Sat, 17 Feb 2018 19:17:25 +0000 (17 20:17 +0100)]
polybench_test.sh.in: break early if execution fails

If execution fails, then there is no point in comparing the computed output.
In fact, trying to compare the output only leads to confusion.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
6 years agohybrid.c: fix typo in comment
Sven Verdoolaege [Sat, 17 Feb 2018 18:59:45 +0000 (17 19:59 +0100)]
hybrid.c: fix typo in comment

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
6 years agoppcg.c: generate_name: allocate enough space for variable name
Sven Verdoolaege [Sat, 17 Feb 2018 15:16:29 +0000 (17 16:16 +0100)]
ppcg.c: generate_name: allocate enough space for variable name

The original size should be enough in practice, but it causes
gcc 7.2.0 to generate a warning.

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
6 years agogpu.c: fix typo in comment
Sven Verdoolaege [Sun, 19 Nov 2017 12:06:27 +0000 (19 13:06 +0100)]
gpu.c: fix typo in comment

Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>