Sven Verdoolaege [Sun, 18 Aug 2024 15:22:34 +0000 (18 17:22 +0200)]
PPCG 0.09.2
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sun, 18 Aug 2024 15:20:59 +0000 (18 17:20 +0200)]
update pet to version 0.11.8
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sun, 18 Aug 2024 15:19:36 +0000 (18 17:19 +0200)]
update isl to version 0.27
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sun, 7 Jul 2024 12:57:36 +0000 (7 14:57 +0200)]
update pet for support for recent versions of clang
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sun, 7 Jul 2024 12:49:29 +0000 (7 14:49 +0200)]
update isl for support for recent versions of clang
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sun, 2 Apr 2023 12:20:19 +0000 (2 14:20 +0200)]
PPCG 0.09.1
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sun, 2 Apr 2023 08:40:34 +0000 (2 10:40 +0200)]
update pet to version 0.11.7
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sun, 2 Apr 2023 08:11:25 +0000 (2 10:11 +0200)]
update isl to version 0.26
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Tue, 7 Mar 2023 20:56:33 +0000 (7 21:56 +0100)]
update isl for change in coalescing
This has an effect on one of the pet test cases.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sun, 12 Feb 2023 21:56:53 +0000 (12 22:56 +0100)]
update pet for support for recent versions of clang
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sun, 12 Feb 2023 21:55:47 +0000 (12 22:55 +0100)]
update isl for support for recent versions of clang
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sun, 12 Feb 2023 11:38:23 +0000 (12 12:38 +0100)]
add configure~ to .gitignore
Autoconf 2.70 (and later) may leave such backup files.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sun, 12 Feb 2023 11:34:53 +0000 (12 12:34 +0100)]
cpu.c: construct_cpu_schedule_constraints: set context
Do this for consistency with the GPU backend.
The context provides extra information that may be exploited
by the scheduler.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sun, 5 Feb 2023 10:50:17 +0000 (5 11:50 +0100)]
replace obsolete AC_PROG_LIBTOOL by LT_INIT
AC_PROG_LIBTOOL has been obsolete since libtool 2.2.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sat, 17 Dec 2022 13:32:55 +0000 (17 14:32 +0100)]
schedule.c: add missing include
This was missing from
ppcg-0.08.5-9-g6b473a41 (try and remove strides
in bands before tiling, Sat Jul 17 23:08:11 2021 +0200).
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sat, 2 Jul 2022 14:16:32 +0000 (2 16:16 +0200)]
PPCG 0.09
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sat, 2 Jul 2022 13:58:49 +0000 (2 15:58 +0200)]
update pet to version 0.11.6
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sat, 2 Jul 2022 13:23:29 +0000 (2 15:23 +0200)]
update isl to version 0.25
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sat, 2 Jul 2022 12:15:00 +0000 (2 14:15 +0200)]
update pet for support for recent versions of clang
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sat, 2 Jul 2022 10:42:13 +0000 (2 12:42 +0200)]
update isl for support for recent versions of clang
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Mon, 29 Nov 2021 15:03:27 +0000 (29 16:03 +0100)]
inform user about empty context
An empty context means that the original code cannot be executed.
It can be confusing to see the entire code is missing from
the generated code if the user is not aware of the empty context.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sun, 28 Nov 2021 20:34:02 +0000 (28 21:34 +0100)]
avoid using empty context to print array declarations
The context can be empty if it turns out the scop cannot
be executed for any value of the parameters.
Using such an empty context to simplify expressions
can result in void expressions that cannot be printed.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sat, 2 Jul 2022 08:39:19 +0000 (2 10:39 +0200)]
update pet for preserving array extents in case of empty context
This is needed for the test case in the next commit.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sat, 27 Nov 2021 16:43:45 +0000 (27 17:43 +0100)]
gpu: avoid mapping some nested non-permutable bands to the device
If the outer node of the schedule tree is a set or a sequence,
then all or the initial and final children that do not have
any permutable bands are already executed on the CPU.
The outer node may however have children that are themselves
set or sequence node. If only some of their children have
permutable bands, then the entire child of the outer node
was still being mapped to the device.
Allow initial/final children to be collected recursively
to avoid more non-permutable bands getting mapped to the device.
In the case of a set node, recursion can be performed on any child
since the initial/final parts of any child of a set node can
be moved first/last.
In the case of a sequence node, recursion can only
be performed on the first/last child.
Note that the entire set of collected descendants is moved
before/after the other descendants. The relative order
within each of these two parts is preserved.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Fri, 24 Dec 2021 15:04:38 +0000 (24 16:04 +0100)]
gpu: avoid mapping final non-permutable bands to the device
If the outer node of the schedule tree is a sequence and the final
children of this sequence do not have any permutable bands,
then there is no point in including these final children
in the part that is mapped to the device.
Instead, these final children can be run on the CPU instead.
This extends earlier support for separating out independent and
initial non-permutable bands in
ppcg-0.03-191-g6fa73710 (gpu:
avoid mapping independent non-permutable bands to the device,
Thu Oct 24 13:15:10 2013 +0200) and
ppcg-0.04-49-g201e18aa (gpu:
avoid mapping initial non-permutable bands to the device,
Tue Dec 15 12:13:30 2015 +0100).
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Mon, 27 Dec 2021 10:27:46 +0000 (27 11:27 +0100)]
gpu.c: isolate_permutable_subtrees: say why permutable subtrees are put first
That is, in the case of a set node, the permutable subtrees
can be executed either before or after the subtrees without
permutable bands. Explain why the permutable subtrees
are executed first.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Fri, 24 Dec 2021 14:58:52 +0000 (24 15:58 +0100)]
gpu.c: get_non_parallel_subtree_filters: optionally select final subtrees
This will allow isolate_permutable_subtrees to separate out
both initial and final subtrees in an upcoming commit.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Fri, 24 Dec 2021 14:55:33 +0000 (24 15:55 +0100)]
gpu.c: get_non_parallel_subtree_filters: select subtrees based on node type
The function get_non_parallel_subtree_filters selects either
the initial non-parallel subtrees or all non-parallel subtrees.
Originally, the choice was made based on a function parameter,
but in the calls of this function this parameter is uniquely
determined by the node type.
Directly use the node type inside get_non_parallel_subtree_filters.
This makes it easier to introduce a function parameter
to choose between the initial and the final subtrees
(for sequence nodes) in the next commit.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Fri, 24 Dec 2021 14:42:29 +0000 (24 15:42 +0100)]
gpu.c: get_non_parallel_subtree_filters: use isl_bool for local variable
This was missing from
ppcg-0.07-21-g56f199d9 (gpu.c:
has_any_permutable_node: return isl_bool, Thu Sep 14 17:26:13 2017 +0200).
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Mon, 27 Dec 2021 10:42:48 +0000 (27 11:42 +0100)]
gpu.c: get_non_parallel_subtree_filters: fix typo in comment
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sat, 27 Nov 2021 16:37:45 +0000 (27 17:37 +0100)]
gpu.c: declare_accessed_local_variables: skip empty domains
If the domain is empty then clearly no declarations are needed
for the statement instances in the domain and the iteration
over the arrays can be skipped.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Mon, 27 Dec 2021 09:55:19 +0000 (27 10:55 +0100)]
gpu.c: declare_accessed_local_variables: use isl_bool for local variable
This was missing from
ppcg-0.03-215-gacbf2ded (update isl for introduction
of isl_bool and isl_stat, Sun May 24 11:18:42 2015 +0200).
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Fri, 24 Dec 2021 13:36:44 +0000 (24 14:36 +0100)]
take into account live-out writes while computing may-persist set
For determining which elements need to be copied into the device
the set of elements that may need to be preserved by
the selected subtree is computed since the copy-out operation
will overwrite these elements.
This may-persist set consists of the elements
that may need to be preserved by the entire analyzed fragment and
those that are in potential dataflow across the selected subtree.
There is, however, a third class of elements that may need
to be preserved and those are the ones are effectively written
by the analyzed fragment, that are need after this fragment and
that may get overwritten by a copy-out operation.
For example, in the new test cases, an element of the b-array
is written in a statement that does not get mapped
to the device and that is executed before the statement
that is mapped to the device. Since this statement
also writes to (different elements of) the b-array,
this array get copied-out and overwrite the original element
if the b-array is not copied in first.
Add those elements to the may-persist set as well
so that they will be taken into account while computing
the copy-in set.
Note that the outer_may_overwrite field of ppcg_may_persist_data
could also be replaced by its domain.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Fri, 24 Dec 2021 13:35:50 +0000 (24 14:35 +0100)]
update isl for isl_union_map_intersect_domain_wrapped_domain
This is needed in the next commit.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sat, 25 Dec 2021 12:58:44 +0000 (25 13:58 +0100)]
gpu.c: add_to_from_device: store copy-out relation in copy_out
The variable may_write is reused for different kinds of relations,
one of which is constructed starting from the copy-out relation,
which is itself constructed from another may-write relation.
The copy-out relation was therefore also stored in may_write and
then copied into copy_out. Store the copy_out relation directly
in copy_out and then copy that to may_write instead.
This clarifies what the object now stored in copy_out represents
at the point of the call to node_may_persist,
to which it will be passed in an upcoming commit.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Fri, 24 Dec 2021 13:36:44 +0000 (24 14:36 +0100)]
gpu.c: add_to_from_device: move call to node_may_persist up
In an upcoming commit, the copy-out relation in terms
of the statement instances will be passed down to node_may_persist.
The call therefore needs to be performed before
the statement instances are replaced by the prefix schedule
in the copy-out relation.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sun, 28 Nov 2021 14:31:47 +0000 (28 15:31 +0100)]
gpu.c: add_to_from_device: pass down domain to node_may_persist
The domain has already been computed by the caller of add_to_from_device
so it does not need to be recomputed inside node_may_persist.
Note that there is a call to mark_kernels in between,
which may group statements and cause the expanded domain
in node_may_persist to not have all original domain constraints.
That is, the domain passed down may be a subset
of the originally computed domain, but the extra elements
are irrelevant because they do not belong to the original
statement instance sets.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Fri, 24 Dec 2021 13:36:44 +0000 (24 14:36 +0100)]
gpu.c: add_to_from_device: compute copy-out before mapping to prefix schedule
The copy-out relation is needed in terms of the statement instances
in an upcoming commit, so compute the copy-out from the may-write
before replacing the statement instances by the prefix schedule.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Fri, 24 Dec 2021 13:36:44 +0000 (24 14:36 +0100)]
gpu.c: filter_flow: extract out apply_filter
The extracted function will be reused in an upcoming commit.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Mon, 29 Nov 2021 21:18:07 +0000 (29 22:18 +0100)]
gpu.c: update_may_persist_at_filter: remove minor code duplication
The same call to filter_flow needs to be performed for both types
of nodes, so it only needs to appear once.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Mon, 29 Nov 2021 21:17:35 +0000 (29 22:17 +0100)]
gpu.c: update_may_persist_at_filter: extract out remove_all_external_flow
This makes it easier to remove some minor code duplication
in the next commit.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sat, 25 Dec 2021 14:44:35 +0000 (25 15:44 +0100)]
gpu.c: update_may_persist_at_filter: return isl_stat
This clarifies what the possible return values are.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sat, 25 Dec 2021 14:22:52 +0000 (25 15:22 +0100)]
gpu.c: update_may_persist_at_band: return isl_stat
This clarifies what the possible return values are.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sun, 28 Nov 2021 11:10:25 +0000 (28 12:10 +0100)]
gpu.c: fix typo in comment
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sat, 17 Jul 2021 21:08:11 +0000 (17 23:08 +0200)]
try and remove strides in bands before tiling
If the tiled band is strided then the tile sizes
are effectively reduced by the strides.
This is especially important for the GPU target
where the point loops get mapped directly to thread ids and
it is wasteful to only use a strided subset of those.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sat, 17 Jul 2021 21:08:11 +0000 (17 23:08 +0200)]
schedule.c: shift_to_origin: move up construction of universe domain
This will be useful, though not strictly necessary, in the next commit.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Wed, 21 Jul 2021 15:25:09 +0000 (21 17:25 +0200)]
update isl for isl_set_get_lattice_tile
This will be used in an upcoming commit.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sat, 17 Jul 2021 21:07:49 +0000 (17 23:07 +0200)]
add test case with a strided statement instance set
There is currently no special handling of strided statement instance sets,
but some support will be added in an upcoming commit.
Add a test case to ensure this does not break code generation.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Mon, 26 Jul 2021 19:38:31 +0000 (26 21:38 +0200)]
update pet for more relaxed pet_expr_is_equal index comparison
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sun, 11 Jul 2021 15:27:03 +0000 (11 17:27 +0200)]
gpu.c: insert_empty_permutable_band: set domain of partial schedule
This ensures that the shifting introduced in the previous commit
can also be applied to these zero-dimensional bands.
In particular, shift_to_origin calls isl_schedule_node_band_shift,
which checks that the domain of the partial schedule
of the band node is not affected by the shift.
This assumes, however, that the partial schedule
has a domain to begin with. Since insert_empty_permutable_band
did not originally set the domain, calling shift_to_origin
on such a node would result in a failure.
An alternative approach would be to special case
zero-dimensional band nodes in shift_to_origin,
or even to skip the shifting if the shift is obviously zero.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Thu, 13 May 2021 13:31:47 +0000 (13 15:31 +0200)]
try and shift bands to the origin before tiling
If the tiled band does not start at the origin
then the initial tiles are likely to be partial tiles.
Shifting to the origin improves the changes
of starting with full tiles.
Do not perform any shift if this would involve piecewise expressions
since doing so is likely to result in more complicated code.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Thu, 13 May 2021 12:44:26 +0000 (13 14:44 +0200)]
update isl for add isl_pw_*_{isa,as}_*
These will be used in the next commit.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Thu, 13 May 2021 09:43:08 +0000 (13 11:43 +0200)]
build libisl.la before libpet.la
If both are being built, then libpet.la depends on libisl.la.
The same mechanism was originally used to ensure libcloog-isl.la
was built after libisl.la, but this was removed in
7d8b25e5
(drop cloog submodule, Tue Aug 21 18:29:05 2012 +0200).
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sat, 1 May 2021 10:53:34 +0000 (1 12:53 +0200)]
PPCG 0.08.5
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sat, 1 May 2021 10:43:10 +0000 (1 12:43 +0200)]
update pet to version 0.11.5
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sat, 1 May 2021 10:22:53 +0000 (1 12:22 +0200)]
update isl to version 0.24
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Wed, 17 Feb 2021 21:39:14 +0000 (17 22:39 +0100)]
gpu.c: extract_single_tagged_access: clean up space copying
The copy of space is needed by isl_space_domain,
but it was being assigned to space2, which is subsequently overwritten.
In practice, this happens to work out because taking
a copy only changes the reference count and returns
the same pointer, but the original code is still confusing,
if not technically incorrect.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sat, 14 Nov 2020 11:49:53 +0000 (14 12:49 +0100)]
PPCG 0.08.4
Sven Verdoolaege [Sun, 1 Nov 2020 16:42:31 +0000 (1 17:42 +0100)]
update pet to version 0.11.4
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sun, 1 Nov 2020 14:53:18 +0000 (1 15:53 +0100)]
update isl to version 0.23
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Thu, 9 Jul 2020 08:06:39 +0000 (9 10:06 +0200)]
README: refer to pet/README for the latest supported release of clang
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Fri, 3 Jul 2020 17:01:57 +0000 (3 19:01 +0200)]
tell user about eliminated dead code
It may be confusing to see some statements or statement instances
missing from the generated code if the user is not aware
of dead code elimination.
Sven Verdoolaege [Sat, 12 Oct 2019 11:16:03 +0000 (12 13:16 +0200)]
PPCG 0.08.3
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sat, 12 Oct 2019 11:04:45 +0000 (12 13:04 +0200)]
update pet to version 0.11.3
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sat, 12 Oct 2019 10:57:45 +0000 (12 12:57 +0200)]
update isl to version 0.22
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Thu, 22 Aug 2019 20:22:23 +0000 (22 22:22 +0200)]
take into account contraction when generating OpenMP support
Commit
ppcg-0.05-103-gc7d7a176 (optionally group chains of statements,
Thu Mar 17 17:06:40 2016 +0100) introduced the possibility
of grouping several statements into a single statement for scheduling,
but failed to update the detection of parallel loops
in the OpenMP support accordingly.
In particular, when statements are grouped together,
the schedule tree refers to the groups rather than the individual
statements, while the dependence relations still refer
to the original statements.
The practical result is that all dependences get ignored and
every loop is considered parallel when grouping takes place.
Reformulate the partial schedule used in the detection
of parallel loops in terms of the original statements
such that the dependences do get taken into account.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Thu, 22 Aug 2019 20:13:51 +0000 (22 22:13 +0200)]
cpu.c: print_scop: extract out init_build_info
This makes it easier to add extra initializations without
cluttering print_scop.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Thu, 22 Aug 2019 20:19:47 +0000 (22 22:19 +0200)]
cpu.c: ast_schedule_dim_is_parallel: pass ast_build_userinfo
This will give ast_schedule_dim_is_parallel access to other fields
in the structure. In particular, it will allow the function
to access the contraction that will be added to this structure.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sat, 17 Aug 2019 18:26:17 +0000 (17 20:26 +0200)]
update pet for support for recent versions of clang
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sat, 17 Aug 2019 18:23:32 +0000 (17 20:23 +0200)]
update isl for support for recent versions of clang
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sat, 17 Aug 2019 10:13:08 +0000 (17 12:13 +0200)]
tell user about any grouping that is applied prior to scheduling
The computed schedule can be confusing if the user
is unaware of the grouping.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sun, 10 Mar 2019 15:47:17 +0000 (10 16:47 +0100)]
PPCG 0.08.2
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sun, 10 Mar 2019 15:36:08 +0000 (10 16:36 +0100)]
update pet to version 0.11.2
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sun, 10 Mar 2019 15:26:58 +0000 (10 16:26 +0100)]
update isl to version 0.21
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sat, 26 Jan 2019 09:43:35 +0000 (26 10:43 +0100)]
update pet for support for recent versions of clang
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sat, 26 Jan 2019 09:25:45 +0000 (26 10:25 +0100)]
update isl for support for recent versions of clang
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sat, 26 Jan 2019 09:13:46 +0000 (26 10:13 +0100)]
update isl for move of interface/all.h
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Tue, 17 Jul 2018 12:40:27 +0000 (17 14:40 +0200)]
PPCG 0.08.1
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Tue, 17 Jul 2018 12:23:13 +0000 (17 14:23 +0200)]
update pet to version 0.11.1
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Tue, 17 Jul 2018 09:23:09 +0000 (17 11:23 +0200)]
update isl to version 0.20
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Tue, 15 May 2018 13:13:19 +0000 (15 15:13 +0200)]
gpu_group.c: can_tile: use isl_map_get_range_simple_fixed_box_hull
This functionality was moved to isl.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Tue, 15 May 2018 13:13:06 +0000 (15 15:13 +0200)]
update isl for isl_map_get_range_simple_fixed_box_hull
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Tue, 15 May 2018 10:09:18 +0000 (15 12:09 +0200)]
update pet for direct header inclusions
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Thu, 24 May 2018 12:33:02 +0000 (24 14:33 +0200)]
gpu_group.c: compute_array_dim_size: improve error handling
In particular, start from an infinite bound to be able
to detect the difference between failure to find a bound and
failures that occur during the computation.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Thu, 24 May 2018 11:01:46 +0000 (24 13:01 +0200)]
gpu_group.c: compute_size_in_direction: extract out is_suitable_bound
This improves readability and error handling.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Thu, 24 May 2018 10:59:31 +0000 (24 12:59 +0200)]
gpu_group.c: compute_size_in_direction: drop unused variable
It was already unused when it was introduced in
9f18b065 (initial
version of ppcg, Tue Jun 21 17:40:50 2011 +0200).
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Tue, 15 May 2018 08:27:54 +0000 (15 10:27 +0200)]
gpu_group.c: can_tile: separate stride detection from bound computation
In particular, instead of looking at each individual dimension
in the access and determining a stride and a bound for each
separately, first determine (and remove) all strides and
only then compute bounds.
This will make it easier to promote the entire bound computation
functionality to isl.
The code also gets simplified a bit and it could be further simplified
by changing the way the shifts and the strides are stored.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Tue, 15 May 2018 07:28:03 +0000 (15 09:28 +0200)]
gpu_array_bound: always store shift/stride
Since
ppcg-0.08-4-g9040bd2f (gpu_group.c: set_stride: use
isl_map_get_range_stride_info, Mon Apr 16 15:53:49 2018 +0200),
a valid shift and stride is always computed (0 and 1 in the trivial
case), so these results might as well be stored even in the trivial
case. This allows some special casing to be removed.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Tue, 15 May 2018 07:25:26 +0000 (15 09:25 +0200)]
gpu_group.c: set_stride: return indication of whether stride was found
This allows the caller to use this information directly instead
of having to figure it out indirectly from bound->stride.
This will make is easier to always store a stride in the next commit,
even if it is trivial.
Note that the caller could also check if this stride is trivial
or not, but the callee already performs this check, so it might
as well return this information.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Mon, 14 May 2018 14:33:33 +0000 (14 16:33 +0200)]
gpu_group.c: can_tile: return isl_bool
This allows some error handling in the caller to be moved
inside can_tile, where it is easier to follow.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Mon, 14 May 2018 14:26:25 +0000 (14 16:26 +0200)]
gpu_group.c: compute_group_bounds_core: return isl_stat
This clarifies what the possible return values are.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Tue, 24 Apr 2018 19:51:16 +0000 (24 21:51 +0200)]
print.c: directly include required headers
Do so instead of relying on the headers getting included indirectly.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Mon, 16 Apr 2018 13:53:49 +0000 (16 15:53 +0200)]
gpu_group.c: set_stride: use isl_map_get_range_stride_info
This removes some code duplication with respect to isl.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Mon, 16 Apr 2018 13:53:00 +0000 (16 15:53 +0200)]
update isl for isl_map_get_range_stride_info
Note that this update also includes some minor improvement
to the AST generation, which may affect code generated by PPCG.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Mon, 16 Apr 2018 13:33:45 +0000 (16 15:33 +0200)]
gpu_group.c: check_stride: extract out set_stride
That is, isolate the part that looks for a stride
such that it can easily be replaced.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Tue, 17 Apr 2018 14:00:17 +0000 (17 16:00 +0200)]
gpu_group.c: fix typo in comment
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sun, 18 Feb 2018 16:48:08 +0000 (18 17:48 +0100)]
PPCG 0.08
Sven Verdoolaege [Sun, 18 Feb 2018 11:07:53 +0000 (18 12:07 +0100)]
update pet to version 0.11
Sven Verdoolaege [Sat, 17 Feb 2018 15:12:07 +0000 (17 16:12 +0100)]
update isl to version 0.19
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>
Sven Verdoolaege [Sat, 17 Feb 2018 19:11:50 +0000 (17 20:11 +0100)]
check working OpenMP support
Some versions of clang miscompile a mapping of the PolyBench/C correlation
benchmark to OpenMP. The value of a parameter appears to be reset
after a parallel for loop. Check that this pattern produces
correct results before running PolyBench/C OpenMP tests.
Signed-off-by: Sven Verdoolaege <sven.verdoolaege@gmail.com>