gpu: avoid mapping initial non-permutable bands to the device
commit201e18aa8617c4dd44aca28708c4d87a8a956a21
authorSven Verdoolaege <skimo@kotnet.org>
Tue, 15 Dec 2015 11:13:30 +0000 (15 12:13 +0100)
committerSven Verdoolaege <skimo@kotnet.org>
Tue, 5 Jan 2016 16:45:40 +0000 (5 17:45 +0100)
tree9cbcb4744c4dd24ba2b823d0ea839a18c60add56
parent5d6e7f1a81e6719106cc1a2bf8e1fb873720b6a1
gpu: avoid mapping initial non-permutable bands to the device

If the outer node of the schedule tree is a sequence and the initial
children of this sequence do not have any permutable bands,
then there is no point in including these initial children
in the part that is mapped to the device.
Instead, these initial children can be run on the CPU and
any results produced can be copied to the device.
This should be cheaper than running one or more single instance kernels
on the device.

Signed-off-by: Sven Verdoolaege <skimo@kotnet.org>
gpu.c