restrict grid to actually executed blocks
The grid sizes used for wrapping iterations over the blocks
in the grid (either the default sizes or those specified by
the user) may be larger than the actual number of blocks
required to run the kernel, possibly depending on the parameter
values. With this commit, we now construct a box containing
the origin and the ids of the blocks that actually need
to do something. There may still be blocks in there that
have no work, but there should be fewer of them than before.
Signed-off-by: Sven Verdoolaege <skimo@kotnet.org>