gpu: tile kernel band in schedule tree
Since the kernel code is currently still generated from the flat
schedule (derived from the schedule tree before the band is tiled),
this tiling does not have any effect.
It is however a step in the preparation for generating the entire
code from the schedule tree.
The scaling is performed inside create_kernel since we will
want to perform some other operations on the schedule tree
before the scaling inside create_kernel. In particular,
the mapping to block identifiers is most easily constructed
on the unscaled band node.
Signed-off-by: Sven Verdoolaege <skimo@kotnet.org>