Downsampler finished. I have tested several variants of combining and
not combining loops during downsampling an image with alpha mask and
outside mask. The results as tested on an image of 4096x4096px size:
Downsampling the image and the mask in a single loop is slower than
in two separate loops. We assume that this has to do with CPU caching?
Downsampling and mapping into byte[] of short/float must be done in a
single loop for optimal speed.
Downsampling and mapping the channels of RGB into three independent byte[]
must be done in a single loop for optimal speed.
Downsampling and merging an alpha and outside mask must be done in a single
loop.
So, picture and alpha[+outside] are separate operations---nice.
Pictures:
For ByteProcessor, use downsampleByteProcessor() and use getPixels() to
get the byte[] pixels,
for ShortProcessor, use downsampleShort() and use the returned byte[],
for FloatProcessor, use downsampleFloat() and use the returned byte[],
for ColorProcessor, use downsampleColor() and use the returned byte[][]
Alpha:
For alpha without outside, use downsampleByteProcessor() and use
getPixels() to get the byte[] pixels,
for outside without alpha, use downsampleOutside() and use getPixels() to
get the byte[] pixels,
for alpha and outside, use downsampleAlphaAndOutside() and use getPixels()
to get the byte[] pixels of both