Use MEM_ALIGN_ATTR in libdemac instead of fixed alignment. Speeds up arm11 by ~6%.