Refactor x86 decl based scatter vectorization, prepare SLP
The following refactors the x86 decl based scatter vectorization
similar to what I did to the gather path. This prepares scatters
for SLP as well, mainly single-lane since there are multiple
missing bits to support multi-lane scatters.
Tested extensively on the SLP-only branch which has the ability
to force SLP even for single lanes.
PR tree-optimization/111133
* tree-vect-stmts.cc (vect_build_scatter_store_calls):
Remove and refactor to ...
(vect_build_one_scatter_store_call): ... this new function.
(vectorizable_store): Use vect_check_scalar_mask to record
the SLP node for the mask operand. Code generate scatters
with builtin decls from the main scatter vectorization
path and prepare that for SLP.
* tree-vect-slp.cc (vect_get_operand_map): Do not look
at the VDEF to decide between scatter or gather since that
doesn't work for patterns. Use the LHS being an SSA_NAME
or not instead.