Handle gcc __builtin_strcmp using 128/256 bit vectors with sse4.1, avx/avx2
* amd64 front end: redo the translation into IR for PTEST, so as to
use only IROps which we know Memcheck can do exact instrumentation
for. Handling for both the 128- and 256-bit cases is has been
changed.
* ir_opt.c: add some constant folding rules to support the above. In
particular, for the case `ptest %reg, %reg` (the same reg twice), we
want rflags.C to be set to a defined-1 even if %reg is completely
undefined. Doing that requires folding `x and not(x)` to zero when
x has type V128 or V256.
* memcheck/tests/amd64/rh2257546_{128,256}.c: new test cases
https://bugzilla.redhat.com/show_bug.cgi?id=
2257546