aarch64: Specify that FEAT_MOPS sequences clobber CC
commitcbdffae5745327b0e5eb887afc512daf34b049b1
authorKyrylo Tkachov <kyrylo.tkachov@arm.com>
Wed, 30 Nov 2022 17:38:16 +0000 (30 17:38 +0000)
committerKyrylo Tkachov <kyrylo.tkachov@arm.com>
Wed, 30 Nov 2022 17:38:16 +0000 (30 17:38 +0000)
treeff8575e5bfb0fbd897da1a0ad1b614b0990bdb5b
parent031d3f095520f0e1ee03e29b7ad5067c2a3f96e0
aarch64: Specify that FEAT_MOPS sequences clobber CC

According to the architecture pseudocode the FEAT_MOPS sequences overwrite the NZCV flags
as par of their operation, so GCC needs to model that in the relevant RTL patterns.
For the testcase:
void g();
void foo (int a, size_t N, char *__restrict__ in,
         char *__restrict__ out)
{
  if (a != 3)
    __builtin_memcpy (out, in, N);
  if (a > 3)
    g ();
}

we will currently generate:
foo:
        cmp     w0, 3
        bne     .L6
.L1:
        ret
.L6:
        cpyfp   [x3]!, [x2]!, x1!
        cpyfm   [x3]!, [x2]!, x1!
        cpyfe   [x3]!, [x2]!, x1!
        ble     .L1 // Flags reused after CPYF* sequence
        b       g

This is wrong as the result of cmp needs to be recalculated after the MOPS sequence.
With this patch we'll insert a "cmp w0, 3" before the ble, similar to what clang does.

Bootstrapped and tested on aarch64-none-linux-gnu.
Pushing to trunk and to the GCC 12 branch after some baking time.

gcc/ChangeLog:

* config/aarch64/aarch64.md (aarch64_cpymemdi): Specify clobber of CC reg.
(*aarch64_cpymemdi): Likewise.
(aarch64_movmemdi): Likewise.
(aarch64_setmemdi): Likewise.
(*aarch64_setmemdi): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/mops_5.c: New test.
* gcc.target/aarch64/mops_6.c: Likewise.
* gcc.target/aarch64/mops_7.c: Likewise.
gcc/config/aarch64/aarch64.md
gcc/testsuite/gcc.target/aarch64/mops_5.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/mops_6.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/mops_7.c [new file with mode: 0644]