i386: Do not sanitize upper part of V2SFmode reg with -fno-trapping-math [PR110832]
commitad5b757d99b5a121198b79a6a42c1f15ae86a190
authorUros Bizjak <ubizjak@gmail.com>
Tue, 8 Aug 2023 16:53:51 +0000 (8 18:53 +0200)
committerUros Bizjak <ubizjak@gmail.com>
Tue, 8 Aug 2023 16:56:07 +0000 (8 18:56 +0200)
treee9f3179e9e8ac70689fc986ada4d12247a448cb1
parentaadc5c07feb0ab08729ab25d0d896b55860ad9e6
i386: Do not sanitize upper part of V2SFmode reg with -fno-trapping-math [PR110832]

Also introduce -m[no-]partial-vector-fp-math option to disable trapping
V2SF named patterns in order to avoid generation of partial vector V4SFmode
trapping instructions.

The new option is enabled by default, because even with sanitization,
a small but consistent speed up of 2 to 3% with Polyhedron capacita
benchmark can be achieved vs. scalar code.

Using -fno-trapping-math improves Polyhedron capacita runtime 8 to 9%
vs. scalar code.  This is what clang does by default, as it defaults
to -fno-trapping-math.

PR target/110832

gcc/ChangeLog:

* config/i386/i386.opt (mpartial-vector-fp-math): New option.
* config/i386/mmx.md (movq_<mode>_to_sse): Do not sanitize
upper part of V2SFmode register with -fno-trapping-math.
(<plusminusmult:insn>v2sf3): Enable for ix86_partial_vec_fp_math.
(divv2sf3): Ditto.
(<smaxmin:code>v2sf3): Ditto.
(sqrtv2sf2): Ditto.
(*mmx_haddv2sf3_low): Ditto.
(*mmx_hsubv2sf3_low): Ditto.
(vec_addsubv2sf3): Ditto.
(vec_cmpv2sfv2si): Ditto.
(vcond<V2FI:mode>v2sf): Ditto.
(fmav2sf4): Ditto.
(fmsv2sf4): Ditto.
(fnmav2sf4): Ditto.
(fnmsv2sf4): Ditto.
(fix_truncv2sfv2si2): Ditto.
(fixuns_truncv2sfv2si2): Ditto.
(floatv2siv2sf2): Ditto.
(floatunsv2siv2sf2): Ditto.
(nearbyintv2sf2): Ditto.
(rintv2sf2): Ditto.
(lrintv2sfv2si2): Ditto.
(ceilv2sf2): Ditto.
(lceilv2sfv2si2): Ditto.
(floorv2sf2): Ditto.
(lfloorv2sfv2si2): Ditto.
(btruncv2sf2): Ditto.
(roundv2sf2): Ditto.
(lroundv2sfv2si2): Ditto.
* doc/invoke.texi (x86 Options): Document
-mpartial-vector-fp-math option.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr110832-1.c: New test.
* gcc.target/i386/pr110832-2.c: New test.
* gcc.target/i386/pr110832-3.c: New test.
gcc/config/i386/i386.opt
gcc/config/i386/mmx.md
gcc/doc/invoke.texi
gcc/testsuite/gcc.target/i386/pr110832-1.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/pr110832-2.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/pr110832-3.c [new file with mode: 0644]