rs6000: inline ldouble __gcc_qsub
While performing some tests of IEEE 128 float for PPC64LE, Michael
Meissner noticed that __gcc_qsub is substantially slower than
__gcc_qadd. __gcc_qsub calls __gcc_add with the second operand
negated. Because the functions normally are invoked through
libgcc shared object, the extra PLT overhead has a large impact
on the overall time of the function. This patch converts
__gcc_qadd to a static inline function invoked by __gcc_qadd
and __gcc_qsub.
libgcc/ChangeLog:
* config/rs6000/ibm-ldouble.c (ldouble_qadd_internal): Rename from
__gcc_qadd.
(__gcc_qadd): Call ldouble_qadd_internal.
(__gcc_qsub): Call ldouble_qadd_internal with second long double
argument negated.