fix ELF x86_64 function calls
There were a number of bugs relating to structs that should be packed
into registers, including a total lack of support for structs that
should be split into an integer register plus an SSE register.
Each architecture is now required to define a RegArgs type that explains
how each argument is to be distributed onto machine registers; all
architectures except for x86_64/ELF use
typedef int RegArgs;
and interpret the value as the number of registers used.
gfunc_sret now takes a RegArgs * as its last argument; if non-NULL, the
RegArgs structure/int is initialized to describe if and how the argument
fits into machine registers.
vdup() is exported for use in the architecture-specific code.