Simplify PME GPU constants
The values dependent on threadsPerAtom are now computed directly
from a class enum value, rather than indirectly from a bool.
Turned clang-format off for sections of the code where we declare
template functions. That stuff is easier to read and maintain if we
treat it like tabular data rather than free-form code.
Removed some const on template values. Those are useless, because
template values are always const, and confused a compiler when a const
ThreadsPerAtom was compared with a ThreadsPerAtom.
Change-Id: I295d4c2ea52b7912b8bd9a09ee178d104e9bfcb0