kernel - Improve lockf performance
* Improve fcntl lockf performance with two small optimizations.
Together these add a little over 10% in non-contended
performance.
* Add a 2-entry-per-cpu range lock allocation cache. This
covers the most typical lock/unlock situations.
* Conditionalize the setting of VMAYHAVELOCKS to avoid unnecessary
atomic ops.
* Remove the clearing of VMAYHAVELOCKS. The cost in close() is
basically nothing while the cost in the lockf critical path
is several branches and an atomic op.