kernel - Refactor kern_sendfile()
* Refactor kern_sendfile() to greatly improve performance.
* Use vm_page_lookup_sbusy_try() exclusively to acquire VM pages
to assign to the mbufs.
* Instead of holding pages in a fancy manner, just issue the
UIO_NOCOPY / VMIO VOP_READ() in the blind and loop up.
* The VOP_READ() is still synchronous. It is really unclear
whether asynchronizing VOP_READ() via the pagerops would
really improve performance verses simply implementing a
limited number of connections per worker. At least in
localhost tests, we seem to be hitting a hardware memory
bottleneck long before we hit a cpu bottleneck.