kernel - usched_dfly revamp (4), improve tail
* Improve tail performance (many more cpu-bound processes than available
cpus).
* Experiment with removing the LWKT priority adjustments for kernel vs user.
Instead give LWKT a hint about the user scheduler when scheduling a thread.
LWKT's round-robin is left unhinted to hopefully round-robin starved LWKTs
running in kernel mode.
* Implement a better calculation for the per-thread uload than the priority.
Instead, use estcpu.
* Adjust default weigntings for new uload calculation scale.