inpcb/in6pcb: Split port token
The original single local port space is devided into ncpus2 local port
space groups. We denote local port space group as PG(N), N=[0, ncpus2).
Property of PG(N):
- PG(N) only contains local ports matching following condition:
(host_order(port) & ncpus2_mask) == N
- PG(N) is protected by its own token.
On explicit local port bind(2) path and accept(2) path, PG(N) is selected
by using the local port already available (accept(2)) or supplied
(bind(2)):
N = host_order(port) & ncpus2_mask
On implicit local port selection path (bind(2) and connect(2)), PG(N) is
selected and used in the following way:
N = mycpuid;
N1 = N;
again:
if (find free port in PG(N)) {
DONE;
} else {
N = (N + 1) & ncpus2_mask;
if (N != N1)
goto again;
FAILED;
}
PG(N) is now recorded in inpcb struct, so when inpcb is destroyed, we
know which port space group it should use.
On i7-3770 w/ Intel 82599ES, using tools/kq_connect_client:
Port token contention rate on each hyperthread is reduced from 120K/s to
40K/s. Admittedly the contention rate is still high but it is much
better than before.
Now the major source of port token contention is the contention between
implicit local port select path and inpcb destroy path. There may be a
way to choose local port which could hash the inpcb to the current CPU;
this needs more investigation.