ctdb-tests: Wait for child process when killing cluster mutex helper
The following test sometimes fails:
==================================================
Running "cluster_mutex_test lock-unlock-lock-unlock ./tests/var/cluster_mutex.lockfile"
--------------------------------------------------
Output (Exit status: 134):
--------------------------------------------------
LOCK
UNLOCK
CONTENTION
NOLOCK
cluster_mutex_test: ../../tests/src/cluster_mutex_test.c:307: test_lock_unlock_lock_unlock: Assertion `dl2->mh != NULL' failed.
--------------------------------------------------
Required output (Exit status: 0):
--------------------------------------------------
LOCK
UNLOCK
LOCK
UNLOCK
FAILED
==========================================================================
TEST FAILED: tests/cunit/cluster_mutex_001.sh (status 1) (duration: 0s)
==========================================================================
This is due to a race in the test. For the first UNLOCK a signal is
sent to the cluster mutex handler but the test tries to retake the
lock before that process is scheduled and the signal is processed.
Therefore, the fcntl() lock is still held and contention is seen.
After unlocking, tests need to wait until the child has gone, so build
this into ctdb_kill(). This is one of the only places where the PID
is accessible.
Outside of testing, on a real system, nothing will never try
to (re)take the lock so quickly.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14085
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>