eventscripts: Become unhealthy faster on nfsd failure
commitfec69034ee939c98b64030aa6a502556d2b4704b
authorMartin Schwenke <martin@meltin.net>
Mon, 12 Aug 2013 01:36:25 +0000 (12 11:36 +1000)
committerAmitay Isaacs <amitay@gmail.com>
Wed, 14 Aug 2013 06:10:30 +0000 (14 16:10 +1000)
treeb6177ac2d4bf816ecfbce5dc775fa99847d467a5
parent4cb3e2cd78053eeb4583faadc252b0595d31f7d5
eventscripts: Become unhealthy faster on nfsd failure

Anecdotal evidence suggests that most nfsd RPC check failures are due
to cluster filesystem or storage problem.  Apparently these are rarely
helped by attempting to restart the NFS service because the restart
tends to hang.

Fail after 2 nfsd RPC check failures, instead of waiting for 6
failures.  Restart on every 10th failure to try to bring the node back
to good health.

Update unit tests to match.

Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit e9ef93f7b6dad59eabaa32124df81f3e74c651ef)
ctdb/config/nfs-rpc-checks.d/20.nfsd.check
ctdb/tests/eventscripts/60.nfs.monitor.112.sh
ctdb/tests/eventscripts/60.nfs.monitor.113.sh
ctdb/tests/eventscripts/60.nfs.monitor.114.sh