ctdb/config/events/README

   1 The events/ directory contains event scripts used by CTDB.  Event
   2 scripts are triggered on certain events, such as startup, monitoring
   3 or public IP allocation.  Scripts may be specific to services,
   4 networking or internal CTDB operations.
   5
   6 Scripts are divided into subdirectories for different CTDB components.
   7 Right now the only component is "legacy".
   8
   9 All event scripts start with the prefix 'NN.' where N is a digit.  The
  10 event scripts are run in sequence based on NN.  Thus 10.interface will
  11 be run before 60.nfs.  It is recommended to keep each NN unique.
  12 However, scripts with the same NN prefix will be executed in
  13 alphanumeric sort order.
  14
  15 As a special case, any eventscript that ends with a '~' character will be
  16 ignored since this is a common postfix that some editors will append to
  17 older versions of a file.  Similarly, any eventscript with multiple '.'s
  18 will be ignored as package managers can create copies with additional
  19 suffix starting with '.' (e.g. .rpmnew, .dpkg-dist).
  20
  21 Only executable event scripts are run by CTDB.  Any event script that
  22 does not have execute permission is ignored.
  23
  24 The eventscripts are called with varying number of arguments.  The
  25 first argument is the event name and the rest of the arguments depend
  26 on the event name.
  27
  28 Event scripts must return 0 for success and non-zero for failure.
  29
  30 Output of event scripts is logged.  On failure the output of the
  31 failing event script is included in the output of "ctdb scriptstatus".
  32
  33 The following events are supported (with arguments shown):
  34
  35 init
  36
  37         This event is triggered once when CTDB is starting up.  This
  38         event is used to do some basic cleanup and initialisation.
  39
  40         During the "init" event CTDB is not listening on its Unix
  41         domain socket, so the "ctdb" CLI will not work.
  42
  43         Failure of this event will cause CTDB to terminate.
  44
  45         Example: 00.ctdb creates $CTDB_SCRIPT_VARDIR
  46
  47 setup
  48
  49         This event is triggered once, after the "init" event has
  50         completed.
  51
  52         For this and any subsequent events the CTDB Unix domain socket
  53         is available, so the "ctdb" CLI will work.
  54
  55         Failure of this event will cause CTDB to terminate.
  56
  57         Example: 11.natgw checks that it has valid configuration
  58
  59 startup
  60
  61         This event is triggered after the "setup" event has completed
  62         and CTDB has finished its initial database recovery.
  63
  64         This event starts all services that are managed by CTDB.  Each
  65         service that is managed by CTDB should implement this event
  66         and use it to (re)start the service.
  67
  68         If the "startup" event fails then CTDB will retry it until it
  69         succeeds.  There is no limit on the number of retries.
  70
  71         Example: 50.samba uses this event to start the Samba daemon.
  72
  73 shutdown
  74
  75         This event is triggered when CTDB is shutting down.
  76
  77         This event shuts down all services that are managed by CTDB.
  78         Each service that is managed by CTDB should implement this
  79         event and use it to stop the service.
  80
  81         Example: 50.samba uses this event to shut down the Samba
  82         daemon.
  83
  84 monitor
  85
  86         This event is run periodically.  The interval between
  87         successive "monitor" events is configured using the
  88         MonitorInterval tunable, which defaults to 15 seconds.
  89
  90         This event is triggered by CTDB to continuously monitor that
  91         all managed services are healthy.  If all event scripts
  92         complete then the monitor event successfully then the node is
  93         marked HEALTHY.  If any event script fails then no subsequent
  94         scripts will be run for that event and the node is marked
  95         UNHEALTHY.
  96
  97         Each service that is managed by CTDB should implement this
  98         event and use it to monitor the service.
  99
 100         Example: 10.interface checks that each configured interface
 101         for public IP addresses has a physical link established.
 102
 103 startrecovery
 104
 105         This event is triggered every time a database recovery process
 106         is started.
 107
 108         This is rarely used.
 109
 110 recovered
 111
 112         This event is triggered every time a database recovery process
 113         is completed.
 114
 115         This is rarely used.
 116
 117 takeip <interface> <ip-address> <netmask-bits>
 118
 119         This event is triggered for each public IP address taken by a
 120         node during IP address (re)assignment.  Multiple "takeip"
 121         events can be run in parallel if multiple IP addresses are
 122         being assigned.
 123
 124         Example: In 10.interface the "ip" command (from the Linux
 125         iproute2 package) is used to add the specified public IP
 126         address to the specified interface.  The "ip" command can
 127         safely be run concurrently.  However, the "iptables" command
 128         cannot be run concurrently so a wrapper is used to serialise
 129         runs using exclusive locking.
 130
 131         If substantial work is required to reconfigure a service when
 132         a public IP address is taken over it can be better to defer
 133         service reconfiguration to the "ipreallocated" event, after
 134         all IP addresses have been assigned.
 135
 136         Example: 60.nfs uses ctdb_service_set_reconfigure() to flag
 137         that public IP addresses have changed so that service
 138         reconfiguration will occur in the "ipreallocated" event.
 139
 140 releaseip <interface> <ip-address> <netmask-bits>
 141
 142         This event is triggered for each public IP address released by
 143         a node during IP address (re)assignment.  Multiple "releaseip"
 144         events can be run in parallel if multiple IP addresses are
 145         being unassigned.
 146
 147         In all other regards, this event is analogous to the "takeip"
 148         event above.
 149
 150 updateip <old-interface> <new-interface> <ip-address> <netmask-bits>
 151
 152         This event is triggered for each public IP address moved
 153         between interfaces on a node during IP address (re)assignment.
 154         Multiple "updateip" events can be run in parallel if multiple
 155         IP addresses are being moved.
 156
 157         This event is only used if multiple interfaces are capable of
 158         hosting an IP address, as specified in the public addresses
 159         configuration file.
 160
 161         This event is similar to the "takeip" event above.
 162
 163 ipreallocated
 164
 165         This event is triggered on all nodes as the last step of
 166         public IP address (re)assignment.  It is unconditionally
 167         triggered after any "releaseip", "takeip" and "updateip"
 168         events, even though these events may not run on some nodes if
 169         there are no relevant changes.  That is, the "ipreallocated"
 170         event is triggered unconditionally, even on nodes where public
 171         IP addresses assignments have not changed.
 172
 173         This event is used to reconfigure services.
 174
 175         Since "ipreallocated" is always run, this allows
 176         reconfiguration to depend on the states of other nodes rather
 177         that just IP addresses.
 178
 179         Example: 11.natgw recalculates the NAT gateway master and
 180         updates the relevant network configuration on each node if the
 181         NAT gateway master has changed.
 182
 183 Additional notes for "takeip", "releaseip", "updateip",
 184 "ipreallocated":
 185
 186 * Failure of any of these events causes IP allocation to be retried.
 187
 188 * An event script can use ctdb_service_set_reconfigure() in "takeip",
 189   "releaseip" or "updateip" events to flag that its service needs to
 190   be reconfigured.  The "ipreallocated" event can then use
 191   ctdb_service_needs_reconfigure() to test if there were public IPs
 192   changes to determine what type of reconfiguration (if any) is
 193   needed.