7 * A missing network interface now causes monitoring to fail and the
8 node to become unhealthy.
10 * Changed ctdb command's default control timeout from 3s to 10s.
12 * debug-hung-script.sh now includes the output of "ctdb scriptstatus"
13 to provide more information.
18 * Starting CTDB daemon by running ctdbd directly should not remove
19 existing unix socket unconditionally.
21 * ctdbd once again successfully kills client processes on releasing
22 public IPs. It was checking for them as tracked child processes
23 and not finding them, so wasn't killing them.
25 * ctdbd_wrapper now exports CTDB_SOCKET so that child processes of
26 ctdbd (such as uses of ctdb in eventscripts) use the correct socket.
28 * Always use Jenkins hash when creating volatile databases. There
29 were a few places where TDBs would be attached with the wrong flags.
31 * Vacuuming code fixes in CTDB 2.2 introduced bugs in the new code
32 which led to header corruption for empty records. This resulted
33 in inconsistent headers on two nodes and a request for such a record
34 keeps bouncing between nodes indefinitely and logs "High hopcount"
35 messages in the log. This also caused performance degradation.
37 * ctdbd was losing log messages at shutdown because they weren't being
38 given time to flush. ctdbd now sleeps for a second during shutdown
39 to allow time to flush log messages.
41 * Improved socket handling introduced in CTDB 2.2 caused ctdbd to
42 process a large number of packets available on single FD before
43 polling other FDs. Use fixed size queue buffers to allow fair
44 scheduling across multiple FDs.
46 Important internal changes
47 --------------------------
49 * A node that fails to take/release multiple IPs will only incur a
50 single banning credit. This makes a brief failure less likely to
51 cause node to be banned.
53 * ctdb killtcp has been changed to read connections from stdin and
54 10.interface now uses this feature to improve the time taken to kill
57 * Improvements to hot records statistics in ctdb dbstatistics.
59 * Recovery daemon now assembles up-to-date node flags information
60 from remote nodes before checking if any flags are inconsistent and
63 * ctdbd no longer creates multiple lock sub-processes for the same
64 key. This reduces the number of lock sub-processes substantially.
66 * Changed the nfsd RPC check failure policy to failover quickly
67 instead of trying to repair a node first by restarting NFS. Such
68 restarts would often hang if the cause of the RPC check failure was
69 the cluster filesystem or storage.
71 * Logging improvements relating to high hopcounts and sticky records.
73 * Make sure lower level tdb messages are logged correctly.
75 * CTDB commands disable/enable/stop/continue are now resilient to
76 individual control failures and retry in case of failures.
85 * 2 new configuration variables for 60.nfs eventscript:
87 - CTDB_MONITOR_NFS_THREAD_COUNT
88 - CTDB_NFS_DUMP_STUCK_THREADS
90 See ctdb.sysconfig for details.
92 * Removed DeadlockTimeout tunable. To enable debug of locking issues set
94 CTDB_DEBUG_LOCKS=/etc/ctdb/debug_locks.sh
96 * In overall statistics and database statistics, lock buckets have been
97 updated to use following timings:
99 < 1ms, < 10ms, < 100ms, < 1s, < 2s, < 4s, < 8s, < 16s, < 32s, < 64s, >= 64s
101 * Initscript is now simplified with most CTDB-specific functionality
102 split out to ctdbd_wrapper, which is used to start and stop ctdbd.
104 * Add systemd support.
106 * CTDB subprocesses are now given informative names to allow them to
107 be easily distinguished when using programs like "top" or "perf".
112 * ctdb tool should not exit from a retry loop if a control times out
113 (e.g. under high load). This simple fix will stop an exit from the
114 retry loop on any error.
116 * When updating flags on all nodes, use the correct updated flags. This
117 should avoid wrong flag change messages in the logs.
119 * The recovery daemon will not ban other nodes if the current node
122 * ctdb dbstatistics command now correctly outputs database statistics.
124 * Fixed a panic with overlapping shutdowns (regression in 2.2).
126 * Fixed 60.ganesha "monitor" event (regression in 2.2).
128 * Fixed a buffer overflow in the "reloadips" implementation.
130 * Fixed segmentation faults in ping_pong (called with incorrect
131 argument) and test binaries (called when ctdbd not running).
133 Important internal changes
134 --------------------------
136 * The recovery daemon on stopped or banned node will stop participating in any
139 * Improve cluster wide database traverse by sending the records directly from
140 traverse child process to requesting node.
142 * TDB checking and dropping of all IPs moved from initscript to "init"
145 * To avoid "rogue IPs" the release IP callback now fails if the
146 released IP is still present on an interface.
155 * The "stopped" event has been removed.
157 The "ipreallocated" event is now run when a node is stopped. Use
158 this instead of "stopped".
160 * New --pidfile option for ctdbd, used by initscript
162 * The 60.nfs eventscript now uses configuration files in
163 /etc/ctdb/nfs-rpc-checks.d/ for timeouts and actions instead of
164 hardcoding them into the script.
166 * Notification handler scripts can now be dropped into /etc/ctdb/notify.d/.
168 * The NoIPTakeoverOnDisabled tunable has been renamed to
169 NoIPHostOnAllDisabled and now works properly when set on individual
172 * New ctdb subcommand "runstate" prints the current internal runstate.
173 Runstates are used for serialising startup.
178 * The Unix domain socket is now set to non-blocking after the
179 connection succeeds. This avoids connections failing with EAGAIN
180 and not being retried.
182 * Fetching from the log ringbuffer now succeeds if the buffer is full.
184 * Fix a severe recovery bug that can lead to data corruption for SMB clients.
186 * The statd-callout script now runs as root via sudo.
188 * "ctdb delip" no longer fails if it is unable to move the IP.
190 * A race in the ctdb tool's ipreallocate code was fixed. This fixes
191 potential bugs in the "disable", "enable", "stop", "continue",
192 "ban", "unban", "ipreallocate" and "sync" commands.
194 * The monitor cancellation code could sometimes hang indefinitely.
195 This could cause "ctdb stop" and "ctdb shutdown" to fail.
197 Important internal changes
198 --------------------------
200 * The socket I/O handling has been optimised to improve performance.
202 * IPs will not be assigned to nodes during CTDB initialisation. They
203 will only be assigned to nodes that are in the "running" runstate.
205 * Improved database locking code. One improvement is to use a
206 standalone locking helper executable - the avoids creating many
207 forked copies of ctdbd and potentially running a node out of memory.
209 * New control CTDB_CONTROL_IPREALLOCATED is now used to generate
210 "ipreallocated" events.
212 * Message handlers are now indexed, providing a significant
213 performance improvement.