1 Application Interface Specification Implementation
2 --------------------------------------------------
4 * Add GMI & sort queues to build environment.
5 * Changed queue.h to be more consistent with naming conventions used in tree.
6 * Simplified cluster membmership service by modifying to use GMI.
7 * Modified availability management framework to use GMI.
8 * Modified method services connect to main executive to make services standalone
9 objects. This could facilitate plugin services in the future.
10 * Modified checkpointing for multinode using GMI.
11 * healthcheck timeouts were being caused by slow access to the filesystem for
12 sockets and deleted sockets. This was fixed by using the abstract namespace
13 which is memory based.
14 * Generic logging facility added to get rid of numerous #ifdef DEBUGs and avoid
15 the SIGPIPE and signal changes created by the libc's version of syslog.
16 * Authentication for API<->executive added. Only uid=0 or gid=ais processes can
17 connect to the executive via the libraries to provide service.
18 * This release has alot of cleanups!
19 * Removed all point-to-point authentication since using GMI now.
21 * GMI provides extended virtual synchrony semantics with:
22 * - agreed message ordering (all processors agree on message order)
23 * - using available hardware multicast
24 * - group membership algorithm that currently supports 16 processors
25 * - message fragmentation to fit MTU that avoids UDP fragmentation
26 * - full recovery of messages during configuration change
27 * - 512kb message support
31 * Changed all send/recv functions to sendmsg/recvmsg.
34 * Remove current poll code and replace with poll abstraction.
37 * Add CKPT service to AIS Executive for single node checkpointing.
40 * Added support for pthreads to clm and amf interfaces. Critical
41 sections and shared data now protected by mutex. If Dispatch in
42 one thread, Finalize in another thread will cause Dispatch to behave
44 * Moved global receive buffer for APIs into handles or stack.
45 * Allocate and free instance memory in handle manager instead of in each API.
46 * Merge defect fixes from 0.22.1-0.22.8 into development tree.
47 * CkptCheckpointHandleT and CkptSectionIteratorT functions created a new
48 connection each time any APIs using those types were called. Now these
49 types have been encapsulated into their own handle database which doesn't
50 create (expensive) connections for each API call.
51 * Changed named types such as MESSAGE_CKPT_REQ* to MESSAGE_REQ_CKPT* to match
55 * Added ability for authentication to use none, password, or DSA depending on
56 settings in file /etc/ais/authtype. The values are no authentication, password
57 authentication, or dsa authentication.
58 * Added DSA authentication to node-to-node communication.
59 Server generates 16 byte random number, sends random number to connecting
60 client, client signs random number message with DSA private key, server
61 verifies signature with public key of client for random number message.
62 * Added DSA key generator.
63 * Added password authentication. File /etc/ais/aiskeys/password is used by the
64 server to compare the client's password. If they match, the connection is
66 * Added no authentication option.
67 * Fixed PPC compile to compile cleanly with -Wall.
68 * Added version checking to APIs.
71 * Fixed problem if outbound queues have messages queued, they are not
72 sent during the poll loop because poll isn't passed the correct events flag.
73 This problem introduced in the select to poll conversion in 0.23.
74 * Healthchecks for CLM service intra-node now run on seperate timers per connection.
75 * Fixed problem connections intra-node were completely broken as a result
76 of the change from select to poll in 0.23.
77 * Fixed bug in AMF timer_del on NULL timer.
78 * Fixed few bugs in AMF intra-node communications would result in segfault.
79 * Made pollfd_table global to reduce variable passing and simplify code.
80 * Cleaned up with compile of -Wall which found several bugs.
81 * Some minor cleanups of makefiles from major reorg in 0.23.
82 * Replaced memory malloc/free/realloc with memory pool versions to avoid
83 failed memory allocation requests and improve realtime response.
84 * Implemented the library portion of all checkpointing (Ckpt) APIs.
87 * Reorganized executive into exec directory and split executive components
89 * Placed library interfaces into lib directory.
90 * Placed test components into test directory.
91 * Abstracted some of the service setup and teardown code that was
92 integrated into the main loops and disconnect function call to call
93 generic functions {amf|clm}InitializeExecutive, and {amf|clm}FinalizeApi.
94 * Cleaned up connection (ci) datatypes.
95 * Removed old timers, replaced with generic timer implementation.
96 * replaced select with poll in executive.
97 * lock memory of process and set RR prio 99 to avoid priority inversions.
98 * Fixed problem where accept couldn't connect because resource exhaustion,
99 executive would crash.
102 * Fix defect in HA state and operational state machines where states are not
103 always determined correctly.
104 * Fix defect invalid argument to saAmfErrorReport will crash executive.
105 * Fix defect no /var/run/aisexec.pid created for service.
108 * Fix defect testclm doesn't exit if no connection to AIS executive possible.
109 * Fix defect aisexec crashes if no /etc/groups.conf file present.
110 * Fix defect aisexec opens "groups.conf" instead of "/etc/groups.conf" file.
111 * Fix defect aisexec doesn't set SaClmClusterNodeT data structure if local interface
112 not defined in /etc/clusterips, or empty/no /etc/clusterips file present.
113 * Fixed some basic error reporting in aisexec to use syslog's LOG_ERR instead
115 * Fixed SEGV if component not found for componentcapabilitymodelget API.
118 * Correctly parse model values in configuration file for both service
119 groups and components.
120 * Changed variables with text nodeexec to aisexec.
121 * Fixed a bug select wouldn't retry because errno checked for -EINTR
122 when it should be checking for EINTR.
123 * Correctly determine startup HA state and send appropriate ha state
124 changes to registered receivers.
127 * Implemented saAmfCSISetCallback.
128 * Implemented saAmfCSIRemoveCallback.
129 * Implemented saAmfProtectionGroupTrackStart.
130 * Implemented saAmfProtectionGroupTrackCallback.
131 * Implemented saAmfProtectionGroupTrackStop.
132 * Implemented saAmfErrorReport.
133 * Implemented saAmfErrorCancelAll.
134 * Implemented saAmfResponse.
135 * Implemented saAmfComponentCapabilityModelGet.
136 * Implemented saAmfPendingOperationGet dummy function.
137 This function will have to be rewritten to be
138 correct, but completes the API for now.
139 * Fixed problem where queued messages would not cause select
140 to be triggered. Occured in component register, unregister,
141 track start, track stop functions.
144 * Implemented saAmfReadinessStateSetCallback.
145 * Implemented saAmfStoppingComplete.
146 * Implemented saAmfComponentTerminateCallback.
147 * Implemented saAmfHAStateGet.
150 * Implemented saAmfHealthcheckCallback.
151 * Implemented saAmfResponse.
152 * Implemented saAmfReadinessStateGet.
153 * Made connect non-blocking to fix bug where blocking connects
154 could cause timeout on heartbeating.
155 * Made recv's non-blocking by adding small buffer to each connection
156 and recving and processing as needed.
157 * Integrated message dispatch for libais and nodeexec connections.
158 * Fixed bug where SA_TRACK_CURRENT does not return current state of
159 cluster membership after second invocation of the testclm application.
160 * Seperated several functions from main.
161 * Fixed bug where outqs were not flushed when data present within them
162 at end of processing loop before next select. Previously they would
163 only flush when new data was sent on the queues.
164 * Fixed bug where CLM and AMF always processed dispatch functions in
165 SA_DISPATCH_BLOCKING mode.
168 * zero out component data structure during parse.
169 * make component register/unregister update state in the group list.
170 * return ERR_NOT_EXIST and ERR_EXIST and BAD_OPERATION error codes for
171 register and unregister as per spec.
172 * renamed executive message handlers to include exec in name of handler
174 * Handle null proxyCompName to register and unregister as per spec.
175 * Fixed off-by-one in handle database that resulted in badness when
176 allocating memory after creating any handle (ie: create two handles).
177 * Added component enumerator which enumerates all components and executes
178 a function on the component.
179 * Added component enumerator to unregister all library and nodeexec
180 connections that are disconnected.
183 * Added correct time stamping for SaClmClusterNodeT structure.
184 * Made nodeexec message handlers use the message_handler structure.
185 * Made nodeexec_process_receive handle messages made of only headers
186 with no payloads. Previously, nodeexec would lock.
187 * Seperated heartbeat into two shorter messages one request and one response.
188 * Changed names of some structures to be more consistent.
191 * Added simple linked list implementation list.h.
192 * Modified parser to read data into linked lists. This makes processing
193 register/unregister/healthchecks/management of HAState easier.
194 * Exported queue implementation from nodeexec.c to queue.h.
195 * Seperated parser and parser testing code to seperate source files.
196 * Modified makefile to build parser test code.
197 * Modified parser test code to display new linked-list implementation.
198 * Partially implemented register/unregister/get component name commands
202 * Genericized nodeexec handler so it could run any type of service
203 and each service has its own set of handler functions as not to crash
204 the node executive. This allows a set of messages to be designed
205 to not be crashable, vs trying to figure out all of the interactions
206 between a flat message name space.
207 * Added size field for messages stored into the outq so
208 two send_messages in the nodeexec wouldn't crash. The send_message
209 function requires the size on messages, and the size was retrieved from
210 the message header. In a two-part send, there is no message header in
211 the second message. This was resulting in junk data and possible
212 crashes on messages that must be queued waiting for the libais to
213 recv on the other end.
214 This also fixes the MESSAGE_MAGIC value being displayed in some
215 NodeId fields of the testclm application.
216 * Changed lots of type names to something more consistent.
217 * Changed lots of enumerated types names to something more consistent.
220 * Abstracted some of the networking functions for EINTR and other errors
221 Added library handle verification and generic handle database mechanism
222 available for all services.
223 * Implemented database mechanism and all abstracted functions on cluster
227 * Defined AMF configuration file.
228 * Added configuration parser for AMF service.
229 * Defined inital AMF header/c files.
232 * Initial release of cluster membership service.