1 Copyright (c) 2002-2004 MontaVista Software, Inc.
2 Copyright (c) 2006 Red Hat, Inc.
6 This software licensed under BSD license, the text of which follows:
8 Redistribution and use in source and binary forms, with or without
9 modification, are permitted provided that the following conditions are met:
11 - Redistributions of source code must retain the above copyright notice,
12 this list of conditions and the following disclaimer.
13 - Redistributions in binary form must reproduce the above copyright notice,
14 this list of conditions and the following disclaimer in the documentation
15 and/or other materials provided with the distribution.
16 - Neither the name of the MontaVista Software, Inc. nor the names of its
17 contributors may be used to endorse or promote products derived from this
18 software without specific prior written permission.
20 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
21 AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
22 IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
23 ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
24 LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
25 CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
26 SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
27 INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
28 CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
29 ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF
30 THE POSSIBILITY OF SUCH DAMAGE.
32 -------------------------------------------------------------------------------
33 This file provides a map for developers to understand how to contribute
34 to the openais project. The purpose of this document is to prepare a
35 developer to write a service for openais, or understand the architecture
38 The following is described in this document:
40 * all files, purpose, and dependencies
41 * architecture of openais
42 * taking advantage of virtual synchrony
46 -------------------------------------------------------------------------------
47 all files, purpose, and dependencies.
48 -------------------------------------------------------------------------------
56 Definitions for AMF interface.
60 Definitions for CKPT interface.
64 Definitions for CLM interface.
68 Definitions for the AMF interface.
72 Defintiions for the EVT interface.
76 Definitions for the LCK interface.
79 Definitions for the CFG interface.
82 Definitions for the CPG interface.
85 Definitions for the EVS interface.
88 IPC interface between client and server for AMF service.
91 IPC interface between client and server for CFG service.
94 IPC interface between client and server for CKPT service.
97 IPC interface between client and server for CLM service.
100 IPC interface between client and server for CPG service.
103 IPC interface between client and server for EVS service.
106 IPC interface between client and server for EVT service.
109 IPC interface for generic operations.
112 IPC interface between client and server for LCK service.
115 IPC interface between client and server for MSG service.
118 Handle database implementation.
121 Linked list implementation.
124 Byte swapping implementation.
127 FIFO queue implementation.
130 Sort queue where items are sorted according to a sequence number. Avoids
131 Sort, hence, install of a new element takes is O(1). Inline implementation.
140 AMF user library linked into user application.
144 CFG user library linked into user application.
148 CKPT user library linked into user application.
152 CLM user library linked into user application.
156 CPG user library linked into user application.
160 EVS user library linked into user application.
164 EVT user library linked into user application.
168 LCK user library linked into user application.
172 MSG user library linked into uer application.
176 AMF user library linked into user application.
180 CKPT user library linked into user application.
184 EVT user library linked into user application.
188 Utility functions used by all libraries.
195 Parser plugin for default configuration file format.
198 Poll abstraction interface.
201 AMF application handling.
204 AMF cluster handling.
207 AMF component level handling.
210 Defines all AMF symbol names.
213 AMF node level handling.
216 AMF service group handling.
219 AMF Service instance handling.
222 AMF service unit handling.
225 AMF utility functions.
228 Server side implementation of CFG service which is used to display
229 redundant ring status and reenabling redundant rings.
232 Server side implementation of Checkpointing (CKPT API).
235 Server side implementation of Cluster Membership (CLM API).
238 Server side implementation of closed procss groups (CPG API).
241 Cryptography functions used by openais.
244 Server side implementation of extended virtual synchrony passthrough
248 Server side implementation of Event Service (EVT API).
251 All IPC operations used by openais.
257 Secret key generator used by openais encryption tools.
260 Server side implementation of the distributed lock service (LCK API).
263 Main function which connects all components together.
265 exec/mainconfig.{c|h}
266 Reads main configuration that is set in the configuration parser.
272 Server side implementation of message service (MSG API).
275 Object database used to configure services.
277 exec/openais-instantiate.c
278 instantiates a component by forking and exec'ing it and writing its
282 Non-blocking thread-based logging service with overflow protection.
285 Service handling routines including the default service handler
289 The synchronization service implementation.
292 Threaded based timer service.
295 Timer list used to expire timers.
297 exec/totemconfig.{c.h}
298 The totem configuration configurator from data parsed with aisparser
299 in the configuration file.
302 General definitions for the totem protocol used by the totem stack.
305 IP handling functions for totem - lowest on stack.
308 The totem multi ring protocool and currently unimplemented. Between
309 totemsrp and totempg.
312 Network handling functions for totem - between totemip and totemrrp.
315 Process groups interface which is used by all applications - highest on
319 Redundant ring functions for totem - between totemnet and totemsrp.
322 Utility functions used by openais executive.
325 Defines build version.
328 Virtual Synchrony plugin API.
331 Virtual Synchrony YKD Dynamic Linear Voting algorithm.
338 Counts the lines of code in the AIS implementation.
340 -------------------------------------------------------------------------------
341 architecture of openais
342 -------------------------------------------------------------------------------
344 The openais standards based cluster framework is a generic cluster plugin
345 architecture used to create cluster APIs and services. Usually there are
346 libraries which implement APIs and are linked into the end user application.
347 The libraries request services from the aisexec process, called the AIS
348 executive. The AIS executive uses the Totem protocol stack to communicate
349 within the cluster and execute operations on behalf of the user. Finally the
350 response of the API is delivered once the operation has completed.
353 --------------------------------------------------
354 | AMF and more services libraries |
355 --------------------------------------------------
357 --------------------------------------------------
358 | openais Executive |
360 | +---------+ +--------+ +---------+ |
361 | | Object | | AIS | | Service | |
362 | | Datbase | | Config | | Handler | |
363 | | Service | | Parser | | Manager | |
364 | +---------+ +--------+ +---------+ |
365 | +-------+ +-------+ |
367 | |Service| |svcs...| |
368 | +-------+ +-------+ |
377 | +--------------------------------+ +--------+ |
378 | | Totem | | Timers | |
379 | | Stack | | API | |
380 | +--------------------------------+ +--------+ |
386 -------------------------------------------------
388 Figure 1: openais Architecture
390 Every application that intends to use openais links with the libais library.
391 This library uses IPC, or more specifically BSD unix sockets, to communicate
392 with the executive. The library is a small program responsible only for
393 packaging the request into a message. This message is sent, using IPC, to
394 the executive which then processes it. The library then waits for a response.
396 The library itself contains very little intelligence. Some utility services
399 * create a connection to the executive
400 * send messages to the executive
401 * retrieve messages from the executive
403 * create a handle instance
404 * destroy a handle instance
405 * get a reference to a handle instance
406 * release a reference to a handle instance
408 When a library connects, it sends via a message, the service type. The
409 service type is stored and used later to reference the message handlers
410 for both the library message handlers and executive message handlers.
411 Every message sent contains an integer identifier, which is used to index
412 into an array of message handlers to determine the correct message handler
413 to execute For the library. Hence a message is uniquely identified by the
414 message handler ID number and the service handler ID number.
416 When a library sends a message via IPC, the delivery of the message occurs
417 to the proper library message handler. The library message handler is
418 responsible for sending the message via the totem process groups API to all
421 This simplifies the library handler significantly. The main purpose of the
422 library handler should be to package the library request into a message that
423 can be sent to all nodes.
425 The totem process groups API sends the message according to the extended
426 virtual synchrony model. The group messaging interface also delivers the
427 message according to the extended virtual synchrony model. This has several
428 advantages which are described in the virtual synchrony section. One
429 advantage that must be described now is that messages are self-delivered;
430 if a node sends a message, that same message is delivered back to that
433 When the executive message is delivered, it is processed by the executive
434 message handler. The executive message handler contains the brains of
435 AIS and is responsible for making all decisions relating to the request
436 from the libais library user.
438 -------------------------------------------------------------------------------
439 taking advantage of virtual synchrony
440 -------------------------------------------------------------------------------
443 processor: a system responsible for executing the virtual synchrony model
444 configuration: the list of processors under which messages are delivered
445 partition: one or more processors leave the configuration
446 merge: one or more processors join the configuration
447 group messaging: sending a message from one sender to many receivers
449 Virtual synchrony is a model for group messaging. This is often confused
450 with particular implementations of virtual synchrony. Try to focus on
451 what virtual syncrhony provides, not how it provides it, unless interested
452 in working on the group messaging interface of openais.
454 Virtual synchrony provides several advantages:
456 * integrated membership
457 * strong membership guarantees
458 * agreed ordering of delivered messages
459 * same delivery of configuration changes and messages on every node
461 * reliable communication in the face of unreliable networks
462 * recovery of messages sent within a configuration where possible
463 * use of network multicast using standard UDP/IP
465 Integrated membership allows the group messaging interface to give
466 configuration change events to the API services. This is obviously beneficial
467 to the cluster membership service (and its respective API0, but is helpful
468 to other services as described later.
470 Strong membership guarantees allow a distributed application to make decisions
471 based upon the configuration (membership). Every service in openais registers
472 a configuration change function. This function is called whenever a
473 configuration change occurs. The information passed is the current processors,
474 the processors that have left the configuration, and the processors that have
475 joined the configuration. This information is then used to make decisions
476 within a distributed state machine. One example usage is that an AMF component
477 running a specific processor has left the configuration, so failover actions
478 must now be taken with the new configuration (and known components).
480 Virtual synchrony requires that messages may be delivered in agreed order.
481 FIFO order indicates that one sender and one receiver agree on the order of
482 messages sent. Agreed ordering takes this requirement to groups, requiring that
483 one sender and all receivers agree on the order of messages sent.
485 Consider a lock service. The service is responsible for arbitrating locks
486 between multiple processors in the system. With fifo ordering, this is very
487 difficult because a request at about the same time for a lock from two seperate
488 processors may arrive at all the receivers in different order. Agreed ordering
489 ensures that all the processors are delivered the message in the same order.
490 In this case the first lock message will always be from processor X, while the
491 second lock message will always be from processor Y. Hence the first request
492 is always honored by all processors, and the second request is rejected (since
493 the lock is taken). This is how race conditions are avoided in distributed
496 Every processor is delivered a configuration change and messages within a
497 configuration in the same order. This ensures that any distributed state
498 machine will make the same decisions on every processor within the
499 configuration. This also allows the configuration and the messages to be
500 considered when making decisions.
502 Virtual synchrony requires that every node is delivered messages that it
503 sends. This enables the logic to be placed in one location (the handler
504 for the delivery of the group message) instead of two seperate places. This
505 also allows messages that are sent to be ordered in the stream of other
506 messages within the configuration.
508 Certain guarantees are required by virtual synchrony. If a message is sent,
509 it must be delivered by every processor unless that processor fails. If a
510 particular processor fails, a configuration change occurs creating a new
511 configuration under which a new set of decisions may be made. This implies
512 that even unreliable networks must reliably deliver messages. The
513 mplementation in openais works on unreliable as well as reliable networks.
515 Every message sent must be delivered, unless a configuration change occurs.
516 In the case of a configuration change, every message that can be recovered
517 must be recovered before the new configuration is installed. Some systems
518 during partition won't continue to recover messages within the old
519 configuration even though those messages can be recovered. Virtual synchrony
520 makes that impossible, except for those members that are no longer part
523 Finally virtual syncrhony takes advantage of hardware multicast to avoid
524 duplicated packets and scale to large transmit rates. On 100mbit network,
525 openais can approach wire speeds depending on the number of messages queued
526 for a particular processor.
528 What does all of this mean for the developer?
530 * messages are delivered reliably
531 * messages are delivered in the same order to all nodes
532 * configuration and messages can both be used to make decisions
534 -------------------------------------------------------------------------------
536 -------------------------------------------------------------------------------
538 The first stage in adding a library to the system is to develop the library.
540 Library code should follow these guidelines:
542 * use SA Forum coding style for SA Forum APIs to aid in debugging
543 * use openais coding guidelines for APIs that are not SA Forum that
544 are to be merged into the openais tree.
545 * implement all library code within one file named after the api.
546 examples are ckpt.c, clm.c, amf.c.
547 * use parallel structure as much as possible between different APIs
548 * make use of utility services provided by util.c.
549 * if something is needed that is generic and useful by all services,
550 submit patches for other libraries to use these services.
551 * use the reference counting handle manager for handle management.
557 struct saVersionDatabase {
559 SaVersionT *versionsSupported;
562 The versionCount number describes how many entries are in the version database.
563 The versionsSupported member is an array of SaVersionT describing the acceptable
564 versions this API supports.
566 An api developer specifies versions supported by adding the following C
567 code to the library file:
572 static SaVersionT clmVersionsSupported[] = {
577 static struct saVersionDatabase clmVersionDatabase = {
578 sizeof (clmVersionsSupported) / sizeof (SaVersionT),
582 After this is specified, the following API is used to check versions:
586 struct saVersionDatabase *versionDatabase,
587 const SaVersionT *version);
589 An example usage of this is
592 error = saVersioNVerify (&clmVersionDatabase, version);
594 where version is a pointer to an SaVersionT passed into the API.
596 error will return SA_OK if the version is valid as specified in the
603 Every handle instance is stored in a handle database. The handle database
604 stores instance information for every handle used by libraries. The system
605 includes reference counting and is safe for use in threaded applications.
607 The handle database structure is:
609 struct saHandleDatabase {
610 unsigned int handleCount;
611 struct saHandle *handles;
612 pthread_mutex_t mutex;
613 void (*handleInstanceDestructor) (void *);
616 handleCount is the number of handles
617 handles is an array of handles
618 mutex is a pthread mutex used to mutually exclude access to the handle db
619 handleInstanceDestructor is a callback that is called when the handle
620 should be freed because its reference count as dropped to zero.
622 The handle database is defined in a library as follows:
624 static void clmHandleInstanceDestructor (void *);
626 static struct saHandleDatabase clmHandleDatabase = {
629 .mutex = PTHREAD_MUTEX_INITIALIZER,
630 .handleInstanceDestructor = clmHandleInstanceDestructor
633 There are several APIs to access the handle database:
637 struct saHandleDatabase *handleDatabase,
641 Creates an instance of size instanceSize in the handleDatabase paraemter
642 returning the handle number in handleOut. The handle instance reference
643 count starts at the value 1.
647 struct saHandleDatabase *handleDatabase,
648 unsigned int handle);
650 Destroys further access to the handle. Once the handle reference count
651 drops to zero, the database destructor is called for the handle. The handle
652 instance reference count is decremented by 1.
655 saHandleInstanceGet (
656 struct saHandleDatabase *handleDatabase,
660 Gets an instance specified handle from the handleDatabase and returns
661 it in the instance member. If the handle is valid SA_OK is returned
662 otherwise an error is returned. This is used to ensure a handle is
663 valid. Eveyr get call increases the reference count on a handle instance
667 saHandleInstancePut (
668 struct saHandleDatabase *handleDatabase,
669 unsigned int handle);
671 Decrements the reference count by 1. If the reference count indicates
672 the handle has been destroyed, it will then be removed from the database
673 and the destructor called on the instance data. The put call takes care
674 of freeing the handle instance data.
676 Create a data structure for the instance, and use it within the libraries
677 to store state information about the instance. This information can be
678 the handle, a mutex for protecting I/O, a queue for queueing async messages
679 or whatever is needed by the API.
681 -----------------------------------
682 communicating with the executive
683 -----------------------------------
685 A service connection is created with the following API;
691 enum service_types service);
694 The responseOut parameter specifies the file descriptor where response messages
695 will be delivered. The callback out parameter describes the file descriptor
696 where callback messages are delivered.
698 The service specifies the service to use.
700 Messages are sent and received from the executive with the following functions:
702 SaAisErrorT saSendMsgRetry (
707 the s member is the socket to use retrieved with saServiceConnect
708 The iov is the iovector used to send a message.
709 the iov_len is the number of elements in iov.
711 This sends an IO-vectorized message.
720 the s member is the socket to use retrieved with saServiceConnect
721 the msg member is a pointer to the message to send to the service
722 the len member is the length of the message to send
723 the flags parameter is the flags to use with the sendmsg system call
726 This sends a data blob to the exective.
728 A message is received from the executive with the function:
737 the s member is the socket to use retrieved with saServiceConnect
738 the msg member is a pointer to the message to receive to the service
739 the len member is the length of the message to receive
740 the flags parameter is the flags to use with the sendmsg system call
742 A message may be send and a reply waited for with the following function:
743 SaAisErrorT saSendMsgReceiveReply (
747 void *responseMessage,
750 s is the socket to send and receive the response.
751 iov is the iovector to send.
752 iov_len is the number of elements in iov.
753 responseMessage is the data block used to store the response.
754 responesLen is the length of the data block that is expected to be received.
756 Waiting for a file descriptor using poll systemcall is done with the api:
764 where the parameters are the standard poll parameters.
766 Messages can be received out of order searching for a specific message id with:
771 Please follow the style of the messages. It makes debugging much easier
772 if parallel style is used.
774 An service should be added to service_types enumeration in ipc_gen or in the
775 case of an external project, a number should be registered with the project.
789 These are the request CLM message identifiers:
791 Each library should have an ipc_APINAME.h file in include. It should define
792 request types and response types.
795 MESSAGE_REQ_CLM_TRACKSTART = 0,
796 MESSAGE_REQ_CLM_TRACKSTOP = 1,
797 MESSAGE_REQ_CLM_NODEGET = 2,
798 MESSAGE_REQ_CLM_NODEGETASYNC = 3
801 These are the response CLM message identifiers:
804 MESSAGE_RES_CLM_TRACKCALLBACK = 0,
805 MESSAGE_RES_CLM_TRACKSTART = 1,
806 MESSAGE_RES_CLM_TRACKSTOP = 2,
807 MESSAGE_RES_CLM_NODEGET = 3,
808 MESSAGE_RES_CLM_NODEGETASYNC = 4,
809 MESSAGE_RES_CLM_NODEGETCALLBACK = 5
812 A request header should be placed at the front of every message send by
816 int size __attribute__((aligned(8)));
817 int id __attribute__((aligned(8)));
818 } mar_req_header_t __attribute__((aligned(8)));
820 There is also a response message header which should start every response
824 int size; __attribute__((aligned(8)))
825 int id __attribute__((aligned(8)));
826 SaAisErrorT error __attribute__((aligned(8)));
827 } mar_res_header_t __attribute__((aligned(8)));
829 the error parameter is used to pass errors from the executive to the library,
830 including SA_ERR_TRY_AGAIN for flow control, which is described later.
832 This is described later:
835 mar_uint32_t nodeid __attribute__((aligned(8)));
836 void *conn __attribute__((aligned(8)));
837 } mar_message_source_t __attribute__((aligned(8)));
839 This is the MESSAGE_REQ_CLM_TRACKSTART message id above:
841 struct req_clm_trackstart {
842 mar_req_header_t header;
844 SaClmClusterNotificationT *notificationBufferAddress;
845 SaUint32T numberOfItems;
848 The saClmClusterTrackStart api should create this message and send it to the
851 responses should be of:
853 struct res_clm_trackstart
858 * Avoid doing anything tricky in the library itself. Let the executive
859 handler do all of the work of the system. minimize what the API does.
860 * Once an api is developed, it must be added to the makefile. Just add
861 a line for the file to EXECOBJS build line.
862 * protect I/O send/recv with a mutex.
863 * always look at other libraries when there is a question about how to
864 do something. It has likely been thought out in another library.
866 -------------------------------------------------------------------------------
868 -------------------------------------------------------------------------------
869 Services are defined by service handlers and messages described in
870 include/ipc_SERVICE.h. These two peices of information are used by the
871 executive to dispatch the correct messages to the correct receipients.
873 -------------------------------
874 the service handler structure
875 -------------------------------
877 A service is added by defining a structure defined in exec/service.h. The
878 structure is a little daunting:
880 struct libais_handler {
881 int (*libais_handler_fn) (void *conn, void *msg);
884 enum openais_flow_control flow_control;
887 The response_size, response_id, and flow_control for a library handler are
888 used for flow control. A response message will be sent to the library of the
889 size response_size, with the header id of response_id if the totem message
890 queue is full. Some library APIs may not need to block in this condition
891 (because they don't have to use totem), so they should specify
892 OPENAIS_FLOW_CONTROL_NOT_REQUIREDin the flow control field.
894 The libais_handler_fn is a function to be called when the library handler is
895 requested to be executed.
897 struct openais_exec_handler {
898 void (*exec_handler_fn) (void *msg, unsigned int nodeid);
899 void (*exec_endian_convert_fn) (void *msg);
902 The exec_handler_fn is a function to be called when the executive handler is
903 requested to execute.
905 The exec_endian_convert_fn is a function to be called to convert the endianess
906 of the executive message. Note messages are not stored in big or little endian
907 format before transmit. Instead they are transmitted in either big endian or
908 little endian depending on the byte order of the transmitter and converted to
909 the host machine order on receipt of the message.
911 struct openais_service_handler {
914 unsigned int private_data_size;
915 int (*lib_init_fn) (void *conn);
916 int (*lib_exit_fn) (void *conn);
917 struct openais_lib_handler *lib_service;
918 int lib_service_count;
919 struct openais_exec_handler *exec_service;
920 int (*exec_init_fn) (struct objdb_iface_ver0 *);
921 int (*config_init_fn) (struct objdb_iface_ver0 *);
922 void (*exec_dump_fn) (void);
923 int exec_service_count;
925 enum totem_configuration_type configuration_type,
926 unsigned int *member_list, int member_list_entries,
927 unsigned int *left_list, int left_list_entries,
928 unsigned int *joined_list, int joined_list_entries,
929 struct memb_ring_id *ring_id);
930 void (*sync_init) (void);
931 int (*sync_process) (void);
932 void (*sync_activate) (void);
933 void (*sync_abort) (void);
936 name is the name of the service.
938 id is the identifier of the service.
940 private_data_size is the size of the private data used by the connection
941 which the library and executive handlers can reference.
943 lib_init_fn is the function executed when a library connection is made to
946 lib_exit_fn is the function executed when a library connection is exited
947 either because the application closed the file descriptor, or the OS
948 closed the file descriptor.
950 lib_service is an array of openais_lib_handler data structures which define
951 the library service handler.
953 lib_service_count is the number of elements in lib_service.
955 exec_service is an array of openais_exec_handler data structures which define
956 the executive service handler.
958 exec_init_fn is a function used to initialize the executive service. This
961 config_init_fn is called to parse config files and populate the object
964 exec_dump_fn is called when SIGUSR2 is sent to the executive to dump the
965 current state of the service.
967 exec_service_count is the number of entries in the exec_service array.
969 confchg_fn is called every time a configuration change occurs.
971 sync_init is called when the service should begin synchronization.
973 sync_process is called to process synchronization messages.
975 sync_activate is called to activate the current service synchronization.
977 sync_abort is called to abort the current service synchronization.
982 The totem protocol includes flow control so that it doesn't send too many
983 messages when the network is completely full. But the library can
984 still send messages to the executive much faster then the executive can send
985 them over totem. So the library relies on the group messaging flow control to
986 control flow of messages sent from the library. If the totem queues are full,
987 no more messages may be sent, so the executive in ipc.c automatically detects
988 this scenario and returns an SA_ERR_TRY_AGAIN error.
990 When a library gets SA_ERR_TRY_AGAIN, the library may either retry, or return
991 this error to the user if the error is allowed by the API definitions. The
992 The other information is critical to ensuring that the library reads the correct
993 message and size of message. Make sure the libais_handler matches the messages
994 used in the handler function.
996 ------------------------------------------------
997 dynamically linking the service handler plugin
998 ------------------------------------------------
1000 The service handler needs some special magic to dynamically be linked into
1004 * Dynamic loader definition
1006 static struct openais_service_handler *clm_get_service_handler_ver0 (void);
1008 static struct openais_service_handler_iface_ver0 clm_service_handler_iface = {
1009 .openais_get_service_handler_ver0 = clm_get_service_handler_ver0
1012 static struct lcr_iface openais_clm_ver0[1] = {
1014 .name = "openais_clm",
1016 .versions_replace = 0,
1017 .versions_replace_count = 0,
1019 .dependency_count = 0,
1020 .constructor = NULL,
1026 static struct lcr_comp clm_comp_ver0 = {
1028 .ifaces = openais_clm_ver0
1031 static struct openais_service_handler *clm_get_service_handler_ver0 (void)
1033 return (&clm_service_handler);
1036 __attribute__ ((constructor)) static void clm_comp_register (void) {
1037 lcr_interfaces_set (&openais_clm_ver0[0], &clm_service_handler_iface);
1039 lcr_component_register (&clm_comp_ver0);
1042 Once this code is added (substitute clm for the service being implemented),
1043 the service will be loaded if its in the default services list.
1045 The default service list is specified in service.c:default_services. If
1046 creating an external plugin, there are configuration parameters which may
1047 be used to add your plugin into the openais scanning of plugins.
1049 ---------------------------------
1050 Connection specific information
1051 ---------------------------------
1052 Every connection may have specific connection information if private data
1053 is greater then zero for the service handler. This is used to allow each
1054 library connection to maintain private state to that connection. The private
1055 data for a connection can be retrieved with:
1056 struct service_pd service_pd = (struct service_pd *)openais_conn_private_data_get (conn);
1058 where service is the name of the service implemented and conn is the connection
1059 information likely passed into the library handler or stored in a
1060 message_source structure for later use by an executive handler.
1062 ------------------------------
1063 sending responses to the api
1064 ------------------------------
1066 A message is sent to the library from the executive message handler using
1069 extern int openais_conn_send_response (void *conn_info, void *msg,
1072 conn_info is passed into the library message handler or stored in the
1073 executive message. This member describes the connection to send the response.
1075 msg is the message to send
1076 mlen is the length of the message to send
1078 Keep in mind that struct res_message should be at the beginning of the response
1079 message so that it follows the style used in the rest of openais.
1081 --------------------------------------------
1082 deferring response to an executive message
1083 --------------------------------------------
1085 The message source structure is used to store information about the source of a
1086 message so a later executive message can respond to a library request. In
1087 a library handler, the source field should be set up with:
1089 message_source_set (&req_exec_ZZZZZZZ.source, conn);
1090 gmi_mcast (req_exec_ZZZZZZZ)
1092 In this case conn_info is passed into the library message handler
1094 Then the executive message handler determines if this processor is responsible
1097 if (message_source_is_local (conn)) {
1098 openais_conn_send_response ();
1105 To send a message to every processor and the local processor for self
1106 delivery according to virtual synchrony semantics use:
1108 The totempg interface supports multiple users at one time and if you need
1109 to use a full totempg interface (defined in totempg.h) please ask for
1110 assistance on the mailing list. If you simply want to use multicast
1111 transmissions in openais, do the following:
1113 assert (totempg_groups_mcast_joined (openais_group_handle, &req_exec_clm_iovec, 1, TOTEMPG_AGREED) == 0);
1118 Every library handler has the prototype:
1120 static int message_handler_req_clm_init (void *conn, void *msg);
1122 The start of the handler function should look something like this:
1124 int message_handler_req_clm_trackstart (void *conn *conn,
1127 struct req_clm_trackstart *req_clm_trackstart =
1128 (struct req_clm_trackstart *)message;
1130 { package up library handler message into executive message }
1131 { multicast message using totempg interface }
1134 This assigns the void *message to a structure that can be used by the
1137 The conn field is used to indicate where the response should respond to.
1138 Use the tricks described in deferring a response to the executive handler to
1139 have the executive handler respond to the message.
1141 avoid doing anything tricky in a library handler. Do all the work in the
1142 executive handler at first. If later, it is possible to optimize, optimize
1148 Every executive handler has the prototype:
1150 static int message_handler_req_exec_clm_nodejoin (void *msg,
1151 unsigned int nodeid);
1153 The start of the handler function should look something like this:
1155 static int message_handler_req_exec_clm_nodejoin (void *msg,
1156 unsigned int nodeid);
1158 struct req_exec_clm_nodejoin *req_exec_clm_nodejoin = (struct req_exec_clm_nodejoin *)message;
1160 { do real work of executing request, this is done on every node }
1163 The conn_info structure is not available. If it is needed, it can be stored
1164 in the message sent by the library message handler in a source structure.
1166 The msg field contains the message sent by the library handler
1168 The nodeid is a unique node identifier of the node that originated the message.
1170 --------------------
1172 --------------------
1173 This should be used to initialize any state for the connection.
1175 --------------------
1177 --------------------
1178 This function is called every time a service connection is disconnected by
1179 the executive. Free memory, change structures, or whatever work needs to
1180 be done to clean up.
1182 If the exit_fn couldn't complete because it is waiting for some event, it may
1183 return -1, which will allow the executive to make some forward progress. Then
1184 exit_fn will be called again. Return 0 when the exit was completed. This is
1185 most useful when toteom should be used to queue a message, but the queue is
1186 full. In this case, waiting a few more seconds may open up the queue, so
1187 return -1, and then the executive will try again to call exit_fn. Do NOT
1188 return -1 forever or the ais executive will spin.
1190 If -1 is returned, ENSURE that the state of the library hasn't changed so much that
1191 exit_fn cannot be called again. If exit_fn returns -1, it WILL be called again
1192 so expect it in the code.
1197 This function is called whenever a configuration change occurs. Some
1198 services may not need this function, while others may. This is a good way
1199 to sync up joining nodes with the current state of the information stored
1200 on a particular processor.
1202 -------------------------------------------------------------------------------
1204 -------------------------------------------------------------------------------
1205 GDB is your friend, especially the "where" command. But it stops execution.
1206 This has a nasty side effect of killing the current configuration. In this
1207 case GDB may become your enemy.
1209 printf is your friend when GDB is your enemy.
1211 If stuck, ask on the mailing list, send your patches. Alot of time has been
1212 spent designing openais, and even more time debugging it. There are people
1213 that can help you debug problems, especially around things like message
1216 Submit patches early to get feedback, especially around things like parallel
1217 style. Parallel style is very important to ensure maintainability by the
1220 If this document is wrong or incomplete, complain so we can get it fixed