fix for trie index
[csql.git] / src / usermanual / UserManual.tex
1 \documentclass[12pt]{article}
2 \usepackage{graphicx} % needed for including graphics e.g. EPS, PS
3 \topmargin -1.5cm % read Lamport p.163
4 \oddsidemargin -0.04cm % read Lamport p.163
5 \evensidemargin -0.04cm % same as oddsidemargin but for left-hand pages
6 \textwidth 16.59cm
7 \textheight 21.94cm
8 %\pagestyle{empty} % Uncomment if don't want page numbers
9 \parskip 7.2pt % sets spacing between paragraphs
10 %\renewcommand{\baselinestretch}{1.5} % Uncomment for 1.5 spacing between lines
11 \parindent 0pt % sets leading space for paragraphs
14 \begin{document}
16 \title{CSQL User Manual }
17 \author{}
18 \date{}
19 \maketitle
21 \tableofcontents
22 \pagebreak
24 % Start your text
25 \section{Introduction}
26 \label{Introduction}
28 \emph {CSQL} is compact SQL main memory database which delivers ultra high performance and throughput. It is a free software, licensed with the GNU GENERAL PUBLIC LICENSE
29 It aids in the development of high performance, fault-resilent applications requiring concurrent access to the shared data.
31 This database is built keeping ultra fast performance in mind. This database suits well as a front end database for other commercial database to increase the throughput. Throughput of queries involving single table may improve by 20x. This loads all the data into memory and avoids on demand fetches from disk and avoids the overhead of buffer manager found in commercial disk based database systems. It provides fast access through shared memory.
33 This main memory database shall be used as a cache for disk based database to improve the throughput of your exising application by 20x times without any code change.
34 \subsection{Platforms Supported}
35 \label{psupported}
36 Linux \\
37 Solaris
39 \subsection{Compilers Supported}
40 \label{csupported}
41 g++ in Linux \\
42 CC in Solaris
44 \subsection{Key Features}
45 \label{keyfeatures}
46 Single row lookup takes less than 10 microsecs (100K selects/second)
48 Single row insert, update, delete takes less than 20 microsecs (50K writes/second, 180 Million inserts/hour)
50 Memory resident architecture to give ultra fast performance
52 Embedded Architecture - database is co-located with application
54 Small Footprint
56 Multi Layered: ODBC, JDBC, SQL API, DB API
58 Supports ACI of Transaction Properties
60 Isolation Level support:
64 Protection from process failures
66 Faster lookups through Hash Indexes
68 Row Level locking for high concurrency
70 Hand written user level mutexes which avoids major recovery
72 Multi user support with authentication
74 Primitive SQL support
76 Primitive JDBC Support
79 \subsection{Architecture}
80 \label{arch}
81 Main memory databases are times faster than disk based database systems as, all the data is available
82 in physical memory and avoids the buffer manager overhead which is found in disk based database systems.
83 Moreover data structures and algorithms can work more efficiently as compared to traditional disk based algorithms. \\
85 CSQL is compact main memory database SQL Engine which supports limited set of features and gives ultra
86 fast response to database queries. It supports only limited features which is used by most of the real time
87 applications which includes INSERT, UPDATE, DELETE, SELECT on single table with local predicates.
88 The primary difference between traditional database system and main memory database system is that in the
89 latter the whole data is stored in main memory.It differs in \\
91 Indexing - A conventional index stores a data value followed by the address at which the record can be
92 located. When all the data is in memory anyway, there is no need to store the data value
93 redundantly, only the address is required. \\
95 Caching/buffer management - no longer exists because all the data is in memory already. All of the buffer or cache
96 processing as well as disk latency associated with moving data from disk to
97 memory (and vice versa) are removed. \\
99 Optimization - is still required but becomes much simpler and is more likely to choose the fastest path,
100 since the number of alternatives is much less than with a disk-based database \\
102 CSQL is developed keeping one thing in mind - performance. It uses shared memory architecure where all
103 applications access the database though direct pointer referencing. Data access is synchronized using mutexes
104 for internal structures and through row level locks for rows. The is absolutely no disk I/O for any DML operation,
105 only for writing trace log messages disk is used. Rest all are handled and processed in-memory. \\
107 \subsection{Compiling the source:}
108 \label{compsrc}
109 Go to the root directory and enter the following commands.
110 \begin{verbatim}
111 $./configure --prefix=`pwd`/install
112 $make
113 $make install
114 \end{verbatim}
116 This will create "install" directory under the current directory and places all the executables created in bin directory and libraries in lib directory. \\
118 \subsection{Generating API Reference}
119 \label{genAPIref}
120 Go to the root directory and enter
121 \begin{verbatim}
122 $doxygen
123 \end{verbatim}
125 This will create "docs/html" directory under which API Reference html files are stored. Refer index.html in that directory.
127 \section{Getting Started}
128 \label{getstarted}
129 \subsection{Starting the Server}
130 \label{startserver}
131 You should find csqlserver executable under the bin directory of the installation.
132 Change csql.conf file under the installation root directory if necessary.
134 Set the \textbf {CSQL\_INSTALL\_ROOT } environment variable to absolute path of installation root directory. For example if csql is installed in /opt/csql, enter
136 \begin{verbatim}
137 $export CSQL_INSTALL_ROOT=/opt/csql
138 \end{verbatim}
140 Set the \textbf{CSQL\_CONFIG\_FILE} environment variable to absolute path of csql.conf file.
142 \begin{verbatim}
144 \end{verbatim}
146 Set the \textbf {LD\_LIBRARY\_PATH} environment variable to absolute path $CSQL\_INSTALL\_ROOT/lib
148 \begin{verbatim}
150 \end{verbatim}
152 Note: Alternatively, setupenv.ksh script in installation directory shall be run to set all the above environment variables.
154 You should find csqlserver executable under the bin directory of the installation. Running that command will start the server.
155 \begin{verbatim}
156 $./csqlserver
157 \end{verbatim}
159 This starts the server and creates the database file.
161 \subsection{Shutting down the Server}
162 \label{shutserver}
163 Pressing Ctrl-C on the terminal where csqlserver is running, is safe and will stop the server gracefully by removing the database file.
165 \subsection{Creating tables}
166 \label{creatingtables}
167 Tables and Indexes can be created though interactive sql tool named csql.
168 To create table t1 with two fields integer and char
169 \begin{verbatim}
170 $csql
171 CSQL>create table t1 (f1 int, f2 char (20), primary key (f1));
172 \end{verbatim}
173 If you see "Statement Executed" message after you enter the above command, it means the table is created successfully.
175 \subsection{Inserting data into tables}
176 \label{insertingdata}
177 Rows shall be inserted using INSERT SQL statement through csql tool
178 \begin{verbatim}
179 $csql
180 CSQL>insert into t1 values (10, 'Value');
182 \end{verbatim}
183 If you see "Statement Executed: Rows Affected = 1 " message after you enter the above command, it means the row is inserted successfully.
185 \subsection{Selecting data from tables}
186 \label{selectdata}
188 \begin{verbatim}
189 $csql
190 CSQL>select * from t1;
191 \end{verbatim}
192 It will display all the rows in the table t1
195 \subsection{Deleting data from tables}
196 \label{deletedata}
198 \begin{verbatim}
199 $csql
200 CSQL>delete from t1;
201 \end{verbatim}
202 It will delete all the rows in the table t1
203 \section{API Interfaces}
205 \subsection{SQL Support}
206 \label{SQL Syntax Support:}
208 \begin{verbatim}
209 CREATE TABLE {tablename} ({fielddefinitionlist}[, PRIMARY KEY ({fieldname})])
210 fielddefinitionlist shall be fieldname type [(size)] [NOT NULL]
212 CREATE INDEX {idxname} ON {tablename}({fieldname}) [HASH] [UNIQUE] [PRIMARY]
214 DROP TABLE {tablename}
216 DROP INDEX {indexname}
218 INSERT INTO {tablename} [fieldNameList] VALUES ({valuelist})
220 UPDATE {tablename} SET {fldname=value ,...} [WHERE {condition}]
222 DELETE FROM {tablename} [WHERE {condition}]
224 SELECT {* | fieldNameList} FROM {tablename} [WHERE {condition}]
225 condition shall be [NOT] {predicate} {AND | OR } {predicate}
226 predicate shall be {fieldName} {"=, !=, >,<, >=, <="} [fieldName | value]
227 LIKE {format}, BETWEEN {value} AND {value} , IN {list}
228 \end{verbatim}
230 DataType support:
234 CHAR
237 \subsection{JDBC Interface }
238 \label{JDBC Interface }
239 \subsubsection{Supported Interfaces}
240 \label{supportedInterfaces}
241 \begin{verbatim}
242 Connection:: connect, close
243 Statement:: execute
244 PreparedStatement:: execute, params
245 ResultSetMetaData::
246 ResultSet:: Forward only
247 \end{verbatim}
250 DataType Support:
251 All Primitive types
252 Date, Time, TimeStamp
254 \subsubsection{Connecting from jdbc}
255 \label{jdbcconnect}
256 The default username is root and password is manager for the csql database system. Below code snippet shows how to connect to csql database from jdbc applications \\
257 \begin{verbatim}
258 Class.forName("csql.jdbc.JdbcSqlDriver");
259 Connection con = DriverManager.getConnection("jdbc:csql", "root", "manager");
260 \end{verbatim}
262 \subsubsection{Inserting data through jdbc}
263 \label{jdbcinsert}
264 The below code snippet inserts into table t1 with two fields, f1 integer and f2 char \\
265 \begin{verbatim}
266 PreparedStatement stmt = null;
267 stmt = con.prepareStatement("INSERT INTO T1 (f1, f2) VALUES (?, ?);");
268 int ret =0;
269 for (int i =0 ; i< 10 ; i++) {
270 stmt.setInt(1, i);
271 stmt.setString(2, String.valueOf(i+100));
272 ret = stmt.executeUpdate();
273 if (ret != 1) break; //error
275 stmt.close();
276 con.commit();
277 \end{verbatim}
279 \subsection{SQLAPI }
280 \label{SQLAPI }
282 \begin{verbatim}
283 SqlConnection :: connect, disconnect, commit, rollback
285 SqlStatement:: prepare, execute, bindField, bindParam,
286 fetch, set[Type]Param, getProjFldInfo, getParamFldInfo
288 FieldInfo :: Meta data information for params and projection fields.
289 \end{verbatim}
292 \subsection{DB API}
293 \label{dbapi}
294 This allows user process and threads to access or manipulate the database. Main interfaces are Connection, DatabaseManager, Table, etc.
295 Connection provides interface to connect and disconnect to the database file
296 DatabaseManager provides interface to create and drop database objects including tables and indexes.
297 Table provides interface to insert, update, delete and fetch tuples.
299 Refer API reference under the directory docs/html. (If there is no html, you shall generate it by yourself using doxygen tools. Refer the previous section for this.)
301 \subsubsection{Getting a Connection}
302 \label{getconn}
303 Connection interface is the heart of all the interfaces as it is the entry point for database access and it provides interface for transaction commit/rollback.
305 You can obtain a connection to database using the following code snippet:
307 \begin{verbatim}
308 Connection conn;
309 DbRetVal rv = OK;
310 rv ="root", ",manager");
311 if (rv!= OK) return -1;
313 \end{verbatim}
315 \subsubsection{Creating Tables}
316 \label{createtable}
317 Database Manager provides interface for table creation and deletion. We shall obtain the DatabaseManager object from the connection object. The table or schema definition is encapsulated in TableDef interface. It provides methods to specify the field definition of the table.
318 For example to create table with two fields,
319 \begin{verbatim}
320 DatabaseManager *dbMgr = conn.getDatabaseManager();
321 if (dbMgr == NULL) { printf("Bad connection \n"); return -1;}
322 TableDef tabDef;
323 tabDef.addField("f1", typeInt, 0, NULL, true);
324 tabDef.addField("f2", typeString, 196);
325 rv = dbMgr->createTable("t1", tabDef);
326 if (rv != OK) { printf("Table creation failed\n"); return -1; }
327 \end{verbatim}
329 First parameter of addField method is field name, second is the type of the field, third parameter is length, fourth is default value, fifth is not null flag and last is primary key flag.
330 In our example, field "f1" is not null and it is the primary key for the table.
331 Call addField for all the field definition in the table and call createTable passing table name as first parameter and TableDef as second parameter.
333 \subsubsection{Inserting into the tables}
334 \label{inserttable}
335 Any DML operation requires a transaction to be started. All the operations happen within the context of this transaction.
336 Application buffer should first be binded to the respective fields using the bindFld method as mentioned in the below example; insertTuple method will pick values from the binded application buffer and creates a new row or tuple in the table.
337 \begin{verbatim}
338 Table *table = dbMgr->openTable("t1");
339 if (table == NULL) { printf("Unable to open table\n"); return -1; }
340 int id = 100;
341 char name[196] = "Tuticorin";
342 table->bindFld("f1", &id);
343 table->bindFld("f2", name);
344 conn.startTransaction();
345 ret = table->insertTuple();
346 if (ret != OK) { printf("Unable to insert tuple\n"); return -1; }
347 conn.commit();
348 \end{verbatim}
350 \subsubsection{Selecting tuples from the tables}
351 \label{seltable}
352 Application buffer should first be binded to the respective fields using the bindFld() method as we did for insert; fetch() method will copy the values from the row to the binded application buffer.
353 Predicate if required shall be created using the condition interface and set in the table object before the execution. Calling execute() method is must before calling the fetch() method, as it will create the execution plan for the scan. It selects appropriate index based on the predicate set.
355 \begin{verbatim}
356 Condition p1;
357 int val1 = 100;
358 p1.setTerm("f1", OpEquals, &val1);
359 table->setCondition(p1);
360 int id = 100;
361 char name[196] = "Bangalore";
362 table->bindFld("f1", &id);
363 table->bindFld("f2", name);
364 table->execute();
365 void *tuple = (char*)table->fetch() ;
366 if (tuple == NULL) {printf(" No tuple found \n" ); table->close();return -1;}
367 printf("tuple value is %d %s \n",id ,name);
368 table->close();
369 \end{verbatim}
371 \subsubsection{Updating or Deleting tuples}
372 \label{updtable}
373 This operation is allowed always on existing scan. When this method is called the current tuple in the scan in either updated with application buffer values which are binded, or gets deleted based on the method called.
374 For example:
375 \begin{verbatim}
376 tuple = (char*)table->fetch() ;
377 if (tuple == NULL) {printf(" No tuple found \n" ); table->close();return -1;}
378 strcpy(name, "50576543210"); //update the value
379 table->updateTuple();
380 \end{verbatim}
382 \section{Configuration}
383 \label{confFile}
384 \subsection{csql.conf}
385 \label{csqlconf}
386 Configuration is read from environment variable \begin{verbatim}CSQL_CONFIG_FILE \end{verbatim}. A default configuration is available at the src root directory with name csql.conf.
388 \subsubsection{Server Section}
389 \label{serversection}
390 This contains parameters which affect the functioning of the csql server process. For Server section parameters, make sure that the value is same for the server process and all the csql client process which connects to it. otherwise, behavior is undefined
392 \begin{verbatim} PAGE_SIZE \end{verbatim}
393 Default Value :8192 \\
394 Description: Each database is logically divided into pages and allocation happens in unit of pages. This defines the size of the allocation unit. Increasing this value will reduce frequent allocation of pages \\
397 \begin{verbatim} MAX_TRANS \end{verbatim}
398 Default Value :100 \\
399 Description: Total number of active transactions that can run concurrently in the system \\
401 \begin{verbatim} MAX_PROCS \end{verbatim}
402 Default Value :100 \\
403 Description: Total number of client process which can connect and work with the database concurrently \\
405 \begin{verbatim} MAX_SYS_DB_SIZE \end{verbatim}
406 Default Value :1048576 bytes \\
407 Description: Maximum size of the system database. Catalog information and control information(lock, logs, etc) are stored in this internal database. \\
409 \begin{verbatim} MAX_DB_SIZE \end{verbatim}
410 Default Value :10485760 bytes \\
411 Description: Maximum size of the user database. Rows and Index information of tables are stored in this database.\\
414 \begin{verbatim} SYS_DB_KEY \end{verbatim}
415 Default Value :2222 \\
416 Description: Shared memory key to be used by the system to create and locate system database. Change this value if u want to run two csql instances in same host \\
418 \begin{verbatim} USER_DB_KEY \end{verbatim}
419 Default Value :5555 \\
420 Description: Shared memory key to be used by the system to create and locate user database \\
422 \begin{verbatim} LOG_FILE \end{verbatim}
423 Default Value :/tmp/log/log.out \\
424 Description: Give full path for the log file. This is where important system actions are stored for tracing application events\\
427 \begin{verbatim} MAP_ADDRESS \end{verbatim}
428 Default Value :400000000 \\
429 Description: The virtual memory start address at which the shared memory segment will be created and attached\\
431 \subsubsection{Client Section}
432 \label{serversection}
433 This contains parameters which affect the functioning of the client process which connects to the database.
435 \begin{verbatim} MUTEX_TIMEOUT_SECS \end{verbatim}
436 Default Value : 0 \\
437 Description: Mutex interval in seconds
439 \begin{verbatim} MUTEX_TIMEOUT_USECS \end{verbatim}
440 Default Value : 1000 \\
441 Description: Mutex interval in milli seconds
443 \begin{verbatim} MUTEX_TIMEOUT_RETRIES \end{verbatim}
444 Default Value : 10 \\
445 Description: Mutex retry count in case it fails to acquire
448 \begin{verbatim} LOCK_TIMEOUT_SECS \end{verbatim}
449 Default Value : 0 \\
450 Description: Lock interval in seconds
452 \begin{verbatim} LOCK_TIMEOUT_USECS \end{verbatim}
453 Default Value : 1000 \\
454 Description: Lock interval in milli seconds
456 \begin{verbatim} LOCK_TIMEOUT_RETRIES \end{verbatim}
457 Default Value : 10 \\
458 Description: Lock retry count in case it fails to acquire
460 \section{Tools}
461 \label{tools}
462 \subsection{csql tool}
463 \label{csqltool}
464 \begin{verbatim}
465 csql is a command line tool for interactive SQL access to database.
467 Usage: csql [-u username] [-p passwd] [-s sqlfile]
468 username -> username to connect to database
469 password -> password to connect to database
470 sqlfile -> filename containing sql statements
471 \end{verbatim}
473 Example:
474 \begin{verbatim}
475 $csql -u root -p manager -s /tmp/import.sql
476 \end{verbatim}
478 \subsection{catalog tool}
479 \label{catalogtool}
480 \begin{verbatim}
481 catalog is a command line tool used to display the internal control structures of
482 csql database engine. It will print the information in XML format.
484 Usage: catalog [-u username] [-p passwd] [-l] [-i] [-d] [-T table] [-I index]
485 [-D <lock|trans|proc>]
486 l -> list all table with field information
487 i -> reinitialize catalog tables. Drops all tables.
488 d -> print db usage statistics
489 T -> list table information
490 I -> list index information
491 D -> print debug information for system tables
492 Note: If multiple options are specified, last one will be considered.
493 \end{verbatim}
495 Example:
496 \begin{verbatim}
497 $catalog -u root -p manager -T table1
498 \end{verbatim}
499 Sample output:
500 \begin{verbatim}
501 <Table Info>
502 <TableName> table1 </TableName>
503 <TupleCount> 0 </TupleCount>
504 <PagesUsed> 1 </PagesUsed>
505 <SpaceUsed> 16 </SpaceUsed>
506 <Indexes> 1 <Indexes>
507 <TupleLength> 20 </TupleLength>
508 <Fields> 2 </Fields>
509 <Indexes>
510 <IndexName> t1_idx1_Primary </IndexName>
511 </Indexes>
512 </Table Info>
513 \end{verbatim}
515 \section{Common Error Messages}
516 \label{errormessages}
517 \textbf {Message: Shared memory open failed} \\
518 Check whether the csqlserver process is running \\
520 \textbf {Message: Unable to create log file or Unable to start the logger}\\
521 Check whether the csqlserver process is running \\
522 Check whether log file directory exists which is specified in csql.conf \\
523 For default configuration /tmp/log directory should exist in the system.
525 \textbf {Message: Shared memory create failed}\\
526 Check the system limit for max shared memory segment \\
527 Run the following command to increase the limit to 1 GB \\
528 /sbin/sysctl -w kernel.shmmax=1000000000
530 % Stop your text
531 \end{document}