2 @setfilename ../../info/semantic
3 @set TITLE Semantic Manual
4 @set AUTHOR Eric M. Ludlam and David Ponce
5 @settitle @value{TITLE}
7 @c *************************************************************************
9 @c *************************************************************************
11 @c Merge all indexes into a single index for now.
12 @c We can always separate them later into two or more as needed.
19 @c @footnotestyle separate
25 This manual documents the Semantic library and utilities.
27 Copyright @copyright{} 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2007,
28 2009 Free Software Foundation, Inc.
31 Permission is granted to copy, distribute and/or modify this document
32 under the terms of the GNU Free Documentation License, Version 1.3 or
33 any later version published by the Free Software Foundation; with no
34 Invariant Sections, with the Front-Cover texts being ``A GNU Manual,''
35 and with the Back-Cover Texts as in (a) below. A copy of the license
36 is included in the section entitled ``GNU Free Documentation License.''
38 (a) The FSF's Back-Cover Text is: ``You have the freedom to copy and
39 modify this GNU manual. Buying copies from the FSF supports it in
40 developing GNU and promoting software freedom.''
47 * Semantic: (semantic). Source code parser library and utilities.
53 @center @titlefont{Semantic}
55 @center by @value{AUTHOR}
68 @macro obsolete{old,new}
70 @strong{Compatibility}:
71 @code{\new\} introduced in @semantic{} version 2.0 supercedes
72 @code{\old\} which is now obsolete.
75 @c *************************************************************************
77 @c *************************************************************************
83 @semantic{} is a suite of Emacs libraries and utilities for parsing
84 source code. At its core is a lexical analyzer and two parser
85 generators (@code{bovinator} and @code{wisent}) written in Emacs Lisp.
86 @semantic{} provides a variety of tools for making use of the parser
87 output, including user commands for code navigation and completion, as
88 well as enhancements for imenu, speedbar, whichfunc, eldoc,
89 hippie-expand, and several other parts of Emacs.
91 To send bug reports, or participate in discussions about semantic,
92 use the mailing list cedet-semantic@@sourceforge.net via the URL:
93 @url{http://lists.sourceforge.net/lists/listinfo/cedet-semantic}
102 * Semantic Internals::
104 * GNU Free Documentation License::
109 @chapter Introduction
111 This chapter gives an overview of @semantic{} and its goals.
113 Ordinarily, Emacs uses regular expressions (and syntax tables) to
114 analyze source code for purposes such as syntax highlighting. This
115 approach, though simple and efficient, has its limitations: roughly
116 speaking, it only ``guesses'' the meaning of each piece of source code
117 in the context of the programming language, instead of rigorously
118 ``understanding'' it.
120 @semantic{} provides a new infrastructure to analyze source code using
121 @dfn{parsers} instead of regular expressions. It contains two
122 built-in parser generators (an @acronym{LL} generator named
123 @code{Bovine} and an @acronym{LALR} generator named @code{Wisent},
124 both written in Emacs Lisp), and parsers for several common
125 programming languages. It can also make use of @dfn{external
126 parsers}---programs such as GNU Global and GNU IDUtils.
128 @semantic{} provides a uniform, language-independent @acronym{API} for
129 accessing the parser output. This output can be used by other Emacs
130 Lisp programs to implement ``syntax-aware'' behavior. @semantic{}
131 itself includes several such utilities, including user-level Emacs
132 commands for navigating, searching, and completing source code.
134 The following diagram illustrates the structure of the @semantic{}
139 The words in all-capital are those that @semantic{} itself provides.
140 Others are current or future languages or applications that are not
141 distributed along with @semantic{}.
150 +---------------+ +--------+ +--------+
151 C --->| C PARSER |--->| | | |
152 +---------------+ | | | |
153 +---------------+ | COMMON | | COMMON |<--- SPEEDBAR
154 Java --->| JAVA PARSER |--->| PARSE | | |
155 +---------------+ | TREE | | PARSE |<--- SEMANTICDB
156 +---------------+ | FORMAT | | API |<--- ecb
157 Scheme --->| SCHEME PARSER |--->| | | |
158 +---------------+ | | | |
159 +---------------+ | | | |
160 Texinfo --->| TEXI. PARSER |--->| | | |
161 +---------------+ | | | |
165 +---------------+ | | | |<--- app. 1
166 Lang. A --->| A Parser |--->| | | |
167 +---------------+ | | | |<--- app. 2
168 +---------------+ | | | |
169 Lang. B --->| B Parser |--->| | | |<--- app. 3
170 +---------------+ | | | |
174 +---------------+ | | | |
175 Lang. Y --->| Y Parser |--->| | | |<--- app. ?
176 +---------------+ | | | |
177 +---------------+ | | | |<--- app. ?
178 Lang. Z --->| Z Parser |--->| | | |
179 +---------------+ +--------+ +--------+
183 * Semantic Components::
186 @node Semantic Components
187 @section Semantic Components
189 In this section, we provide a more detailed description of the major
190 components of @semantic{}, and how they interact with one another.
192 The first step in parsing a source code file is to break it up into
193 its fundamental components. This step is called lexical analysis:
196 syntax table, keywords list, and options
200 input file ----> Lexer ----> token stream
204 The output of the lexical analyzer is a list of tokens that make up
205 the file. The next step is the actual parsing, shown below:
211 token stream ---> Parser ----> parse tree
215 The end result, the parse tree, is @semantic{}'s internal
216 representation of the language grammar. @semantic{} provides an
217 @acronym{API} for Emacs Lisp programs to access the parse tree.
219 Parsing large files can take several seconds or more. By default,
220 @semantic{} automatically caches parse trees by saving them in your
221 @file{.emacs.d} directory. When you revisit a previously-parsed file,
222 the parse tree is automatically reloaded from this cache, to save
223 time. @xref{SemanticDB}.
226 @chapter Using Semantic
228 @include sem-user.texi
230 @node Semantic Internals
231 @chapter Semantic Internals
233 This chapter provides an overview of the internals of @semantic{}.
234 This information would not be needed by neither application developers
235 nor grammar developers.
237 It would be useful mostly for the hackers who would like to learn
238 more about how @semantic{} works.
241 * Parser code :: Code used for the parsers
242 * Tag handling :: Code used for manipulating tags
243 * Semanticdb Internals :: Code used in the semantic database
244 * Analyzer Internals :: Code used in the code analyzer
245 * Tools :: Code used in user tools
246 * Tests :: Code used for testing
252 @semantic{} parsing code is spread across a range of files.
256 The core infrastructure sets up buffers for parsing, and has all the
257 core parsing routines. Most parsing routines are overloadable, so the
258 actual implementation may be somewhere else.
260 @item semantic-edit.el
261 Incremental reparse based on user edits.
263 @item semantic-grammar.el
264 @itemx semantic-grammar.wy
265 Parser for the different grammar languages, and a major mode for
266 editing grammars in Emacs.
268 @item semantic-lex.el
269 Infrastructure for implementing lexical analyzers. Provides macros
270 for creating individual analyzers for specific features, and a way to
271 combine them together.
273 @item semantic-lex-spp.el
274 Infrastructure for a lexical symbolic preprocessor. This was written
275 to implement the C preprocessor, but could be used for other lexical
278 @item bovine/bovine-grammar.el
279 @itemx bovine/bovine-grammar-macros.el
280 @itemx bovine/semantic-bovine.el
281 The ``bovine'' grammar. This is the first grammar mode written for
282 @semantic{} and is useful for simple creating simple parsers.
284 @item wisent/wisent.el
285 @itemx wisent/bison-wisent.el
286 @itemx wisent/semantic-wisent.el
287 @itemx wisent/semantic-debug-grammar.el
288 A port of bison to Emacs. This infrastructure lets you create LALR
289 based parsers for @semantic{}.
291 @item semantic-ast.el
292 Manage Abstract Syntax Trees for parsers.
294 @item semantic-debug.el
295 Infrastructure for debugging grammars.
297 @item semantic-util.el
298 Various utilities for manipulating tags, such as describing the tag
299 under point, adding labels, and the all important
300 @code{semantic-something-to-tag-table}.
305 @section Tag handling
307 A tag represents an individual item found in a buffer, such as a
308 function or variable. Tag handling is handled in several source
312 @item semantic-tag.el
313 Basic tag creation, queries, cloning, binding, and unbinding.
315 @item semantic-tag-write.el
316 Write a tag or tag list to a stream. These routines are used by
317 @file{semanticdb-file.el} when saving a list of tags.
319 @item semantic-tag-file.el
320 Files associated with tags. Goto-tag, file for include, and file for
323 @item semantic-tag-ls.el
324 Language dependant features of a tag, such as parent calculation, slot
325 protection, and other states like abstract, virtual, static, and leaf.
327 @item semantic-dep.el
328 Include file handling. Contains the include path concepts, and
329 routines for looking up file names in the include path.
331 @item semantic-format.el
332 Convert a tag into a nicely formatted and colored string. Use
333 @code{semantic-test-all-format-tag-functions} to test different output
336 @item semantic-find.el
337 Find tags matching different conditions in a tag table.
338 These routines are used by @file{semanticdb-find.el} once the database
339 has been converted into a simpler tag table.
341 @item semantic-sort.el
342 Sorting lists of tags in different ways. Includes sorting a plain
343 list of tags forward or backward. Includes binning tags based on
344 attributes (bucketize), and tag adoption for multiple references to
347 @item semantic-doc.el
348 Capture documentation comments from near a tag.
352 @node Semanticdb Internals
353 @section Semanticdb Internals
355 @acronym{Semanticdb} complexity is certainly an issue. It is a rather
356 hairy problem to try and solve.
360 Defines a @dfn{database} and a @dfn{table} base class. You can
361 instantiate these classes, and use them, but they are not persistent.
363 This file also provides support for @code{semanticdb-minor-mode},
364 which automatically associates files with tables in databases so that
365 tags are @emph{saved} while a buffer is not in memory.
367 The database and tables both also provide applicate cache information,
368 and cache flushing system. The semanticdb search routines use caches
369 to save datastructures that are complex to calculate.
371 Lastly, it provides the concept of @dfn{project root}. It is a system
372 by which a file can be associated with the root of a project, so if
373 you have a tree of directories and source files, it can find the root,
374 and allow a tag-search to span all available databases in that
377 @item semanticdb-file.el
378 Provides a subclass of the basic table so that it can be saved to
379 disk. Implements all the code needed to unbind/rebind tags to a
380 buffer and writing them to a file.
382 @item semanticdb-el.el
383 Implements a special kind of @dfn{system} database that uses Emacs
384 internals to perform queries.
386 @item semanticdb-ebrowse.el
387 Implements a system database that uses Ebrowse to parse files into a
388 table that can be queried for tag names. Successful tag hits during a
389 find causes @semantic{} to pick up and parse the reference files to
390 get the full details.
392 @item semanticdb-find.el
393 Infrastructure for searching groups @semantic{} databases, and dealing
394 with the search results format.
396 @item semanticdb-ref.el
397 Tracks crossreferences. Cross references are needed when buffer is
398 reparsed, and must alert other tables that any dependant caches may
399 need to be flushed. References are in the form of include files.
403 @node Analyzer Internals
404 @section Analyzer Internals
406 The @semantic{} analyzer is a complex engine which has been broken
407 down across several modules. When the @semantic{} analyzer fails,
408 start with @code{semantic-analyze-debug-assist}, then dive into some
412 @item semantic-analyze.el
413 The core analyzer for defining the @dfn{current context}. The
414 current context is an object that contains references to aspects of
415 the local context including the current prefix, and a tag list
416 defining what the prefix means.
418 @item semantic-analyze-complete.el
419 Provides @code{semantic-analyze-possible-completions}.
421 @item semantic-analyze-debug.el
422 The analyzer debugger. Useful when attempting to get everything
425 @item semantic-analyze-fcn.el
426 Various support functions needed by the analyzer.
428 @item semantic-ctxt.el
429 Local context parser. Contains overloadable functions used to move
430 around through different scopes, get local variables, and collect the
431 current prefix used when doing completion.
433 @item semantic-scope.el
434 Calculate @dfn{scope} for a location in a buffer. The scope includes
435 local variables, and tag lists in scope for various reasons, such as
436 C++ using statements.
438 @item semanticdb-typecache.el
439 The typecache is part of @code{semanticdb}, but is used primarilly by
440 the analyzer to look up datatypes and complex names. The typecache is
441 bound across source files and builds a master lookup table for data
445 Interactive Analyzer functions. Simple routines that do completion or
446 lookups based on the results from the Analyzer. These routines are
447 meant as examples for application writers, but are quite useful as
450 @item semantic-ia-sb.el
451 Speedbar support for the analyzer, displaying context info, and
459 These files contain various tools a user can use.
462 @item semantic-idle.el
463 Idle scheduler for @semantic{}. Manages reparsing buffers after
464 edits, and large work tasks in idle time. Includes modes for showing
465 summary help and pop-up completion.
468 The @semantic{} navigator. Provides many ways to move through a
469 buffer based on the active tag table.
471 @item semantic-decorate.el
472 A minor mode for decorating tags based on details from the parser.
473 Includes overlines for functions, or coloring class fields based on
476 @item semantic-decorate-include.el
477 A decoration mode for include files, which assists users in setting up
478 parsing for their includes.
480 @item semantic-complete.el
481 Advanced completion prompts for reading tag names in the minibuffer, or
484 @item semantic-imenu.el
485 Imenu support for using @semantic{} tags in imenu.
487 @item semantic-mru-bookmark.el
488 Automatic bookmarking based on tags. Jump to locations you've been
489 before based on tag name.
492 Support for @semantic{} tag usage in Speedbar.
494 @item semantic-util-modes.el
495 A bunch of small minor-modes that exposes aspects of the semantic
496 parser state. Includes @code{semantic-stickyfunc-mode}.
499 @itemx document-vars.el
500 Create an update comments for tags.
502 @item semantic-adebug.el
503 Extensions of @file{data-debug.el} for @semantic{}.
505 @item semantic-chart.el
506 Draw some charts from stats generated from parsing.
509 @item semantic-elp.el
510 Profiler for helping to optimize the @semantic{} analyzer.
520 @item semantic-utest.el
521 Basic testing of parsing and incremental parsing for most supported
524 @item semantic-ia-utest.el
525 Test the semantic analyzer's ability to provide smart completions.
527 @item semantic-utest-c.el
528 Tests for the C parser's lexical pre-processor.
530 @item semantic-regtest.el
531 Regression tests from the older Semantic 1.x API.
540 In semantic 1.4, a BNF file represented ``Bovine Normal Form'', the
541 grammar file used for the 1.4 parser generator. This was a play on
542 Backus-Naur Form which proved too confusing.
545 A verb representing what happens when a bovine parser parses a file.
548 In a bovine, or LL parser, the bovine lambda is a function to execute
549 when a specific set of match rules has succeeded in matching text from
553 A parser using the bovine parser generator. It is an LL parser
554 suitible for small simple languages.
561 A program which converts text into a stream of tokens by analyzing
562 them lexically. Lexers will commonly create strings, symbols,
563 keywords and punctuation, and strip whitespaces and comments.
568 A nonterminal symbol or simply a nonterminal stands for a class of
569 syntactically equivalent groupings. A nonterminal symbol name is used
570 in writing grammar rules.
573 Some functions are defined via @code{define-overload}.
574 These can be overloaded via ....
577 A program that converts @b{tokens} to @b{tags}.
580 A tag is a representation of some entity in a language file, such as a
581 function, variable, or include statement. In semantic, the word tag is
582 used the same way it is used for the etags or ctags tools.
584 A tag is usually bound to a buffer region via overlay, or it just
585 specifies character locations in a file.
588 A single atomic item returned from a lexer. It represents some set
589 of characters found in a buffer.
592 The output of the lexer as well as the input to the parser.
595 A parser using the wisent parser generator. It is a port of bison to
596 Emacs Lisp. It is an LALR parser suitable for complex languages.
600 @node GNU Free Documentation License
601 @appendix GNU Free Documentation License
602 @include doclicense.texi
615 @c Following comments are for the benefit of ispell.
617 @c LocalWords: alist API APIs arg argc args argv asis assoc autoload Wisent
618 @c LocalWords: backquote bnf bovinate bovinates LALR
619 @c LocalWords: bovinating bovination bovinator bucketize
620 @c LocalWords: cb cdr charquote checkcache cindex CLOS
621 @c LocalWords: concat concocting const constantness ctxt Decl defcustom
622 @c LocalWords: deffn deffnx defun defvar destructor's dfn diff dir
623 @c LocalWords: doc docstring EDE EIEIO elisp emacsman emph enum
624 @c LocalWords: eq Exp EXPANDFULL expresssion fn foo func funcall
625 @c LocalWords: ia ids iff ifinfo imenu imenus init int isearch itemx java kbd
626 @c LocalWords: keymap keywordtable lang languagemode lexer lexing Ludlam
627 @c LocalWords: menubar metaparent metaparents min minibuffer Misc mode's
628 @c LocalWords: multitable NAvigaTOR noindent nomedian nonterm noselect
629 @c LocalWords: nosnarf obarray OLE OO outputfile paren parsetable POINT's
630 @c LocalWords: popup positionalonly positiononly positionormarker pre
631 @c LocalWords: printf printindex Programmatically pt punctuations quotemode
632 @c LocalWords: ref regex regexp Regexps reparse resetfile samp sb
633 @c LocalWords: scopestart SEmantic semanticdb setfilename setq
634 @c LocalWords: settitle setupfunction sexp sp SPC speedbar speedbar's
635 @c LocalWords: streamorbuffer struct subalist submenu submenus
636 @c LocalWords: subsubsection sw sym texi texinfo titlefont titlepage
637 @c LocalWords: tok TOKEN's toplevel typemodifiers uml unset untar
638 @c LocalWords: uref usedb var vskip xref yak
641 arch-tag: cbc6e78c-4ff1-410e-9fc7-936487e39bbf