Semantic is a program for Emacs which includes, at its core, a lexer, and a compiler compiler (bovinator). Additional tools include a bnf->semantic table converter, example tables, and a speedbar tool.
The core utility is the "semantic bovinator" which has similar behaviors as yacc or bison. Since it is not designed to be as feature rich as these tools, it uses the term "bovine" for cow, a lesser cousin of the yak and bison.
To send bug reports, or participate in discussions about semantic,
use the mailing list cedet-semantic@sourceforge.net via the URL:
<http://lists.sourceforge.net/lists/listinfo/cedet-semantic>
To install semantic, untar the distribution into a subdirectory, such as
/usr/share/emacs/site-lisp/semantic-#.#. Next, add the following
lines into your individual .emacs file, or into
site-lisp/site-start.el.
(setq semantic-load-turn-everything-on t) (load-file "/path/to/semantic/semantic-load.el")
If you would like to turn individual tools on or off in your init file, skip the first line.
Semantic is a tool primarily for the Emacs-Lisp programmer. However, it comes with "applications" that non-programmer might find useful. This chapter is mostly for the benefit of these non-programmers as it gives brief descriptions of basic concepts such as grammars, parsers, compiler-compilers, parse-tree, etc.
The grammar of a natural language defines rules by which valid phrases and sentences can be composed using words, the fundamental units with which all sentences are created. In a similar fashion, a "context-free grammar" defines the rules by which programs can be composed using the fundamental units of the language, i.e., numbers, symbols, punctuations, etc. Context-free grammars are often specified in a well-known form called Backus-Naur Form, BNF for short. This is a systematic way of representing context-free grammars such that programs can read files with grammars written in BNF and generate code for "parser" of that language. YACC (Yet Another Compiler Compiler) is one such program that has been part of UNIX operating systems since the 1970's. YACC is pronounced the same as "yak", the long-haired ox found in Asia. The parser generated by YACC is usually a C program. Bison is also a "compiler compiler" that takes BNF grammars and produces parsers in C language. The difference between YACC and Bison is that Bison is free software and upward-compatible with YACC. It also comes with an excellent manual.
Semantic is similar in spirit to YACC and Bison. Semantic, however, is referred to as a bovinator rather than as a parser, because it is a lesser cousin of YACC and Bison. It is lesser in that it does not perform a full parse like YACC or Bison. Instead, it bovinates. "Bovination" refers to partial parsing which creates parse trees of only the top most expressions rather than parsing every nested expression. This is sufficient for the purposes for which semantic was designed. Semantic is meant to be used within Emacs for providing editor-related features such as code browsers and translators rather than for compiling which requires far more complex and complete parsers. Semantic is not designed to be able to create full parse trees.
One key benefit of semantic is that it creates parse trees (perhaps the term bovine tree may be more accurate) with the same structure regardless of the type of language involved. Higher level applications written to work with bovine trees will then work with any language for which the grammar is available. For example, a code browser written today that supports C, C++, and Java may work without any change on other languages that do not even exist yet. All one has to do is to write the BNF specification for the new language. The rest of the work is done by semantic. For certain languages, it is hard if not impossible to specify the syntax of the language in BNF form, e.g., texinfo and other document oriented languages. Semantic provides a parser for texinfo nevertheless. Instead of BNF grammar, texinfo files are "parsed" using Regexps.
Semantic comes with grammars for these languages:
Several tools employing semantic that provide user observable features are listed in Tools section.
This chapter gives an overview of major components of semantic and how they interact with each other to perform its job.
The first step of parsing is to break up the input file into its fundamental components. This step is called lexing. The output of the lexer is a list of tokens that make up the file.
syntax table, keywords list, and options
|
|
v
input file ----> Lexer ----> token stream
The next step is the parsing shown below.
bovine table
|
v
token stream ---> Parser ----> parse tree
The end result, the parse tree, is created based on the "bovine table", which is the internal representation of the BNF language grammar used by semantic.
Semantic database provides caching of the parse trees by saving them
into files named semantic.cache automatically then loading them
when appropriate instead of re-parsing. The reason for this is to save the
time it takes to parse a file which could take several seconds or more
for large files.
Finally, semantic provides an API for the Emacs-Lisp programmer to access the information in the parse tree.
In order to reduce a source file into a token list, it must first be converted into a token stream. Tokens are syntactic elements such as whitespace, symbols, strings, lists, and punctuation.
The lexer uses the major-mode's syntax table for conversion.
See Syntax Tables.
As long as that is set up correctly (along with the important
comment-start and comment-start-skip variable) the lexer
should already work for your language.
The primary entry point of the lexer is the semantic-flex function shown below. Normally, you do not need to call this function. It is usually called by semantic-bovinate-toplevel for you.
| semantic-flex start end &optional depth length | Function |
| Using the syntax table, do something roughly equivalent to flex. Semantically check between START and END. Optional argument DEPTH indicates at what level to scan over entire lists. The return value is a token stream. Each element is a list of the form (symbol start-expression . end-expresssion). END does not mark the end of the text scanned, only the end of the beginning of text scanned. Thus, if a string extends past END, the end of the return token will be larger than END. To truly restrict scanning, use `narrow-to-region'. The last argument, LENGTH specifies that semantic-flex should only return LENGTH tokens. |
Semantic lexer breaks up the content of an Emacs buffer into a list of tokens. This process is based mostly on regular expressions which in turn depend on the syntax table of the buffer's major mode being setup properly. See Major Modes. See Syntax Tables. See Regexps.
Specifically, the following regular expressions which rely on syntax tables are used:
\\s-
\\sw
\\s_
\\s.
\\s<
\\s>
\\s\\
\\s)
\\s$
\\s\"
\\s\'
In addition, Emacs' built-in features such as
comment-start-skip,
forward-comment,
forward-list,
and
forward-sexp
are employed.
The lexer, semantic-flex, scans the content of a buffer and returns a token list. Let's illustrate this using this simple example.
00: /*
01: * Simple program to demonstrate semantic.
02: */
03:
04: #include <stdio.h>
05:
06: int i_1;
07:
08: int
09: main(int argc, char** argv)
10: {
11: printf("Hello world.\n");
12: }
Evaluating (semantic-flex (point-min) (point-max))
within the buffer with the code above returns the following token list.
The input line and string that produced each token is shown after
each semi-colon.
((punctuation 52 . 53) ; 04: # (INCLUDE 53 . 60) ; 04: include (punctuation 61 . 62) ; 04: < (symbol 62 . 67) ; 04: stdio (punctuation 67 . 68) ; 04: . (symbol 68 . 69) ; 04: h (punctuation 69 . 70) ; 04: > (INT 72 . 75) ; 06: int (symbol 76 . 79) ; 06: i_1 (punctuation 79 . 80) ; 06: ; (INT 82 . 85) ; 08: int (symbol 86 . 90) ; 08: main (semantic-list 90 . 113) ; 08: (int argc, char** argv) (semantic-list 114 . 147) ; 09-12: body of main function )
As shown above, the token list is a list of "tokens". Each token in turn is a list of the form
(TOKEN-TYPE BEGINNING-POSITION . ENDING-POSITION)
where TOKEN-TYPE is a symbol, and the other two are integers indicating the buffer position that delimit the token such that
(buffer-substring BEGINNING-POSITION ENDING-POSITION)
would return the string form of the token.
Note that one line (line 4 above) can produce seven tokens while
the whole body of the function produces a single token.
This is because the depth parameter of semantic-flex was
not specified.
Let's see the output when depth is set to 1.
Evaluate (semantic-flex (point-min) (point-max) 1) in the same buffer.
Note the third argument of 1.
((punctuation 52 . 53) ; 04: #
(INCLUDE 53 . 60) ; 04: include
(punctuation 61 . 62) ; 04: <
(symbol 62 . 67) ; 04: stdio
(punctuation 67 . 68) ; 04: .
(symbol 68 . 69) ; 04: h
(punctuation 69 . 70) ; 04: >
(INT 72 . 75) ; 06: int
(symbol 76 . 79) ; 06: i_1
(punctuation 79 . 80) ; 06: ;
(INT 82 . 85) ; 08: int
(symbol 86 . 90) ; 08: main
(open-paren 90 . 91) ; 08: (
(INT 91 . 94) ; 08: int
(symbol 95 . 99) ; 08: argc
(punctuation 99 . 100) ; 08: ,
(CHAR 101 . 105) ; 08: char
(punctuation 105 . 106) ; 08: *
(punctuation 106 . 107) ; 08: *
(symbol 108 . 112) ; 08: argv
(close-paren 112 . 113) ; 08: )
(open-paren 114 . 115) ; 10: {
(symbol 120 . 126) ; 11: printf
(semantic-list 126 . 144) ; 11: ("Hello world.\n")
(punctuation 144 . 145) ; 11: ;
(close-paren 146 . 147) ; 12: }
)
The depth parameter "peeled away" one more level of "list" delimited by matching parenthesis or braces. The depth parameter can be specified to be any number. However, the parser needs to be able to handle the extra tokens.
This is an interesting benefit of the lexer having the full
resources of Emacs at its disposal.
Skipping over matched parenthesis is achieved by simply calling
the built-in functions forward-list and forward-sexp.
All common token symbols are enumerated below. Additional token
symbols aside from these can be generated by the lexer if user option
semantic-flex-extensions is set. It is up to the user to add
matching extensions to the parser to deal with the lexer
extensions. An example use of semantic-flex-extensions is in
semantic-make.el where semantic-flex-extensions is set to
the value of semantic-flex-make-extensions which may generate
shell-command tokens.
bol
nil.
charquote
\\s\\+.
close-paren
\\s).
These are typically ), }, ], etc.
comment
nil.
newline
\\s-*\\(\n\\|\\s>\\).
This token is produced only if the user set
semantic-flex-enable-newlines to
non-nil.
open-paren
\\s(.
These are typically (, {, [, etc.
Note that these are not usually generated unless the depth
argument to semantic-flex is greater than 0.
punctuation
\\(\\s.\\|\\s$\\|\\s'\\).
semantic-list
string
\\s\".
The lexer relies on forward-sexp to find the
matching end.
symbol
\\(\\sw\\|\\s_\\)+.
whitespace
nil. If
semantic-ignore-comments is non-nil too comments are
considered as whitespaces.
Although most lexer functions are called for you by other semantic functions, there are ways for you to extend or customize the lexer. Three variables shown below serve this purpose.
| semantic-flex-unterminated-syntax-end-function | Variable |
| Function called when unterminated syntax is encountered. This should be set to one function. That function should take three parameters. The SYNTAX, or type of syntax which is unterminated. SYNTAX-START where the broken syntax begins. FLEX-END is where the lexical analysis was asked to end. This function can be used for languages that can intelligently fix up broken syntax, or the exit lexical analysis via throw or signal when finding unterminated syntax. |
| semantic-flex-extensions | Variable |
Buffer local extensions to the lexical analyzer.
This should contain an alist with a key of a regex and a data element of
a function. The function should both move point, and return a lexical
token of the form:
( TYPE START . END)
|
| semantic-flex-syntax-modifications | Variable |
Changes the syntax table for a given buffer.
These changes are active only while the buffer is being flexed.
This is a list where each element has the form
(CHAR CLASS) CHAR is the char passed to `modify-syntax-entry', and CLASS is the string also passed to `modify-syntax-entry' to define what syntax class CHAR has. (setq semantic-flex-syntax-modifications '((?. "_")) This makes the period . a symbol constituent. This may be necessary if filenames are prevalent, such as in Makefiles. |
| semantic-flex-enable-newlines | Variable |
When flexing, report 'newlines as syntactic elements.
Useful for languages where the newline is a special case terminator.
Only set this on a per mode basis, not globally.
|
| semantic-flex-enable-whitespace | Variable |
When flexing, report 'whitespace as syntactic elements.
Useful for languages where the syntax is whitespace dependent.
Only set this on a per mode basis, not globally.
|
| semantic-flex-enable-bol | Variable |
| When flexing, report beginning of lines as syntactic elements. Useful for languages like python which are indentation sensitive. Only set this on a per mode basis, not globally. |
| semantic-number-expression | Variable |
Regular expression for matching a number.
If this value is nil, no number extraction is done during lex.
This expression tries to match C and Java like numbers.
DECIMAL_LITERAL:
[1-9][0-9]*
;
HEX_LITERAL:
0[xX][0-9a-fA-F]+
;
OCTAL_LITERAL:
0[0-7]*
;
INTEGER_LITERAL:
<DECIMAL_LITERAL>[lL]?
| <HEX_LITERAL>[lL]?
| <OCTAL_LITERAL>[lL]?
;
EXPONENT:
[eE][+-]?[09]+
;
FLOATING_POINT_LITERAL:
[0-9]+[.][0-9]*<EXPONENT>?[fFdD]?
| [.][0-9]+<EXPONENT>?[fFdD]?
| [0-9]+<EXPONENT>[fFdD]?
| [0-9]+<EXPONENT>?[fFdD]
;
|
Another important piece of the lexer is the keyword table (see Settings). You language will want to set up a keyword table for fast conversion of symbol strings to language terminals.
The keywords table can also be used to store additional information about those keywords. The following programming functions can be useful when examining text in a language buffer.
| semantic-flex-keyword-p text | Function |
Return non-nil if TEXT is a keyword in the keyword table.
|
| semantic-flex-keyword-put text property value | Function |
| For keyword TEXT, set PROPERTY to VALUE. |
| semantic-token-put-no-side-effect token key value | Function |
For TOKEN, put the property KEY on it with VALUE without side effects.
If VALUE is nil, then remove the property from TOKEN.
All cons cells in the property list are replicated so that there
are no side effects if TOKEN is in shared lists.
|
| semantic-flex-keyword-get text property | Function |
| For keyword TEXT, get the value of PROPERTY. |
| semantic-flex-map-keywords fun &optional property | Function |
| Call function FUN on every semantic keyword. If optional PROPERTY is non-nil, call FUN only on every keyword which has a PROPERTY value. FUN receives a semantic keyword as argument. |
| semantic-flex-keywords &optional property | Function |
| Return a list of semantic keywords. If optional PROPERTY is non-nil, return only keywords which have PROPERTY set. |
Keyword properties can be set up in a BNF file for ease of maintenance. While examining the text in a language buffer, this can provide an easy and quick way of storing details about text in the buffer.
Add known properties here when they are known.
When converting a source file into a nonterminal token stream
(parse-tree) it is important to specify rules to accomplish this. The
rules are stored in the buffer local variable
semantic-toplevel-bovine-table.
While it is certainly possible to write this table yourself, it is most
likely you will want to use the BNF converter (see See BNF conversion.)
This is an easier method for specifying your rules. You will still need
to specify a variable in your language for the table, however. A good
rule of thumb is to call it language-toplevel-bovine-table if it
part of the language, or semantic-toplevel-language-bovine-table
if you donate it to the semantic package.
When initializing a major-mode for your language, you will set the
variable semantic-toplevel-bovine-table to the contents of your
language table. semantic-toplevel-bovine-table is always buffer
local.
Since it is important to know the format of the table when debugging , you should still attempt to understand the basics of the table.
Please see the documentation for the variable
semantic-toplevel-bovine-table for details on its format.
* add more doc here *
The BNF converter takes a file in "Bovine Normal Form" which is similar to "Backus-Naur Form". If you have ever used yacc or bison, you will find it similar. The BNF form used by semantic, however, does not include token precedence rules, and several other features needed to make real parser generators.
It is important to have an Emacs Lisp file with a variable ready to take
the output of your table (see See Bovinating.) Also, make sure that the
file semantic-bnf.el is loaded. Give your language file the
extension .bnf and you are ready.
The comment character is #.
When you want to test your file, use the keyboard shortcut C-c C-c to parse the file, generate the variable, and load the new definition in. It will then use the settings specified above to determine what to do. Use the shortcut C-c c to do the same thing, but spend extra time indenting the table nicely.
Make sure that you create the variable specified in the
%parsetable token before trying to convert the BNF file. A
simple definition like this is sufficient.
(defvar semantic-toplevel-lang-bovine-table nil "Table for use with semantic for parsing LANG.")
If you use tokens (created with the %token specifier), also
make sure you have a keyword table available, like this:
(defvar semantic-lang-keyword-table nil "Table for use with semantic for keywords.")
Specify the name of the keyword table with the %keywordtable
specifier.
The BNF file has two sections. The first is the settings section, and the second is the language definition, or list of semantic rules.
A setting is a keyword starting with a %. (This syntax is taken from yacc and bison. See (bison).)
There are several settings that can be made in the settings section. They are:
| %start <nonterminal> | Setting |
Specify an alternative to bovine-toplevel. (See below)
|
| %scopestart <nonterminal> | Setting |
Specify an alternative to bovine-inner-scope.
|
| %outputfile <filename> | Setting |
| Required. Specifies the file into which this files output is stored. |
| %parsetable <lisp-variable-name> | Setting |
| Required. Specifies a lisp variable into which the output is stored. |
| %setupfunction <lisp-function-name> | Setting |
| Required. Name of a function into which setup code is to be inserted. |
| %keywordtable <lisp-variable-name> | Setting |
Required if there are %token keywords.
Specifies a lisp variable into which the output of a keyword table is
stored. This obarray is used to turn symbols into keywords when applicable.
|
| %token <name> "<text>" | Setting |
Optional. Specify a new token NAME. This is added to a lexical
keyword list using TEXT. The symbol is then converted into a new
lexical terminal. This requires that the %keywordtable specified
variable is available in the file specified by %outputfile.
|
| %token <name> type "<text>" | Setting |
| Optional. Specify a new token NAME. It is made from an existing lexical token of type TYPE. TEXT is a string which will be matched explicitly. NAME can be used in match rules as though they were flex tokens, but are converted back to TYPE "text" internally. |
| %put <NAME> symbol <VALUE> | Setting |
| %put <NAME> ( symbol1 <VALUE1> symbol2 <VALUE2> ... ) | Setting |
| %put ( <NAME1> <NAME2>...) symbol <VALUE> | Setting |
Tokens created without a type are considered keywords, and placed in a
keyword table. Use %put to apply properties to that keyword.
(see Lexing).
|
| %languagemode <lisp-function-name> | Setting |
| %languagemode ( <lisp-function-name1> <lisp-function-name2> ... ) | Setting |
| Optional. Specifies the Emacs major mode associated with the language being specified. When the language is converted, all buffers of this mode will get the new table installed. |
| %quotemode backquote | Setting |
| Optional. Specifies how symbol quoting is handled in the Optional Lambda Expressions. (See below) |
| %( |
Setting |
Specify setup code to be inserted into the %setupfunction.
It will be inserted between two specifier strings, or added to
the end of the function.
|
When working inside %( ... )% tokens, any lisp expression can be
entered which will be placed inside the setup function. In general, you
probably want to set variables that tell Semantic and related tools how
the language works.
Here are some variables that control how different programs will work with your language.
| semantic-flex-depth | Variable |
| Default flexing depth. This specifies how many lists to create tokens in. |
| semantic-number-expression | Variable |
Regular expression for matching a number.
If this value is nil, no number extraction is done during lex.
Symbols which match this expression are returned as number
tokens instead of symbol tokens.
The default value for this variable should work in most languages. |
| semantic-flex-extensions | Variable |
Buffer local extensions to the lexical analyzer.
This should contain an alist with a key of a regex and a data element of
a function. The function should both move point, and return a lexical
token of the form:
( TYPE START . END)
|
| semantic-flex-syntax-modifications | Variable |
Updates to the syntax table for this buffer.
These changes are active only while this file is being flexed.
This is a list where each element is of the form:
(CHAR CLASS)Where CHAR is the char passed to modify-syntax-entry, and CLASS is the string also passed to modify-syntax-entry to define what class of syntax CHAR is. |
| semantic-flex-enable-newlines | Variable |
When flexing, report 'newlines as syntactic elements.
Useful for languages where the newline is a special case terminator.
Only set this on a per mode basis, not globally.
|
| semantic-ignore-comments | Variable |
Default comment handling.
t means to strip comments when flexing. Nil means to keep comments
as part of the token stream.
|
| semantic-symbol->name-assoc-list | Variable |
Association between symbols returned, and a string.
The string is used to represent a group of objects of the given type.
It is sometimes useful for a language to use a different string
in place of the default, even though that language will still
return a symbol. For example, Java return's includes, but the
string can be replaced with Imports.
|
| semantic-case-fold | Variable |
Value for case-fold-search when parsing.
|
| semantic-expand-nonterminal | Variable |
Function to call for each nonterminal production.
Return a list of non-terminals derived from the first argument, or nil
if it does not need to be expanded.
Languages with compound definitions should use this function to expand
from one compound symbol into several. For example, in C the
definition
int a, b;is easily parsed into one token, but represents multiple variables. A functions should be written which takes this compound token and turns it into two tokens, one for A, and the other for B. Within the language definition (the This list can then be detected by the function set in
Please see |
| semantic-override-table | Variable |
|
Buffer local semantic function overrides alist.
These overrides provide a hook for a `major-mode' to override specific
behaviors with respect to generated semantic toplevel nonterminals and
things that these non-terminals are useful for.
Each element must be of the form: (SYM . FUN)
where SYM is the symbol to override, and FUN is the function to
override it with.
Available override symbols:
Parameters mean:
|
| semantic-type-relation-separator-character | Variable |
| Character strings used to separation a parent/child relationship. This list of strings are used for displaying or finding separators in variable field dereferencing. The first character will be used for display. In C, a type field is separated like this: "type.field" thus, the character is a ".". In C, and additional value of "->" would be in the list, so that "type->field" could be found. |
| semantic-dependency-include-path | Variable |
| Defines the include path used when searching for files. This should be a list of directories to search which is specific to the file being included. This variable can also be set to a single function. If it is a function, it will be called with one arguments, the file to find as a string, and it should return the full path to that file, or nil. |
This configures Imenu to use semantic parsing.
| imenu-create-index-function | Variable |
|
The function to use for creating a buffer index.
It should be a function that takes no arguments and returns an index of the current buffer as an alist. Simple elements in the alist look like This function is called within a The variable is buffer-local. |
These are specific to the document tool.
document-comment-start
document-comment-line-prefix
document-comment-end
Writing the rules should be very similar to bison for basic syntax. Each rule is of the form
RESULT : MATCH1 (optional-lambda-expression)
| MATCH2 (optional-lambda-expression)
;
RESULT is a non-terminal, or a token synthesized in your grammar. MATCH is a list of elements that are to be matched if RESULT is to be made. The optional lambda expression is a list containing simplified rules for concocting the parse tree.
In bison, each time an element of a MATCH is found, it is "shifted" onto the parser stack. (The stack of matched elements.) When all of MATCH1's elements have been matched, it is "reduced" to RESULT. See (bison)Algorithm.
The first RESULT written into your language specification should
be bovine-toplevel, or the symbol specified with %start.
When starting a parse for a file, this is the default token iterated
over. You can use any token you want in place of bovine-toplevel
if you specify what that nonterminal will be with a %start token
in the settings section.
MATCH is made up of symbols and strings. A symbol such as
foo means that a syntactic token of type foo must be
matched. A string in the mix means that the previous symbol must have
the additional constraint of exactly matching it. Thus, the
combination:
symbol "moose"
means that a symbol must first be encountered, and then it must
string-match "moose". Be especially careful to remember that the
string is a regular expression. The code:
punctuation "."
will match any punctuation.
For the above example in bison, a LEX rule would be used to create a new token MOOSE. In this case, the MOOSE token would appear. For the bovinator, this task was mixed into the language definition to simplify implementation, though Bison's technique is more efficient.
To make a symbol match explicitly for keywords, for example, you can use
the %token command in the settings section to create new symbols.
%token MOOSE "moose"
find_a_moose: MOOSE
;
will match "moose" explicitly, unlike the previous example where moose need only appear in the symbol. This is because "moose" will be converted to MOOSE in the lexical analysis stage. Thus the symbol MOOSE won't be available any other way.
If we specify our token in this way:
%token MOOSE symbol "moose"
find_a_moose: MOOSE
;
then MOOSE will match the string "moose" explicitly, but it won't
do so at the lexical level, allowing use of the text "moose" in other
forms of regular expressions.
Non symbol tokens are also allowed. For example:
%token PERIOD punctuation "."
filename : symbol PERIOD symbol
;
will explicitly match one period when used in the above rule.
The OLE (Optional Lambda Expression) is converted into a bovine lambda (see See Bovinating.) This lambda has special short-cuts to simplify reading the Emacs BNF definition. An OLE like this:
( $1 )
results in a lambda return which consists entirely of the string or object found by matching the first (zeroth) element of match. An OLE like this:
( ,(foo $1) )
executes `foo' on the first argument, and then splices its return into the return list whereas:
( (foo $1) )
executes foo, and that is placed in the return list.
Here are other things that can appear inline:
$1
,$1
'$1
foo
(foo)
,(foo)
'(foo)
(EXPAND $1 nonterminal depth)
(EXPANDFULL $1 nonterminal depth)
bovine-toplevel. This lets you have
much simpler rules in this specific case, and also lets you have
positional information in the returned tokens, and error skipping.
(ASSOC symbol1 value1 symbol2 value2 ... )
( ( symbol1 . value1) (symbol2 . value2) ... )
If the symbol %quotemode backquote is specified, then use
,@ to splice a list in, and , to evaluate the expression.
This lets you send $1 as a symbol into a list instead of having
it expanded inline.
The rule:
SYMBOL : symbol
is equivalent to
SYMBOL : symbol
( $1 )
which, if it matched the string "A", would return
( "A" )
If this rule were used like this:
ASSIGN: SYMBOL punctuation "=" SYMBOL
( $1 $3 )
it would match "A=B", and return
( ("A") ("B") )
The letters A and B come back in lists because SYMBOL is a nonterminal, not an actual lexical element.
to get a better result with nonterminals, use , to splice lists in like this;
ASSIGN: SYMBOL punctuation "=" SYMBOL
( ,$1 ,$3 )
which would return
( "A" "B" )
In order for a generalized program using Semantic to work with
multiple languages, it is important to have a consistent meaning for
the contents of the tokens returned. The variable
semantic-toplevel-bovine-table is documented with the complete
list of a tokens that a functional or OO language may use. While any
given language is free to create their own tokens, such a language
definition would not produce a stream of tokens usable by a
generalized tool.
In general, all tokens returned from a parser should be generated with the following form:
("NAME" type-symbol ... "DOCSTRING" PROPERTIES OVERLAY)
NAME and type-symbol are the only syntactic elements of a
nonterminal which are guaranteed to exist. This means that a parser
which uses nil for either of these two slots, or some value
which is not type consistent is wrong.
NAME is also guaranteed to be a string. This string represents the name of the nonterminal, usually a named definition which the language will use elsewhere as a reference to the syntactic element found.
type-symbol is a symbol representing the type of the nonterminal. Valid type-symbols can be anything, as long is it is an Emacs Lisp symbol.
DOCSTRING is a required slot in the nonterminal, but can be nil. Some languages have the documentation saved as a comment nearby. In these cases, DOCSTRING is nil, and the function `semantic-find-documentation'.
PROPERTIES is a slot generated by the semantic parser harness,
and need not be provided by a language author. Programmatically access
nonterminal properties with semantic-token-put and
semantic-token-get to access properties.
OVERLAY represents positional information for this token. It is
automatically generated by the semantic parser harness, and need not
be provided by the language author, unless they provide a nonterminal
expansion function via semantic-expand-nonterminal.
The OVERLAY property is accessed via several functions returning the beginning, end, and buffer of a token. Use these functions unless the overlay is really needed (see Token Queries). Depending on the overlay in a program can be dangerous because sometimes the overlay is replaced with an integer pair
[ START END ]when the buffer the token belongs to is not in memory. This happens when a using has activated the Semantic Database semanticdb.
If a parser produces tokens for a functional language, then the following token formats are available.
("NAME" variable "TYPE" DEFAULT-VALUE EXTRA-SPEC
"DOCSTRING" PROPERTIES OVERLAY)
nil for untyped languages. Languages which
support variable declarations without a type (Such as C) should supply
a string representing the default type for that language.
DEFAULT-VALUE can be a string, or something pre-parsed and language specific. Hopefully this slot will be better defined in future versions of Semantic.
EXTRA-SPEC are extra specifiers. See below.
("NAME" function "TYPE" ( ARG-LIST ) EXTRA-SPEC
"DOCSTRING" PROPERTIES OVERLAY)
nil for untyped languages, or for
procedures in languages which support functions with no return data.
See above for more.
ARG-LIST is a list of arguments passed to this function. Each element in the arg list can be one of the following:
("NAME" type "TYPE" ( PART-LIST ) ( PARENTS ) EXTRA-SPEC
"DOCSTRING" PROPERTIES OVERLAY)
PART-LIST is the list of individual entries inside compound types. Structures, for example, can contain several fields which can be represented as variables. Valid entries in a PART-LIST are:
PARENTS represents a list of parents of this type. Parents are used in two situations.
The structure of the PARENTS list is of this form:
( EXPLICIT-PARENTS . INTERFACE-PARENTS)EXPLICIT-PARENTS can be a single string (Just one parent) or a list of parents (in a multiple inheritance situation. It can also be nil.
INTERFACE-PARENTS is a list of strings representing the names of all INTERFACES, or abstract classes inherited from. It can also be nil.
This slot can be interesting because the form:
( nil "string")is a valid parent where there is no explicit parent, and only an interface.
("FILE" include SYSTEM "DOCSTRING" PROPERTIES OVERLAY)
#include statement in C.
In this case, instead of NAME, a FILE is specified.
FILE can be a subset of the actual file to be loaded.
SYSTEM is true if this include is part of a set of system
includes. This field isn't currently being used and may be
eliminated.
("NAME" package DETAIL "DOCSTRING" PROPERTIES OVERLAY)
package statement, or a provide in Emacs Lisp.
DETAIL might be an associated file name, or some other language specific bit of information.
Some default token types have a slot EXTRA-SPEC, for extra specifiers. These specifiers provide additional details not commonly used, or not available in all languages. This list is an alist, and if a given key is nil, it is not in the list, saving space. Some valid extra specifiers are:
(parent . "text")
(dereference . INT)
(pointer . INT)
* characters.
(typemodifiers . ( "text" ... ))
register' and volatile'
(suffix . "text")
(const . t)
(throws . ( "text" ... ))
(destructor . t)
(constructor . t)
(user-visible . t)
(prototype . t)
autoload statement creates prototypes.
From a program you can use the function semantic-bovinate-toplevel.
This function takes one optional parameter specifying if the cache
should be refreshed. By default, the cached results of the last parse
are always used. Specifying that the cache should be checked will cause
it to be flushed if it is out of date.
Another function you can use is semantic-bovinate-nonterminal.
This command takes a token stream returned by the function
semantic-flex followed by a DEPTH (as above). This takes an
additional optional argument of NONTERMINAL which is the nonterminal in
your table it is to start parsing with.
| bovinate &optional clear | Command |
| Bovinate the current buffer. Show output in a temp buffer. Optional argument CLEAR will clear the cache before bovinating. |
| semantic-clear-toplevel-cache | Command |
| Clear the toplevel bovine cache for the current buffer. Clearing the cache will force a complete reparse next time a token stream is requested. |
| semantic-bovinate-toplevel &optional checkcache | Function |
Bovinate the entire current buffer.
If the optional argument CHECKCACHE is non-nil, then flush the cache iff
there has been a size change.
|
Writing language files using BNF is significantly easier than writing then using regular expressions in a functional manner. Debugging them, however, can still prove challenging.
There are two ways to debug a language definition if it is not
behaving as expected. One way is to debug against the source .bnf
file. The second is to debug against the lisp table created from the
.bnf source, or perhaps written by hand.
If your language definition was written in BNF notation, debugging is
quite easy. The command bovinate-debug will start you off.
| bovinate-debug | Command |
| Bovinate the current buffer and run in debug mode. |
If you prefer debugging against the Lisp table, find the table in a
buffer, place the cursor in it, and use the command
semantic-bovinate-debug-set-table in it.
| semantic-bovinate-debug-set-table &optional clear | Command |
| Set the table for the next debug to be here. Optional argument CLEAR to unset the debug table. |
After the table is set, the bovinate-debug command can be run
at any time for the given language.
While debugging, two windows are visible. One window shows the file being parsed, and the syntactic token being tested is highlighted. The second window shows the table being used (either in the BNF source, or the Lisp table) with the current rule highlighted. The cursor will sit on the specific match rule being tested against.
In the minibuffer, a brief summary of the current situation is listed. The first element is the syntactic token which is a list of the form:
(TYPE START . END)
The rest of the display is a list of all strings collected for the currently tested rule. Each time a new rule is entered, the list is restarted. Upon returning from a rule into a previous match list, the previous match list is restored, with the production of the dependent rule in the list.
Use C-g to stop debugging. There are no commands for any fancier types of debugging.
Once a source file has been parsed, the following APIs can be used to write programs that use the token stream most effectively.
When writing programs that use the bovinator, the following functions are needed to find get details out of a nonterminal.
| semantic-equivalent-tokens-p token1 token2 | Function |
Compare TOKEN1 and TOKEN2 and return non-nil if they are equivalent.
Use eq to test of two tokens are the same. Use this function if tokens
are being copied and regrouped to test for if two tokens represent the same
thing, but may be constructed of different cons cells.
|
| semantic-token-token token | Function |
Retrieve from TOKEN the token identifier.
i.e., the symbol 'variable, 'function, 'type, or other.
|
| semantic-token-name token | Function |
| Retrieve the name of TOKEN. |
| semantic-token-docstring token &optional buffer | Function |
| Retrieve the documentation of TOKEN. Optional argument BUFFER indicates where to get the text from. If not provided, then only the POSITION can be provided. |
| semantic-token-overlay token | Function |
| Retrieve the OVERLAY part of TOKEN. The returned item may be an overlay or an unloaded buffer representation. |
| semantic-token-extent token | Function |
| Retrieve the extent (START END) of TOKEN. |
| semantic-token-start token | Function |
| Retrieve the start location of TOKEN. |
| semantic-token-end token | Function |
| Retrieve the end location of TOKEN. |
| semantic-token-type token | Function |
| Retrieve the type of TOKEN. |
| semantic-token-put token property value | Function |
| On token, set property to value. |
| semantic-token-get token property | Function |
| For token get the value of property. |
| semantic-token-extra-spec token spec | Function |
| Retrieve a specifier for the variable TOKEN. SPC is the symbol whose modifier value to get. This function can get specifiers from any type of TOKEN. Do not use this function if you know what type of token you are dereferencing. Instead, use the function specific to that token type. It will be faster. |
| semantic-token-type-parts token | Function |
| Retrieve the parts of the type TOKEN. |
| semantic-token-type-parent token | Function |
Retrieve the parent of the type TOKEN.
The return value is a list. A value of nil means no parents.
The car of the list is either the parent class, or a list
of parent classes. The cdr of the list is the list of
interfaces, or abstract classes which are parents of TOKEN.
|
| semantic-token-type-parent-superclass token | Function |
| Retrieve the parent super classes of type type TOKEN. |
| semantic-token-type-parent-implement token | Function |
| Retrieve the parent interfaces of type type TOKEN. |
| semantic-token-type-modifiers token | Function |
| Retrieve the type modifiers for the type TOKEN. |
| semantic-token-type-extra-specs token | Function |
| Retrieve the extra specifiers for the type TOKEN. |
| semantic-token-type-extra-spec token spec | Function |
| Retrieve a extra specifier for the type TOKEN. SPEC is the symbol whose modifier value to get. |
| semantic-token-function-args token | Function |
| Retrieve the arguments of the function TOKEN. |
| semantic-token-function-modifiers token | Function |
| Retrieve the type modifiers of the function TOKEN. |
| semantic-token-function-destructor token | Function |
Non-nil if TOKEN is a destructor function.
|
| semantic-token-function-extra-specs token | Function |
| Retrieve the extra specifiers of the function TOKEN. |
| semantic-token-function-extra-spec token spec | Function |
| Retrieve a specifier for the function TOKEN. SPEC is a symbol whose specifier value to get. |
| semantic-token-function-throws token | Function |
Retrieve the throws signal of the function TOKEN.
This is an optional field, and returns nil if it doesn't exist.
|
| semantic-token-function-parent token | Function |
| The parent of the function TOKEN. A function has a parent if it is a method of a class, and if the function does not appear in body of its parent class. |
| semantic-token-variable-const token | Function |
| Retrieve the status of constantness from the variable TOKEN. |
| semantic-token-variable-default token | Function |
| Retrieve the default value of the variable TOKEN. |
| semantic-token-variable-modifiers token | Function |
| Retrieve type modifiers for the variable TOKEN. |
| semantic-token-variable-extra-specs token | Function |
| Retrieve extra specifiers for the variable TOKEN. |
| semantic-token-variable-extra-spec token spec | Function |
| Retrieve a specifier value for the variable TOKEN. SPEC is the symbol whose specifier value to get. |
| semantic-token-include-system token | Function |
| Retrieve the flag indicating if the include TOKEN is a system include. |
For override methods that query a token, see See Token Details.
These functions take some key, and returns information found inside the nonterminal stream. Some will return one token (the first matching item found.) Others will return a list of all items matching a given criterion. All these functions work regardless of a buffer being in memory or not.
| semantic-find-nonterminal-by-name name streamorbuffer &optional search-parts search-include | Function |
Find a nonterminal NAME within STREAMORBUFFER. NAME is a string.
If SEARCH-PARTS is non-nil, search children of tokens.
If SEARCH-INCLUDE is non-nil, search include files.
|
| semantic-find-nonterminal-by-property property value streamorbuffer &optional search-parts search-includes | Function |
| Find all nonterminals with PROPERTY equal to VALUE in STREAMORBUFFER. Properties can be added with semantic-token-put. Optional argument SEARCH-PARTS and SEARCH-INCLUDES are passed to semantic-find-nonterminal-by-function. |
| semantic-find-nonterminal-by-extra-spec spec streamorbuffer &optional search-parts search-includes | Function |
| Find all nonterminals with a given SPEC in STREAMORBUFFER. SPEC is a symbol key into the modifiers association list. Optional argument SEARCH-PARTS and SEARCH-INCLUDES are passed to semantic-find-nonterminal-by-function. |
| semantic-find-nonterminal-by-extra-spec-value spec value streamorbuffer &optional search-parts search-includes | Function |
| Find all nonterminals with a given SPEC equal to VALUE in STREAMORBUFFER. SPEC is a symbol key into the modifiers association list. VALUE is the value that SPEC should match. Optional argument SEARCH-PARTS and SEARCH-INCLUDES are passed to semantic-find-nonterminal-by-function. |
| semantic-find-nonterminal-by-position position streamorbuffer &optional nomedian | Function |
Find a nonterminal covering POSITION within STREAMORBUFFER.
POSITION is a number, or marker. If NOMEDIAN is non-nil, don't do
the median calculation, and return nil.
|
| semantic-find-innermost-nonterminal-by-position position streamorbuffer &optional nomedian | Function |
Find a list of nonterminals covering POSITION within STREAMORBUFFER.
POSITION is a number, or marker. If NOMEDIAN is non-nil, don't do
the median calculation, and return nil.
This function will find the topmost item, and recurse until no more
details are available of findable.
|
| semantic-find-nonterminal-by-token token streamorbuffer &optional search-parts search-includes | Function |
| Find all nonterminals with a token TOKEN within STREAMORBUFFER. TOKEN is a symbol representing the type of the tokens to find. Optional argument SEARCH-PARTS and SEARCH-INCLUDE are passed to semantic-find-nonterminal-by-function. |
| semantic-find-nonterminal-standard streamorbuffer &optional search-parts search-includes | Function |
| Find all nonterminals in STREAMORBUFFER which define simple token types. Optional argument SEARCH-PARTS and SEARCH-INCLUDE are passed to semantic-find-nonterminal-by-function. |
| semantic-find-nonterminal-by-type type streamorbuffer &optional search-parts search-includes | Function |
| Find all nonterminals with type TYPE within STREAMORBUFFER. TYPE is a string which is the name of the type of the token returned. Optional argument SEARCH-PARTS and SEARCH-INCLUDES are passed to semantic-find-nonterminal-by-function. |
| semantic-find-nonterminal-by-function function streamorbuffer &optional search-parts search-includes | Function |
Find all nonterminals in which FUNCTION match within STREAMORBUFFER.
FUNCTION must return non-nil if an element of STREAM will be included
in the new list.
If optional argument SEARCH-PARTS is non- If SEARCH-INCLUDES is non- |
| semantic-find-nonterminal-by-function-first-match function streamorbuffer &optional search-parts search-includes | Function |
Find the first nonterminal which FUNCTION match within STREAMORBUFFER.
FUNCTION must return non-nil if an element of STREAM will be included
in the new list.
If optional argument SEARCH-PARTS, all sub-parts of tokens are searched.
The over-loadable function semantic-nonterminal-children is used for
searching.
If SEARCH-INCLUDES is non-nil, then all include files are also
searched for matches.
|
| semantic-recursive-find-nonterminal-by-name name buffer | Function |
| Recursively find the first occurrence of NAME. Start search with BUFFER. Recurse through all dependencies till found. The return item is of the form (BUFFER TOKEN) where BUFFER is the buffer in which TOKEN (the token found to match NAME) was found. |
When you just want to get at a nonterminal the cursor is on, there is
a more efficient mechanism than using
semantic-find-nonterminal-by-position. This mechanism
directly queries the overlays the parsing step leaves in the buffer.
This provides for very rapid retrieval of what function or variable
the cursor is currently in.
These functions query the current buffer's overlay system for tokens.
| semantic-find-nonterminal-by-overlay &optional positionormarker buffer | Function |
Find all nonterminals covering POSITIONORMARKER by using overlays.
If POSITIONORMARKER is nil, use the current point.
Optional BUFFER is used if POSITIONORMARKER is a number, otherwise the current
buffer is used. This finds all tokens covering the specified position
by checking for all overlays covering the current spot. They are then sorted
from largest to smallest via the start location.
|
| semantic-find-nonterminal-by-overlay-in-region start end &optional buffer | Function |
| Find all nonterminals which exist in whole or in part between START and END. Uses overlays to determine position. Optional BUFFER argument specifies the buffer to use. |
| semantic-current-nonterminal | Function |
| Return the current nonterminal in the current buffer. If there are more than one in the same location, return the smallest token. |
| semantic-current-nonterminal-parent | Function |
Return the current nonterminals parent in the current buffer.
A token's parent would be a containing structure, such as a type
containing a field. Return nil if there is no parent.
|
Sometimes it is important to reorganize a token stream into a form that is better for display to a user. It is important to not use functions with side effects when doing this, and that could effect the token cache.
There are some existing utility functions which will reorganize the token list for you.
| semantic-bucketize tokens &optional parent filter | Function |
| Sort TOKENS into a group of buckets based on token type. Unknown types are placed in a Misc bucket. Type bucket names are defined by either `semantic-symbol->name-assoc-list'. If PARENT is specified, then TOKENS belong to this PARENT in some way. This will use `semantic-symbol->name-assoc-list-for-type-parts' to generate bucket names. Optional argument FILTER is a filter function to be applied to each bucket. The filter function will take one argument, which is a list of tokens, and may re-organize the list with side-effects. |
| semantic-bucketize-token-token | Variable |
| Function used to get a symbol describing the class of a token. This function must take one argument of a semantic token. It should return a symbol found in `semantic-symbol->name-assoc-list' which semantic-bucketize uses to bin up tokens. To create new bins for an application augment `semantic-symbol->name-assoc-list', and `semantic-symbol->name-assoc-list-for-type-parts' in addition to setting this variable (locally in your function). |
| semantic-adopt-external-members tokens | Function |
|
Rebuild TOKENS so that externally defined members are regrouped.
Some languages such as C++ and CLOS permit the declaration of member
functions outside the definition of the class. It is easier to study
the structure of a program when such methods are grouped together
more logically.
This function uses semantic-nonterminal-external-member-p to determine when a potential child is an externally defined member. Note: Applications which use this function must account for token types which do not have a position, but have children which *do* have positions. Applications should use |
| semantic-orphaned-member-metaparent-type | Variable |
In semantic-adopt-external-members, the type of 'type for metaparents.
A metaparent is a made-up type semantic token used to hold the child list
of orphaned members of a named type.
|
| semantic-mark-external-member-function | Variable |
Function called when an externally defined orphan is found.
Be default, the token is always marked with the adopted property.
This function should be locally bound by a program that needs
to add additional behaviors into the token list.
This function is called with one argument which is a shallow copy
of the token to be modified. This function should return the
token (or a copy of it) which is then integrated into the
revised token list.
|
These functions provide ways reading the names of items in a buffer with completion.
| semantic-read-symbol prompt &optional default stream filter | Function |
| Read a symbol name from the user for the current buffer. PROMPT is the prompt to use. Optional arguments: DEFAULT is the default choice. If no default is given, one is read from under point. STREAM is the list of tokens to complete from. FILTER is provides a filter on the types of things to complete. FILTER must be a function to call on each element. (See !!! |
| semantic-read-variable prompt &optional default stream | Function |
| Read a variable name from the user for the current buffer. PROMPT is the prompt to use. Optional arguments: DEFAULT is the default choice. If no default is given, one is read from under point. STREAM is the list of tokens to complete from. |
| semantic-read-function prompt &optional default stream | Function |
| Read a function name from the user for the current buffer. PROMPT is the prompt to use. Optional arguments: DEFAULT is the default choice. If no default is given, one is read from under point. STREAM is the list of tokens to complete from. |
| semantic-read-type prompt &optional default stream | Function |
| Read a type name from the user for the current buffer. PROMPT is the prompt to use. Optional arguments: DEFAULT is the default choice. If no default is given, one is read from under point. STREAM is the list of tokens to complete from. |
These functions are called `override methods' because they provide generic behaviors, which a given language can override. For example, finding a dependency file in Emacs lisp can be done with the `locate-library' command (which overrides the default behavior.) In C, a dependency can be found by searching a generic search path which can be passed in via a variable.
Any given token consists of Meta information which is best viewed in some textual form. This could be as simple as the token's name, or as a prototype to be added to header file in C. Not only are there several default converters from a Token into text, but there is also some convenient variables that can be used with them. Use these variables to allow options on output forms when displaying tokens in your programs.
| semantic-token->text-functions | Variable |
List of functions which convert a token to text.
Each function must take the parameters TOKEN &optional PARENT COLOR.
TOKEN is the token to convert.
PARENT is a parent token or name which refers to the structure
or class which contains TOKEN. PARENT is NOT a class which a TOKEN
would claim as a parent.
COLOR indicates that the generated text should be colored using
font-lock.
|
| semantic-token->text-custom-list | Variable |
A List used by customizable variables to choose a token to text function.
Use this variable in the :type field of a customizable variable.
|
Every token to text conversion function must take the same parameters, which are TOKEN, the token to be converted, PARENT, the containing parent (like a structure which contains a variable), and COLOR, which is a flag specifying that color should be applied to the returned string.
When creating, or using these strings, particularly with color, use concat to build up larger strings instead of format. This will preserve text properties.
| semantic-name-nonterminal token &optional parent color | Function |
| Return the name string describing TOKEN. The name is the shortest possible representation. Optional argument PARENT is the parent type if TOKEN is a detail. Optional argument COLOR means highlight the prototype with font-lock colors. |
| semantic-summarize-nonterminal token &optional parent color | Function |
| Summarize TOKEN in a reasonable way. Optional argument PARENT is the parent type if TOKEN is a detail. Optional argument COLOR means highlight the prototype with font-lock colors. |
| semantic-prototype-nonterminal token &optional parent color | Function |
| Return a prototype for TOKEN. This function should be overloaded, though it need not be used. This is because it can be used to create code by language independent tools. Optional argument PARENT is the parent type if TOKEN is a detail. Optional argument COLOR means highlight the prototype with font-lock colors. |
| semantic-prototype-file buffer | Function |
| Return a file in which prototypes belonging to BUFFER should be placed. Default behavior (if not overridden) looks for a token specifying the prototype file, or the existence of an EDE variable indicating which file prototypes belong in. |
| semantic-abbreviate-nonterminal token &optional parent color | Function |
| Return an abbreviated string describing TOKEN. The abbreviation is to be short, with possible symbols indicating the type of token, or other information. Optional argument PARENT is the parent type if TOKEN is a detail. Optional argument COLOR means highlight the prototype with font-lock colors. |
| semantic-concise-prototype-nonterminal token &optional parent color | Function |
| Return a concise prototype for TOKEN. Optional argument PARENT is the parent type if TOKEN is a detail. Optional argument COLOR means highlight the prototype with font-lock colors. |
| semantic-uml-abbreviate-nonterminal token &optional parent color | Function |
| Return a UML style abbreviation for TOKEN. Optional argument PARENT is the parent type if TOKEN is a detail. Optional argument COLOR means highlight the prototype with font-lock colors. |
These functions help derive information about tokens that may not be obvious for non-traditional languages with their own token types.
| semantic-nonterminal-children token &optional positionalonly | Function |
Return the list of top level children belonging to TOKEN.
Children are any sub-tokens which may contain overlays.
The default behavior (if not overridden with nonterminal-children
is to return type parts for a type, and arguments for a function.
If optional argument POSITIONALONLY is non- If this function is overridden, use semantic-nonterminal-children-default to also include the default behavior, and merely extend your own. Note for language authors: If a mode defines a language that has tokens in it with overlays that should not be considered children, you should still return them with this function. If you do not, then token re-parsing, and database saving will fail. |
| semantic-nonterminal-external-member-parent token | Function |
|
Return a parent for TOKEN when TOKEN is an external member.
TOKEN is an external member if it is defined at a toplevel and
has some sort of label defining a parent. The parent return will
be a string.
The default behavior, if not overridden with
If this function is overridden, use semantic-nonterminal-external-member-parent-default to also include the default behavior, and merely extend your own. |
| semantic-nonterminal-external-member-p parent token | Function |
Return non-nil if PARENT is the parent of TOKEN.
TOKEN is an external member of PARENT when it is somehow tagged
as having PARENT as it's parent.
The default behavior, if not overridden with
If this function is overridden, use
|
| semantic-nonterminal-external-member-children token &optional usedb | Function |
Return the list of children which are not *in* TOKEN.
If optional argument USEDB is non-nil, then also search files in
the Semantic Database. If USEDB is a list of databases, search those
databases.
Children in this case are functions or types which are members of TOKEN, such as the parts of a type, but which are not defined inside the class. C++ and CLOS both permit methods of a class to be defined outside the bounds of the class' definition. The default behavior, if not overridden with
If this function is overridden, use semantic-nonterminal-external-member-children-default to also include the default behavior, and merely extend your own. |
| semantic-nonterminal-protection token &optional parent | Function |
Return protection information about TOKEN with optional PARENT.
This function returns on of the following symbols:
nil - No special protection. Language dependent.
'public - Anyone can access this TOKEN.
'private - Only methods in the local scope can access TOKEN.
'friend - Like private, except some outer scopes are allowed
access to token.
Some languages may choose to provide additional return symbols specific
to themselves. Use of this function should allow for this.
The default behavior (if not overridden with |
| semantic-nonterminal-abstract token &optional parent | Function |
Return non nil if TOKEN is abstract.
Optional PARENT is the parent token of TOKEN.
In UML, abstract methods and classes have special meaning and behavior
in how methods are overridden. In UML, abstract methods are italicized.
The default behavior (if not overridden with |
| semantic-nonterminal-leaf token &optional parent | Function |
Return non nil if TOKEN is leaf.
Optional PARENT is the parent token of TOKEN.
In UML, leaf methods and classes have special meaning and behavior.
The default behavior (if not overridden with |
| semantic-nonterminal-static token &optional parent | Function |
Return non nil if TOKEN is static.
Optional PARENT is the parent token of TOKEN.
In UML, static methods and attributes mean that they are allocated
in the parent class, and are not instance specific.
UML notation specifies that STATIC entries are underlined.
The default behavior (if not overridden with |
| semantic-find-dependency token | Function |
Find the filename represented from TOKEN.
TOKEN may be a stripped element, in which case PARENT specifies a
parent token that has positional information.
Depends on semantic-dependency-include-path for searching. Always searches
`.' first, then searches additional paths.
|
| semantic-find-nonterminal token &optional parent | Function |
| Find the location of TOKEN. TOKEN may be a stripped element, in which case PARENT specifies a parent token that has position information. Different behaviors are provided depending on the type of token. For example, dependencies (includes) will seek out the file that is depended on, and functions will move to the specified definition. |
| semantic-find-documentation token | Function |
| Find documentation from TOKEN and return it as a clean string. TOKEN might have DOCUMENTATION set in it already. If not, there may be some documentation in a comment preceding TOKEN's definition which we can look for. When appropriate, this can be overridden by a language specific enhancement. |
| semantic-up-context &optional point | Function |
Move point up one context from POINT.
Return non-nil if there are no more context levels.
Overloaded functions using up-context take no parameters.
|
| semantic-beginning-of-context &optional point | Function |
Move POINT to the beginning of the current context.
Return non-nil if there is no upper context.
The default behavior uses semantic-up-context. It can
be overridden with beginning-of-context.
|
| semantic-end-of-context &optional point | Function |
Move POINT to the end of the current context.
Return non-nil if there is no upper context.
Be default, this uses semantic-up-context, and assumes parenthetical
block delimiters. This can be overridden with end-of-context.
|
| semantic-get-local-variables &optional point | Function |
Get the local variables based on POINT's context.
Local variables are returned in Semantic token format.
Be default, this calculates the current bounds using context blocks
navigation, then uses the parser with bovine-inner-scope to
parse tokens at the beginning of the context.
This can be overridden with get-local-variables.
|
| semantic-get-local-arguments &optional point | Function |
Get arguments (variables) from the current context at POINT.
Parameters are available if the point is in a function or method.
This function returns a list of tokens. If the local token returns
just a list of strings, then this function will convert them to tokens.
Part of this behavior can be overridden with get-local-arguments.
|
| semantic-get-all-local-variables &optional point | Function |
Get all local variables for this context, and parent contexts.
Local variables are returned in Semantic token format.
Be default, this gets local variables, and local arguments.
This can be overridden with get-all-local-variables.
Optional argument POINT is the location to start getting the variables from.
|
These next set of functions handle local context parsing. This means looking at the code (locally) and navigating, and fetching information such as a the type of the parameter the cursor may be typing in.
| semantic-end-of-command | Function |
Move to the end of the current command.
Be default, uses semantic-command-separation-character.
Override with end-of-command.
|
| semantic-beginning-of-command | Function |
Move to the beginning of the current command.
Be default, users semantic-command-separation-character.
Override with beginning-of-command.
|
| semantic-ctxt-current-symbol &optional point | Function |
Return the current symbol the cursor is on at POINT in a list.
This will include a list of type/field names when applicable.
This can be overridden using ctxt-current-symbol.
|
| semantic-ctxt-current-assignment &optional point | Function |
Return the current assignment near the cursor at POINT.
Return a list as per semantic-ctxt-current-symbol.
Return nil if there is nothing relevant.
Override with ctxt-current-assignment.
|
| semantic-ctxt-current-function &optional point | Function |
Return the current function the cursor is in at POINT.
The function returned is the one accepting the arguments that
the cursor is currently in.
This can be overridden with ctxt-current-function.
|
| semantic-ctxt-current-argument &optional point | Function |
Return the current symbol the cursor is on at POINT.
Override with ctxt-current-argument.
|
| semantic-ctxt-scoped-types &optional point | Function |
Return a list of type names currently in scope at POINT.
Override with ctxt-scoped-types.
|
For details on using these functions to get more detailed information about the current context: See Context Analysis.
If you write a program that uses the stream of tokens in a persistent display or database, it is necessary to know when tokens change so that your displays can be updated. This is especially important as tokens can be replaced, changed, or deleted, and the associated overlays will then throw errors when you try to use them. Complete integration with token changes can be achieved via several very important hooks.
One interesting way to interact with the parser is to let it know that changes you are going to make will not require re-parsing.
| semantic-edits-are-safe | Variable |
When non-nil, modifications do not require a reparse.
This prevents tokens from being marked dirty, and it
prevents top level edits from causing a cache check.
Use this when writing programs that could cause a full
reparse, but will not change the tag structure, such
as adding or updating top-level comments.
|
Next, it is sometimes useful to know what the current parsing state is. These function can let you know what level of re-parsing may be needed. Careful choices on when to reparse can make your program much faster.
| semantic-bovine-toplevel-full-reparse-needed-p &optional checkcache | Function |
Return non-nil if the current buffer needs a full reparse.
Optional argument CHECKCACHE indicates if the cache check should be made.
|
| semantic-bovine-toplevel-partial-reparse-needed-p &optional checkcache | Function |
Return non-nil if the current buffer needs a partial reparse.
This only returns non-nil if semantic-bovine-toplevel-full-reparse-needed-p
returns nil.
Optional argument CHECKCACHE indicates if the cache check should be made
when checking semantic-bovine-toplevel-full-reparse-needed-p.
|
If you need very close interaction with the user's editing, then these two hooks can be used to find out when a given tag is being changed. These hooks could even be used to cut down on re-parsing if used correctly.
For all hooks, make sure you are careful to add it as a local hook if you only want to effect a single buffer. Setting it globally can cause unwanted effects if your program is concerned with a single buffer.
| semantic-dirty-token-hooks | Variable |
Hooks run after when a token is marked as dirty (edited by the user).
The functions must take TOKEN, START, and END as a parameters.
This hook will only be called once when a token is first made dirty,
subsequent edits will not cause this to run a second time unless that
token is first cleaned. Any token marked as dirty will
also be called with semantic-clean-token-hooks, unless a full
reparse is done instead.
|
| semantic-clean-token-hooks | Variable |
Hooks run after a token is marked as clean (re-parsed after user edits.)
The functions must take a TOKEN as a parameter.
Any token sent to this hook will have first been called with
semantic-dirty-token-hooks. This hook is not called for tokens
marked dirty if the buffer is completely re-parsed. In that case, use
semantic-after-toplevel-cache-change-hook.
|
| semantic-change-hooks | Variable |
Hooks run when semantic detects a change in a buffer.
Each hook function must take three arguments, identical to the
common hook after-change-function.
|
Lastly, if you just want to know when a buffer changes, use this hook.
| semantic-after-toplevel-bovinate-hook | Variable |
|
Hooks run after a toplevel token parse.
It is not run if the toplevel parse command is called, and buffer does
not need to be fully re-parsed.
This function is also called when the toplevel cache is flushed, and
the cache is emptied.
For language specific hooks, make sure you define this as a local hook.
This hook should not be used any more.
Use |
| semantic-after-toplevel-cache-change-hook | Variable |
|
Hooks run after the buffer token list has changed.
This list will change when a buffer is re-parsed, or when the token
list in a buffer is cleared. It is *NOT* called if the current token
list partially re-parsed.
Hook functions must take one argument, which is the new list of tokens associated with this buffer. For language specific hooks, make sure you define this as a local hook. |
| semantic-after-partial-cache-change-hook | Variable |
|
Hooks run after the buffer token list has been updated.
This list will change when the current token list has been partially
re-parsed.
Hook functions must take one argument, which is the list of tokens updated among the ones associated with this buffer. For language specific hooks, make sure you define this as a local hook. |
| semantic-before-toplevel-cache-flush-hook | Variable |
Hooks run before the toplevel nonterminal cache is flushed.
For language specific hooks, make sure you define this as a local hook.
This hook is called before a corresponding
semantic-after-toplevel-cache-change-hook which is also called
during a flush when the cache is given a new value of nil.
|
Here are some simple examples that use different aspects of the
semantic library APIs. For fully functional example programs with
lots of comments, see the file semantic-examples.el.
If you need a command that asks the user for a token name, you can get full range completion using the query functions Nonterminal Completion.
(interactive (list (semantic-read-symbol "Symbol: ")))
If you have the name of a function or variable, and need to find its location in a buffer, you need a search function. There is a wide range of searches you can perform Nonterminal Streams.
(semantic-find-nonterminal-by-name "some-name" (current-buffer) t ;; look inside structures and classes for these symbols nil) ;; do not look inside header files.
If you have the name of a function or variable, and need to find its location somewhere in a project, you need to use the Semantic Database semanticdb. There are many search functions similar to the ones found in Nonterminal Streams.
The Semantic Database is interesting in that the return structure is not
If you have a nonterminal token, or a list of them, you may want to find their position in a buffer.
(semanticdb-find-nonterminal-by-name "symbol" nil ;; Defaults to the current project's database list. t ;; Search inside types nil ;; Do not search include files nil ;; Only search files in the same mode (all C files) t ;; When a token is found, make sure it is loaded in a buffer. )
Of interesting note above, semanticdb can find symbols in files that are not loaded into an Emacs buffer. These tokens do not have an associated overlay, and the function semantic-token-buffer will fail.
The last parameter's tells the search function to find-file-noselect any file in which a matching token was found. This will allow you to merge all the tokens into a completion list, or other flat list needed by most functions that use association lists.
If you do not ask semanticdb to load those files, you will need to
explicitly request the database object (found in the car of
each sublist) get the file loaded. It is useful to not auto find all
files if you don't need to jump to that token.
A nonterminal token is a rather unpleasant Lisp structure when trying to decipher what is going on. As such, there is a wide range of functions available that can convert a token into a human readable, and colorful string Token->Text.
If you program interfaces with lots of users, you will probably want to have your program define a configurable variable that will let users change the visible portion of your program.
(defcustom my-summary-function 'semantic-uml-prototype-nonterminal "*Function to use when showing info about my token." :group 'my-program :type semantic-token->text-custom-list)
Note the special type provided by semantic.
Next, you can call this function to create a string.
(funcall my-summary-function token
token-parent
t ; use color
)
In this case, token-parent is an optional argument. In many cases, parent is not used by the outputting function. The parent may be a struct or class that contains token, or nil for top-level definitions. In particular, C++ needs the parent to correctly calculate the protection of each method.
This chapter deals with how to derive the current context, and also how a language maintainer can get the current context API to work with their language.
By default, the behavior will function in C like languages. This means languages with parenthetical blocks, and type dereferencing which uses a similar form.
Source code is typically built up of control structures, and blocks of context, or lexical scope. Semantic terms these lexical scopes as a "context". The following functions can be used to navigate contexts. Some of them are override functions. Language authors can override a subset of them to make them work for their language.
| semantic-up-context &optional point | Function |
Move point up one context from POINT.
Return non-nil if there are no more context levels.
Overloaded functions using up-context take no parameters.
|
| semantic-beginning-of-context &optional point | Function |
Move POINT to the beginning of the current context.
Return non-nil if there is no upper context.
The default behavior uses semantic-up-context. It can
be overridden with beginning-of-context.
|
| semantic-end-of-context &optional point | Function |
Move POINT to the end of the current context.
Return non-nil if there is no upper context.
Be default, this uses semantic-up-context, and assumes parenthetical
block delimiters. This can be overridden with end-of-context.
|
These next set of functions can be used to navigate across commands.
| semantic-end-of-command | Function |
Move to the end of the current command.
Be default, uses semantic-command-separation-character.
Override with end-of-command.
|
| semantic-beginning-of-command | Function |
Move to the beginning of the current command.
Be default, users semantic-command-separation-character.
Override with beginning-of-command.
|
Within a given context, or block of code, local variables are often defined. These functions can be used to retrieve lists of locally scoped variables.
| semantic-get-local-variables &optional point | Function |
Get the local variables based on POINT's context.
Local variables are returned in Semantic token format.
By default, this calculates the current bounds using context blocks
navigation, then uses the parser with bovine-inner-scope to
parse tokens at the beginning of the context.
This can be overridden with get-local-variables.
|
| semantic-get-local-arguments &optional point | Function |
Get arguments (variables) from the current context at POINT.
Parameters are available if the point is in a function or method.
This function returns a list of tokens. If the local token returns
just a list of strings, then this function will convert them to tokens.
Part of this behavior can be overridden with get-local-arguments.
|
| semantic-get-all-local-variables &optional point | Function |
Get all local variables for this context, and parent contexts.
Local variables are returned in Semantic token format.
Be default, this gets local variables, and local arguments.
This can be overridden with get-all-local-variables.
Optional argument POINT is the location to start getting the variables from.
|
While a context has already been used to describe blocks of code, other context include more local details, such as the symbol the cursor is on, or the fact we are assigning into some other variable.
These context deriving functions can be overridden to provide language specific behavior. By default, it assumes a C like language.
| semantic-ctxt-current-symbol &optional point | Function |
Return the current symbol the cursor is on at POINT in a list.
This will include a list of type/field names when applicable.
This can be overridden using ctxt-current-symbol.
|
| semantic-ctxt-current-assignment &optional point | Function |
Return the current assignment near the cursor at POINT.
Return a list as per semantic-ctxt-current-symbol.
Return nil if there is nothing relevant.
Override with ctxt-current-assignment.
|
| semantic-ctxt-current-function &optional point | Function |
| Return the current symbol the cursor is on at POINT. The function returned is the one accepting the arguments that the cursor is currently in. This can be overridden with `ctxt.current-function'. |
| semantic-ctxt-current-argument &optional point | Function |
Return the current symbol the cursor is on at POINT.
Override with ctxt-current-argument.
|
The context parsing API is used in a context analysis library. This library provides high level routines for scanning through token databases to create lists of token associates. At it's core is a set of EIEIO classes defining a context. The context contains information about what was parsed at a given position, such as the strings there, and they type of assignment. The analysis library then searches the databases to determine the types and names available.
Two high level functions which can be run interactively are:
*NOTE TO SELF: Add more here*
| semantic-analyze-current-context position | Command |
Analyze the current context at POSITION.
If called interactively, display interesting information about POSITION
in a separate buffer.
Returns an object based on symbol semantic-analyze-context.
|
| semantic-analyze-possible-completions point | Command |
| Return a list of semantic tokens which are possible completions. Analysis is done at POINT. |
Several tools come with Semantic which would not be possible without it. In general, these tools will work with any language supported by Semantic.
Speedbar supports the display of tags through the Semantic parser. To
use this utility, add a line like this to your .emacs file:
(add-hook 'speedbar-load-hook (lambda () (require 'semantic-sb)))
or you can simply add:
(require 'semantic-sb)
Once installed, speedbar will use semantic to find tokens, and will display them appropriately. Tags from semantic will have additional details which can be seen, such as return type, or arguments to functions.
If you use semantic-load.el, you do not need to add the above
lines in your .emacs file.
Two additional speedbar modes are described in Speedbar Analysis, and class browser.
There is special support for creating Imenu entries using semantic. This is a highly customizable tool which can create specialized menu systems for navigating your source file.
By default, each language that wants special imenu support will set
itself up for it. To setup imenu for your buffers, use this command
in your .emacs file:
(add-hook 'semantic-init-hooks (lambda ()
(imenu-add-to-menubar "TOKENS")))
Also supported is which-func-mode. This usually uses imenu tags to show the current function. The semantic support for this function uses overlays, which is much faster.
If you use semantic-load.el, you do not need to add the above
lines in your .emacs file.
You can customize imenu with the following options:
| semantic-imenu-summary-function | Option |
| Function to use when creating items in Imenu. Some useful functions are: semantic-abbreviate-nonterminal semantic-summarize-nonterminal semantic-prototype-nonterminal |
| semantic-imenu-bucketize-file | Option |
Non-nil if tokens in a file are to be grouped into buckets.
|
| semantic-imenu-buckets-to-submenu | Option |
Non-nil if buckets of tokens are to be turned into submenus.
This option is ignored if semantic-imenu-bucketize-file is nil.
|
| semantic-imenu-expand-type-parts | Option |
Non-nil if types should have submenus with parts in it.
|
| semantic-imenu-bucketize-type-parts | Option |
Non-nil if elements of a type should be placed grouped into buckets.
Nil means to keep them in the same order.
Overridden to nil if semantic-imenu-bucketize-file is nil.
|
| semantic-imenu-sort-bucket-function | Option |
| Function to use when sorting tags in the buckets of functions. |
| semantic-imenu-index-directory | Option |
Non nil to index the entire directory for tags.
Doesn't actually parse the entire directory, but displays tags for all files
currently listed in the current Semantic database.
This variable has no meaning if semanticdb is not active.
|
| semantic-imenu-auto-rebuild-directory-indexes | Option |
If non-nil automatically rebuild directory index imenus.
That is when a directory index imenu is updated, automatically rebuild
other buffer local ones based on the same semanticdb.
|
When adding support to a language, this variable may need to be set:
| semantic-imenu-expandable-token | Variable |
Tokens of this token type will be given submenu with children.
By default, a type has interesting children. In Texinfo, however,
a section has interesting children.
|
Semanticdb is a utility which tracks your parsed files, and saves the parsed information to files. When you reload your source files, semanticdb automatically associates the file with the cached copy, saving time by not re-parsing your buffer.
Semanticdb also provides an API for programs to use. These functions will return token information without loading the source file into memory by checking the disk cache.
To use semanticdb, add the following to your .emacs file:
(require 'semanticdb) (global-semanticdb-minor-mode 1)
If you have a tool which optionally uses the semantic database, it may be important to track if the database mode is turned on or off.
| semanticdb-mode-hooks | Option |
| Hooks run whenever global-semanticdb-minor-mode is run. Use semanticdb-minor-mode-p to determine if the mode has been turned on or off. |
| semanticdb-persistent-path | Option |
List of valid paths that semanticdb will cache tokens to.
When global-semanticdb-minor-mode is active, token lists will
be saved to disk when Emacs exits. Not all directories will have
tokens that should be saved.
The value should be a list of valid paths. A path can be a string,
indicating a directory in which to save a variable. An element in the
list can also be a symbol. Valid symbols are never, which will
disable any saving anywhere, always, which enables saving
everywhere, or project, which enables saving in any directory that
passes a list of predicates in semantic-project-predicates.
|
| semanticdb-project-roots | Option |
List of directories, where each directory is the root of some project.
All subdirectories of a root project are considered a part of one project.
Values in this string can be overridden by project management programs
via the semanticdb-project-root-functions variable.
|
The important difference between these two is that you may put just
"~" in semanticdb-persistent-path, but you may put individual
project directories into semanticdb-project-roots so that
different database lists don't get cross referenced incorrectly.
You can search for tokens in the database using the following functions. It is important to note that database search functions do not return a plain list of tokens. This is because some tokens may not be loaded in a buffer, which means that the found token would not have an overlay, and no way to determine where it came from.
As such, all search functions return a list of the form:
( (DATABASE TOKEN1 TOKEN2 ...) (DATABASE2 TOKEN3 TOKEN4 ...) ...)
| semanticdb-find-nonterminal-by-function function &optional databases search-parts search-includes diff-mode find-file-match | Function |
Find all occurrences of nonterminals which match FUNCTION.
Search in all DATABASES. If DATABASES is nil, search a range of
associated databases.
When SEARCH-PARTS is non-nil the search will include children of tokens.
When SEARCH-INCLUDES is non-nil, the search will include dependency files.
When DIFF-MODE is non-nil, search databases which are of a different mode.
A Mode is the major-mode that file was in when it was last parsed.
When FIND-FILE-MATCH is non-nil, the make sure any found token's file is
in an Emacs buffer.
|
| semanticdb-find-nonterminal-by-name name &optional databases search-parts search-includes diff-mode find-file-match | Function |
| Find all occurrences of nonterminals with name NAME in databases. See semanticdb-find-nonterminal-by-function for details on DATABASES, SEARCH-PARTS, SEARCH-INCLUDES, DIFF-MODE, and FIND-FILE-MATCH. |
| semanticdb-find-nonterminal-by-name-regexp regex &optional databases search-parts search-includes diff-mode find-file-match | Function |
| Find all occurrences of nonterminals with name matching REGEX in databases. See semanticdb-find-nonterminal-by-function for details on DATABASES, SEARCH-PARTS, SEARCH-INCLUDES DIFF-MODE, and FIND-FILE-MATCH. |
| semanticdb-find-nonterminal-by-type type &optional databases search-parts search-includes diff-mode find-file-match | Function |
| Find all nonterminals with a type of TYPE in databases. See semanticdb-find-nonterminal-by-function for details on DATABASES, SEARCH-PARTS, SEARCH-INCLUDES DIFF-MODE, and FIND-FILE-MATCH. |
| semanticdb-find-nonterminal-by-property property value &optional databases search-parts search-includes diff-mode find-file-match | Function |
| Find all nonterminals with a PROPERTY equal to VALUE in databases. See semanticdb-find-nonterminal-by-function for details on DATABASES, SEARCH-PARTS, SEARCH-INCLUDES DIFF-MODE, and FIND-FILE-MATCH. Return a list ((DB-TABLE . TOKEN-LIST) ...). |
| semanticdb-find-nonterminal-by-extra-spec spec &optional databases search-parts search-includes diff-mode find-file-match | Function |
| Find all nonterminals with a SPEC in databases. See semanticdb-find-nonterminal-by-function for details on DATABASES, SEARCH-PARTS, SEARCH-INCLUDES DIFF-MODE, and FIND-FILE-MATCH. Return a list ((DB-TABLE . TOKEN-LIST) ...). |
| semanticdb-find-nonterminal-by-extra-spec-value spec value &optional databases search-parts search-includes diff-mode find-file-match | Function |
| Find all nonterminals with a SPEC equal to VALUE in databases. See semanticdb-find-nonterminal-by-function for details on DATABASES, SEARCH-PARTS, SEARCH-INCLUDES DIFF-MODE, and FIND-FILE-MATCH. Return a list ((DB-TABLE . TOKEN-LIST) ...). |
| semanticdb-file-stream file | Function |
| Return a list of tokens belonging to FILE. If file has database tokens available in the database, return them. If file does not have tokens available, then load the file, and create them. |
Senator stands for SEmantic NAvigaTOR and was written by David Ponce.
This library defines commands and a minor mode to navigate between semantic language tokens in the current buffer.
The following user level commands are provided by Senator.
| senator-next-token | Command |
| Move to the next token in the current buffer. |
| senator-previous-token | Command |
| Move to the previous token in the current buffer. |
| senator-jump sym | Command |
| Jump to the semantic symbol SYM. If called interactively and a prefix argument is supplied jump in the local type's context (see function senator-current-type-context). |
Searching using senator mode restricts the search only to the definition text, such as the name of the functions or variables in a given buffer.
| senator-isearch-toggle-semantic-mode | Command |
| Toggles semantic search in isearch mode. When semantic search is enabled, isearch is restricted to token names. |
| senator-search-forward string | Command |
| senator-search-backward string | Command |
| Search forward and backward for a token matching string. |
| re-search-forward regex | Command |
| re-search-backward regex | Command |
| Search forward and backward for a token matching the regular expression regex. |
| word-search-forward word | Command |
| word | word-search-backward |
| Search forward and backward for a token whose name matches word. |
Completion in senator scans all known definitions in the local file, and uses that information to provide the completion.
| senator-complete-symbol | Command |
| Complete the current symbol under point. |
| senator-completion-menu-keyboard-popup | Command |
| Popup a completion menu for the symbol at point. |
Token Copy/Paste is a high level form of the typical copy yank used by Emacs. Copying a token saves the meta-information related to the function or item the cursor is currently in. When that information is yanked into a new buffer, the form of the text created is based on the current status of the programming buffer.
For example, pasting a function into a different file results in a function call template to be inserted. In a Texinfo file, a @deffn is created with documentation for that function or command.
| senator-copy-token | Command |
| Take the current token, and place it in the token ring. |
| senator-kill-token | Command |
Take the current token, place it in the token ring, and kill it.
Killing the token removes the text for that token, and places it into
the kill ring. Retrieve that text with yank.
|
| senator-yank-token | Command |
| Yank a token from the token ring. The form the token takes is different depending on where it is being yanked to. |
| senator-copy-token-to-register register &optional kill-flag | Command |
| Copy the current token into REGISTER. Optional argument KILL-FLAG will delete the text of the token to the kill ring. |
For programmers, to provide specialized pasting, created an override
function for insert-foreign-token (see See Settings.)
| senator-minor-mode | Command |
|
Toggle the SEmantic NAvigaTOR key bindings in the current buffer.
The following default key bindings are provided when semantic minor mode is enabled:
|
To enable the Senator keymap in all modes that support semantic parsing, use this:
(add-hook 'semantic-init-hooks 'senator-minor-mode)
To customize navigation around different types of tokens, use the following variables:
| senator-step-at-token-ids | Option |
List of token identifiers where to step.
Token identifier is symbol 'variable, 'function, 'type, or other. If
nil navigation steps at any token found. This is a buffer local
variable. It can be set in a mode hook to get a specific language
navigation.
|
| senator-step-at-start-end-token-ids | Option |
List of token identifiers where to step at start and end.
Token identifier is symbol 'variable, 'function, 'type, or other. If
nil navigation only step at beginning of tokens. If t step at start
and end of any token where it is allowed to step. Also, stepping at
start and end of a token prevent stepping inside its children. This
is a buffer local variable. It can be set in a mode hook to get a
specific language navigation.
|
To have a mode specific customization, do something like this in a hook:
(add-hook 'mode-hook
(lambda ()
(setq senator-step-at-token-ids '(function variable))
(setq senator-step-at-start-end-token-ids '(function))
))
This will cause navigation and search commands to stop only between functions and variables, and to step at start and end of functions only.
Any comments, suggestions, bug reports or upgrade requests are welcome. Please send them to David Ponce at david@dponce.com
The semantic analyzer is a library tool that performs context analysis and can derive useful information.
| semantic-analyze-current-context position | Command |
| Analyze the current context at POSITION. If called interactively, display interesting information about POSITION in a separate buffer. Returns an object based on symbol semantic-analyze-context. |
While this can be used as a command, it is mostly useful that way
while debugging the analyzer, or tools using the return value. Use
the Emacs command describe-class to learn more about using
semantic-analyze-context.
Another command that uses the analyzer context derives a completion list.
| semantic-analyze-possible-completions context | Command |
|
Return a list of semantic tokens which are possible completions.
CONTEXT is either a position (such as point), or a pre-calculated
context. Passing in a context is useful if the caller also needs
to access parts of the analysis.
Completions run through the following filters:
Context type matching can identify the following:
When called interactively, this function displays the list of possible completions. This is useful for debugging. |
The file semantic-ia.el contains two commands for performing
smart completion using the analysis library. Analysis to calculate
these completions are done through the analyzer and completion
mechanism. These functions just provide commands that can be bound
to key bindings.
| semantic-ia-complete-symbol point | Command |
| Complete the current symbol at POINT. Completion options are calculated with semantic-analyze-possible-completions. |
| semantic-ia-complete-symbol-menu point | Command |
| Complete the current symbol via a menu based at POINT. Completion options are calculated with semantic-analyze-possible-completions. |
The Analyzer output can be used through a speedbar interface. This interface lists details about the analysis, such as the current function, local arguments and variables, details on the prefix (the symbol the cursor is on), and a list of all possible completions. Completions are specified in semantic-analyze-possible-completions analyzer.
Each entry can be jumped to by clicking on the name. For strongly typed languages, this means you will jump to the definition of the variable, slot, or type definition.
In addition each entry has an <i> button. Clicking on this will display a summary of everything that is known about the variable or type displayed on that line.
If you click on the name of a variable in the "Completions" menu, then the text that was recently analyzed will be replaced with the name of the token that was clicked on in speedbar.
| semantic-speedbar-analysis | Command |
| Start Speedbar in semantic analysis mode. The analyzer displays information about the current context, plus a smart list of possible completions. |
You can also enter speedbar analyzer mode by selecting "Analyze" from the "Display" menu item on speedbar's menu.
The semantic class browser is a library that can covert a project group of files into an EIEIO based structure that contains links between structures so that the inheritance links between semantic tokens can be easily navigated.
The core to this library is one function in semantic-cb.el.
| semantic-cb-new-class-browser | Function |
| Create an object representing this project's organization. The object returned is of type semantic-cb-project, which contains the slot `:types', a list of all top-level types. Each element is a class of type semantic-cb-token, or semantic-cb-type. |
Use the Emacs function `describe-class' to learn more about these classes.
You can access the class inheritance structure through a speedbar interface. You can choose the "Class Browser" option from Speedbar's "Display" menu item, or use the following command:
| semantic-cb-speedbar-mode | Command |
| Bring speedbar up, and put it into Class Browser mode. This will use the Class Browser logic applied to the current Semantic project database to build the available relations. The structure of the class hierarchy can then be navigated using traditional speedbar interactions. |
The document program uses semantic token streams to aid in the
creation of texinfo documentation.
For example, the following is a code fragment from document.el
that comes with semantic:
(defun document (&optional resetfile) "Document the function or variable the cursor is in. Optional argument RESETFILE is provided w/ universal argument. When non-nil, query for a new documentation file." ... )
While visiting document.el, put the cursor somewhere within the
function shown above. Then type M-x document.
After asking for the texinfo file name, which in this case is
semantic.texi, this will update the texinfo
documentation of the document function in that file.
The result is that the following texinfo text will be either created
or updated in semantic.texi file:
@deffn Command document &optional resetfile
Document the function or variable the cursor is in.
Optional argument @var{RESETFILE} is provided w/ universal argument.
When non-@code{nil}, query for a new documentation file.
@end deffn
Note that the function name, arguments and documentation string
is put in the right place.
Within the doc-string, the function arguments are marked with
the @var command and the nil code fragment is marked with
@code command.
This example provides just a glimpse of what is possible with the
syntactic information provided by semantic.
The main entry point for the documentation generator are the following commands:
| document &optional resetfile | Command |
Document the function or variable the cursor is in.
Optional argument RESETFILE is provided w/ universal argument.
When non-nil, query for a new documentation file.
|
| document-inline | Command |
| Document the current function with an inline comment. |
| document-insert-defun-comment nonterm buffer | Command |
| Insert mode-comment documentation about NONTERM from BUFFER. |
| document-insert-new-file-header header | Command |
Insert a new header file into this buffer. Add reference to HEADER.
Used by prototype if this file doesn't have an introductory comment.
|
In addition to these base documentation commands, the texinfo semantic parser includes a two convenience functions when working directly with texinfo files.
| semantic-texi-update-doc &optional token | Command |
| Update the documentation for TOKEN. If the current buffer is a texinfo file, then find the source doc, and update it. If the current buffer is a source file, then get the documentation for this item, find the existing doc in the associated manual, and update that. |
| semantic-texi-goto-source &optional token | Command |
Jump to the source for the definition in the texinfo file TOKEN.
If TOKEN is nil, it is derived from the deffn under POINT.
|
Some commands to draw charts of statistics generated from parsing:
| semantic-chart-nonterminals-by-token &optional buffer-or-stream | Command |
| Create a bar chart representing the number of nonterminals for a token. Each bar represents how many toplevel nonterminal in BUFFER-OR-STREAM exist with a given token type. See `semantic-symbol->name-assoc-list' for tokens which will be charted. |
| semantic-chart-database-size &optional buffer-or-stream | Command |
| Create a bar chart representing the size of each file in semanticdb. Each bar represents how many toplevel nonterminals in BUFFER-OR-STREAM exist in each database entry. |
| semantic-chart-nonterminal-complexity-token &optional symbol buffer-or-stream | Command |
| Create a bar chart representing the complexity of some tokens. Complexity is calculated for tokens with a token of SYMBOL. Each bar represents the complexity of some nonterminal in BUFFER-OR-STREAM. Only the most complex items are charted. |
| semantic-show-dirty-mode &optional arg | Command |
Minor mode for highlighting dirty tokens.
With prefix argument ARG, turn on if positive, otherwise off. The
minor mode can be turned on only if semantic feature is available and
the current buffer was set up for parsing. Return non-nil if the
minor mode is enabled.
|
| global-semantic-show-dirty-mode &optional arg | Command |
Toggle global use of semantic-show-dirty-mode.
If ARG is positive, enable, if it is negative, disable.
If ARG is nil, then toggle.
|
| semantic-dirty-token-face | Option |
Face used to show dirty tokens in semantic-show-dirty-token-mode.
|
| semantic-show-unmatched-syntax-mode &optional arg | Command |
Minor mode to highlight unmatched-syntax tokens.
With prefix argument ARG, turn on if positive, otherwise off. The
minor mode can be turned on only if semantic feature is available and
the current buffer was set up for parsing. Return non-nil if the
minor mode is enabled.
|
| global-semantic-show-unmatched-syntax-mode &optional arg | Command |
Toggle global use of semantic-show-unmatched-syntax-mode.
If ARG is positive, enable, if it is negative, disable.
If ARG is nil, then toggle.
|
| semantic-unmatched-syntax-face | Option |
| Face used to show unmatched-syntax in. The face is used in semantic-show-unmatched-syntax-mode. |
| global-semantic-auto-parse-mode &optional arg | Command |
Toggle global use of semantic-auto-parse-mode.
If ARG is positive, enable, if it is negative, disable.
If ARG is nil, then toggle.
|
| semantic-auto-parse-mode &optional arg | Command |
Minor mode to auto parse buffer following changes.
With prefix argument ARG, turn on if positive, otherwise off. The
minor mode can be turned on only if semantic feature is available and
the current buffer was set up for parsing. Return non-nil if the
minor mode is enabled.
|
| semantic-auto-parse-no-working-message | Option |
Non-nil disable display of working message during parse.
|
| semantic-auto-parse-idle-time | Option |
| Time in seconds of idle time before auto-reparse. This time should be short enough to ensure that auto-parse will be run as soon as Emacs is idle. |
| semantic-auto-parse-max-buffer-size | Option |
| Maximum size in bytes of buffers automatically re-parsed. If this value is less than or equal to 0 buffers are automatically re-parsed regardless of their size. |
%( <lisp-expression> )%: Settings
%keywordtable: Settings
%languagemode: Settings
%outputfile: Settings
%parsetable: Settings
%put: Settings
%quotemode: Settings
%scopestart: Settings
%setupfunction: Settings
%start: Settings
%token: Settings
bovinate: Compiling
bovinate-debug: Debugging
document: document
document-inline: document
document-insert-defun-comment: document
document-insert-new-file-header: document
global-semantic-auto-parse-mode: minor modes
global-semantic-show-dirty-mode: minor modes
global-semantic-show-unmatched-syntax-mode: minor modes
re-search-backward: senator
re-search-forward: senator
semantic-abbreviate-nonterminal: Token->Text
semantic-adopt-external-members: Nonterminal Sorting
semantic-analyze-current-context: analyzer, Context Analysis
semantic-analyze-possible-completions: analyzer, Context Analysis
semantic-auto-parse-idle-time: minor modes
semantic-auto-parse-max-buffer-size: minor modes
semantic-auto-parse-mode: minor modes
semantic-auto-parse-no-working-message: minor modes
semantic-beginning-of-command: Blocks, Local Context
semantic-beginning-of-context: Blocks, Local Context
semantic-bovinate-debug-set-table: Debugging
semantic-bovinate-toplevel: Compiling
semantic-bovine-toplevel-full-reparse-needed-p: Parser Hooks
semantic-bovine-toplevel-partial-reparse-needed-p: Parser Hooks
semantic-bucketize: Nonterminal Sorting
semantic-cb-new-class-browser: class browser
semantic-cb-speedbar-mode: class browser
semantic-chart-database-size: charts
semantic-chart-nonterminal-complexity-token: charts
semantic-chart-nonterminals-by-token: charts
semantic-clear-toplevel-cache: Compiling
semantic-concise-prototype-nonterminal: Token->Text
semantic-ctxt-current-argument: Derived Context, Local Context
semantic-ctxt-current-assignment: Derived Context, Local Context
semantic-ctxt-current-function: Derived Context, Local Context
semantic-ctxt-current-symbol: Derived Context, Local Context
semantic-ctxt-scoped-types: Local Context
semantic-current-nonterminal: Nonterminals at point
semantic-current-nonterminal-parent: Nonterminals at point
semantic-dirty-token-face: minor modes
semantic-end-of-command: Blocks, Local Context
semantic-end-of-context: Blocks, Local Context
semantic-equivalent-tokens-p: Token Queries
semantic-find-dependency: Token Details
semantic-find-documentation: Token Details
semantic-find-innermost-nonterminal-by-position: Nonterminal Streams
semantic-find-nonterminal: Token Details
semantic-find-nonterminal-by-extra-spec: Nonterminal Streams
semantic-find-nonterminal-by-extra-spec-value: Nonterminal Streams
semantic-find-nonterminal-by-function: Nonterminal Streams
semantic-find-nonterminal-by-function-first-match: Nonterminal Streams
semantic-find-nonterminal-by-name: Nonterminal Streams
semantic-find-nonterminal-by-overlay: Nonterminals at point
semantic-find-nonterminal-by-overlay-in-region: Nonterminals at point
semantic-find-nonterminal-by-position: Nonterminal Streams
semantic-find-nonterminal-by-property: Nonterminal Streams
semantic-find-nonterminal-by-token: Nonterminal Streams
semantic-find-nonterminal-by-type: Nonterminal Streams
semantic-find-nonterminal-standard: Nonterminal Streams
semantic-flex: Lexing
semantic-flex-keyword-get: Keywords
semantic-flex-keyword-p: Keywords
semantic-flex-keyword-put: Keywords
semantic-flex-keywords: Keywords
semantic-flex-map-keywords: Keywords
semantic-get-all-local-variables: Local Variables, Local Context
semantic-get-local-arguments: Local Variables, Local Context
semantic-get-local-variables: Local Variables, Local Context
semantic-ia-complete-symbol: Smart Completion
semantic-ia-complete-symbol-menu: Smart Completion
semantic-imenu-auto-rebuild-directory-indexes: imenu
semantic-imenu-bucketize-file: imenu
semantic-imenu-bucketize-type-parts: imenu
semantic-imenu-buckets-to-submenu: imenu
semantic-imenu-expand-type-parts: imenu
semantic-imenu-index-directory: imenu
semantic-imenu-sort-bucket-function: imenu
semantic-imenu-summary-function: imenu
semantic-name-nonterminal: Token->Text
semantic-nonterminal-abstract: Token Details
semantic-nonterminal-children: Token Details
semantic-nonterminal-external-member-children: Token Details
semantic-nonterminal-external-member-p: Token Details
semantic-nonterminal-external-member-parent: Token Details
semantic-nonterminal-leaf: Token Details
semantic-nonterminal-protection: Token Details
semantic-nonterminal-static: Token Details
semantic-prototype-file: Token->Text
semantic-prototype-nonterminal: Token->Text
semantic-read-function: Nonterminal Completion
semantic-read-symbol: Nonterminal Completion
semantic-read-type: Nonterminal Completion
semantic-read-variable: Nonterminal Completion
semantic-recursive-find-nonterminal-by-name: Nonterminal Streams
semantic-show-dirty-mode: minor modes
semantic-show-unmatched-syntax-mode: minor modes
semantic-speedbar-analysis: Speedbar Analysis
semantic-summarize-nonterminal: Token->Text
semantic-texi-goto-source: document
semantic-texi-update-doc: document
semantic-token-docstring: Token Queries
semantic-token-end: Token Queries
semantic-token-extent: Token Queries
semantic-token-extra-spec: Token Queries
semantic-token-function-args: Token Queries
semantic-token-function-destructor: Token Queries
semantic-token-function-extra-spec: Token Queries
semantic-token-function-extra-specs: Token Queries
semantic-token-function-modifiers: Token Queries
semantic-token-function-parent: Token Queries
semantic-token-function-throws: Token Queries
semantic-token-get: Token Queries
semantic-token-include-system: Token Queries
semantic-token-name: Token Queries
semantic-token-overlay: Token Queries
semantic-token-put: Token Queries
semantic-token-put-no-side-effect: Keywords
semantic-token-start: Token Queries
semantic-token-token: Token Queries
semantic-token-type: Token Queries
semantic-token-type-extra-spec: Token Queries
semantic-token-type-extra-specs: Token Queries
semantic-token-type-modifiers: Token Queries
semantic-token-type-parent: Token Queries
semantic-token-type-parent-implement: Token Queries
semantic-token-type-parent-superclass: Token Queries
semantic-token-type-parts: Token Queries
semantic-token-variable-const: Token Queries
semantic-token-variable-default: Token Queries
semantic-token-variable-extra-spec: Token Queries
semantic-token-variable-extra-specs: Token Queries
semantic-token-variable-modifiers: Token Queries
semantic-uml-abbreviate-nonterminal: Token->Text
semantic-unmatched-syntax-face: minor modes
semantic-up-context: Blocks, Local Context
semanticdb-file-stream: semanticdb
semanticdb-find-nonterminal-by-extra-spec: semanticdb
semanticdb-find-nonterminal-by-extra-spec-value: semanticdb
semanticdb-find-nonterminal-by-function: semanticdb
semanticdb-find-nonterminal-by-name: semanticdb
semanticdb-find-nonterminal-by-name-regexp: semanticdb
semanticdb-find-nonterminal-by-property: semanticdb
semanticdb-find-nonterminal-by-type: semanticdb
semanticdb-mode-hooks: semanticdb
semanticdb-persistent-path: semanticdb
semanticdb-project-roots: semanticdb
senator-complete-symbol: senator
senator-completion-menu-keyboard-popup: senator
senator-copy-token: senator
senator-copy-token-to-register: senator
senator-isearch-toggle-semantic-mode: senator
senator-jump: senator
senator-kill-token: senator
senator-minor-mode: senator
senator-next-token: senator
senator-previous-token: senator
senator-search-backward: senator
senator-search-forward: senator
senator-step-at-start-end-token-ids: senator
senator-step-at-token-ids: senator
senator-yank-token: senator
word: senator
word-search-forward: senator