REX C code generator.

rexcc is a code generator for C language. It generates C code from regular expressions and initializes Deterministic Finite Automata(DFA) rexdfa_t object. The rexcc program reads user specified input file, for a description of the code to generate. It will produce a C file or it will output the generated code to the standard output.

Input file format

The rexcc input file consists of three sections, separated by a line containing only `%%'.

C code prolog
%%
regular expressions
%%
C code epilog

C code prolog

This section is used to include any header files or definitions that are required by the rest of the C code.

regular expressions

This section is used to specify the regular expressions that will be used to generate and initialize the Deterministic Finite Automata (DFA). This section contain series of regular expression definitions of the form:

userdata	regex

where userdata must be a user defined data of type rexuserdata_t and regex must be a regular expression. Both must be separated by space or tab.

C code epilog

This section is used to add any C code that uses the rexdfa_t object generated from the rules specified in the previous section. The name of the generated variable of type rexdfa_t is always `ccdfa' and it is declared as static. If you need to access it outside of the generated file you should add code in this section that will make such access possible. For example:

rexdfa_t *mydfa = &ccdfa;

Or using accessor function:

rexdfa_t *GetMyDfaPtr()
{
	return &ccdfa;
} 

Example

#include "mydefinitions.h"
#define IDENTIFIER 257

%%
IDENTIFIER      [A-Za-z_][A-Za-z_0-9]*
"keyword"       while|do
256             [ \n\r\t]
%%

/* All userdata used in the previous section, can be cast to rexuserdata_t. */

rexdfa_t *get_simple_dfa()
{
        return &ccdfa;
}

The userdata specified for eache regular expression is used to identify that regular expression when the automata arrives at an accepting state.

Building the generated code

The code generated with rexcc doesn't require to be linked with the REX library, but it includes the header file rexdfa.h. This file provides the definitions of the DFA related structures used by the generated code and it also provides macros for accessing the states and substates of the DFA. You must add the path to the rexdfa.h header file to your default search path.

List of macros:

Example

rexcc parameters

# rexcc [OPTIONS] <filename>
 OPTIONS:
	-o <cfile>               Output .c file.
	-d                       Dump regular expressions.
	-D                       Dump DFA states.
	-N                       Dump NFA states.
	-s                       Include substates.
	-t                       Display statistics.
	-v                       Display version information.
	-h, --help               Display this help.