Finite State Automata and Lexing-Parsing
Finite State Machines & lexing-parsing
My descriptor language parser applies top-down recursive descent using a custom implementation of regular expression.
Patterns can be specified with a subset of the Basic Regular Expression language (BRE) and are adapted to a node representation.
var nfa = new Fsa();
nfa.Build("(a|b|c){1,3}", (int)Token1);
nfa.Build("int|long",     (int)Token2);
nfa.Build("[A-Za-z0-9]+", (int)Token3);
...Furthermore, several Nondeterministic Finite Automata (NFAs) can be formed from several regex expressions to match a corresponding set of tokens in a language.
All possible tokens which can be matched by the state machine are knit into one NFA, which can be used to match tokens as-is (using non-deterministic breadth-first evaluation of the NFA)
var (token, match) = nfa.Search(source, offset);
or further optimized using DFA conversion and equivalence partioning.
var dfa = nfa.ConvertToDfa().MinimizeDfa(); var (token, match) = dfa.Search(source, offset);
Usage of the machine will resemble that of tools such as (F)lex. Matched tokens can be passed off to a parsing algorithm (e.g., LALR(1)), or a developer may opt to implement recursive descent parsing by manually analyzing tokens.
Perhaps the token "if" is found, so the following code is invoked to match a statement in source:
var stream = new TokenStream()
{
	Grammar = dfa,
	Source = inputSource
};
stream.Seek(startOffset);
var ifDto = new IfDto();
// "if" "(" {predicate expression} ")"
if (stream.Poll() != (int)Token.If)
{
	throw new Exception("Expected 'if'");
}
if (stream.Poll() != (int)Token.LParens)
{
	throw new Exception("Expected '('");
}
ifDto.Predicate = ExpressionGrammar.Match(stream);
if (stream.Poll() != (int)RParens)
{
	throw new Exception("Expected ')'");
}
// "{" {statement} [, ...] "}" | {statement}
ifDto.Body = StatementGrammar.MatchBody(stream);
// ... [ "else" {failure body} ]
More grammar can follow at any level of complexity supported by the chosen parsing methodology.