Core Single Parsers

The core single parsers are all generic on the type of input they accept. For a list of additional parsers which are related strictly to parsing strings with character input sequences, see the String/Character Parsers page. In addition, there are several specialty parsers related to parsing common patterns from programming languages, which also take character input sequences. You can find these in the Programming Parsers page.

Throughout the descriptions of these parsers, examples will be shown where one parser is functionally or logically equivalent to a combination of other parsers. This is done to give multiple possible ways to understand some of the trickier concepts.

The best way to access these core parsers is through the static factory methods. Add this to the top of your C# file:

using ParserObjects;
using static ParserObjects.ParserMethods<char>;

(Replace <char> with whatever your input type is.)

Declaration Styles

There are two basic styles of declaring parsers. The first is to use the static factory methods to create parsers:

using static ParserObjects.ParserMethods<char>;

var parser = List(
    Any()
);

The second is to use monadic extension methods to combine them:

using static ParserObjects.ParserMethods<char>;

var parser = Any().List();

In a few cases there are tuple syntaxes available as well. These will be noted in the appropriate sections.

Do not use parser class names directly from the ParserObjects.Parsers namespace. These class names are not designed for easy discovery or use, and they may change between releases to better describe what they are and how they operate. The Function and Method names described above will stay the same between releases, even if the ways they are implemented may change.

Fundamental Parser Types

These parsers represent the theoretical core of the parsing library. To use these, import the methods you’re using (replace <char> with whatever input type you are using):

using ParserObjects;
using static ParserObjects.ParserMethods<char>;

Any Parser

The Any parser matches any single input value and returns it directly. It consumes one item of input, and only fails when the sequence is at the end.

var anyParser = Any();

It is functionally equivalent to the match predicate parser, though simpler and faster (described below):

var anyParser = Match(_ => true);

Bool Parser

The Bool parser invokes a parser and returns true if the inner parser succeeds, false otherwise. It is useful if you want to know whether something matches, but don’t care what the result value is, or if you want to convert IParser<TInput> to IParser<TInput, bool>.

var parser = Bool(innerParser);

Chain Parser

The Chain parser invokes an initial parser to obtain a prefix value, then uses that prefix value to select the next parser to invoke.

var parser = Chain(initial, result => {
    if (!result.Success)
        return new HandleFailureParser();
    if (result.Value == 'a')
        return new AParser();
    if (result.Value == 'b')
        return new BParser();
});
var parser = initial.Chain(result => {
    if (!result.Success)
        return new HandleFailureParser();
    if (result.Value == 'a')
        return new AParser();
    if (result.Value == 'b')
        return new BParser();
});

The Chain parser will throw an InvalidOperationException if the callback method returns a null value.

Chain With Parser

The ChainWith parser is related to the Chain parser but uses a different fluent syntax for selecting a value.

var parser = ChainWith(initial, config => config
    .When(x => x == 'a', new AParser())
    .When(x => x == 'b', new BParser())
);

Choose Parser

The Choose parser invokes an initial parser to parse a prefix value without consuming any input, then uses that prefix value to select the next parser to invoke.

var parser = Choose(initial, result => {
    if (!result.Success)
        return new HandleFailureParser();
    if (result.Value == 'a')
        return new AParser();
    if (result.Value == 'b')
        return new BParser();
});
var parser = initial.Choose(result => {
    if (!result.Success)
        return new HandleFailureParser();
    if (result.Value == 'a')
        return new AParser();
    if (result.Value == 'b')
        return new BParser();
});

The Choose parser is implemented using the Chain parser internally and is equivalent to a combination of the Chain and None parsers:

var parser = initial
    .None()
    .Chain(result => ...);

Combine Parser

The Combine parser takes a list of parsers, parses each in sequence, and returns a list of object results. You can transform or filter these results as appropriate for your application.

var parser = Combine(p1, p2, p3, ...);

Empty Parser

The Empty parser consumes no input and always returns success with a default value, even when the input sequence is at end. It consumes no input and returns no value.

var parser = Empty();

End Parser

The End parser returns success if the stream is at the end, failure otherwise. It consumes no input and returns no value.

var parser = End();

Examine Parser

The Examine parser allows inserting callbacks before or after any other parser, and is primarily used for debugging. You can also use the Examine parser to make adjustments to the input stream or state data before the parse, and augment the result after a parse.

var parser = inner.Examine(
    before: c => { ... }, 
    after: c => { ... }
);

Fail Parser

The Fail parser returns failure unconditionally. It can be used to explicitly insert failure conditions into your parser graph, to provide error messages which are more helpful than the default error messages, or to serve as a placeholder for replacement operations. The Fail parser has an output type so it can be inserted into places in your parser graph that expect an output type to be specified.

var parser = Fail<char>("helpful error message");
var parser = Fail("helpful error message");

If the output type is not specified, it returns the same as the input type.

First Parser

The First parser takes a list of parsers. Each parser is attempted in order, and the result is returned as soon as any parser succeeds. If none of the parsers succeed, the First parser fails. The First parser can also be written as an extension method on a tuple of parsers.

var parser = First(
    parser1, 
    parser2,
    parser3
);
var parser = (parser1, parser2, parser3).First();

The tuple variant of this parser is limited up to 9 child parsers. The other variants can take any number of child parsers.

Function Parser

The Function parser takes a callback function to perform the parse. The callback takes success and fail arguments, which are functions to generate the correct result object with filled-in metadata. It will automatically rewind the input sequence on failure, so you do not need to cleanup manually. It will also automatically report the correct number of consumed input tokens so you do not need to track it yourself.

var parser = Function((t, success, fail) => {
    // for success
    return success("ok");

    // for failure
    return fail("parse failed");
});

The Function parser is very similar to the Sequential parser. Both use a callback to execute the parse. The Function parser is almost completely free from structure and does not assume that the parse internally is performed using IParser instances. The Sequential parser, on the other hand provides a state object which should be used to perform parses, and expects that the parsing internally will be done using IParser instances. The Function parser is used internally to implement several of the other parser types in this list.

List Parser

The List parser attempts to parse the inner parser repeatedly until it fails, and returns an enumeration of the results. The List parser takes optional minimum and maximum values, to control the number of items matched. If you specify a minimum, the list will fail unless that number of items has been matched. If you do not specify a minimum, the list may return success if no items are matched, and return an empty list as a result. If a maximum number is specified, the list will continue matching only until that maximum number is reached then it will stop.

var parser = List(innerParser);
var parser = List(innerParser, 3, 5);

// same as List(innerParser, minimum: 1);
var parser = List(innerParser, true);

var parser = innerParser.List();
var parser = innerParser.List(3, 5);

// Same as innerParser.List(minimum: 1);
var parser = innerParser.List(true);

If the inner parser returns success but consumes zero input, the List parser will break the loop and return only a single item. If a minimum number is set, the List parser will loop only until the minimum value and then break, returning success with a list with the correct number of items. This is a precaution to prevent the list parser from getting into an infinite loop when no input is being consumed.

Match Predicate Parser

The MatchPredicateParser examines the next input item and returns it if it matches a predicate.

var parser = Match(c => ...);

None Parser

The None parser evaluates an inner parser and the rewinds the input sequence to ensure no data has been consumed.

var parser = None(Any());
var parser = Any().None();

Optional Parser

The Optional parser attempts to invoke the inner parser, but returns success no matter the result. The Optional parser takes a callback argument to return a default value if the parse fails. If the default value callback is not provided, the Optional parser will return an IOption object which will report on success or failure of the inner parser.

var parser = Optional(innerParser);
var parser = Optional(innerParser, () => defaultValue);

var parser = innerParser.Optional();
var parser = innerParser.Optional(() => defaultValue);

The Optional parser is functionally equivalent to a combination of First and Produce parsers:

var parser = First(
    innerParser,
    Produce(() => defaultValue)
);

Peek Parser

The Peek parser peeks at the next value of input, but does not consume it. It returns failure when the input sequence is at end, success otherwise.

var parser = new PeekParser<char>();
var parser = Peek();

This parser is functionally equivalent to the Any and None parsers:

var parser = Any().None();

Predict Parser

The Predict parser peeks at a lookahead value in the input stream, and uses that value to determine what parser to invoke next.

var parser = Predict(config => config
    .When(c => c == 'a', new AParser())
    .When(c => c == 'b', new BParser())
);

If no matching value is found, the Predict parser returns failure. The Predict parser is implemented internally using the Chain parser and the Peek parser. It is logically equivalent, though nicer syntax, to:

var parser = Peek().Chain(r => ...);

Produce Parser

The Produce parser produces a value but consumes no input. It always returns success.

var parser = Produce(() => "abcd");
var parser = Produce((input, data) => "abcd");

The produce parser may be used to construct synthetic values at parse time. It can return a constant value or create a new value on every call, the value will not be cached. It may look at and consume input from the input sequence. It may use values from the contextual state data.

The simple case of the Produce parser is functionally equivalent to a combination of the Empty and Transform parsers:

var parser = Empty().Transform(_ => "abcd");

Rule Parser

The Rule parser attempts to execute a list of parsers, and then return a combined result. If any parser in the list fails, the input is rewound and the whole parser fails. You can create rule parsers by using the .Rule() extension method on a Tuple or ValueTuple of parser objects, which may be cleaner to read and write in some situations

var parser = Rule(
    parser1, 
    parser2, 
    parser3, 
    (r1, r2, r3) => ...
);
var parser = (parser1, parser2, parser3).Rule((r1, r2, r3) => ...);

The Rule() method and tuple variants are both limited to 9 parsers at most. If you need to combine the results of more than 9 parsers, use the Combine parser instead.

Separated List

The SeparatedList parser is similar to a List parser except the items have separators between them. Like the List parser, the Separated List may take minimum and maximum values to control how many items are matched.

var parser = SeparatedList(item, separator);
var parser = SeparatedList(item, separator, 3, 5);

// Same as SeparatedList(item, separator, minimum: 1);
var parser = SeparatedList(item, separator, true);

This parser is implemented as a combination of several other parser types including List, Rule, First and Combine.

Sequential Parser

The Sequential parser allows turning a parser graph into a block of sequential code. This allows you to use procedural logic to aid in parsing and to set breakpoints between parsers to get maximum debuggability. Some grammars are best parsed using a stack or other mechanism, instead of the recursive descent algorithm used by ParserObjects, so the Sequential parser allows you to use those algorithms instead. The downside is that the Sequential Parser does not work with some features like BNF stringification or .Replace()/.ReplaceChild() operations.

var parser = Sequential(t => 
{
    var type = t.Parse(Word());
    if (type == 'decimal')
    {
        var colon = t.Parse(Match(':'));
        var value = t.Parse(Integer());
        return value;
    }
    if (type == 'hex')
    {
        var colon = t.Parse(Match(':'));
        var value = t.Parse(HexadecimalInteger());
        return value;
    }
    return 0;
});

The t object assists in performing the parse and it has ability to handle errors by causing the whole Sequential parser to fail if any of the child parsers fail.

Synchronize Parser

The Synchronize parser allows entering panic mode when a parse fails. In panic mode, the parser will discard tokens to get back to a known “good” state, before attempting the parse again. This is useful for cases where you want to report all syntax errors to the user, not just the first error.

var parser = Synchronize(inner, x => x == ';');
var parser = inner.Synchronize(x => x == ';');

Once you define your parser, you can check to see if there are any errors. If the parser eventually succeeds, the successful result will also be available:

var result = parser.Parse(...);
var allErrors = result.TryGetData<ErrorList>();
var successResult = result.TryGetData<IResult>();

You can use the list of errors to report problems back to the user.

Try Parser

The Try parser catches user-thrown exceptions from within the parse and handles them. When an exception is caught, the input sequence is rewound to the location where the Try parser began.

var parser = Try(innerParser, ex => {...}, bubble: true);

The second parameter is a callback to allow examining the exception when it is received. This can be a useful place to set a breakpoint during debugging. The third parameter bubble tells whether to rethrow the exception (true) or to handle the exception and return a failure result (false).

You can get information about the exception thrown from the result, if you set bubble: false:

var result = parser.Parse(...);
var exception = result.TryGetData<Exception>();

Matching Parsers

These parsers help to simplify matching of literal patterns.

Match Pattern Parser

The MatchPatternParser takes a literal list of values, and attempts to match these against the input sequence. If all input items match, in order, the values will be returned as a list. (some of the below examples take advantage of the fact that a string is an IEnumerable<char> to help simplify)

var parser = Match(new char[] { 'a', 'b', 'c', 'd' });
var parser = Match("abcd");

This is functionally equivalent (though faster and more succinct) to a combination of the Rule and MatchPredicate parsers:

var parser = Rule(
    Match(c => c == 'a'),
    Match(c => c == 'b'),
    Match(c => c == 'c'),
    Match(c => c == 'd')
    (a, b, c, d) => new [] { a, b, c, d }
);

Trie Parser

The Trie parser uses a trie to find the longest match in a list of possible literal sequences. This is a useful optimization for keyword and operator literals, where individual patterns may have overlapping prefixes. The ParserObjects library provides IReadOnlyTrie<TKey, TResult> and IInsertableTrie<TKey, TResult> abstractions for this purpose.

var parser = Trie(trie);
var parser = trie.ToParser();
var parser = Trie(trie => trie.Add(...));
var parser = MatchAny("value", "value2", "value3");

The MatchAny parser is implemented using the Trie mechanism internally, and works only on char input, string output scenarios.

Transforming parsers

These parsers exist to transform results from one form to another.

Transform Parser

The Transform parser transforms the output of an inner parser. If the inner parser fails the Transform parser fails. If the inner parser succeeds, the Transform parser will return a transformed result.

var parser = Transform(innerParser, r => ...);
var parser = innerParser.Transform(r => ...);

Transform Result Parser

The TransformResult parser has an opportunity to transform the entire result, including all metadata, and it operates even when the result is a failure result.

var parser = TransformResult(inner, (state, result) => { ... });

Transform Error Parser

The TransformError parser is implemented by the TransformResult parser, but the callback only executes when the result is a failure. This is used to transform the result to, for example, provide a better error message.

var parser = TransformError(parser, (state, errorResult) => { ... });

Recursive Parsers

These parsers exist to help simplify certain recursion scenarios, especially in parsing equations and mathematical expressions. They are not helpful in all recursion scenarios.

Left Apply Parser

The LeftApplyParser is a parser for left-associative parsing. The left value is parsed first and the value of it is applied to the right side production rule. The value of the right parser will then be used as the new left value and it will attempt to continue until a right parser does not match. The pseudo-BNF for it is:

self := <self> <right> | <item>

var parser = LeftApply(
    itemParser, 
    left => Rule(
        left,
        ...
    )
);

Right Apply Parser

The RightApply is for right-associative recursion. It is conceptually similar to the LeftApply parser, but with right-recursion instead. It parses an item and then attempts to parse a separator followed by a recursion to itself. The pseudo-BNF for it is:

self := <item> (<middle> <self>)?

var parser = RightApply(item, middle, (l, m, r) => ...);

This same recursive functionality can be reproduced by a combination of Deferred, First, and Rule:

IParser<char, string> parserCore = null;
var parser = Deferred(() => parserCore);
parserCore = First(
    Rule(
        item, 
        middle,
        parser,
        (l, m, r) => ...
    ),
    item
);

Pratt Parser

The Pratt parser is an implementation of the Pratt parsing algorithm, which may be particularly helpful with parsing mathematical expressions.

var parser = Pratt(config => { ... });

For detailed information about configuring and using the Pratt parser, see the Pratt Parser page. It may be simpler to use in many situations than the LeftApply and RightApply parsers are.

Referencing Parsers

These parsers exist to help with referencing issues, to help resolve circular dependencies or decide on which parser to use at parse-time.

Create Parser

The Create parser creates a parser at parse time using information available in the current parse state. Create parser looks similar to the Deferred parser, though has a few important semantic differences: The create callback takes the ParseState, and it cannot be used with find/replace operations. The Create parser is expected to create new parser instances at different times, so it is not considered to have “children”. This means that the parser returns by the Create parser will not be visible to Visitors.

var parser = new CreateParser<TInput, TOutput>(state => { ... });
var parser = Create(state => { ... });

Deferred Parser

The Deferred parser references another parser and resolves the reference at parse time instead of at declaration time. This allows your parser to handle recursion and circular references. The parser returned from the Deferred parser is expected by the system to be the same throughout the entire parse and may be cached after first access. Because the parser returned by Deferred is expected to be static and available at any time after the parser graph is created, the parser can be used with find/replace operations and should correctly work with BNF stringification.

var parser = new DeferredParser<TInput, TOutput>(() => targetParser);
var parser = Deferred(() => targetParser);

Replaceable Parser

The Replaceable parser references an inner parser and invokes it transparently. However, the replaceable parser allows the inner parser to be replaced in-place without cloning. This is useful in cases where you want to make modifications to the parser tree without creating a whole new tree.

var parser = Replaceable(innerParser);
var parser = innerParser.Replaceable();

If an inner parser is not explicitly specified, the inner parser will be a Fail parser. These two lines are equivalent:

var parser = Replaceable<TOutput>();
var parser = Replaceable(Fail<TOutput>());

It is extremely helpful to name your replaceable parsers so you can quickly find and replace values by name.