Superpower 3.2.2-dev-00214

Superpower NuGet Version

A parser combinator library based on Sprache. Superpower generates friendlier error messages through its support for token-driven parsers.

Logo

What is Superpower?

The job of a parser is to take a sequence of characters as input, and produce a data structure that's easier for a program to analyze, manipulate, or transform. From this point of view, a parser is just a function from string to T - where T might be anything from a simple number, a list of fields in a data format, or the abstract syntax tree of some kind of programming language.

Just like other kinds of functions, parsers can be built by hand, from scratch. This is-or-isn't a lot of fun, depending on the complexity of the parser you need to build (and how you plan to spend your next few dozen nights and weekends).

Superpower is a library for writing parsers in a declarative style that mirrors the structure of the target grammar. Parsers built with Superpower are fast, robust, and report precise and informative errors when invalid input is encountered.

Usage

Superpower is embedded directly into your C# program, without the need for any additional tools or build-time code generation tasks.

dotnet add package Superpower

The simplest text parsers consume characters directly from the source text:

// Parse any number of capital 'A's in a row
var parseA = Character.EqualTo('A').AtLeastOnce();

The Character.EqualTo() method is a built-in parser. The AtLeastOnce() method is a combinator, that builds a more complex parser for a sequence of 'A' characters out of the simple parser for a single 'A'.

Superpower includes a library of simple parsers and combinators from which more sophisticated parsers can be built:

TextParser<string> identifier =
    from first in Character.Letter
    from rest in Character.LetterOrDigit.Or(Character.EqualTo('_')).Many()
    select first + new string(rest);

var id = identifier.Parse("abc123");

Assert.Equal("abc123", id);

Parsers are highly modular, so smaller parsers can be built and tested independently of the larger parsers that use them.

Tokenization

Along with text parsers that consume input character-by-character, Superpower supports token parsers.

A token parser consumes elements from a list of tokens. A token is a fragment of the input text, tagged with the kind of item that fragment represents - usually specified using an enum:

public enum ArithmeticExpressionToken
{
    None,
    Number,
    Plus,

A major benefit of driving parsing from tokens, instead of individual characters, is that errors can be reported in terms of tokens - unexpected identifier `frm`, expected keyword `from` - instead of the cryptic unexpected `m`.

Token-driven parsing takes place in two distinct steps:

  1. Tokenization, using a class derived from Tokenizer<TKind>, then
  2. Parsing, using a function of type TokenListParser<TKind>.
var expression = "1 * (2 + 3)";

// 1.
var tokenizer = new ArithmeticExpressionTokenizer();
var tokenList = tokenizer.Tokenize(expression);

// 2.
var parser = ArithmeticExpressionParser.Lambda; // parser built with combinators
var expressionTree = parser.Parse(tokenList);

// Use the result
var eval = expressionTree.Compile();
Console.WriteLine(eval()); // -> 5

Assembling tokenizers with TokenizerBuilder<TKind>

The job of a tokenizer is to split the input into a list of tokens - numbers, keywords, identifiers, operators - while discarding irrelevant trivia such as whitespace or comments.

Superpower provides the TokenizerBuilder<TKind> class to quickly assemble tokenizers from recognizers, text parsers that match the various kinds of tokens required by the grammar.

A simple arithmetic expression tokenizer is shown below:

var tokenizer = new TokenizerBuilder<ArithmeticExpressionToken>()
    .Ignore(Span.WhiteSpace)
    .Match(Character.EqualTo('+'), ArithmeticExpressionToken.Plus)
    .Match(Character.EqualTo('-'), ArithmeticExpressionToken.Minus)
    .Match(Character.EqualTo('*'), ArithmeticExpressionToken.Times)
    .Match(Character.EqualTo('/'), ArithmeticExpressionToken.Divide)
    .Match(Character.EqualTo('('), ArithmeticExpressionToken.LParen)
    .Match(Character.EqualTo(')'), ArithmeticExpressionToken.RParen)
    .Match(Numerics.Natural, ArithmeticExpressionToken.Number)
    .Build();

Tokenizers constructed this way produce a list of tokens by repeatedly attempting to match recognizers against the input in top-to-bottom order.

Writing tokenizers by hand

Tokenizers can alternatively be written by hand; this can provide the most flexibility, performance, and control, at the expense of more complicated code. A handwritten arithmetic expression tokenizer is included in the test suite, and a more complete example can be found here.

Writing token list parsers

Token parsers are defined in the same manner as text parsers, using combinators to build up more sophisticated parsers out of simpler ones.

class ArithmeticExpressionParser
{
    static readonly TokenListParser<ArithmeticExpressionToken, ExpressionType> Add =
        Token.EqualTo(ArithmeticExpressionToken.Plus).Value(ExpressionType.AddChecked);
        
    static readonly TokenListParser<ArithmeticExpressionToken, ExpressionType> Subtract =
        Token.EqualTo(ArithmeticExpressionToken.Minus).Value(ExpressionType.SubtractChecked);
        
    static readonly TokenListParser<ArithmeticExpressionToken, ExpressionType> Multiply =
        Token.EqualTo(ArithmeticExpressionToken.Times).Value(ExpressionType.MultiplyChecked);
        
    static readonly TokenListParser<ArithmeticExpressionToken, ExpressionType> Divide = 
        Token.EqualTo(ArithmeticExpressionToken.Divide).Value(ExpressionType.Divide);

    static readonly TokenListParser<ArithmeticExpressionToken, Expression> Constant =
            Token.EqualTo(ArithmeticExpressionToken.Number)
            .Apply(Numerics.IntegerInt32)
            .Select(n => (Expression)Expression.Constant(n));

    static readonly TokenListParser<ArithmeticExpressionToken, Expression> Factor =
        (from lparen in Token.EqualTo(ArithmeticExpressionToken.LParen)
            from expr in Parse.Ref(() => Expr)
            from rparen in Token.EqualTo(ArithmeticExpressionToken.RParen)
            select expr)
        .Or(Constant);

    static readonly TokenListParser<ArithmeticExpressionToken, Expression> Operand =
        (from sign in Token.EqualTo(ArithmeticExpressionToken.Minus)
            from factor in Factor
            select (Expression)Expression.Negate(factor))
        .Or(Factor).Named("expression");

    static readonly TokenListParser<ArithmeticExpressionToken, Expression> Term =
        Parse.Chain(Multiply.Or(Divide), Operand, Expression.MakeBinary);

    static readonly TokenListParser<ArithmeticExpressionToken, Expression> Expr =
        Parse.Chain(Add.Or(Subtract), Term, Expression.MakeBinary);

    public static readonly TokenListParser<ArithmeticExpressionToken, Expression<Func<int>>>
        Lambda = Expr.AtEnd().Select(body => Expression.Lambda<Func<int>>(body));
}

Error messages

The error scenario tests demonstrate some of the error message formatting capabilities of Superpower. Check out the parsers referenced in the tests for some examples.

ArithmeticExpressionParser.Lambda.Parse(new ArithmeticExpressionTokenizer().Tokenize("1 + * 3"));
     // -> Syntax error (line 1, column 5): unexpected operator `*`, expected expression.

To improve the error reporting for a particular token type, apply the [Token] attribute:

public enum ArithmeticExpressionToken
{
    None,

    Number,

    [Token(Category = "operator", Example = "+")]
    Plus,

Performance

Superpower is built with performance as a priority. Less frequent backtracking, combined with the avoidance of allocations and indirect dispatch, mean that Superpower can be quite a bit faster than Sprache.

Recent benchmark for parsing a long arithmetic expression:

Host Process Environment Information:
BenchmarkDotNet.Core=v0.9.9.0
OS=Windows
Processor=?, ProcessorCount=8
Frequency=2533306 ticks, Resolution=394.7411 ns, Timer=TSC
CLR=CORE, Arch=64-bit ? [RyuJIT]
GC=Concurrent Workstation
dotnet cli version: 1.0.0-preview2-003121

Type=ArithmeticExpressionBenchmark  Mode=Throughput  

Method Median StdDev Scaled Scaled-SD
Sprache 283.8618 µs 10.0276 µs 1.00 0.00
Superpower (Token) 81.1563 µs 2.8775 µs 0.29 0.01

Benchmarks and results are included in the repository.

Tips: if you find you need more throughput: 1) consider a hand-written tokenizer, and 2) avoid the use of LINQ comprehensions and instead use chained combinators like Then() and especially IgnoreThen() - these allocate fewer delegates (closures) during parsing.

Examples

Superpower is introduced, with a worked example, in this blog post.

Example parsers to learn from:

  • JsonParser is a complete, annotated example implementing the JSON spec with good error reporting
  • DateTimeTextParser shows how Superpower's text parsers work, parsing ISO-8601 date-times
  • IntCalc is a simple arithmetic expresion parser (1 + 2 * 3) included in the repository, demonstrating how Superpower token parsing works
  • Plotty implements an instruction set for a RISC virtual machine
  • tcalc is an example expression language that computes durations (1d / 12m)

Real-world projects built with Superpower:

  • Serilog.Expressions uses Superpower to implement an expression and templating language for structured log events
  • The query language of Seq is implemented using Superpower
  • seqcli extraction patterns use Superpower for plain-text log parsing
  • PromQL.Parser is a parser for the Prometheus Query Language

Have an example we can add to this list? Let us know.

Getting help

Please post issues to the issue tracker, or tag your question on StackOverflow with superpower.

The repository's title arose out of a talk "Parsing Text: the Programming Superpower You Need at Your Fingertips" given at DDD Brisbane 2015.

Showing the top 20 packages that depend on Superpower.

Packages Downloads
DotNetEnv
A .NET library to load environment variables from .env files
25

.NET 6.0

  • No dependencies.

.NET 8.0

  • No dependencies.

.NET Standard 2.0

Version Downloads Last updated
3.2.2-dev-00214 7 05/22/2026
3.2.1 21 04/30/2026
3.2.1-dev-00211 28 04/29/2026
3.2.1-dev-00210 23 04/29/2026
3.2.0 26 04/27/2026
3.2.0-dev-00207 25 04/27/2026
3.2.0-dev-00206 25 04/27/2026
3.2.0-dev-00204 26 04/27/2026
3.2.0-dev-00203 26 04/27/2026
3.1.1-dev-00257 24 04/27/2026
3.1.1-dev-00201 23 04/27/2026
3.1.0 26 04/27/2026
3.1.0-dev-00250 24 04/27/2026
3.1.0-dev-00247 23 04/27/2026
3.1.0-dev-00245 21 04/27/2026
3.1.0-dev-00241 26 04/27/2026
3.1.0-dev-00239 26 04/27/2026
3.0.0 27 04/27/2026
3.0.0-dev-00236 26 04/27/2026
3.0.0-dev-00233 24 04/27/2026
3.0.0-dev-00231 23 04/27/2026
3.0.0-dev-00229 26 04/27/2026
3.0.0-dev-00222 27 04/27/2026
3.0.0-dev-00221 23 04/27/2026
3.0.0-dev-00220 25 04/27/2026
3.0.0-dev-00219 22 04/27/2026
3.0.0-dev-00215 24 04/27/2026
3.0.0-dev-00213 28 04/27/2026
3.0.0-dev-00210 23 04/27/2026
3.0.0-dev-00206 21 04/27/2026
3.0.0-dev-00203 23 04/27/2026
3.0.0-dev-00201 23 04/27/2026
3.0.0-dev-00196 24 04/27/2026
3.0.0-dev-00193 27 04/27/2026
3.0.0-dev-00189 22 04/27/2026
2.3.1-dev-00185 21 04/27/2026
2.3.1-dev-00184 24 04/27/2026
2.3.0 23 04/27/2026
2.3.0-dev-00172 26 04/27/2026
2.2.0 26 04/27/2026
2.2.0-dev-00170 24 04/27/2026
2.2.0-dev-00169 22 04/27/2026
2.2.0-dev-00161 25 04/27/2026
2.1.1-dev-00159 25 04/27/2026
2.1.1-dev-00157 24 04/27/2026
2.1.0 25 04/27/2026
2.1.0-dev-00155 24 04/27/2026
2.1.0-dev-00135 24 04/27/2026
2.1.0-dev-00132 23 04/27/2026
2.1.0-dev-00128 25 04/27/2026
2.1.0-dev-00126 25 04/27/2026
2.1.0-dev-00124 24 04/27/2026
2.0.1-dev-00123 22 04/27/2026
2.0.1-dev-00120 25 04/27/2026
2.0.0 25 04/27/2026
2.0.0-dev-00116 23 04/27/2026
2.0.0-dev-00115 21 04/27/2026
2.0.0-dev-00114 25 04/27/2026
2.0.0-dev-00111 27 04/27/2026
2.0.0-dev-00109 23 04/27/2026
2.0.0-dev-00107 24 04/27/2026
2.0.0-dev-00103 23 04/27/2026
2.0.0-dev-00102 24 04/27/2026
2.0.0-dev-00101 23 04/27/2026
2.0.0-dev-00100 23 04/27/2026
2.0.0-dev-00093 22 04/27/2026
2.0.0-dev-00091 23 04/27/2026
2.0.0-dev-00089 24 04/27/2026
1.1.1-dev-00086 21 04/27/2026
1.1.1-dev-00081 24 04/27/2026
1.1.1-dev-00079 24 04/27/2026
1.1.0 25 04/27/2026
1.1.0-dev-00065 22 04/27/2026
1.1.0-dev-00063 21 04/27/2026
1.1.0-dev-00061 23 04/27/2026
1.1.0-dev-00059 22 04/27/2026
1.0.3-dev-00051 23 04/27/2026
1.0.2 26 04/27/2026
1.0.1 26 04/27/2026
1.0.1-dev-00044 22 04/27/2026
1.0.0 24 04/27/2026
1.0.0-dev-00038 23 04/27/2026
1.0.0-dev-00037 20 04/27/2026
1.0.0-dev-00036 24 04/27/2026
1.0.0-dev-00035 23 04/27/2026
1.0.0-dev-00034 24 04/27/2026
1.0.0-dev-00033 23 04/27/2026
1.0.0-dev-00032 23 04/27/2026
1.0.0-dev-00031 25 04/27/2026
1.0.0-dev-00030 23 04/27/2026
1.0.0-dev-00029 24 04/27/2026
1.0.0-dev-00028 21 04/27/2026
1.0.0-dev-00027 23 04/27/2026
1.0.0-dev-00026 20 04/27/2026
1.0.0-dev-00025 24 04/27/2026
1.0.0-dev-00024 22 04/27/2026
1.0.0-dev-00023 24 04/27/2026
1.0.0-dev-00022 23 04/27/2026
1.0.0-dev-00021 19 04/27/2026
1.0.0-dev-00020 23 04/27/2026
1.0.0-dev-00019 22 04/27/2026
1.0.0-dev-00018 26 04/27/2026
1.0.0-dev-00017 23 04/27/2026
1.0.0-dev-00016 23 04/27/2026
1.0.0-dev-00014 19 04/27/2026
1.0.0-dev-00013 22 04/27/2026
1.0.0-dev-00012 25 04/27/2026
1.0.0-dev-00010 25 04/27/2026
1.0.0-dev-00009 24 04/27/2026
1.0.0-dev-00008 25 04/27/2026
1.0.0-dev-00007 24 04/27/2026
1.0.0-dev-00006 20 04/27/2026
1.0.0-dev-00005 23 04/27/2026
1.0.0-dev-00004 24 04/27/2026
1.0.0-dev-00003 23 04/27/2026
1.0.0-dev-00002 21 04/27/2026