The PEGTL includes several parts that go beyond the core library functionality.
They are included both for convenience and to show how certain things can be done with the PEGTL.
All feedback is highly welcome, in particular whether more sub-rules to serve as attachment points for actions are required.
Similarly, if you have written a grammar with the PEGTL that might be generally useful, you are welcome to contribute it for inclusion in future versions.
For all questions and remarks contact us at taocpp(at)icemx.net.
- Core ABNF rules according to RFC 5234, Appendix B.
- Ready for production use.
- Constants for ASCII letters.
- Shortens
string<'f','o','o'>
tostring<f,o,o>
. - Ready for production use.
- Superceeded by
TAO_PEGTL_STRING()
.
- HTTP 1.1 grammar according to RFC 7230.
- Has been used successfully but is still considered experimental.
- Grammars and actions for PEGTL-input-to-integer conversions.
- Limit the nesting depth of rules when parsing a grammar to prevent stack overflows.
- Can be applied selectively at specific rules to reduce overhead.
- See
src/test/pegtl/contrib_limit_depth.cpp
.
- JSON grammar according to RFC 7159 (for UTF-8 encoded JSON only).
- Ready for production use.
- Functions to handle exceptions that (might) contain nested exceptions.
- See Parse Tree.
- Grammar rules to parse Lua-style long (or raw) string literals.
- Ready for production use.
- Contains optimised version of
rep< N, string< Cs... > >
: - Rule
ascii::rep_string< N, Cs... >
.
- Contains optimised version of
rep_min_max< Min, Max, ascii::one< C > >
: - Rule
ascii::rep_one_min_max< Min, Max, C >
.
- Allows to parse rules separated by a separator.
- Rule
separated_seq< S, A, B, C, D >
is equivalent toseq< A, S, B, S, C, S, D >
.
Utility function to_string<>()
that converts template classes with arbitrary sequences of characters as template arguments into a std::string
that contains these characters.
- See Tracer.
This file contains helpers to unescape JSON and C and similar escape sequences.
- Utility functions frequently needed to unescape escape-sequences.
- Action classes that perform unescaping of escape-sequences.
- URI grammar according to RFC 3986.
- This is still experimental.
Reads a file with an ABNF (RFC 5234)-style grammar and converts it into corresponding PEGTL rules in C++. Some extensions and restrictions compared to RFC 5234:
- As we are defining PEGs, the alternations are now ordered (
sor<>
). - The and- and not-predicates from PEGs have been added as
&
and!
, respectively. - A single LF is also accepted as line ending.
- C++ identifiers are formed by replacing the dashes in rulenames with underscores.
- Reserved identifiers (keywords, ...) are rejected.
- Numerical values must fit into the corresponding C++ data type.
A small example that provokes the grammar analysis to find problems.
A calculator with all binary operators from the C language that shows
- how to use stack-based actions to perform a calculation on-the-fly during the parsing run, and
- how to build a grammar with a run-time data structure for arbitrary binary operators with arbitrary precedence and associativity.
In addition to the binary operators, round brackets can be used to change the evaluation order. The implementation uses long
integers as data type for all calculations.
$ build/src/example/pegtl/calculator "2 + 3 * -7" "(2 + 3) * 7"
-19
35
In this example the grammar takes a bit of a second place behind the infrastructure for the actions required to actually evaluate the arithmetic expressions. The basic approach is "shift-reduce", which is very close to a stack machine, which is a model often well suited to PEGTL grammar actions: Some actions merely push something onto a stack, while other actions apply some functions to the objects on the stack, usually reducing its size.
Examples of grammars for regular, context-free, and context-sensitive languages.
Two simple examples for grammars that parse different kinds of CSV-style file formats.
Minimal parser-style "hello world" example from the Getting Started page.
Shows one approach to implementing an indentation-aware language with a very very small subset of Python.
Shows how to use the custom error messages defined in json_errors.hpp
with the <tao/pegtl/contrib/json.hpp>
grammar to parse command line arguments as JSON data.
Extends on json_parse.cpp
by parsing JSON files into generic JSON data structure.
Shows how to use a simple custom control to create some parsing statistics while parsing JSON files.
Parses all files passed on the command line with a slightly experimental grammar that should correspond to the Lua 5.3 lexer and parser.
Shows how to implement a custom parsing rule with the simplified calling convention.
A small example which shows how to create a parse tree for a given grammar using <tao/pegtl/contrib/parse_tree.hpp>
.
The example shows how to choose which rules will produce a parse tree node, which rules will store the content, and how to add additional transformations to the parse tree to transform it into an AST-like structure or to simplify it.
The output is in DOT format and can be converted into a graph.
$ build/src/example/pegtl/parse_tree "(2*a + 3*b) / (4*n)" | dot -Tsvg -o parse_tree.svg
The above will generate an SVG file with a graphical representation of the parse tree.
Experimental grammar that parses Protocol Buffers (.proto3
) files.
See PEGTL issue 55 and the source code for a description.
Grammar for a toy-version of S-expressions that shows how to include other files during a parsing run.
Simple example that adds a list of comma-separated double
s read from std::cin
.
Simple example that shows how to parse with a symbol table.
Uses the building blocks from <tao/pegtl/contrib/unescape.hpp>
to show how to actually unescape a string literal with various typical escape sequences.
Shows how to use the included tracer control, here together with the URI grammar from <tao/pegtl/contrib/uri.hpp>
.
Invoked with one or more URIs as command line arguments will attempt to parse the URIs while printing trace information to std::cerr
.
This document is part of the PEGTL.
Copyright (c) 2014-2023 Dr. Colin Hirsch and Daniel Frey
Distributed under the Boost Software License, Version 1.0
See accompanying file LICENSE_1_0.txt or copy at https://www.boost.org/LICENSE_1_0.txt