The recovery of structure from flat sequences of input data is a problem that almost all programs need to solve. Computer Science has developed a wide array of declarative languages for describing the structure of languages, usually based on the context-free grammar formalism, and there exist parser generators that produce efficient parsers for these descriptions. However, when faced with a problem involving parsing, most programmers opt for ad-hoc hand-coded solutions, or use parser combinator libraries to construct parsing functions.
This paper develops a hybrid approach, treating grammars as collections of active right-hand sides, indexed by a set of non-terminals. Active right-hand sides are built using the standard monadic parser combinators and allow the consumed input to affect the language being parsed, thus allowing for the precise description of the realistic languages that arise in programming.
We carefully investigate the semantics of grammars with active right-hand sides, not just from the point of view of language acceptance but also in terms of the generation of parse results. Ambiguous grammars may generate exponentially, or even infinitely, many parse results and these must be efficiently represented using Shared Packed Parse Forests (SPPFs). A particular feature of our approach is the use of Reynolds-style parametricity to ensure that the language that grammars describe cannot be affected by the representation of parse results.