Bob Atkey's bloghttp://bentnib.org/Bob Atkey's blogOff the Beaten Track 2017: Call for Talk Proposalshttp://bentnib.org/posts/2016-10-10-off-the-beaten-track.html<p>This year I am the programme chair
for
<a href="http://conf.researchr.org/track/POPL-2017/OBT-2017">Off the Beaten Track 2017</a>!
This will be held on 21st January 2017, co-located with POPL 2017 in
Paris, France.</p><h2>Background</h2>
<p>Programming language researchers have the principles, tools,
algorithms and abstractions to solve all kinds of problems, in all
areas of computer science. However, identifying and evaluating new
problems, particularly those that lie outside the typical core PL
problems we all know and love, can be a significant challenge. This
workshop’s goal is to identify and discuss problems that do not often
show up in our top conferences, but where programming language
research can make a substantial impact. We hope fora like this will
increase the diversity of problems that are studied by PL researchers
and thus increase our community’s impact on the world.</p><p>While many workshops associated with POPL have become more like
mini-conferences themselves, this is an anti-goal for OBT. The
workshop will be informal and structured to encourage discussion. We
are at least as interested in problems as in solutions.</p><h2>Scope</h2>
<p>A good submission is one that outlines a new problem or an
interesting, underrepresented problem domain. Good submissions may
also remind the PL community of problems that were once in vogue but
have not recently been seen in top PL conferences. Good submissions do
not need to propose complete or even partial solutions, though there
should be some reason to believe that programming languages
researchers have the tools necessary to search for solutions in the
area at hand. Submissions that seem likely to stimulate discussion
about the direction of programming language research are encouraged.</p><p>Use your imagination. It's hard to imagine how a paper that discusses
programming languages could be considered out of scope. If in doubt,
ask the program chair.</p><h2>Previous OBTs</h2>
<p>2017 marks the sixth year of OBT and its co-location with POPL. The
previous five workshops were:</p><ul><li><a href="http://conf.researchr.org/track/POPL-2016/OBT-2016-talks">OBT 2016</a>, St. Petersburg, USA</li><li><a href="http://www.cs.rice.edu/~sc40/obt15/">OBT 2015</a>, Mumbai, India</li><li><a href="http://popl-obt-2014.cs.brown.edu/">OBT 2014</a>, San Diego, USA</li><li><a href="http://goto.ucsd.edu/~rjhala/OBT2013/">OBT 2013</a>, Rome, Italy</li><li><a href="http://www.cs.princeton.edu/~dpw/obt/">OBT 2012</a>, Philadelphia, USA</li></ul>
<h2>Important Dates</h2>
<ul><li>10th November 2016: Submission deadline</li><li>8th December 2016: Notification</li><li>(18th December 2016: POPL early registration)</li><li>21st January 2017: Workshop</li></ul>
<h2>Submission</h2>
<p>Please submit your talk proposal via EasyChair:</p><p><a href="https://easychair.org/conferences/?conf=obt2017">https://easychair.org/conferences/?conf=obt2017</a></p><p>All submissions should be in PDF format, two pages or less, in at
least 10pt font, printable on A4 and on US Letter paper. Authors are
welcome to include links to multimedia content such as YouTube videos
or online demos. Reviewers may or may not view linked documents; it is
up to authors to convince the reviewers to do so.</p><p>For each accepted submission, one of the authors will give a talk at
the workshop. The length of the talk will depend on the submissions
received and how the program committee decides to assemble the
program.</p><p>Reviewing of submissions will be very light. Authors should not expect
a detailed analysis of their submission by the program
committee. Accepted submissions will be posted as is on this web
site. By submitting a document, you agree that if it is accepted, it
may be posted and you agree that one of the co-authors will attend the
workshop and give a talk there. There will be no revision process and
no formal publication.</p><h2>Organisers</h2>
<p>General chair:</p><ul><li>Lindsey Kuper, Intel Labs, USA</li></ul>
<p>Programme chair:</p><ul><li>Robert Atkey, University of Strathclyde, UK</li></ul>
<p>Programme committee:</p><ul><li>Ekaterina Komendantskaya, Heriot-Watt University, UK</li><li>Chris Martens, North Carolina State University, USA</li><li>Tomas Petricek, University of Cambridge, UK</li><li>Wren Romano, Google Inc., USA</li><li>Mary Sheeran, Chalmers University of Technology, Sweden</li><li>KC Sivaramakrishnan, University of Cambridge, UK</li><li>Wouter Swierstra, Utrecht University, Netherlands</li></ul>http://bentnib.org/posts/2016-10-10-off-the-beaten-track.htmlMon, 10 Oct 2016 00:00:00 +0000Authenticated Data Structures, as a Library, for Free!http://bentnib.org/posts/2016-04-12-authenticated-data-structures-as-a-library.html<p>Let's assume that you're querying to some database stored in the cloud
(i.e., on someone else’s computer). Being of a sceptical mind, you
worry whether or not the answers you get back are from the database
you expect. Or is the cloud lying to you?</p><p>Authenticated Data Structures (ADSs) are a proposed solution to this
problem. When the server sends back its answers, it also sends back a
“proof” that the answer came from the database it claims. You, the
client, verify this proof. If the proof doesn't verify, then you’ve
got evidence that the server was lying. If the proof does verify, then
there is a guarantee that the server’s response is legitimate (usually
up to the possibility of a hash collision).</p><p>This all seems great, but doesn’t address the question of how anyone
might build an ADS, and, crucially, prove that it has the security of
unforgable (up to hash collisions) verification. A brute-force way to
implement an ADS is for the client to retain a copy of the database,
and to check the server’s results against the copy. But then what
would be the point of the server?</p><p><a href="https://en.wikipedia.org/wiki/Merkle_tree">Merkle Trees</a> are the
original example of an ADS. Merkle trees solve the problem of needing
a complete copy of the database by having the client store a <em>hash</em> of
the database. The server’s proof is then verified against the
hash. But what if we want a data structure that isn’t trees? Will we
have to implement our own ADS? How will we know that we have done it
correctly? Implementing a completely new ADS has three problems. We
need: to invent a way to do the authentication, a proof that
authentication has been done correctly, and a proof that we have
correctly implemented it.</p><p>Andrew Miller, Michael Hicks, Jonathan Katz, and Elaine Shi have a
solution to this meta-problem in their POPL2014 paper
<a href="http://www.cs.umd.edu/~mwh/papers/gpads.pdf">Authenticated Data Structures, Generically</a>,
by describing a new language “Lambda-Auth” for implementing
correct-by-construction ADSs.</p><p>Miller et al. describe a programming language with special constructs
for writing ADSs. Once you’ve written an implementation of your data
structure, you annotate the implementation with ‘authentication’
markers that indicate points in the structure that act as
authentication checkpoints. On the client (verifier) side, each
checkpoint becomes a hash code representing the real data, which will
be checked against the proof sent by the server. On the server
(prover) side, each checkpoint is an indicator of where to generate a
piece of proof. The key insight of Miller et al. is that the client
and server run <em>the same code</em>, just with two different
interpretations of what the authentication checkpoints mean. They are
then able to prove, <em>for all programs</em>, that the server and client
sides will always agree, and that proofs of authentication are
unforgable (up to hash collisions). The authors wrote a
<a href="http://www.pl-enthusiast.net/2014/06/11/authenticated-data-structures-generically/">PL Enthusiast blog post</a>
with a gentle introduction, and
<a href="http://www.pl-enthusiast.net/2014/06/23/ads-generically-part-ii/">a second part</a>
with more detail.</p><p>In this post, I'll show that in a language with sufficiently powerful
abstraction facilities (in this case <a href="http://ocaml.org">OCaml</a>), it is
possible to implement Miller et al.’s solution as a <em>library</em> within
the language, with no need to create a new language, or to alter an
existing language’s implementation.</p><p>Moreover, although I won’t give any details in this blog post, I claim
that the correctness proof given by Miller et al. will be an instance
of
<a href="http://bentnib.org/fomega-parametricity.html">parametricity for higher-kinded types</a>. I’ll
give a few sketchy details at the end.</p><h2>Authenticated Data Structures as a Library</h2>
<p>To get started, I need to isolate a description of the bits and pieces
we need to be able to write authenticated data structures, taking
Miller et al.’s paper as a guide. OCaml provides module signatures as
a handy way of collecting together requirements for the existence of
certain types and functions. I'll call the types and functions we need
for making ADSs an <code>AUTHENTIKIT</code>: a kit for implementing authenticated
data structures:</p><pre><code><span class="ocamlkeyword">module</span> <span class="ocamlkeyword">type</span> AUTHENTIKIT <span class="ocamlsymbol">=</span> <span class="ocamlkeyword">sig</span></code></pre>
<p>The first thing is to postulate the existence of the type constructor
<code>auth</code> that represents the type of authenticated values. This is the
OCaml rendering of the “•” type constructor used in Miller et al.’s
language for representing authenticated values. Note that I haven’t
committed to any implementation of this type: it is left completely
abstract in the <code>AUTHENTIKIT</code> interface:</p><pre><code> <span class="ocamlkeyword">type</span> 'a auth</code></pre>
<p>We’ll need is a way of describing abstract authenticated computations:
computations that are either generating proofs or verifying proofs. In
Miller et al.’s setup, authenticated computations are built in to the
language. However, OCaml comes with with its own fixed notion of
computation, so we have to use a monad to layer our own notions on
top:</p><pre><code> <span class="ocamlkeyword">type</span> 'a authenticated_computation
<span class="ocamlkeyword">val</span> return <span class="ocamlsymbol">:</span> 'a <span class="ocamlsymbol">-></span> 'a authenticated_computation
<span class="ocamlkeyword">val</span> <span class="ocamlsymbol">(</span><span class="ocamlsymbol">>>=</span><span class="ocamlsymbol">)</span> <span class="ocamlsymbol">:</span> 'a authenticated_computation <span class="ocamlsymbol">-></span>
<span class="ocamlsymbol">(</span>'a <span class="ocamlsymbol">-></span> 'b authenticated_computation<span class="ocamlsymbol">)</span> <span class="ocamlsymbol">-></span>
'b authenticated_computation</code></pre>
<p>The third thing I’ll need is a way of proving that the types of data
that we want to authenticate are “authenticatable”. This will
essentially mean that they are serialisable to a string representation
in way that is suitable for proof construction and verification. The
requirement to be serialisable is hidden in Miller et al.’s formalism
because they assume that all data in the language is
serialisable. This is a somewhat fishy assumption (how do you
serialise and deserialise a function without breaking abstraction
boundaries?), so here I’m being a bit more formal, and requiring the
programmer to build evidence of serialisability using the following
combinators tucked away in a submodule of any <code>AUTHENTIKIT</code>
implementation:</p><pre><code> <span class="ocamlkeyword">module</span> Authenticatable <span class="ocamlsymbol">:</span> <span class="ocamlkeyword">sig</span>
<span class="ocamlkeyword">type</span> 'a evidence
<span class="ocamlkeyword">val</span> auth <span class="ocamlsymbol">:</span> 'a auth evidence
<span class="ocamlkeyword">val</span> pair <span class="ocamlsymbol">:</span> 'a evidence <span class="ocamlsymbol">-></span> 'b evidence <span class="ocamlsymbol">-></span> <span class="ocamlsymbol">(</span>'a <span class="ocamlsymbol">*</span> 'b<span class="ocamlsymbol">)</span> evidence
<span class="ocamlkeyword">val</span> sum <span class="ocamlsymbol">:</span> 'a evidence <span class="ocamlsymbol">-></span> 'b evidence <span class="ocamlsymbol">-></span>
<span class="ocamlsymbol">[</span>`left <span class="ocamlkeyword">of</span> 'a <span class="ocamlsymbol">|</span> `right <span class="ocamlkeyword">of</span> 'b<span class="ocamlsymbol">]</span> evidence
<span class="ocamlkeyword">val</span> string <span class="ocamlsymbol">:</span> string evidence
<span class="ocamlkeyword">val</span> int <span class="ocamlsymbol">:</span> int evidence
<span class="ocamlkeyword">end</span></code></pre>
<p>Finally, I postulate the existence of the <code>auth</code> and <code>unauth</code>
functions as Miller et al. do. The difference here is that I have
explicitly requested evidence that the data type involved is
authenticatable. Also, the <code>unauth</code> function returns a computation in
our authenticated computation monad, indicating that this is where
some of the work to get an authenticated data structure to work will
happen.</p><pre><code> <span class="ocamlkeyword">val</span> auth <span class="ocamlsymbol">:</span> 'a Authenticatable<span class="ocamlsymbol">.</span>evidence <span class="ocamlsymbol">-></span> 'a <span class="ocamlsymbol">-></span> 'a auth
<span class="ocamlkeyword">val</span> unauth <span class="ocamlsymbol">:</span> 'a Authenticatable<span class="ocamlsymbol">.</span>evidence <span class="ocamlsymbol">-></span> 'a auth <span class="ocamlsymbol">-></span> 'a authenticated_computation
<span class="ocamlkeyword">end</span></code></pre>
<h3>An Autheticated Data Structure: Merkle Trees</h3>
<p>Before I describe the prover and verifier implementations of the
<code>AUTHENTIKIT</code> module type, I’ll give an example of an authenticated
data structure implementation. I’ll do a basic Merkle tree, with the
following interface:</p><pre><code><span class="ocamlkeyword">module</span> <span class="ocamlkeyword">type</span> MERKLE <span class="ocamlsymbol">=</span>
<span class="ocamlkeyword">functor</span> <span class="ocamlsymbol">(</span>A <span class="ocamlsymbol">:</span> AUTHENTIKIT<span class="ocamlsymbol">)</span> <span class="ocamlsymbol">-></span> <span class="ocamlkeyword">sig</span>
<span class="ocamlkeyword">open</span> A
<span class="ocamlkeyword">type</span> path <span class="ocamlsymbol">=</span> <span class="ocamlsymbol">[</span>`L <span class="ocamlsymbol">|</span> `R<span class="ocamlsymbol">]</span> list
<span class="ocamlkeyword">type</span> tree <span class="ocamlsymbol">=</span> <span class="ocamlsymbol">[</span>`left <span class="ocamlkeyword">of</span> string <span class="ocamlsymbol">|</span> `right <span class="ocamlkeyword">of</span> tree <span class="ocamlsymbol">*</span> tree<span class="ocamlsymbol">]</span> auth
<span class="ocamlkeyword">val</span> make_leaf <span class="ocamlsymbol">:</span> string <span class="ocamlsymbol">-></span> tree
<span class="ocamlkeyword">val</span> make_branch <span class="ocamlsymbol">:</span> tree <span class="ocamlsymbol">-></span> tree <span class="ocamlsymbol">-></span> tree
<span class="ocamlkeyword">val</span> retrieve <span class="ocamlsymbol">:</span> path <span class="ocamlsymbol">-></span> tree <span class="ocamlsymbol">-></span> string option authenticated_computation
<span class="ocamlkeyword">val</span> update <span class="ocamlsymbol">:</span> path <span class="ocamlsymbol">-></span> string <span class="ocamlsymbol">-></span> tree <span class="ocamlsymbol">-></span> tree option authenticated_computation
<span class="ocamlkeyword">end</span></code></pre>
<p>The first thing to note about this module signature is that its
implementations are parameterised by implementations of
<code>AUTHENTIKIT</code>. This will be characteristic of any authenticated data
structure implementation we implement in this style: we need the
freedom to instantiate the implementation with alternative
<code>AUTHENTIKIT</code>s to get the prover and verifier sides of the system.</p><p>The second thing to note is that the data structure interface is a
very lightly annotated version of a completely boring binary tree
interface. We have types for trees and paths, and operations to
construct trees, query trees, and update trees. The only differences
from a normal interface is that the tree type is annotated with an
extra <code>auth</code> type constructor, indicating how the tree is augmented
with authentication information, and the <code>authenticated_computation</code>
type constructors on the <code>retrieve</code> and <code>update</code> operations,
indicating that they are authenticated operations.</p><p>The implementation is likewise a straightforward implementation of a
binary tree in OCaml, with a couple of extra annotations for the
authentication. The <code>Merkle</code> module signature states that
implementations are parameterised by <code>AUTHENTIKITs</code>, so we declare the
<code>Merkle</code> implementation like so:</p><pre><code><span class="ocamlkeyword">module</span> Merkle <span class="ocamlsymbol">:</span> MERKLE <span class="ocamlsymbol">=</span>
<span class="ocamlkeyword">functor</span> <span class="ocamlsymbol">(</span>A <span class="ocamlsymbol">:</span> AUTHENTIKIT<span class="ocamlsymbol">)</span> <span class="ocamlsymbol">-></span> <span class="ocamlkeyword">struct</span>
<span class="ocamlkeyword">open</span> A</code></pre>
<p>The <code>open A</code> introduces all the members of the <code>AUTHENTIKIT</code> into
scope, so we don't have to qualify any names. The first thing to do
is to fulfil the promise of definitions for the types <code>path</code> and
<code>tree</code> that we gave in the interface:</p><pre><code> <span class="ocamlkeyword">type</span> path <span class="ocamlsymbol">=</span> <span class="ocamlsymbol">[</span>`L<span class="ocamlsymbol">|</span>`R<span class="ocamlsymbol">]</span> list
<span class="ocamlkeyword">type</span> tree <span class="ocamlsymbol">=</span> <span class="ocamlsymbol">[</span>`left <span class="ocamlkeyword">of</span> string <span class="ocamlsymbol">|</span> `right <span class="ocamlkeyword">of</span> tree <span class="ocamlsymbol">*</span> tree <span class="ocamlsymbol">]</span> auth</code></pre>
<p>Next, we construct some evidence that the body of the tree type is
authenticatable, so we will be able to use <code>auth</code> and <code>unauth</code> on
trees.</p><pre><code> <span class="ocamlkeyword">let</span> tree <span class="ocamlsymbol">:</span> <span class="ocamlsymbol">[</span>`left <span class="ocamlkeyword">of</span> string <span class="ocamlsymbol">|</span> `right <span class="ocamlkeyword">of</span> tree <span class="ocamlsymbol">*</span> tree<span class="ocamlsymbol">]</span> Authenticatable<span class="ocamlsymbol">.</span>evidence <span class="ocamlsymbol">=</span>
Authenticatable<span class="ocamlsymbol">.</span><span class="ocamlsymbol">(</span>sum string <span class="ocamlsymbol">(</span>pair auth auth<span class="ocamlsymbol">)</span><span class="ocamlsymbol">)</span></code></pre>
<p>The <code>make_leaf</code> and <code>make_branch</code> functions build leaves and branches
using the appropriate constructors. </p><pre><code> <span class="ocamlkeyword">let</span> make_leaf s <span class="ocamlsymbol">=</span>
auth tree <span class="ocamlsymbol">(</span>`left s<span class="ocamlsymbol">)</span>
<span class="ocamlkeyword">let</span> make_branch l r <span class="ocamlsymbol">=</span>
auth tree <span class="ocamlsymbol">(</span>`right <span class="ocamlsymbol">(</span>l<span class="ocamlsymbol">,</span>r<span class="ocamlsymbol">)</span><span class="ocamlsymbol">)</span></code></pre>
<p>To query our authenticated data structure, we have two functions
<code>retrieve</code> and <code>update</code>. First, the implementation of <code>retrieve</code> which
takes a path and a tree and returns the data identified by that path,
if it exists. </p><pre><code> <span class="ocamlkeyword">let</span> <span class="ocamlkeyword">rec</span> retrieve path t <span class="ocamlsymbol">=</span>
unauth tree t <span class="ocamlsymbol">>>=</span> <span class="ocamlkeyword">fun</span> t <span class="ocamlsymbol">-></span>
<span class="ocamlkeyword">match</span> path<span class="ocamlsymbol">,</span> t <span class="ocamlkeyword">with</span>
<span class="ocamlsymbol">|</span> <span class="ocamlsymbol">[</span><span class="ocamlsymbol">]</span><span class="ocamlsymbol">,</span> `left s <span class="ocamlsymbol">-></span> return <span class="ocamlsymbol">(</span>Some s<span class="ocamlsymbol">)</span>
<span class="ocamlsymbol">|</span> `L<span class="ocamlsymbol">::</span>path<span class="ocamlsymbol">,</span> `right <span class="ocamlsymbol">(</span>l<span class="ocamlsymbol">,</span>r<span class="ocamlsymbol">)</span> <span class="ocamlsymbol">-></span> retrieve path l
<span class="ocamlsymbol">|</span> `R<span class="ocamlsymbol">::</span>path<span class="ocamlsymbol">,</span> `right <span class="ocamlsymbol">(</span>l<span class="ocamlsymbol">,</span>r<span class="ocamlsymbol">)</span> <span class="ocamlsymbol">-></span> retrieve path r
<span class="ocamlsymbol">|</span> <span class="ocamlsymbol">_</span><span class="ocamlsymbol">,</span> <span class="ocamlsymbol">_</span> <span class="ocamlsymbol">-></span> return None</code></pre>
<p>As I mentioned above, this is really nothing more than a standard
binary tree search implementation, obfuscated by the need to unwrap
the authenticated input, and the additional <code>>>=</code> and <code>return</code>s needed
to track the authenticated computation. The <code>update</code> implemention is
similar in its similarlity to a standard binary tree update function:</p><pre><code> <span class="ocamlkeyword">let</span> <span class="ocamlkeyword">rec</span> update path v t <span class="ocamlsymbol">=</span>
unauth tree t <span class="ocamlsymbol">>>=</span> <span class="ocamlkeyword">fun</span> t <span class="ocamlsymbol">-></span>
<span class="ocamlkeyword">match</span> path<span class="ocamlsymbol">,</span> t <span class="ocamlkeyword">with</span>
<span class="ocamlsymbol">|</span> <span class="ocamlsymbol">[</span><span class="ocamlsymbol">]</span><span class="ocamlsymbol">,</span> `left <span class="ocamlsymbol">_</span> <span class="ocamlsymbol">-></span>
return <span class="ocamlsymbol">(</span>Some <span class="ocamlsymbol">(</span>make_leaf v<span class="ocamlsymbol">)</span><span class="ocamlsymbol">)</span>
<span class="ocamlsymbol">|</span> `L<span class="ocamlsymbol">::</span>path<span class="ocamlsymbol">,</span> `right <span class="ocamlsymbol">(</span>l<span class="ocamlsymbol">,</span>r<span class="ocamlsymbol">)</span> <span class="ocamlsymbol">-></span>
<span class="ocamlsymbol">(</span>update path v l <span class="ocamlsymbol">>>=</span> <span class="ocamlkeyword">function</span>
<span class="ocamlsymbol">|</span> None <span class="ocamlsymbol">-></span> return None
<span class="ocamlsymbol">|</span> Some l' <span class="ocamlsymbol">-></span> return <span class="ocamlsymbol">(</span>Some <span class="ocamlsymbol">(</span>make_branch l' r<span class="ocamlsymbol">)</span><span class="ocamlsymbol">)</span><span class="ocamlsymbol">)</span>
<span class="ocamlsymbol">|</span> `R<span class="ocamlsymbol">::</span>path<span class="ocamlsymbol">,</span> `right <span class="ocamlsymbol">(</span>l<span class="ocamlsymbol">,</span>r<span class="ocamlsymbol">)</span> <span class="ocamlsymbol">-></span>
<span class="ocamlsymbol">(</span>update path v r <span class="ocamlsymbol">>>=</span> <span class="ocamlkeyword">function</span>
<span class="ocamlsymbol">|</span> None <span class="ocamlsymbol">-></span> return None
<span class="ocamlsymbol">|</span> Some r' <span class="ocamlsymbol">-></span> return <span class="ocamlsymbol">(</span>Some <span class="ocamlsymbol">(</span>make_branch l r'<span class="ocamlsymbol">)</span><span class="ocamlsymbol">)</span><span class="ocamlsymbol">)</span>
<span class="ocamlsymbol">|</span> <span class="ocamlsymbol">_</span> <span class="ocamlsymbol">-></span>
return None</code></pre>
<p>Finally, an <code>end</code> to complete the definition of <code>Merkle</code>:</p><pre><code> <span class="ocamlkeyword">end</span></code></pre>
<p>Just to emphasise again, this implementation is essentially the same
as a standard binary tree implementation written in OCaml. In the
code, we didn’t have to mention anything to do with authenticated data
structures, except for placing <code>auth</code> and <code>unauth</code> in the correct
places. And even if we get in a muddle doing that, the OCaml type
checker will helpfully inform us where we’ve made a mistake. One
wrinkle is that I've had to give up writing in “direct style” and had
to write in monadic style.</p><h3>The Prover</h3>
<p>The <code>Merkle</code> module is not yet ready for use. To use it we need an
implementation of the <code>AUTHENTIKIT</code> interface. The different
implementations of <code>AUTHENTIKIT</code> will correspond to the different
semantics in Miller et al.’s presentation: the prover and the
verifier.</p><p>The first implementation of <code>AUTHENTIKIT</code> I'll give is the prover.</p><h4>Proofs</h4>
<p>The prover constructs proofs, which I'll represent as lists of JSON
values, using the
<a href="https://opam.ocaml.org/packages/ezjsonm/ezjsonm.0.4.3/">Ezjsonm</a>
library.</p><pre><code><span class="ocamlkeyword">type</span> proof <span class="ocamlsymbol">=</span> Ezjsonm<span class="ocamlsymbol">.</span>value list</code></pre>
<p>We will also need to compute cryptographic hashes of JSON values,
which I'll do using the
<a href="https://opam.ocaml.org/packages/cryptokit/cryptokit.1.10/">Cryptokit</a>
library. I'm using the SHA1 algorithm here, but obviously any secure
hashing algorithm would work.</p><pre><code><span class="ocamlkeyword">let</span> hash_json <span class="ocamlsymbol">=</span>
<span class="ocamlkeyword">let</span> hash_algo <span class="ocamlsymbol">=</span> Cryptokit<span class="ocamlsymbol">.</span>Hash<span class="ocamlsymbol">.</span>sha1 <span class="ocamlsymbol">(</span><span class="ocamlsymbol">)</span> in
<span class="ocamlkeyword">fun</span> json_value <span class="ocamlsymbol">-></span>
Cryptokit<span class="ocamlsymbol">.</span>hash_string hash_algo <span class="ocamlsymbol">(</span>Ezjsonm<span class="ocamlsymbol">.</span>to_string <span class="ocamlsymbol">(</span>`A json_value<span class="ocamlsymbol">)</span><span class="ocamlsymbol">)</span></code></pre>
<h4>Prover Implementation</h4>
<p>The module <code>Prover</code> implements the <code>AUTHENTIKIT</code> signature, plus the
additional knowledge that a) authenticated computations generate
proofs as well as values, and b) that we have a <code>get_hash</code> function
that returns the hashed representation of any authenticated value. The
prover’s interface is captured by its module signature:</p><pre><code><span class="ocamlkeyword">module</span> Prover <span class="ocamlsymbol">:</span> <span class="ocamlkeyword">sig</span>
<span class="ocamlkeyword">include</span> AUTHENTIKIT <span class="ocamlkeyword">with</span> <span class="ocamlkeyword">type</span> 'a authenticated_computation <span class="ocamlsymbol">=</span> proof <span class="ocamlsymbol">*</span> 'a
<span class="ocamlkeyword">val</span> get_hash <span class="ocamlsymbol">:</span> 'a auth <span class="ocamlsymbol">-></span> string
<span class="ocamlkeyword">end</span> <span class="ocamlsymbol">=</span> <span class="ocamlkeyword">struct</span></code></pre>
<p>First, we implement the prover’s view of authenticated values by
defining its implementation of the <code>auth</code> type. The prover sees an
authenticated value as a pair of an underlying value <code>x</code> and a hash of
<code>x</code>’s representation:</p><pre><code> <span class="ocamlkeyword">type</span> 'a auth <span class="ocamlsymbol">=</span> 'a <span class="ocamlsymbol">*</span> string
<span class="ocamlkeyword">let</span> get_hash <span class="ocamlsymbol">(</span>a<span class="ocamlsymbol">,</span>h<span class="ocamlsymbol">)</span> <span class="ocamlsymbol">=</span> h</code></pre>
<p>It will be up to the rest of the functions in this module to maintain
the invariant that the second half of the pair will always be the hash
code of the serialised representation of the first half.</p><p>Authenticated computations on the prover side are represented using a
specialisation of the standard Writer monad, which collects a proof
(list of JSON values) as a side-effect. This implementation is quite
slow, because I have used lists, but could be speeded up by using a
better data structure.</p><pre><code> <span class="ocamlkeyword">type</span> 'a authenticated_computation <span class="ocamlsymbol">=</span>
proof <span class="ocamlsymbol">*</span> 'a
<span class="ocamlkeyword">let</span> return a <span class="ocamlsymbol">=</span>
<span class="ocamlsymbol">(</span><span class="ocamlsymbol">[</span><span class="ocamlsymbol">]</span><span class="ocamlsymbol">,</span> a<span class="ocamlsymbol">)</span>
<span class="ocamlkeyword">let</span> <span class="ocamlsymbol">(</span><span class="ocamlsymbol">>>=</span><span class="ocamlsymbol">)</span> <span class="ocamlsymbol">(</span>prf<span class="ocamlsymbol">,</span>a<span class="ocamlsymbol">)</span> f <span class="ocamlsymbol">=</span>
<span class="ocamlkeyword">let</span> <span class="ocamlsymbol">(</span>prf'<span class="ocamlsymbol">,</span>b<span class="ocamlsymbol">)</span> <span class="ocamlsymbol">=</span> f a in
<span class="ocamlsymbol">(</span>prf <span class="ocamlsymbol">@</span> prf'<span class="ocamlsymbol">,</span>b<span class="ocamlsymbol">)</span></code></pre>
<p>The prover's view of authenticatable values are ones for which it is
possible to serialise them to JSON. We represent the evidence that
such a thing is possible as the existence of a function from values to
JSON values.</p><pre><code> <span class="ocamlkeyword">module</span> Authenticatable <span class="ocamlsymbol">=</span> <span class="ocamlkeyword">struct</span>
<span class="ocamlkeyword">type</span> 'a evidence <span class="ocamlsymbol">=</span> 'a <span class="ocamlsymbol">-></span> Ezjsonm<span class="ocamlsymbol">.</span>value
<span class="ocamlkeyword">let</span> auth <span class="ocamlsymbol">(</span>a<span class="ocamlsymbol">,</span>h<span class="ocamlsymbol">)</span> <span class="ocamlsymbol">=</span>
`String h
<span class="ocamlkeyword">let</span> pair a_serialiser b_serialiser <span class="ocamlsymbol">(</span>a<span class="ocamlsymbol">,</span>b<span class="ocamlsymbol">)</span> <span class="ocamlsymbol">=</span>
`A <span class="ocamlsymbol">[</span>a_serialiser a<span class="ocamlsymbol">;</span> b_serialiser b<span class="ocamlsymbol">]</span>
<span class="ocamlkeyword">let</span> sum a_serialiser b_serialiser <span class="ocamlsymbol">=</span> <span class="ocamlkeyword">function</span>
<span class="ocamlsymbol">|</span> `left a <span class="ocamlsymbol">-></span> `A <span class="ocamlsymbol">[</span>`String <span class="ocamlstringconst">"left"</span><span class="ocamlsymbol">;</span> a_serialiser a<span class="ocamlsymbol">]</span>
<span class="ocamlsymbol">|</span> `right b <span class="ocamlsymbol">-></span> `A <span class="ocamlsymbol">[</span>`String <span class="ocamlstringconst">"right"</span><span class="ocamlsymbol">;</span> b_serialiser b<span class="ocamlsymbol">]</span>
<span class="ocamlkeyword">let</span> string s <span class="ocamlsymbol">=</span> `String s
<span class="ocamlkeyword">let</span> int i <span class="ocamlsymbol">=</span> `String <span class="ocamlsymbol">(</span>string_of_int i<span class="ocamlsymbol">)</span>
<span class="ocamlkeyword">end</span></code></pre>
<p>In the <code>auth</code> case, we only serialise the hash code, not the
underlying value. This ensures that the prover does not end up sending
whole data structures back to the client. In the <code>int</code> case, I'm
serialising OCaml’s <code>int</code> values as strings, to avoid complications
arising from JSON’s use of floating point representations for numbers.</p><p>Now the <code>auth</code> and <code>unauth</code> functions. Creation of authenticated
values means pairing the value with its hashed serialised
representation:</p><pre><code> <span class="ocamlkeyword">let</span> auth serialiser a <span class="ocamlsymbol">=</span>
<span class="ocamlsymbol">(</span>a<span class="ocamlsymbol">,</span> hash_json <span class="ocamlsymbol">(</span>serialiser a<span class="ocamlsymbol">)</span><span class="ocamlsymbol">)</span><span class="ocamlsymbol">)</span></code></pre>
<p>Extracting the underlying value from an authenticated value has the
“side effect” of producing a step in the proof, which is the JSON
representation of the value we expect to see:</p><pre><code> <span class="ocamlkeyword">let</span> unauth serialiser <span class="ocamlsymbol">(</span>a<span class="ocamlsymbol">,</span> h<span class="ocamlsymbol">)</span> <span class="ocamlsymbol">=</span>
<span class="ocamlsymbol">(</span><span class="ocamlsymbol">[</span>serialiser a<span class="ocamlsymbol">]</span><span class="ocamlsymbol">,</span> a<span class="ocamlsymbol">)</span></code></pre>
<p>Finally, we complete the definition of <code>Prover</code> with an <code>end</code>:</p><pre><code><span class="ocamlkeyword">end</span></code></pre>
<h4>Trying out the Prover</h4>
<p>We get a prover-side Merkle tree implementation by instantiating the
<code>Merkle</code> module with the <code>Prover</code> implementation of <code>AUTHENTIKIT</code>:</p><pre><code><span class="ocamlkeyword">module</span> Merkle_Prover <span class="ocamlsymbol">=</span> Merkle <span class="ocamlsymbol">(</span>Prover<span class="ocamlsymbol">)</span></code></pre>
<p>Let’s make a little prover-side tree, in the OCaml REPL:</p><pre><code><span class="ocamlprompt">#</span> <span class="ocamlkeyword">let</span> tree <span class="ocamlsymbol">=</span>
Merkle_Prover<span class="ocamlsymbol">.</span><span class="ocamlsymbol">(</span>make_branch
<span class="ocamlsymbol">(</span>make_branch <span class="ocamlsymbol">(</span>make_leaf <span class="ocamlstringconst">"a"</span><span class="ocamlsymbol">)</span> <span class="ocamlsymbol">(</span>make_leaf <span class="ocamlstringconst">"b"</span><span class="ocamlsymbol">)</span><span class="ocamlsymbol">)</span>
<span class="ocamlsymbol">(</span>make_branch <span class="ocamlsymbol">(</span>make_leaf <span class="ocamlstringconst">"c"</span><span class="ocamlsymbol">)</span> <span class="ocamlsymbol">(</span>make_leaf <span class="ocamlstringconst">"d"</span><span class="ocamlsymbol">)</span><span class="ocamlsymbol">)</span><span class="ocamlsymbol">)</span><span class="ocamlprompt">;;</span>
<span class="ocamlkeyword">val</span> tree <span class="ocamlsymbol">:</span> Merkle_Prover<span class="ocamlsymbol">.</span>tree <span class="ocamlsymbol">=</span> <span class="ocamlabstr"><abstr></span></code></pre>
<p>The returned value is abstract, because it is really a pair of the
underlying data and it hash code. Using the prover’s <code>get_hash</code>
function, we can get the hash code of the tree, which is what the
verifier will use to authenticate the prover’s actions:</p><pre><code><span class="ocamlprompt">#</span> <span class="ocamlkeyword">let</span> code <span class="ocamlsymbol">=</span> Prover<span class="ocamlsymbol">.</span>get_hash tree<span class="ocamlprompt">;;</span>
<span class="ocamlkeyword">val</span> code <span class="ocamlsymbol">:</span> string <span class="ocamlsymbol">=</span> <span class="ocamlstringconst">".z\129w\199J\224\\\254\220\bo\246W\158\243S\029\177\190"</span></code></pre>
<p>We can also ask the prover to do queries on the tree. For example, if
we ask for the value at position “left, left”, it returns the result
“a”, and a proof that the verifier will be able to use to check that
this result actually came from the tree with the hash code above.</p><pre><code><span class="ocamlprompt">#</span> <span class="ocamlkeyword">let</span> proof<span class="ocamlsymbol">,</span> result <span class="ocamlsymbol">=</span> Merkle_Prover<span class="ocamlsymbol">.</span>retrieve <span class="ocamlsymbol">[</span>`L<span class="ocamlsymbol">;</span>`L<span class="ocamlsymbol">]</span> tree<span class="ocamlprompt">;;</span>
<span class="ocamlkeyword">val</span> proof <span class="ocamlsymbol">:</span> proof <span class="ocamlsymbol">=</span>
<span class="ocamlsymbol">[</span>`A <span class="ocamlsymbol">[</span> `String <span class="ocamlstringconst">"right"</span>
<span class="ocamlsymbol">;</span> `A <span class="ocamlsymbol">[</span> `String <span class="ocamlstringconst">"?\250m&,\251\129\031\r\252QJ\001\141|d}\242\016l"</span>
<span class="ocamlsymbol">;</span> `String <span class="ocamlstringconst">"i?B\230p\158D\201\248\145\000\1400p\224\018\023\219\1935"</span>
<span class="ocamlsymbol">]</span>
<span class="ocamlsymbol">]</span>
<span class="ocamlsymbol">;</span> `A <span class="ocamlsymbol">[</span> `String <span class="ocamlstringconst">"right"</span>
<span class="ocamlsymbol">;</span> `A <span class="ocamlsymbol">[</span> `String <span class="ocamlstringconst">"X\140\005\028\146\1891L\224\246\224\229\201\018o\b\187\163\240\160"</span>
<span class="ocamlsymbol">;</span> `String <span class="ocamlstringconst">"\223\231\194\230\1362=\157\187\226;?\143>\127\248\014;\201\254"</span>
<span class="ocamlsymbol">]</span>
<span class="ocamlsymbol">]</span>
<span class="ocamlsymbol">;</span> `A <span class="ocamlsymbol">[</span>`String <span class="ocamlstringconst">"left"</span><span class="ocamlsymbol">;</span> `String <span class="ocamlstringconst">"a"</span><span class="ocamlsymbol">]</span>
<span class="ocamlsymbol">]</span>
<span class="ocamlkeyword">val</span> result <span class="ocamlsymbol">:</span> string option <span class="ocamlsymbol">=</span> Some <span class="ocamlstringconst">"a"</span></code></pre>
<p>This proof looks quite long, and in this case is several times the
size of the original database. So it might look like Merkle trees
might not save us anything. In general, however, the proof size is
logarithmic in the size of the tree and so will be much smaller than
sending the entire tree.</p><h3>The Verifier</h3>
<p>We’ve got a prover generating query responses and proofs, but this is
a bit useless if we don’t have a verifier to check them.</p><p>Our verifier is another implementation of the <code>AUTHENTIKIT</code> signature,
with the additional information that authenticated computations are
now functions that consume proofs and return either a value (and
possibly some left-over proof) or return failure. Also, we specify
that, on the verifier side, an authenticated value is represented as
just its hash code (represented as an OCaml string).</p><pre><code><span class="ocamlkeyword">module</span> Verifier <span class="ocamlsymbol">:</span> <span class="ocamlkeyword">sig</span>
<span class="ocamlkeyword">include</span> AuthentiKit
<span class="ocamlkeyword">with</span> <span class="ocamlkeyword">type</span> 'a authenticated_computation <span class="ocamlsymbol">=</span>
proof <span class="ocamlsymbol">-></span> <span class="ocamlsymbol">[</span> `Ok <span class="ocamlkeyword">of</span> proof <span class="ocamlsymbol">*</span> 'a <span class="ocamlsymbol">|</span> `ProofFailure <span class="ocamlsymbol">]</span>
<span class="ocamlkeyword">and</span> <span class="ocamlkeyword">type</span> 'a auth <span class="ocamlsymbol">=</span> string
<span class="ocamlkeyword">end</span> <span class="ocamlsymbol">=</span> <span class="ocamlkeyword">struct</span></code></pre>
<p>The basic idea behind the verifier is that, given a hash code and a
proof, it checks the hash code against the proof, and then uses the
hash code to rebuild the parts of the data structure that will be
explored by the program.</p><p>We start by fulfilling our statement that the verifier’s view of
authenticated values is as OCaml strings:</p><pre><code> <span class="ocamlkeyword">type</span> 'a auth <span class="ocamlsymbol">=</span> string</code></pre>
<p>And that the verifier’s version of authenticated computations is as a
“parser” of proofs. Note how the definitions here are very similar to
the definitions used for parser combinators. In some sense,
verification is a process of “parsing” the proof sequence supplied by
the prover.</p><pre><code> <span class="ocamlkeyword">type</span> 'a authenticated_computation <span class="ocamlsymbol">=</span>
json list <span class="ocamlsymbol">-></span> <span class="ocamlsymbol">[</span>`Ok <span class="ocamlkeyword">of</span> proof <span class="ocamlsymbol">*</span> 'a <span class="ocamlsymbol">|</span> `ProofFailure<span class="ocamlsymbol">]</span>
<span class="ocamlkeyword">let</span> return a <span class="ocamlsymbol">=</span>
<span class="ocamlkeyword">fun</span> proof <span class="ocamlsymbol">-></span> `Ok <span class="ocamlsymbol">(</span>proof<span class="ocamlsymbol">,</span> a<span class="ocamlsymbol">)</span>
<span class="ocamlkeyword">let</span> <span class="ocamlsymbol">(</span><span class="ocamlsymbol">>>=</span><span class="ocamlsymbol">)</span> c f <span class="ocamlsymbol">=</span>
<span class="ocamlkeyword">fun</span> prfs <span class="ocamlsymbol">-></span>
<span class="ocamlkeyword">match</span> c prfs <span class="ocamlkeyword">with</span>
<span class="ocamlsymbol">|</span> `ProofFailure <span class="ocamlsymbol">-></span> `ProofFailure
<span class="ocamlsymbol">|</span> `Ok <span class="ocamlsymbol">(</span>prfs'<span class="ocamlsymbol">,</span>a<span class="ocamlsymbol">)</span> <span class="ocamlsymbol">-></span> f a prfs'</code></pre>
<p>The verifier’s view of authenticable values is slightly more involved
than the prover’s, because it needs to be able to serialise and
deserialise values to and from JSON. There isn’t anything particularly
special going on here, but we do have to be careful to ensure that
this implementation uses the same format as the prover’s. (An obvious
improvement is to make the <code>Prover</code> and <code>Verifier</code> implementations
share an implementation of this sub-module.)</p><pre><code> <span class="ocamlkeyword">module</span> Authenticatable <span class="ocamlsymbol">=</span> <span class="ocamlkeyword">struct</span>
<span class="ocamlkeyword">type</span> 'a evidence <span class="ocamlsymbol">=</span>
<span class="ocamlsymbol">{</span> serialise <span class="ocamlsymbol">:</span> 'a <span class="ocamlsymbol">-></span> Ezjsonm<span class="ocamlsymbol">.</span>value
<span class="ocamlsymbol">;</span> deserialise <span class="ocamlsymbol">:</span> Ezjsonm<span class="ocamlsymbol">.</span>value <span class="ocamlsymbol">-></span> 'a option
<span class="ocamlsymbol">}</span>
<span class="ocamlkeyword">let</span> auth <span class="ocamlsymbol">=</span>
<span class="ocamlkeyword">let</span> serialise h <span class="ocamlsymbol">=</span> `String h
<span class="ocamlkeyword">and</span> deserialise <span class="ocamlsymbol">=</span> <span class="ocamlkeyword">function</span>
<span class="ocamlsymbol">|</span> `String s <span class="ocamlsymbol">-></span> Some s
<span class="ocamlsymbol">|</span> <span class="ocamlsymbol">_</span> <span class="ocamlsymbol">-></span> None
in
<span class="ocamlsymbol">{</span> serialise<span class="ocamlsymbol">;</span> deserialise <span class="ocamlsymbol">}</span>
<span class="ocamlkeyword">let</span> pair a_s b_s <span class="ocamlsymbol">=</span>
<span class="ocamlkeyword">let</span> serialise <span class="ocamlsymbol">(</span>a<span class="ocamlsymbol">,</span>b<span class="ocamlsymbol">)</span> <span class="ocamlsymbol">=</span>
`A <span class="ocamlsymbol">[</span>a_s<span class="ocamlsymbol">.</span>serialise a<span class="ocamlsymbol">;</span> b_s<span class="ocamlsymbol">.</span>serialise b<span class="ocamlsymbol">]</span>
<span class="ocamlkeyword">and</span> deserialise <span class="ocamlsymbol">=</span> <span class="ocamlkeyword">function</span>
<span class="ocamlsymbol">|</span> `A <span class="ocamlsymbol">[</span>x<span class="ocamlsymbol">;</span>y<span class="ocamlsymbol">]</span> <span class="ocamlsymbol">-></span>
<span class="ocamlsymbol">(</span><span class="ocamlkeyword">match</span> a_s<span class="ocamlsymbol">.</span>deserialise x<span class="ocamlsymbol">,</span> b_s<span class="ocamlsymbol">.</span>deserialise y <span class="ocamlkeyword">with</span>
<span class="ocamlsymbol">|</span> Some a<span class="ocamlsymbol">,</span> Some b <span class="ocamlsymbol">-></span> Some <span class="ocamlsymbol">(</span>a<span class="ocamlsymbol">,</span>b<span class="ocamlsymbol">)</span>
<span class="ocamlsymbol">|</span> <span class="ocamlsymbol">_</span> <span class="ocamlsymbol">-></span> None<span class="ocamlsymbol">)</span>
<span class="ocamlsymbol">|</span> <span class="ocamlsymbol">_</span> <span class="ocamlsymbol">-></span>
None
in
<span class="ocamlsymbol">{</span> serialise<span class="ocamlsymbol">;</span> deserialise <span class="ocamlsymbol">}</span>
<span class="ocamlkeyword">let</span> sum a_s b_s <span class="ocamlsymbol">=</span>
<span class="ocamlkeyword">let</span> serialise <span class="ocamlsymbol">=</span> <span class="ocamlkeyword">function</span>
<span class="ocamlsymbol">|</span> `left a <span class="ocamlsymbol">-></span> `A <span class="ocamlsymbol">[</span>`String <span class="ocamlstringconst">"left"</span><span class="ocamlsymbol">;</span> a_s<span class="ocamlsymbol">.</span>serialise a<span class="ocamlsymbol">]</span>
<span class="ocamlsymbol">|</span> `right b <span class="ocamlsymbol">-></span> `A <span class="ocamlsymbol">[</span>`String <span class="ocamlstringconst">"right"</span><span class="ocamlsymbol">;</span> b_s<span class="ocamlsymbol">.</span>serialise b<span class="ocamlsymbol">]</span>
<span class="ocamlkeyword">and</span> deserialise <span class="ocamlsymbol">=</span> <span class="ocamlkeyword">function</span>
<span class="ocamlsymbol">|</span> `A <span class="ocamlsymbol">[</span>`String <span class="ocamlstringconst">"left"</span><span class="ocamlsymbol">;</span> x<span class="ocamlsymbol">]</span> <span class="ocamlsymbol">-></span>
<span class="ocamlsymbol">(</span><span class="ocamlkeyword">match</span> a_s<span class="ocamlsymbol">.</span>deserialise x <span class="ocamlkeyword">with</span>
<span class="ocamlsymbol">|</span> Some a <span class="ocamlsymbol">-></span> Some <span class="ocamlsymbol">(</span>`left a<span class="ocamlsymbol">)</span>
<span class="ocamlsymbol">|</span> <span class="ocamlsymbol">_</span> <span class="ocamlsymbol">-></span> None<span class="ocamlsymbol">)</span>
<span class="ocamlsymbol">|</span> `A <span class="ocamlsymbol">[</span>`String <span class="ocamlstringconst">"right"</span><span class="ocamlsymbol">;</span> y<span class="ocamlsymbol">]</span> <span class="ocamlsymbol">-></span>
<span class="ocamlsymbol">(</span><span class="ocamlkeyword">match</span> b_s<span class="ocamlsymbol">.</span>deserialise y <span class="ocamlkeyword">with</span>
<span class="ocamlsymbol">|</span> Some b <span class="ocamlsymbol">-></span> Some <span class="ocamlsymbol">(</span>`right b<span class="ocamlsymbol">)</span>
<span class="ocamlsymbol">|</span> <span class="ocamlsymbol">_</span> <span class="ocamlsymbol">-></span> None<span class="ocamlsymbol">)</span>
<span class="ocamlsymbol">|</span> <span class="ocamlsymbol">_</span> <span class="ocamlsymbol">-></span>
None
in
<span class="ocamlsymbol">{</span> serialise<span class="ocamlsymbol">;</span> deserialise <span class="ocamlsymbol">}</span>
<span class="ocamlkeyword">let</span> string <span class="ocamlsymbol">=</span>
<span class="ocamlkeyword">let</span> serialise s <span class="ocamlsymbol">=</span> `String s
<span class="ocamlkeyword">and</span> deserialise <span class="ocamlsymbol">=</span> <span class="ocamlkeyword">function</span>
<span class="ocamlsymbol">|</span> `String s <span class="ocamlsymbol">-></span> Some s
<span class="ocamlsymbol">|</span> <span class="ocamlsymbol">_</span> <span class="ocamlsymbol">-></span> None
in
<span class="ocamlsymbol">{</span> serialise<span class="ocamlsymbol">;</span> deserialise <span class="ocamlsymbol">}</span>
<span class="ocamlkeyword">let</span> int <span class="ocamlsymbol">=</span>
<span class="ocamlkeyword">let</span> serialise i <span class="ocamlsymbol">=</span> `String <span class="ocamlsymbol">(</span>string_of_int i<span class="ocamlsymbol">)</span>
<span class="ocamlkeyword">and</span> deserialise <span class="ocamlsymbol">=</span> <span class="ocamlkeyword">function</span>
<span class="ocamlsymbol">|</span> `String i <span class="ocamlsymbol">-></span> <span class="ocamlsymbol">(</span>try Some <span class="ocamlsymbol">(</span>int_of_string i<span class="ocamlsymbol">)</span> <span class="ocamlkeyword">with</span> Failure <span class="ocamlsymbol">_</span> <span class="ocamlsymbol">-></span> None<span class="ocamlsymbol">)</span>
<span class="ocamlsymbol">|</span> <span class="ocamlsymbol">_</span> <span class="ocamlsymbol">-></span> None
in
<span class="ocamlsymbol">{</span> serialise<span class="ocamlsymbol">;</span> deserialise <span class="ocamlsymbol">}</span>
<span class="ocamlkeyword">end</span></code></pre>
<p>Now we get to the crucial <code>auth</code> and <code>unauth</code> functions. Creation of
authenticated values on the verifier side is nothing more than
serialising them and computing their hash code:</p><pre><code> <span class="ocamlkeyword">open</span> Authenticatable
<span class="ocamlkeyword">let</span> auth auth_evidence a <span class="ocamlsymbol">=</span>
hash_json <span class="ocamlsymbol">(</span>auth_evidence<span class="ocamlsymbol">.</span>serialise a<span class="ocamlsymbol">)</span><span class="ocamlsymbol">)</span></code></pre>
<p>Finally, we get to the actual proof checking. The <code>unauth</code> function is
supplied with <em>a)</em> a (de)serialiser for the type of value to be
produced; <em>b)</em> a hash code for the expected value; and <em>c)</em> a proof
which will be used to reconstitute the actual value. If we have run
out of proof elements, then we fail immediately. Otherwise, we check
that the hash code of the first item in the proof is the same as our
required code. If so, we deserialise the proof item and return it with
the remainder of the proof. If the hash codes do not match, then the
verifier reports that proof checking has failed.</p><pre><code> <span class="ocamlkeyword">let</span> unauth auth_evidence h proof <span class="ocamlsymbol">=</span> <span class="ocamlkeyword">match</span> proof <span class="ocamlkeyword">with</span>
<span class="ocamlsymbol">|</span> <span class="ocamlsymbol">[</span><span class="ocamlsymbol">]</span> <span class="ocamlsymbol">-></span> `ProofFailure
<span class="ocamlsymbol">|</span> p<span class="ocamlsymbol">::</span>ps when hash_json p <span class="ocamlsymbol">=</span> h <span class="ocamlsymbol">-></span>
<span class="ocamlsymbol">(</span><span class="ocamlkeyword">match</span> auth_evidence<span class="ocamlsymbol">.</span>deserialise p <span class="ocamlkeyword">with</span>
<span class="ocamlsymbol">|</span> None <span class="ocamlsymbol">-></span> `ProofFailure
<span class="ocamlsymbol">|</span> Some a <span class="ocamlsymbol">-></span> `Ok <span class="ocamlsymbol">(</span>ps<span class="ocamlsymbol">,</span> a<span class="ocamlsymbol">)</span><span class="ocamlsymbol">)</span>
<span class="ocamlsymbol">|</span> <span class="ocamlsymbol">_</span> <span class="ocamlsymbol">-></span> `ProofFailure</code></pre>
<p>Finally, the <code>Verifier</code> ends with an <code>end</code>:</p><pre><code><span class="ocamlkeyword">end</span></code></pre>
<h4>Trying out the Verifier</h4>
<p>Above, we used the <code>Merkle_Prover</code> module to generate a small Merkle
tree and run a query on it. We can now verify the prover’s execution
of <code>retrieve</code>. Recall that we extracted a hash code representing the
tree from the prover’s representation:</p><pre><code><span class="ocamlkeyword">val</span> code <span class="ocamlsymbol">:</span> string <span class="ocamlsymbol">=</span> <span class="ocamlstringconst">".z\129w\199J\224\\\254\220\bo\246W\158\243S\029\177\190"</span></code></pre>
<p>Let’s assume that this code was conveyed to the client somehow, and
the client trusts that it is an accurate representation of the tree it
wants to query.</p><p>Now the client/verifier asks the server/prover to perform a query. The
server sends back the result and a proof, which we computed above:</p><pre><code><span class="ocamlkeyword">val</span> proof <span class="ocamlsymbol">:</span> proof <span class="ocamlsymbol">=</span>
<span class="ocamlsymbol">[</span>`A <span class="ocamlsymbol">[</span> `String <span class="ocamlstringconst">"right"</span>
<span class="ocamlsymbol">;</span> `A <span class="ocamlsymbol">[</span> `String <span class="ocamlstringconst">"?\250m&,\251\129\031\r\252QJ\001\141|d}\242\016l"</span>
<span class="ocamlsymbol">;</span> `String <span class="ocamlstringconst">"i?B\230p\158D\201\248\145\000\1400p\224\018\023\219\1935"</span>
<span class="ocamlsymbol">]</span>
<span class="ocamlsymbol">]</span>
<span class="ocamlsymbol">;</span> `A <span class="ocamlsymbol">[</span> `String <span class="ocamlstringconst">"right"</span>
<span class="ocamlsymbol">;</span> `A <span class="ocamlsymbol">[</span> `String <span class="ocamlstringconst">"X\140\005\028\146\1891L\224\246\224\229\201\018o\b\187\163\240\160"</span>
<span class="ocamlsymbol">;</span> `String <span class="ocamlstringconst">"\223\231\194\230\1362=\157\187\226;?\143>\127\248\014;\201\254"</span>
<span class="ocamlsymbol">]</span>
<span class="ocamlsymbol">]</span>
<span class="ocamlsymbol">;</span> `A <span class="ocamlsymbol">[</span>`String <span class="ocamlstringconst">"left"</span><span class="ocamlsymbol">;</span> `String <span class="ocamlstringconst">"a"</span><span class="ocamlsymbol">]</span>
<span class="ocamlsymbol">]</span>
<span class="ocamlkeyword">val</span> result <span class="ocamlsymbol">:</span> string option <span class="ocamlsymbol">=</span> Some <span class="ocamlstringconst">"a"</span></code></pre>
<p>We can now use <code>Merkle_Verifier</code> to verify the prover’s proof against
<code>code</code>:</p><pre><code><span class="ocamlprompt">#</span> Merkle_Verifier<span class="ocamlsymbol">.</span>retrieve <span class="ocamlsymbol">[</span>`L<span class="ocamlsymbol">;</span>`L<span class="ocamlsymbol">]</span> code proof<span class="ocamlprompt">;;</span>
<span class="ocamlsymbol">-</span> <span class="ocamlsymbol">:</span> <span class="ocamlsymbol">[</span> `Ok <span class="ocamlkeyword">of</span> proof <span class="ocamlsymbol">*</span> string option <span class="ocamlsymbol">|</span> `ProofFailure <span class="ocamlsymbol">]</span> <span class="ocamlsymbol">=</span> `Ok <span class="ocamlsymbol">(</span><span class="ocamlsymbol">[</span><span class="ocamlsymbol">]</span><span class="ocamlsymbol">,</span> Some <span class="ocamlstringconst">"a"</span><span class="ocamlsymbol">)</span></code></pre>
<h4>Attempting to trick the verifier</h4>
<p>Let’s try simulating an attempt by the prover to trick the verifier by
running the query against a different tree. First we create a tree
with the same shape, but with different values in it:</p><pre><code><span class="ocamlkeyword">let</span> other_tree <span class="ocamlsymbol">=</span>
Merkle_Prover<span class="ocamlsymbol">.</span><span class="ocamlsymbol">(</span>make_branch
<span class="ocamlsymbol">(</span>make_branch <span class="ocamlsymbol">(</span>make_leaf <span class="ocamlstringconst">"A"</span><span class="ocamlsymbol">)</span> <span class="ocamlsymbol">(</span>make_leaf <span class="ocamlstringconst">"B"</span><span class="ocamlsymbol">)</span><span class="ocamlsymbol">)</span>
<span class="ocamlsymbol">(</span>make_branch <span class="ocamlsymbol">(</span>make_leaf <span class="ocamlstringconst">"C"</span><span class="ocamlsymbol">)</span> <span class="ocamlsymbol">(</span>make_leaf <span class="ocamlstringconst">"D"</span><span class="ocamlsymbol">)</span><span class="ocamlsymbol">)</span><span class="ocamlsymbol">)</span><span class="ocamlprompt">;;</span></code></pre>
<p>Now we run the “left, left” query on this alternative tree, and we get
back a result and a proof:</p><pre><code><span class="ocamlprompt">#</span> <span class="ocamlkeyword">let</span> proof<span class="ocamlsymbol">,</span> result <span class="ocamlsymbol">=</span> Merkle_Prover<span class="ocamlsymbol">.</span>retrieve <span class="ocamlsymbol">[</span>`L<span class="ocamlsymbol">;</span>`L<span class="ocamlsymbol">]</span> other_tree<span class="ocamlprompt">;;</span>
<span class="ocamlkeyword">val</span> proof <span class="ocamlsymbol">:</span> proof <span class="ocamlsymbol">=</span>
<span class="ocamlsymbol">[</span> `A <span class="ocamlsymbol">[</span>`String <span class="ocamlstringconst">"right"</span>
<span class="ocamlsymbol">;</span> `A <span class="ocamlsymbol">[</span>`String <span class="ocamlstringconst">"Q\217G\000\246\238!\248\212\127\194\184\179>\017zW\0182("</span>
<span class="ocamlsymbol">;</span> `String <span class="ocamlstringconst">"0\239\238\002b4\172\145\127\143\002@-=g\179\197\022\154|"</span>
<span class="ocamlsymbol">]</span>
<span class="ocamlsymbol">]</span>
<span class="ocamlsymbol">;</span> `A <span class="ocamlsymbol">[</span>`String <span class="ocamlstringconst">"right"</span>
<span class="ocamlsymbol">;</span> `A <span class="ocamlsymbol">[</span>`String <span class="ocamlstringconst">"\157\134\234N\234QS\136\165\196\0038\133j\018uY\133\030\005"</span>
<span class="ocamlsymbol">;</span> `String <span class="ocamlstringconst">"\2272|\223M\230\241\132\167\029-\016\141\221\005nx\239\190\184"</span>
<span class="ocamlsymbol">]</span>
<span class="ocamlsymbol">]</span>
<span class="ocamlsymbol">;</span> `A <span class="ocamlsymbol">[</span>`String <span class="ocamlstringconst">"left"</span><span class="ocamlsymbol">;</span> `String <span class="ocamlstringconst">"A"</span><span class="ocamlsymbol">]</span>
<span class="ocamlsymbol">]</span>
<span class="ocamlkeyword">val</span> result <span class="ocamlsymbol">:</span> bytes option <span class="ocamlsymbol">=</span> Some <span class="ocamlstringconst">"A"</span></code></pre>
<p>We send the result and proof back to the sceptical client, who
verifies the proof against their hash code:</p><pre><code><span class="ocamlprompt">#</span> Merkle_Verifier<span class="ocamlsymbol">.</span>retrieve <span class="ocamlsymbol">[</span>`L<span class="ocamlsymbol">;</span>`L<span class="ocamlsymbol">]</span> code proof<span class="ocamlprompt">;;</span>
<span class="ocamlsymbol">-</span> <span class="ocamlsymbol">:</span> <span class="ocamlsymbol">[</span> `Ok <span class="ocamlkeyword">of</span> proof <span class="ocamlsymbol">*</span> bytes option <span class="ocamlsymbol">|</span> `ProofFailure <span class="ocamlsymbol">]</span> <span class="ocamlsymbol">=</span> `ProofFailure</code></pre>
<p>It fails! The cheating server has been thwarted.</p><h2>Correctness from Parametricity</h2>
<p>Of course, the examples above don’t prove that the server/prover can
never cheat. For their calculus, Miller et al. provide a proof that
the prover and verifier implementations satisfy the following
property, which I’ve stated informally here:</p><blockquote><p>If the verifier accepts a proof (i.e., returns <code>Ok</code>) then either it
came from the prover, or is the result of a hash collision.</p></blockquote>
<p>If we assume that hash collisions are unlikely (Miller et al. make
this assumption more formal), then we get the security property we
want: up to the possibility of hash collision, we can spot cheating
clouds.</p><p>I claim, but I’m not going to prove here, that this property is
provable as an instance of the
<a href="http://homepages.inf.ed.ac.uk/wadler/papers/free/free.ps">“Free Theorem”</a>
we get from the fact that the <code>Merkle</code> implementation is parameterised
by <code>AUTHENTIKIT</code> implementations. The full proof involves
<a href="http://www.mpi-sws.org/~dreyer/papers/f-ing/journal.pdf">F-ing Modules</a>
to translate the use of modules and functors into the higher-kinded
calculus System Fω, my
<a href="http://bentnib.org/fomega-parametricity.html">Relational Parametricity for Higher Kinds</a>
for the higher-kinded version of free theorems, and Katsumata’s
<a href="http://www.kurims.kyoto-u.ac.jp/~sinya/paper/csl05-69.pdf">A Semantic Formulation of TT-lifting and Logical Predicates for Computational Metalanguage</a>
to handle the authenticated computations monads.</p>http://bentnib.org/posts/2016-04-12-authenticated-data-structures-as-a-library.htmlTue, 12 Apr 2016 00:00:00 +0000Slides and notes for my OBT “Generalising Abstraction” talkhttp://bentnib.org/posts/2016-01-26-obt.html<p>Here are the slides I used for my keynote talk for the afternoon at
this year’s
<a href="http://conf.researchr.org/track/POPL-2016/OBT-2016-talks">Off the Beaten Track</a>
workshop in St. Petersburg, Florida, USA. Many thanks to
<a href="http://composition.al/">Lindsey Kuper</a> for inviting me to give a talk
and for organising it all.</p><p><p class="displaylink"><a href="http://bentnib.org/docs/generalising-abstraction.pdf"><img alt="Slides for “Generalising Abstraction” talk" src="http://bentnib.org/thumbnails/slides-generalising-abstraction.png"></a></p></p><p>The talk wasn’t recorded, but here is a collection of notes to go with
the things I talked about:</p><ul><li><a href="http://people.cs.vt.edu/~kafura/CS6604/Papers/Alternative-Perspectives-to-CT.pdf">The Abstract is ‘an Enemy’</a>
is a paper by Alan F. Blackwell, Luke Church, and Thomas Green,
written in response to Jeannette Wing’s highly influential
<a href="http://www.cs.cmu.edu/afs/cs/usr/wing/www/publications/Wing06.pdf">Computational Thinking</a>. Wing
argues, amongst other things, that abstraction is a key feature of
the set of techniques that computer scientists use when thinking
about problems, and that people working in other disciplines could
profitably make use of some of the abstraction tools developed by
computer scientists. Blackwell et al. disagree with Wing’s
emphasis on abstraction, noting that it can lead to system designs
that de-emphasise and dehumanise users in favour of simple and
tractable models designed for programmers. They also argue that
abstraction can also lead to excessive literalism, conflation of
the goals of a system with the goal of “the perfect abstraction”,
and mathematical abstraction as a tool for theoreticians to hide
behind when defending their theories. Admittedly, I put this slide
up to get a cheap laugh from their abstract, but I do think the
paper is worth reading in its own right.</li></ul>
<ul><li>The “Pedestrian or Equestrian” sign is from Stanley Park in
Vancouver. I went over the equestrian side. The combination of the
pine forest visuals and the water dripping from the ceiling in the
room where OBT was held made for quite the Vancouverean
experience.</li></ul>
<ul><li>Jean-Yves Girard’s
<a href="http://iml.univ-mrs.fr/~girard/0.ps.gz">Locus Solum</a> is the paper
that introduces Ludics, a logical formalism whose details persist
in eluding my understanding. An interesting feature of the paper
is the extensive glossary covering a large portion of logic,
semantics, proof theory, and computer science in a unique way
(Appendix A: A pure waste of paper).</li></ul>
<ul><li>Barbara Liskov and Stephen Zilles’
<a href="http://web.cs.iastate.edu/~hridesh/teaching/362/07/01/papers/p50-liskov.pdf">Programming with Abstract Data Types</a>
was the first paper to really nail down the idea of an <em>abstract
data type</em> as a structured thing described through a set of
operations, instead of the more negative definition in terms of
information hiding. This lead to the design of the language
<a href="http://www.cs.virginia.edu/~weimer/615/reading/liskov-clu-abstraction.pdf">CLU</a>,
led by Liskov, and to a great amount of work in algebraic
specification. <a href="https://www.youtube.com/watch?v=GDVAHA0oyJU">The Power of Abstraction</a>
is a excellent and informative talk by Liskov covering the history
of how she invented abstract data types.</li></ul>
<ul><li><a href="http://www.cse.chalmers.se/edu/year/2010/course/DAT140_Types/Reynolds_typesabpara.pdf">Types, Abstraction, and Parametric Polymorphism</a>
by John Reynolds is the founding paper of a <em>semantic</em> theory of
abstraction, now called parametricity. Christopher Strachey’s
<a href="http://itu.dk/courses/BPRD/E2009/fundamental-1967.pdf">original definition of parametricity</a>
was informally defined in terms of parametricity polymorphic
functions being “uniformly defined” with respect to their type
parameter. There are many ways to formalise this notion. We could
strictly interpreting it as meaning there is a single
implementation, ignoring the type parameter, as is done in
<a href="http://www.sciencedirect.com/science/article/pii/0304397590901517">PER models</a>
that interpret quantification over all types as
intersection. Reynolds chooses to interpret parametricity as
preservation of relations, generalising the invariance under
homomorphism idea from algebra. This is formalised in the
Abstraction Theorem which appears in this paper. As he notes, the
Abstraction Theorem is closely related to the “fundamental”, or
“basic”, lemma of Logical Relations, first used for the
lambda-calculus by Gordon Plotkin for proving
<a href="http://homepages.inf.ed.ac.uk/gdp/publications/logical_relations_1973.pdf">definability results</a>. I
believe the origin of logical relations goes back to Robin Gandy,
who used them (or at least logical relations that are equivalence
relations) to ensure extensionality in models of calculi with
higher-order functions.</li></ul>
<ul><li>John Harrison’s
<a href="http://www.cl.cam.ac.uk/~jrh13/papers/wlog.pdf">Without Loss of Generality</a>
is a genuinely practical paper for reducing the effort involved in
proving geometric theorems in an interactive theorem
prover. Harrison takes inspiration from the way that
mathematicians reduce problems to simpler ones by exploiting
symmetry. The idea being that “without loss of generality” one
can consider a special case that soundly represents all of the
instances of the general case. The paper presents an extension to
the HOL Light theorem prover that pattern matches on goals
containing geometric statements and reduces them to simpler goals
using symmetry.</li></ul>
<ul><li>The quote “<em>A rose by any other name would smell as sweet</em>” is
from <em>Romeo and Julliet</em> by William Shakespeare. According to
Wikipedia, this play about a catastrophic lack of invariance
under renaming was written some time between 1591 and 1594.</li></ul>
<ul><li>Andy Pitts’ work on Nominal Sets is collected in his book
<a href="http://www.cambridge.org/9781107017788">Nominal Sets: Names and Symmetry in Computer Science</a>. The
quote on the slide is taken from the introductory notes
<a href="http://www.cl.cam.ac.uk/~amp12/papers/nomt/nomt.pdf">Nominal Techniques</a>. The
essence of nominal sets is that everything ought to be invariant
under the swapping of names; it is only the relationships between
names that matter.</li></ul>
<ul><li>The reflexive graph model for System F is originally due to
<a href="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.51.9972">Edmund Robinson and Giuseppe Rosolini</a>,
who showed that it is possible to take existing denotational
models of System F (i.e., typed lambda-calculus with “for all”
types) and complete them to a relationally parametric ones by
considering reflexive graphs whose sets of nodes and edges are
denotations of types in the original model. I used this approach
to give a relationally parametric model of System Fomega (i.e.,
System F plus type-level functions) in
<a href="http://bentnib.org/fomega-parametricity.html">Relational Parametricity for Higher Kinds</a>. A
similar approach had been used by Ryu Hasegawa in
<a href="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.57.1003">Relational limits in general polymorphism</a>. Neil
Ghani, Patricia Johann, and myself used the reflexive graph
approach to give a relationally parametric model of dependent
types in
<a href="http://bentnib.org/dtt-parametricity.html">A Relationally Parametric Model of Dependent Type Theory</a>. The
model I presented in the talk is essentially the same as the one
in that paper, except that the reflexive graphs are presented in
“indexed-style”, rather than “display-style”.</li></ul>
<ul><li>Dimension Types seem quite popular at the moment, with libraries
popping up for many different languages that encode dimension
types. However, most of these systems concentrate on the
<a href="https://www.youtube.com/watch?v=AJQ3TM-p2QI">computer says “no”</a>
aspect. They are only interested in the type system preventing
things, and womble on about Mars rockets going missing and so
forth, like using a different programming language would have
fixed that. Andrew Kennedy’s
<a href="http://research.microsoft.com/en-us/um/people/akenn/units/RelationalParametricityAndUnitsOfMeasure.pdf">Relational Parametricity and Units of Measure</a>
is much more insightful because it shows how unit-correctness
actually yields interesting results: including scale invariance
results, type isomorphisms, and indefinability results. This was
really the point of my talk: abstraction yields properties.</li></ul>
<ul><li>Philip Wadler dubbed some of the consequences of type abstraction
that arise from Reynolds’ work
<a href="http://homepages.inf.ed.ac.uk/wadler/papers/free/free.ps">“Theorems for Free!”</a>,
because the theorem is derived from the type instead of inspection
of the program, which will often be more complex. This has led to
many paper titles that end with “... for free!” when they use
parametricity to prove something.</li></ul>
<ul><li>Distance-indexed types, where two objects are related if they are
within a certain distance, are originally from
<a href="http://www.cis.upenn.edu/~bcpierce/papers/dp.pdf">Distance makes the Types Grow Stronger</a>
by Jason Reed and Benjamin C. Pierce. The intended application
there was to
<a href="http://research.microsoft.com/en-us/projects/databaseprivacy/dwork.pdf">differential privacy</a>,
where programs (interpreted as queries on a database), are
constrained to have limited dependence on small changes to the
input. Roughly speaking, limiting the precise dependence of the
output on the exact values input ensures that the privacy of the
people whose information is in the database is protected. The
Reed-Pierce paper presents a <em>linear</em> type system that treats
information leakage as a resource that needs to be conserved. The
model I presented in the talk doesn’t mention linearity anywhere
at all. However, it can be encoded in the reflexive graph model by
treating the Reed-Pierce resource-limited types as types indexed
by <code>Dist</code>, and building up the other connectives using the same
techniques as the Day-construction models of Bunched Implications.</li></ul>
<ul><li>Geometric-group indexed types first appeared in
<a href="http://bentnib.org/algebraic-indexed.html">Abstraction and Invariance for Algebraically-indexed Types</a>
by myself, Patricia Johann, and Andrew Kennedy. They are a
generalisation of the unit-indexed types from Andrew’s previous
work, but allow for a richer set of transformations that the
program can be invariant under. We looked at two other forms of
“algebraically-indexed types”: one for information flow, and one
for distance indexing. The indefinability results from Andrew’s
work carry over to this more general setting too.</li></ul>
<ul><li>The classical mechanics types, terms, and examples are taken from
my paper
<a href="http://bentnib.org/conservation-laws.html">From Parametricity to Conservation Laws, via Noether’s Theorem</a>. Emmy
Noether’s Theorem is a result in the Calculus of Variations that
shows that it is possible to derive conserved properties of the
stationary solutions of variational problems from continuous
symmetries of their actions, and in some cases to go back
again. This theorem is very important in physics. Many physical
theories, not just in classical mechanics, are described in terms
of stationary solutions of variational problems. Noether’s theorem
gives a way to deduce properties of solutions without having to
explicitly solve the system. In computer science terms, roughly
speaking, Noether’s Theorem shows how to derive time-global
invariants over runs of a system from time-local invariants of the
single-step behaviour of the system (though in Noether’s case,
there are no “single-steps” because everything is continuous). My
paper showed that it was possible to derive the local symmetries
as consequences of free theorems, which can then be plugged into
Noether’s theorem. Note that I <strong>didn’t</strong> show that parametricity
implies Noether’s theorem, though the idea doesn’t necessarily
seem too outlandish, if one could give a suitable account of
continuous change. Noether’s original paper
<a href="http://arxiv.org/pdf/physics/0503066.pdf">Invariant Variation Problems</a>
is available in translation. I found the book
<a href="http://www.springer.com/gb/book/9780387950006">Applications of Lie Groups to Differential Equations</a>
by <a href="http://www.math.umn.edu/~olver/">Peter Olver</a> extremely
helpful in understanding the basic theorem, and its
generalisations, which were also proved by Noether. The book
<a href="http://www.amazon.co.uk/Noethers-Wonderful-Theorem-Dwight-Neuenschwander/dp/0801896940">Emmy Noether’s Wonderful Theorem</a>
gives a readable account of the background and how the theorem is
applied in physics applications.</li></ul>
<ul><li>The non-example of Higher Order Abstract Syntax is from my paper
<a href="http://bentnib.org/syntaxforfree.html">Syntax for Free: Representing Syntax with Binding using Parametricity</a>,
which proved that the (denotation of the) type <code>forall a. ((a → a)
→ a) → (a → a → a) → a</code> is isomorphic to the set of closed
lambda-terms. This result appears to require Kripke-indexed
relations, which are not included in the simple reflexive graph
model I used in the talk. After my talk, Ambrus Kaposi mentioned
some ideas to me of how to incorporate Kripke-indexed relations,
which I need to follow up.</li></ul>
<ul><li>The final section of the talk reported on unpublished work in
progress on <em>internalising</em> relational parametricity into the type
theory. Using the reflexive graph model at a meta-theoretical
level is all very well, but it would be extremely useful to be
able to prove invariance results internally, just as one can prove
equalities between programs internally in dependent type
theories. The approach I presented here is based on the
observation that “relatedness” is sort of like a weakened version
of equality in Martin-Löf type theory (MLTT). (For simplicity I
only considered MLTT with equality reflection.) The resulting
system is something like Ambrus Kaposi and Thorsten Altenkirch’s
<a href="http://www.cs.nott.ac.uk/~txa/publ/ctt.pdf">Cubical Type Theory</a>,
but differs in how it considers equality within in the universe
<code>U</code> of small
types. <a href="http://publications.lib.chalmers.se/publication/230735">Jean-Phillipe Bernardy, Thierry Coquand, and Guilhem Moulin</a>
have published a syntax and model for internalised
parametricity. Their system has unary, but higher-dimensional,
relations, and does not (as far as I understand) have an explicit
relationship with equality.</li></ul>
<p>After the talk there were a few questions, which were written down on
5x3 index cards and read out by the session chair. This style of
questions is an innovation by Lindsey for OBT to try to encourage
people who wouldn’t normally ask questions to ask, and to share time
more equally between questions. My impression was that possibly more
questions were asked than I would have normally expected. Also, I got
to keep the cards with the questions on:</p><p><img alt="Coloured Index cards with the questions below written on them" src="http://bentnib.org/docs/obt-questions.jpg"></p><p>The questions, and my answers, are listed below. These answers are
with the benefit of hindsight and a long plane journey to ruminate on
“what I should have said”, so they aren’t exactly what I said at the
time.</p><ol><li><p><strong>How is that related to homotopy type theory/univalence?</strong> This
is an especially natural question given the formal similarities
between the higher-dimensional models for both. I used to think
that relationally parametric type theory would be a “baby”
version of univalent type theory, where the composable,
reversible paths between objects are replaced by a weaker notion
of relatedness. However, I now think that relational
parametricity and homotopy type theory are orthogonal. Relational
Parametricity is agnostic with respect to the notion of equality
it uses, and so it is possible to speak about relationally
parametric (extensional type theory | type theory with equality
reflection | intensional type theory | observational type theory
| univalent type theory). However, thinking a bit more, it might
be possible that Homotopy Type Theory might be a sub-system of
relationally parametric type theory, where relatedness just so
happens to be slightly richer.</p></li><li><p><strong>What ever came out of John Harrison’s “WLOG” work? Any useful
tools for automatic theorem proving?</strong> I don’t know about tools
for automatic theorem proving, though I think that it would be
interesting (I vaguely remember that constraint solving systems
often involve the use of symmetry to cut down the search space;
the name
<a href="https://scholar.google.com/citations?user=pJB7cjoAAAAJ&hl=en">Karen Petrie</a>
springs to mind). Harrison’s original work is now part of the HOL
light theorem prover, in the
<a href="https://github.com/jrh13/hol-light/blob/c44fc7909eb68d68afa33b9bd0dc66c467bf481e/Multivariate/wlog.ml">Multivariate</a>
library.</p></li><li><p><strong>Have you looked at Lorentz invariance?</strong> This is in reference
to how all the examples I showed of mechanics systems were
invariant under Galliean invariance, which doesn’t account for
the effects of special relativity. Lorentz invariance does
account for special relativity. I haven’t looked at Lorentz
invariance explicitly, but I think that it can be handled by the
system I presented.</p></li><li><p><strong>Where can I get your slides and/or papers you used as
examples?</strong> This page.</p></li><li><p><strong>How about Calculus?</strong> At the time I interpreted this question
as being about the special type constructor <code>C^\infty</code> I used to
internalise smooth functions for the purposes of classical
mechanics models, and I muttered something about Synthetic
Differential Geometry. Looking back, I think it probable that the
questioner was actually asking about the relationship between the
“theory of change” model I talked about here, and differential
calculus as a theory of continuous change. I think that this is a
really interesting question, which might have a lot to say about
incremental computation, bidirectional computation and reversible
computation. The paper
<a href="http://www.informatik.uni-marburg.de/~pgiarrusso/papers/pldi14-ilc-author-final.pdf">A theory of changes for higher-order languages — incrementalizing λ-calculi by static differentiation</a>
gives an account of a static “differentiation” process for
lambda-calculus terms for the purposes of doing incremental
computation. Gabriel Scherer noted a similarity between that work
and relational parametricity, which I wrote a
<a href="http://bentnib.org/posts/2015-04-23-incremental-lambda-calculus-and-parametricity.html">blog post</a>
about.</p></li></ol>http://bentnib.org/posts/2016-01-26-obt.htmlTue, 26 Jan 2016 00:00:00 +0000The Incremental λ-Calculus and Relational Parametricityhttp://bentnib.org/posts/2015-04-23-incremental-lambda-calculus-and-parametricity.html<p>Back in February, the paper
<a href="http://www.informatik.uni-marburg.de/~pgiarrusso/papers/pldi14-ilc-author-final.pdf">A theory of changes for higher-order languages — incrementalizing λ-calculi by static differentiation</a>
by Cai, Giarusso, Rendel, and Ostermann, was posted to
<a href="http://lambda-the-ultimate.org/node/5115">Lambda-the-Ultimate</a>. The
poster, gasche, made the following parenthetical comment at the end of
the L-t-U post:</p><blockquote><p>(The program transformation seems related to the program-level
parametricity transformation. Parametricity abstract over equality
justifications, differentiation on small differences.)</p></blockquote>
<p>In this post, I’m going to substantiate gasche’s conjecture by showing
how (a very slight simplification of) the theory of changes presented
in the paper is formally <em>the same thing as</em> relational
parametricity. Where Reynolds’ relational parametricity for System F
is a theory of how programs act under change of data representation —
<em>i.e.,</em> change of types; Cai <em>et al.</em>’s theory of changes is a theory
of how programs act under change of values.</p><h2>A Theory of Changes, and Change Structures</h2>
<p>Cai <em>et al.</em> are interested in translating purely functional programs
that map values of type <code>A</code> to values of type <code>B</code> into programs that
map changes in values of type <code>A</code> to changes in values of type
<code>B</code>. The intention is that the translated version of the program that
maps changes to changes will more efficiently handle incremental
updates to the input than doing a full recomputation.</p><p>In any attempt to define such a translation, you are going to need to
define what you mean by “changes in the values of type <code>A</code>”. To this
end, Cai <em>et al.</em> define the following notion of <em>change structure</em>. A
change structure (for a particular type) consists of:</p><ol><li>A set <code>A</code> of the possible values of this type;</li><li>For each value <code>a ∈ A</code>, a set <code>∂A(a)</code> of possible changes that can
be applied to the value <code>a</code>.</li><li>An operation <code>⊕</code> that takes a value <code>a ∈ A</code> and a change <code>δa ∈
∂A(a)</code> and produces a value <code>a ⊕ δa ∈ A</code> that represents the
result of applying the change <code>δa</code> to the value <code>a</code>.</li><li>An operation <code>⊖</code> that takes an <code>a ∈ A</code> and a <code>b ∈ A</code> and produces
a change <code>b ⊖ a ∈ ∂A(a)</code> that represents the change required to
get from <code>a</code> to <code>b</code>.</li></ol>
<p>These operations are subject to several laws that are laid out in
Section 2 of the paper.</p><p>Requiring the existence of the <code>⊖</code> operation for every change
structure seems to be quite strong to me. By requiring <code>⊖</code>, we are
requiring that for <em>any</em> two values <code>a</code> and <code>b</code>, there must be at
least one change that connects them, and for some reason we have
chosen this change out of possibly many choices as “special”. Just by
requiring the existence of a change that translates us from <code>a</code> to
<code>b</code>, we are making a commitment that the “space” of <code>A</code> values is
connected in some way, which may not be justified in some
applications. What if there are states that we can get into that are
not get-out-able? What if we only want to consider changes that are
appending of new data?</p><p>As far as I can see, the <code>⊖</code> operation is only used in the paper to
define the zero change <code>0 ∈ ∂A(a)</code> for every value <code>a ∈ A</code>, and then
is never mentioned again. To me, the existence of a zero change for
every value seems much easier to justify than a general <code>⊖</code>
operation. Therefore, I propose the following mild generalisation of
change structure, which only differs from the Cai <em>et al.</em>’s
definition in the final point:</p><ol><li>A set <code>A</code> of possible values;</li><li>For each value <code>a ∈ A</code>, a set <code>∂A(a)</code> of possible changes that can
be applied to the value <code>a</code>.</li><li>An operation <code>⊕</code> that takes a value <code>a ∈ A</code> and a change <code>δa ∈
∂A(a)</code> and produces a value <code>a ⊕ δa ∈ A</code> that represents the
result of applying the change <code>δa</code> to the value <code>a</code>.</li><li>For every <code>a ∈ A</code>, a <em>zero change</em> <code>0 ∈ ∂A(a)</code>.</li></ol>
<p>Change structures allow Cai <em>et al.</em> to define the notion of a
<em>derivative</em> of a function <em>f</em>. The idea is that if <code>f</code> is the
original function, then <code>f'</code> is a function that can compute changes in
the output, given the original input value and the change in the input
value.</p><p>Formally, given a pair of change structures, <code>(A, ∂A, ⊕, 0)</code> and <code>(B,
∂B, ⊕, 0)</code>, and a function <code>f : A → B</code> Cai <em>et al.</em> define a
derivative of <code>f</code> to be a function <code>f' : (a : A) → ∂A(a) → ∂B(f a)</code>
such that: <em>i)</em> <code>f(a ⊕ δa) = (f a) ⊕ (f' a δa)</code>; and <em>ii)</em> <code>f' a 0 =
0</code>. (Notes: point <em>ii</em> is actually a lemma in the paper, which I think
is provable if you have <code>⊖</code>, but I have to state explicitly; also, the
paper calls <code>f'</code> <em>the</em> derivative, but in general it isn’t unique for
a given <code>f</code>.)</p><p>Given these definitions, we can define a category of change
structures: the objects are change structures, and the morphisms are
pairs <code>(f,f')</code> of functions <code>f</code> between the sets of values, and
derivatives <code>f'</code>. Looking at change structures as a category, we can
see that the middle four pages of Cai <em>et al.</em>’s paper, from
Section 2.2 to the end of Section 3, essentially boils down to proving
that this category is cartesian closed, and so can model the simply
typed λ-calculus.</p><h2>Relational Parametricity via Reflexive Graphs</h2>
<p>Relational parametricity is usually introduced in terms of <em>logical
relations</em>. A logical relation is a relation between two
interpretations of the same syntactic type <code>A</code> that is defined by
looking at the “logical” structure of the type <code>A</code> (that is: looking
at the “logical” constructors <code>→</code>, <code>+</code>, <code>×</code> and so on used to
construct <code>A</code>). The general idea is that we relate two interpretations
of the same type <code>A</code> to capture the idea of “change of representation”
for that type.</p><p>An alternative approach to relational parametricity, that ends up
being equivalent when we are just talking about types uses <em>reflexive
graphs</em>. This approach is originally due to Robinson and Rosolini in
their
<a href="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.51.9972">“Reflexive Graphs and Parametric Polymorphism” paper</a> from 1994. The
idea is to abstract logical relations away from types and relations,
and to look at just the bare structure of what we are defining. A
reflexive graph consists of:</p><ol><li>A set <code>O</code>, which we will call abstract <em>objects</em>.</li><li>A set <code>R</code>, which we will call abstract <em>relations</em> or <em>edges</em>.</li><li>A function <code>src : R → O</code> (“source of an edge”) that maps each edge
to its source object.</li><li>A function <code>tgt : R → O</code> (“target of an edge”) that maps each edge
to its target object.</li><li>A function <code>refl : O → R</code> (“reflexive edge”) that maps each object
<code>o</code> to a reflexive edge with source <code>o</code> and target <code>o</code>.</li></ol>
<p>Conceptually, we can think of the objects in <code>O</code> as the actual data we
compute with, and the edges in <code>R</code> as telling us about abstract
relationships between pairs of data items. A reflexive graph morphism
will tell us how to map data to data and relationships to
relationships, in such a way that the two mappings agree.</p><p>Formally, a <em>morphism</em> between two reflexive graphs <code>(O₁, R₁, src₁, tgt₁,
refl₁)</code> and <code>(O₂, R₂, src₂, tgt₂, refl₂)</code> consists of a pair of
functions <code>f : O₁ → O₂</code>, mapping objects to objects, and <code>r : R₁ →
R₂</code>, mapping relations to relations, such that the three operations
<code>src</code>, <code>tgt</code> and <code>refl</code> are preserved.</p><p>As with change structures, we can make reflexive graphs and reflexive
graph morphisms into a category, which is cartesian closed for general
reasons (specifically, the category of reflexive graphs is a presheaf
category, which are always cartesian closed).</p><p>I think it is useful to consider reflexive graphs because they are a
larger world where normal logical relations live, alongside many other
interesting things. We can define a particular reflexive graph that
captures the idea of “relation between two interpretations of a type”,
but we can also go on to define other reflexive graphs to represent
other things, like <a href="http://bentnib.org/fomega-parametricity.html">higher kinds</a>, or
<a href="http://bentnib.org/conservation-laws.html">geometric transformation groups</a>, and, as
I’ll demonstrate below, Cai <em>et al.</em>’s change structures.</p><p>In more detail, the reflexive graph that captures the idea of
“relation between two interpretations of a type” is defined as having
objects that are (small) sets, edges that are triples <code>(A,B,R)</code>, where
<code>A</code> and <code>B</code> are small sets, and <code>R</code> is binary relation between <code>A</code> and
<code>B</code>. We define <code>src(A,B,R) = A</code>, <code>tgt(A,B,R) = B</code>. The “reflexive
edge” for each small set is defined to be the equality relation:
<code>refl(A) = (A,A,{(a,a) | a ∈ A})</code>. Making the reflexive edge from
every set to itself be equality captures the <em>Identity Extension</em>
property identified by Reynolds.</p><p>For interpreting System F, and the higher kinded variant System Fω, we
interpret <em>kinds</em> (i.e., <code>*</code>, <code>* → *</code>, <code>(* → *) → *</code> and so on) as
reflexive graphs, <em>types</em> (possibly with free variables) as morphisms
of reflexive graphs, and <em>terms</em> as transformations between morphisms
of reflexive graphs. See my paper
<a href="http://bentnib.org/fomega-parametricity.html">Relational Parametricity for Higher Kinds</a>
for more details.</p><p>When interpreting <em>dependent types</em> in a relationally parametric
model, we get to make a simplification: we unify the interpretations
of types and kinds. Now all types are interpreted as (families of)
reflexive graphs, and terms are interpreted as (families of) reflexive
graph morphisms. The type of small types, <code>U</code>, is interpreted using
the reflexive graph <code>Type</code> I gave the definition of above. See the
paper
<a href="http://bentnib.org/dtt-parametricity.html">A Relationally Parametric Model of Dependent Type Theory</a>,
that I wrote with Neil and Patty, for more details.</p><p>So, the category of reflexive graphs is the natural place for
interpreting relational parametricity for type systems that are more
expressive than plain System F.</p><h2>Change Structures and Reflexive Graphs are equivalent</h2>
<p>I’m now going to show that Cai <em>et al.</em>’s change structures (or
at least my slight variant) and reflexive graphs are the same
thing. More precisely, the categories involved are equivalent, but
here I’ll just stick to showing how to convert a change structure to
reflexive graph and back.</p><p>Given a change structure <code>(A, ∂A, ⊕, 0)</code>, we define a reflexive graph
as follows:</p><ol><li>The set of objects is the set <code>A</code>.</li><li>The set of relations is the set consisting of pairs <code>(a, δa)</code>,
where <code>a ∈ A</code> and <code>δa ∈ ∂A(a)</code>.</li><li>The source map is defined as <code>src(a, δa) = a</code>.</li><li>The target map is defined as <code>tgt(a, δa) = a ⊕ δa</code>.</li><li>The reflexive maps is defined as <code>refl(a) = (a,0)</code>.</li></ol>
<p>Conversely, given a reflexive graph <code>(O, R, src, tgt, refl)</code>, we
define a change structure as follows:</p><ol><li>The set of values is <code>O</code>.</li><li>For each value <code>o ∈ O</code>, the set of changes at <code>o</code> is <code>∂O(o) = { r
∈ R | src(r) = o }</code> — <em>i.e.</em>, the set of relations with source
<code>o</code>.</li><li>The <code>⊕</code> operation is defined as <code>o ⊕ r = tgt(r)</code> — <em>i.e.</em>, we take
the target of a relation whose source is <code>o</code>.</li><li>For each <code>o ∈ O</code>, the <code>0</code> change in <code>∂O(o)</code> is <code>refl(o)</code>.</li></ol>
<p>If we start with either a change structure or a reflexive graph then
translating to the other and back again will always give us something
that is isomorphic to what we started with. After a bit more work, we
can see that this will give us an equivalence of categories.</p><h2>So what?</h2>
<p>So we’ve got an equivalence of categories: one intended for
interpreting incremental recomputation, and one intended for
interpreting relational parametricity. What does this give us?</p><ol><li><p>Since the category of reflexive graphs can interpret dependent
type theory (<a href="http://bentnib.org/dtt-parametricity.html">see here</a>), the fact that
the categories of change structures and reflexive graphs are
equivalent means that we automatically have a notion of static
differentiation of programs in the sense of Cai <em>et al.</em> for
dependently typed languages.</p></li><li><p>I think this equivalence gives us a way to <em>think</em> about
relational parametricity. Relational parametricity (and “theorems
for free”) are sometimes characterised as holding because programs
“cannot inspect their type parameters”, or “don’t know what type
is being used”. But the equivalence with change structures
indicates to me that thinking of parametricity as a theory of how
programs act under changes of their parameters. This viewpoint is
explicit in Kennedy’s
<a href="http://research.microsoft.com/en-us/um/people/akenn/units/RelationalParametricityAndUnitsOfMeasure.pdf">Relational Parametricity and Units of Measure</a>
from 1997, where he treats polymorphism over dimensions as
invariance under scaling. However, the “theory of change”
viewpoint does not seem to be a common way to think about
parametricity or polymorphism in the literature in general.</p></li><li><p>I hope that a unified viewpoint on parametricity <em>and</em> change
structures may lead to interesting advances in both. As one
example: what about having additional structure on the set of
changes? Could we arrange changes into a partial order to say
that one change was “bigger” than another? This might give us a
way of stating that an incremental recomputation gives us the
<em>least</em> change in the output for a given change in the input. Then
what does that mean for parametricity? Another example: if the
derivative function maps changes in the input to changes in the
output, then can we formulate what is required of a function that
maps changes back? (Fun exercise: show that “well-behaved lenses”
are equivalent to change structure morphisms between certain
change structures, where the derivative function partially applied
to a value — <code>f' a : ∂A(a) → ∂B(f a)</code> — always has an inverse.)</p></li></ol>http://bentnib.org/posts/2015-04-23-incremental-lambda-calculus-and-parametricity.htmlThu, 23 Apr 2015 00:00:00 +0000Slides for “An Algebraic Approach to Typechecking and Elaboration”http://bentnib.org/posts/2015-04-19-algebraic-approach-typechecking-and-elaboration.html<p>Two months at the
<a href="http://www.dcs.gla.ac.uk/research/spls/">Scottish Programming Languages Seminar</a>,
February 2015 Strathclyde Edition, I gave a talk entitled “<em>An
Algebraic Approach to Typechecking and Elaboration</em>”. Several people
have asked me to put the slides online, so here they are:</p><p><p class="displaylink"><a href="http://bentnib.org/docs/algebraic-typechecking-20150218.pdf"><img alt="Thumbnail of slides for "An Algebraic Approach to Typechecking and Elaboration" talk" src="http://bentnib.org/thumbnails/slides-algebraic-typechecking-20150218.png"></a></p></p><p>The point of the talk was to present an algebraic approach to
specifying type systems ─ algebraic in the sense of algebraic theories
with operations and equations. I reckon this presentation leads
naturally to a unification of the way that typed functional languages
like ML and Haskell elaborate their source language into explicitly
typed core languages, and the tactic languages in interactive theorem
provers like Coq and Isabelle.</p><p><strong>Update:</strong> (2015-05-20) I gave this talk again at the
<a href="http://staff.computing.dundee.ac.uk/frantisekfarka/tiap/">Workshop on Type Inference and Automated Proving</a>
in Dundee, and it was
<a href="https://www.youtube.com/watch?v=R5NMX8FBlWU">recorded (link to video)</a>.</p><p><strong>Update:</strong> (2015-09-03) I gave this talk a third time at the
<a href="http://users-cs.au.dk/birke/hope-2015/">Higher Order Programming with Effects (HOPE)</a>
workshop colocated with ICFP 2015 in Vancouver, and it was again
<a href="https://www.youtube.com/watch?v=ypU3j6Wpkoo">recorded (link to video)</a>.</p>http://bentnib.org/posts/2015-04-19-algebraic-approach-typechecking-and-elaboration.htmlSun, 19 Apr 2015 00:00:00 +0000Propositions as Filenames, Builds as Proofs: The Essence of Makehttp://bentnib.org/posts/2015-04-17-propositions-as-filenames-essence-of-make.html<p>The <a href="http://en.wikipedia.org/wiki/Make_%28software%29"><code>make</code></a> program
is a widely used tool for building files from existing files,
according to a set of build rules specified by the user. It is usually
used to compile executable programs from source code, but can also be
used for many other jobs where a bunch of things are generated from
other things, like this website, for example.</p><p>Many alternatives to <code>make</code> have been proposed. Motivations for
replacing <code>make</code> range from a desire to replace <code>make</code>'s very
Unix-philosophy inspired domain-specific language for describing build
rules (Unix-philosophy in the sense that it often
<a href="https://pragprog.com/the-pragmatic-programmer/extracts/coincidence">works by coincidence</a>,
but falls over if you do something exotic, like have filenames with
spaces in, or have an environment variable with the “wrong” name), or
<code>make</code>'s slowness at some tasks, or a perception that <code>make</code> doesn't
treat the <code>make</code>-alternative implementor's favourite programming
language with the special treatment it so obviously
deserves.</p><p>Nevertheless, I think that <code>make</code> (or at least the GNU variant I am
most familiar with) has a core essence that can be profitably extracted
and analysed.</p><h2>The Essence of <code>make</code></h2>
<p><strong>The essence of <code>make</code> is this</strong>: <code>make</code> is an implementation of
<em>constructive logic programming</em>, using the following instantiation of
the
“<a href="http://homepages.inf.ed.ac.uk/wadler/papers/propositions-as-types/propositions-as-types.pdf">Propositions-as-X</a>”
paradigm:</p><ul><li><p>Atomic propositions are <em>filenames</em>. The filenames <code>main.c</code>,
<code>main.o</code> and <code>myprogram</code> are all examples of atomic propositions in
<code>make</code>'s logic. For <code>make</code>, the idea of “well-formed formula” from
traditional logic means “doesn't have spaces in”.</p></li></ul><ul><li>Compound propositions are <em>build rules</em>. A build rule that states
that <code>myprogram</code> can be built from <code>main.o</code> and <code>module.o</code> is a
statement that the atomic propositions <code>main.o</code> and <code>module.o</code>
imply <code>myprogram</code>. Pattern rules like <code>%.o: %.c; gcc -o $@ -c $<</code>
are <em>universally quantified</em> compound propositions: this says
that, for all <em>x</em>, the atomic proposition <em>x</em><code>.c</code> implies the atomic
proposition <em>x</em><code>.o</code>. Static pattern rules are essentially a form of
bounded quantification.</li></ul>
<p> Note that the form of compound propositions allowed is extremely
restricted, even by the standards of logic programming: we are
allowed at most one universal quantifier, which quantifies over
space-less strings, the rest of proposition must be of the form
“<code>f1</code> and <code>f2</code> and ... and <code>fn</code> implies <code>g</code>”, <em>and</em> if there is a
quantifier, the variable must appear in the goal formula <code>g</code>. This
format corresponds to a restricted form of
<a href="http://en.wikipedia.org/wiki/Horn_clause">Horn Clauses</a>, as used in
normal logic programming.</p><p>If we stopped here, then <code>make</code> would not be any more than an
extremely restricted form of Prolog. But what makes <code>make</code> special is
that it implements a <em>constructive</em> logic: it generates proofs, or
evidence, for the propositions it proves.</p><ul><li><p>Proof, or evidence, of an atomic proposition <code>somefile</code> is the
<em>content of an actual file <code>somefile</code> in the filesystem</em>. Some
evidence is provided by the user, in the form of source
files. Evidence for deduced atomic propositions, e.g., <code>.o</code> files,
is generated by the proofs for compound propositions:</p></li><li><p>Proof of a compound proposition “<code>x</code> and <code>y</code> implies <code>z</code>” is a
<em>command</em> to run that will generate the proof of the atomic
proposition <code>z</code> from the proofs of the atomic propositions <code>x</code> and
<code>y</code>. For pattern rules, this proof is parameterised by the
instantiation of the universally quantified variable. For some
reason, in <code>make</code>, the universally quantified variable is written
as “<code>%</code>” in the proposition, and “<code>$*</code>” in the proof.</p></li></ul>
<h2>What <code>make</code> does</h2>
<p>Using this mapping between logic and <code>make</code>, I see <code>make</code> as
conceptually performing three tasks when it is told to use <code>Makefile</code>
to generate the target <code>myprogram</code>.</p><ol><li>It executes the <code>Makefile</code>, expanding out variables. This generates
a collection of build rules.</li><li>It constructs a proof of the atomic proposition <code>myprogram</code> using
backward-chaining proof search from the goal, via the build rules
(aka compound propositions), back to the evidence for atomic
propositions provided by the user in the file system. In
traditional logic, this proof would be represented using a tree,
but obvious efficiency gains can be had by exploiting sharing and
representing it as a directed acyclic graph.</li><li>It <em>executes</em> the proof to generate the evidence of the atomic
proposition <code>myprogram</code>. The evidence for the provability of
<code>myprogram</code> is a file <code>myprogram</code> in the filesystem, generated by
the proofs of the build rules and source files it's proof depends
on. This step can often be made more efficient by reusing existing
pieces of evidence if the evidence they were built from hasn't
changed.</li></ol>
<p>The GNU <code>make</code> manual's description combines the last two steps into
one “run-the-build” step, and in practice this is what an realistic
implementation ought to do. (And the first step is, in GNU <code>make</code>'s
reality, more complex because <code>make</code> can rebuild included files and
restart itself, but I'm glossing over that for now.)</p><h2>So what?</h2>
<p>I think that there are real benefits to seeing <code>make</code>-like systems as
implementations of constructive logic programming:</p><ol><li><p>I believe that seeing <code>make</code>-like systems as a form of constructive
logic programming elucidates the differences between some of the
<code>make</code> alternatives that have been proposed. For instance, I think
that the <a href="http://martine.github.io/ninja/">Ninja</a> system
essentially gets its speed ups by caching the some of results of
the proof search step by storing the expansions of all of the
universally quantified build rules that are needed. The
<a href="http://omake.metaprl.org/index.html">OMake</a> system allows for
targets to dynamically depend on dependencies listed in generated
files, via “scanner” dependencies. I think this corresponds to
proof search in a <em>dependently-typed</em> logic that allows
propositions to depend on the generated evidence of other
propositions.</p></li><li><p>We can start to look at <code>makes</code>'s restrictions through the lens of
logic programming, and start to think about more expressive build
tools:</p><ul><li>Why are build rules restricted to at most one universal
quantifier? What would we gain my allowing unrestricted Horn
clauses? What if the universally quantified variables didn't
have to appear in the goal formula, as in most other logic
programming languages?</li><li>Build rules that generate multiple files are Horn clauses with
conclusions that are conjunctions (ands) of atomic
propositions. GNU <code>make</code> and others make multiple targets
difficult, but from a logical point-of-view, there is no
problem.</li><li>What rules does <code>make</code> use to resolve the choice between
multiple proofs of the one atomic proposition? Could we have
build systems that produce sets of proofs for each proposition?
Can this be used to do multi-platform builds? Can we assign
weightings to build rules so that <code>make</code> picks the overall
“best” proof/build strategy?</li><li><code>make</code> implements a “top-down” approach to evaluating its logic
program. Why not also implement a “bottom-up” evaluation too?
This would enable us to ask questions like “what can be built
using these rules and these source files?”. This might finally
enable decent TAB-completion for <code>make</code> at the command line, and
IDE introspection capabilities.</li><li>Can logics that incorporate forms of (sound) circular reasoning
be used to do build jobs that require iteration until a
fixpoint. Can we use fixpoint logic to work around the tragedy
of LaTeX's “Label(s) may have changed. Rerun to get
cross-references right.”?</li><li>Do all atomic propositions have to be filenames? Why not URLs?
Why not proof-irrelevant ephemeral propositions?</li><li>Can the connection between logic programming and relational
databases be used? Can we use a <code>make</code>-like tool to query a
database, and generate reports?</li><li>Can we automatically augment the proof graphs that <code>make</code>
implicitly generates to add provenance information?</li></ul></li></ol>
<p>Interesting stuff, I think.</p>http://bentnib.org/posts/2015-04-17-propositions-as-filenames-essence-of-make.htmlFri, 17 Apr 2015 00:00:00 +0000POPL Slideshttp://bentnib.org/posts/2014-01-29-popl-slides.html<p>I gave two talks at <a href="http://popl.mpi-sws.org/2014/">POPL 2014</a>, back
to back. This was pretty frightening beforehand, but seemed to go
alright.</p><p>Here are the slides:</p><h2>From Parametricity to Conservation Laws, via Noether's Theorem</h2>
<p><p class="displaylink"><a href="http://bentnib.org/docs/conservation-laws-20140124.pdf"><img alt="Thumbnail of slides for "From Parametricity to Conservation Laws, via Noether's Theorem" talk" src="http://bentnib.org/thumbnails/slides-conservation-laws-20140124.png"></a></p></p><h2>A Relationally Parametric Model of Dependent Type Theory</h2>
<p><p class="displaylink"><a href="http://bentnib.org/docs/dtt-parametricity-20140124.pdf"><img alt="Thumbnail of slides for "A Relationally Parametric Model of Dependent Type Theory"" src="http://bentnib.org/thumbnails/slides-dtt-parametricity-20140124.png"></a></p></p>http://bentnib.org/posts/2014-01-29-popl-slides.htmlWed, 29 Jan 2014 00:00:00 +0000One Done, Two Submittedhttp://bentnib.org/posts/2013-07-17-one-done-two-submitted.html<p>One paper finished, two new ones submitted.</p><h2>Productive Coprogramming</h2>
<p><a href="https://personal.cis.strath.ac.uk/conor.mcbride">Conor</a> and I have
just submitted the final version of "Productive Coprogramming with
Guarded Recursion" to the publishers. Looking forward to ICFP in
Boston!</p><p><p class="displaylink"><a href="http://bentnib.org/productive.html"><img alt="Thumbnail of "Productive Coprogramming" paper" src="http://bentnib.org/thumbnails/doc-productive.png"></a></p></p><h2>Pair of Papers Pertaining to Parametricity</h2>
<h3>Dependent Types</h3>
<p>With <a href="https://personal.cis.strath.ac.uk/neil.ghani">Neil</a> and
<a href="https://personal.cis.strath.ac.uk/patricia.johann">Patty</a>, we've
constructed a relationally parametric model of impredicative and
predicative dependent type theories. Highlights: <em>(1)</em> we prove the
existence of initial algebras for <em>all</em> indexed functors; and *(2) the
model is constructed using reflexive graphs, a kind of cut-down
version of groupoids.</p><p><p class="displaylink"><a href="http://bentnib.org/dtt-parametricity.html"><img alt="Thumbnail of "A Relationally Parametric Model of Dependent Type Theory" paper" src="http://bentnib.org/thumbnails/doc-dtt-parametricity.png"></a></p></p><h3>Classical Mechanics</h3>
<p>Just by myself, I've also done a paper on using relational
parametricity to prove symmetry properties of Lagrangians in classical
mechanics, so that you can derive conservation laws via Noether's
theorem. The main point is to show that parametricity isn't just for
computer programs; physics looks like it is ripe with opportunities
for applications of theorems for free.</p><p><p class="displaylink"><a href="http://bentnib.org/conservation-laws.html"><img alt="Thumbnail of "From Parametricity to Conservation Laws, via Noether's Theorem" paper" src="http://bentnib.org/thumbnails/doc-conservation-laws.png"></a></p></p>http://bentnib.org/posts/2013-07-17-one-done-two-submitted.htmlWed, 17 Jul 2013 00:00:00 +0000Productive Coprogramming with Guarded Recursionhttp://bentnib.org/posts/2013-03-29-productive-coprogramming.html<p><a href="https://personal.cis.strath.ac.uk/conor.mcbride/">Conor McBride</a> and
I have just submitted a new paper to ICFP. In it, we attempt to use
Nakano-style guarded recursion to write productive coprograms.</p><p>This is an elaboration of Conor's
<a href="http://www.e-pig.org/epilogue/?p=186">blog post</a> and the
<a href="http://bentnib.org/posts/2011-11-14-productive-programmer.html">slides</a> I posted here a
while ago.</p><p>Here's a link to the paper, and the abstract:</p><p><p class="displaylink"><a href="http://bentnib.org/productive.html"><img alt="Productive Coprogramming with Guarded Recursion" src="http://bentnib.org/thumbnails/doc-productive.png"></a></p></p><blockquote><p>Total functional programming offers the beguiling vision that, just
by virtue of the compiler accepting a program, we are guaranteed
that it will always terminate. In the case of programs that are not
intended to terminate, e.g., servers, we are guaranteed that
programs will always be productive. Productivity means that, even if
a program generates an infinite amount of data, each piece will
generated in finite time. The theoretical underpinning for
productive programming with infinite output is provided by the
category theoretic notion of final coalgebras. Hence, we speak of
coprogramming with non-well-founded codata, as a dual to programming
with well-founded data like finite lists and trees.</p><p>Systems that offer facilities for productive coprogramming, such as
the proof assistants Coq and Agda, currently do so through syntactic
guardedness checkers. Syntactic guardedness checkers ensure that all
self-recursive calls are guarded by a use of a constructor. Such a
check ensures productivity. Unfortunately, these syntactic checks
are not compositional, and severely complicate coprogramming.</p><p>Guarded recursion, originally due to Nakano, is tantalising as a
basis for a flexible and compositional type-based approach to
coprogramming. However, as we show, by itself, guarded recursion is
not suitable for coprogramming due to the fact that there is no way
make finite observations on pieces of infinite data. In this paper,
we introduce the concept of <em>clock variables</em> that index Nakano’s
guarded recursion. Clock variables allow us to ``close over’’ the
generation of infinite data, and to make finite observations,
something that is not possible with guarded recursion alone.</p></blockquote>http://bentnib.org/posts/2013-03-29-productive-coprogramming.htmlFri, 29 Mar 2013 00:00:00 +0000Theorems for Freehttp://bentnib.org/posts/2012-11-07-theorems-for-free.html<p>I gave a talk last night at the
<a href="http://www.edlambda.co.uk">Ed Lambda</a>, the Edinburgh functional
programming meetup, on “Theorems for Free”. This was (I hope) a fairly
high-level talk about how free theorems are derived, and some
extensions to other kinds of polymorphism that I've worked on
recently. Here are the slides I used:</p><p><p class="displaylink"><a href="http://bentnib.org/docs/theorems-for-free-20121106.pdf"><img alt="Slides for "Theorems for free!" talk" src="http://bentnib.org/thumbnails/slides-theorems-for-free-20121106.png"></a></p></p><p>I didn't include much in the way of references to the literature in
the talk, but the main paper to look at if you are interested is Phil
Wadler's classic
<a href="http://homepages.inf.ed.ac.uk/wadler/papers/free/free.ps">Theorems for Free!</a>
(warning: PostScript!).</p><p>Thanks for Rob Stewart for organising things!</p>http://bentnib.org/posts/2012-11-07-theorems-for-free.htmlWed, 07 Nov 2012 00:00:00 +0000Abstraction and Invariance for Algebraically Indexed Typeshttp://bentnib.org/posts/2012-11-07-algebraically-indexed-types.html<p><a href="https://personal.cis.strath.ac.uk/patricia.johann/">Patricia Johann</a>,
<a href="http://research.microsoft.com/en-us/um/people/akenn/">Andrew Kennedy</a>
and I have a new paper that will be presented at POPL in January! This
paper is an extension of Andrew's POPL'97 paper on interpreting
dimension types in terms of scaling invariance. Here's a link to the
paper and the abstract:</p><p><p class="displaylink"><a href="http://bentnib.org/algebraic-indexed.html"><img alt="Thumbnail of paper "Abstraction and Invariance for Algebraically Indexed Types" src="http://bentnib.org/thumbnails/doc-algebraic-indexed.png"></a></p></p><blockquote><p> Reynolds' relational parametricity provides a powerful way to reason
about programs in terms of invariance under changes of data
representation. A dazzling array of applications of Reynolds' theory
exists, exploiting invariance to yield “free theorems”,
non-inhabitation results, and encodings of algebraic datatypes.
Outside computer science, invariance is a common theme running
through many areas of mathematics and physics. For example, the area of
a triangle is unaltered by rotation or flipping. If we scale a
triangle, then we scale its area, maintaining an invariant
relationship between the two. The transformations under which
properties are invariant are often organised into groups, with the
algebraic structure reflecting the composability and invertibility
of transformations.</p><p> In this paper, we investigate programming languages whose types are
indexed by algebraic structures such as groups of geometric
transformations. Other examples include types indexed by
principals--for information flow security--and types indexed by
distances--for analysis of analytic uniform continuity
properties. Following Reynolds, we prove a general Abstraction
Theorem that covers all these instances. Consequences of our
Abstraction Theorem include free theorems expressing invariance
properties of programs, type isomorphisms based on invariance
properties, and non-definability results indicating when certain
algebraically indexed types are uninhabited or only inhabited by
trivial programs. We have fully formalised our framework and most
examples in Coq.</p></blockquote>http://bentnib.org/posts/2012-11-07-algebraically-indexed-types.htmlWed, 07 Nov 2012 00:00:00 +0000Interleaving Data and Effectshttp://bentnib.org/posts/2012-09-06-interleaving-data-and-effects.html<p>Patricia Johann, Neil Ghani, Bart Jacobs and myself have just
submitted a paper on interleaving pure data types with effects. This
is a much more detailed version of the
<a href="http://bentnib.org/posts/2012-01-06-streams.html">blog post</a>
I wrote back in January on reasoning about stream processing with
effects.</p><p>Here is the abstract:</p><blockquote><p> The study of programming with and reasoning about inductive
datatypes such as lists and trees has benefited from the simple
categorical principle of initial algebras. In initial algebra
semantics, each inductive datatype is represented by an initial
<em>f</em>-algebra for an appropriate functor <em>f</em>. The initial algebra
principle then supports the straightforward derivation of
definitional principles and proof principles for these datatypes.
This technique has been expanded to a whole methodology of
structured functional programming, often called origami programming.</p><p> In this article, we show how to extend initial algebra semantics
from pure inductive datatypes to inductive datatypes interleaved
with computational effects. Inductive datatypes interleaved with
effects arise naturally in many computational settings. For example,
incrementally reading characters from a file generates a list of
characters interleaved with input/output actions. Straightforward
application of initial algebra techniques to effectful datatypes
leads to unnecessarily complicated reasoning, because the pure and
effectful concerns must be considered simultaneously. We show how
these concerns can be separated the
abstraction of initial <em>f</em>-and-<em>m</em>-algebras, where the functor <em>f</em>
describes the pure part of a datatype, and the monad <em>m</em> describes
the interleaved effects. Because initial <em>f</em>-and-<em>m</em>-algebras are
the analogue for the effectful setting of initial <em>f</em>-algebras, they
support the extension of the standard definitional and proof
principles to the effectful setting. Because initial
<em>f</em>-and-<em>m</em>-algebras separate pure and effectful concerns, they
support the direct transfer of definitions and proofs from the pure
setting to the effectful setting.</p><p> Initial <em>f</em>-and-<em>m</em>-algebras are originally due to Filinski and
Støvring, and were subsequently generalised to arbitrary
categories by the authors of this article. In this article, we aim
to introduce the concept of initial <em>f</em>-and-<em>m</em>-algebras to a
general functional programming audience.</p></blockquote>
<p>The actual paper is available <a href="http://bentnib.org/interleaving.html">here</a>.</p>http://bentnib.org/posts/2012-09-06-interleaving-data-and-effects.htmlThu, 06 Sep 2012 00:00:00 +0000Relational Parametricity for Higher Kindshttp://bentnib.org/posts/2012-09-05-relational-parametricity-for-higher-kinds.html<p>I just gave a talk at CSL 2012 on "Relational Parametricity for Higher
Kinds". In the paper I explain how to extend the usual relationally
parametric models of polymorphic types to handle higher kinded types,
like the ones found in Haskell and Scala. As a consequence, you get
encodings of things like equality types, higher-kinded existential
types and higher-kinded initial algebras, with nice reasoning
principles.</p><p>The slides:</p><p><p class="displaylink"><a href="http://bentnib.org/docs/relparamfomega-csl2012-slides.pdf"><img alt="Thumbnail of slides for "Relational Parametricity for Higher Kinds" talk" src="http://bentnib.org/thumbnails/slides-relparamfomega-csl2012-slides.png"></a></p></p><p>The paper, with abstract, is available <a href="http://bentnib.org/fomega-parametricity.html">here</a>.</p>http://bentnib.org/posts/2012-09-05-relational-parametricity-for-higher-kinds.htmlWed, 05 Sep 2012 00:00:00 +0000Reasoning about Stream Processing with Effectshttp://bentnib.org/posts/2012-01-06-streams.html<p>It is a truth, universally acknowledged, that any programming
technique must be in want of a reasoning principle.</p><p>Stream processing in Haskell is very much in the air at the moment,
what with <a href="http://okmij.org/ftp/Streams.html">Iteratees</a> (as embodied
in the <a href="http://hackage.haskell.org/package/enumerator">Enumerator</a>
library),
<a href="http://www.yesodweb.com/blog/2012/01/conduits-conduits">Conduits</a> and
probably some more that I don't know about.</p><p><a href="http://personal.cis.strath.ac.uk/~patricia">Patty Johann</a>,
<a href="http://personal.cis.strath.ac.uk/~ng">Neil Ghani</a>,
<a href="http://www.cs.ru.nl/~bart/">Bart Jacobs</a> and I have recently had the
paper
<a href="http://bentnib.org/induction-with-effects.html">"Fibrational Induction Meets Effects"</a>
accepted to <a href="http://www.itu.dk/research/fossacs-2012/">FoSSaCS 2012</a>
that turns out to be very relevant to discussing and reasoning about
the kinds of stream processing problems that these Haskell libraries
seek to resolve.</p><p>In the paper we build upon the work of
<a href="http://www.diku.dk/hjemmesider/ansatte/stovring/papers/icfp07.pdf">Andrzej Filinski and Kristian Støvring</a>
on induction principles for data structures that interleave pure data
with effects, where the effects are defined in terms of some
monad. Filinski and Støvring gave a induction principle for such
interleaved effectful data types in a small programming language with
effects and fixed notion of data type and predicate. In our FoSSaCS
paper, we have generalised this to a more general categorical setting,
and also generalised the notion of predicate that can be considered
(so you can have proof-relevant predicates, or Kripke predicates, for
instance).</p><p>Our paper is quite technical, since we need the structure of
fibrations and so on to properly attain the right level of
generality. Nevertheless, the core principle is a relatively simple
and revolves around a recursion scheme for pure data interleaved with
monadic effects (which generalises the one given by Filinski and
Støvring). This turns out to have direct application to reasoning
about the processing of streaming data.</p><h2>Effectful Streams</h2>
<p>Effectful streams generate potentially infinite amounts of data by
executing effects as they do so. For example, an effectful stream may
produce data by reading it from a network socket. An effectful stream
is nothing more than an interleaving of some monad "<code>m</code>" with news of
whether the stream has stopped or has yielded an element. This can be
expressed by the following Haskell declarations of a pair of mutually
recursive types:</p><pre><code>newtype Stream m a = Stream { forceStream :: m (StreamStep m a) }
data StreamStep m a
= StreamEmit a (Stream m a)
| StreamStop</code></pre><p>Constructing streams with no (new) effects is accomplished by just
using the <code>return</code> of our chosen monad, and the appropriate
constructor:</p><pre><code>nil :: Monad m => Stream m a
nil = Stream $ return StreamStop
cons :: Monad m => a -> Stream m a -> Stream m a
cons a s = Stream $ return (StreamEmit a s)</code></pre><p>Given a value <code>stream</code> of type <code>Stream m a</code> we can kick it with
<code>forceStream</code> to give us a monadic action that will yield either
<code>StreamEmit a moreStream</code> or <code>StreamStop</code>. The stream gets to execute
some effect in the monad <code>m</code> before telling us whether the stream has
ended or not. It can use this monadic effect to do things like read
from files, consult random number sources or use it as a side channel
to report errors.</p><p>The stream <code>append</code> function shows how a stream can be interrogated by
a recursive function:</p><pre><code>append :: Monad m => Stream m a -> Stream m a -> Stream m a
append s1 s2 = Stream $ do
streamStep <- forceStream s1
case streamStep of
StreamEmit a s1' -> forceStream (cons a (append s1' s2))
StreamStop -> forceStream s2</code></pre><p>Streams look a lot like normal Haskell lists, except that rather than
the ambient Haskell effect of "possibly a non-terminating black hole",
we have the effects described by the monad <code>m</code> (plus the non-optional
"possibly a non-terminating black hole" effect). For instance, we can
define a stream that emits characters read from a <code>Handle</code> until it
hits the end of file:</p><pre><code>ofHandle :: Handle -> Stream IO Char
ofHandle handle = loop
where
loop = Stream $ do
isEOF <- hIsEOF handle
if isEOF then do hClose handle
return StreamStop
else do c <- hGetChar handle
return (StreamEmit c loop)</code></pre><h2>Stream Readers</h2>
<p><code>Stream</code>s constitute the supply-side of data processing. We can use a
similar pattern to treat the consumption side. <code>Reader</code>s are defined
as follows, and look very similar to
<a href="http://okmij.org/ftp/Streams.html">Iteratees</a>:</p><pre><code>newtype Reader a m b = Reader { forceReader :: m (ReaderStep a m b) }
data ReaderStep a m b
= ReaderRead (Maybe a -> Reader a m b)
| ReaderEmit b</code></pre><p>Similar to <code>Stream</code>s, a value of type <code>Reader a m b</code> can be prodded
with <code>forceReader</code>. It will then tell us (after some monadic effect)
whether it wants to read some data (<code>ReaderRead</code>) or if it has
finished reading and wants to output some data (<code>ReaderEmit</code>). The
<code>Maybe</code> type constructor in <code>ReaderRead</code> is intended to indicate
whether or not end-of-stream has been reached.</p><p>The obvious thing to do now is to connect up a <code>Stream</code> and a <code>Reader</code>
to feed the data from one into the other. There are actually (at
least) two different ways of doing this, depending on whether we
execute the effects of the stream before the reader, or the other way
round. Executing the stream before the reader gives the following
definition:</p><pre><code>(|>|) :: Monad m => Stream m a -> Reader a m b -> m b
s |>| r = do
streamStep <- forceStream s
case streamStep of
StreamEmit a s' -> do
readerStep <- forceReader r
case readerStep of
ReaderRead k -> s' |>| k (Just a)
ReaderEmit b -> return b
StreamStop -> do
runOnNothing r</code></pre><p>If the stream stops early by returning <code>StreamStop</code>, then we use the
following function <code>runOnNothing</code> to feed <code>Nothing</code> to a <code>Reader</code>
until it yields a value:</p><pre><code>runOnNothing :: Monad m => Reader a m b -> m b
runOnNothing r = do
readerStep <- forceReader r
case readerStep of
ReaderRead k -> runOnNothing (k Nothing)
ReaderEmit b -> return b</code></pre><h2>Generic Interleaved Data and A Recursion Scheme</h2>
<p>I will now show that the <code>Stream</code> and <code>Reader</code> types may be
generalised to a common pattern. By exploiting this common pattern, we
obtain a powerful recursion principle for data interleaved with
effects. This recursion principle also comes equipped with a reasoning
principle, which allows us to prove things about functions that
recurse over data interleaved with effects.</p><p>The common shape of the <code>Stream</code> and <code>Reader</code> types is captured by the
following pair of Haskell type declarations. All the parts specific to
<code>Stream</code>s or <code>Reader</code>s have been abstracted out into the argument <code>f
:: * -> *</code> (which we will assume is an instance of the <code>Functor</code>
type class).</p><pre><code>data Step m f = Step (f (D m f))
newtype D m f = D { force :: m (Step m f) }</code></pre><p>A value of type <code>D m f</code> is therefore an interleaving of the effects
described by <code>m</code> with the pure data described by <code>f</code>. The function
<code>force</code> plays the part of the <code>forceStream</code> and <code>forceReader</code>
functions defined above.</p><p>Effectful data can be constructed and deconstructed by the
following functions:</p><pre><code>construct :: Monad m => f (D m f) -> D m f
construct x = D $ return (Step x)
deconstruct :: Monad m => D m f -> m (f (D m f))
deconstruct d = do Step x <- force d
return x</code></pre><p>When using <code>deconstruct</code> there may be some effects to execute before
we get access to the underlying pure data described by <code>f</code>.</p><p>To recover the <code>Stream</code> type, we simply define the appropriate <code>f</code> to
describe the two things that <code>Stream</code>s are allowed to do: emit values
and cease to be.</p><pre><code>data StreamStep a x
= StreamEmit a x
| StreamStop
deriving Functor
type Stream m a = D m (StreamStep a)</code></pre><p>In the new <code>StreamStep</code> type, the type parameter <code>x</code> indicates the
hole where the next step of the recursion is placed.</p><p>The <code>nil</code> and <code>cons</code> constructors can then be defined in terms of
<code>construct</code> and the appropriate part of the <code>StreamStep</code> type:</p><pre><code>nil :: Monad m => Stream m a
nil = construct StreamStop
cons :: Monad m => a -> Stream m a -> Stream m a
cons a s = construct (StreamEmit a s)</code></pre><p>The benefit of re-expressing the <code>Stream</code> type in this way is that we
can now define a powerful recursion scheme on values of type <code>D m f</code>,
including <code>Stream m a</code>, and use this to define functions like <code>append</code>
and <code>(|>|)</code>. The interesting part is that this recursion scheme comes
equipped with a reasoning principle, which allows us to prove things
about our functions. To properly define the recursion scheme we first
need to know what an Eilenberg-Moore algebra for a monad is.</p><h3>Eilenberg-Moore Algebras</h3>
<p>For any monad <code>m</code>, an
<a href="http://en.wikipedia.org/wiki/Eilenberg-Moore_algebra#Algebras_for_a_monad"><code>m</code>-Eilenberg-Moore algebra</a>
consists of a pair of a type <code>a</code> and a function <code>h :: m a -> a</code>,
satisfying some laws that state that <code>h</code> interacts nicely with the
structure of the monad. I think of the existence of an Eilenberg-Moore
algebra structure on <code>a</code> as roughly stating that the type <code>a</code> is
effectful in the sense that it can have additional effects "prepended"
on to it.</p><p>The existence of an Eilenberg-Moore algebra structure for a type can
be expressed as a Haskell type class, <code>EMAlgebra</code>:</p><pre><code>class (Functor m, Monad m) => EMAlgebra m a where
algebra :: m a -> a</code></pre><p>Of course we cannot state the laws, so we shall just commit to
promising that they hold. The type class mechanism also ties us to
giving at most one Eilenberg-Moore structure for each pair of a monad
<code>m</code> and a type <code>a</code>, where in fact there may be many, but this is a
small price to pay for the convenience that type classes provide.</p><p>Every type of the form <code>m a</code> for some monad <code>m</code> is obviously
effectful, so we can give it an Eilenberg-Moore algebra structure
using the function <code>join :: m (m a) -> m a</code> defined in the
<code>Control.Monad</code> module:</p><pre><code>instance (Functor m, Monad m) => EMAlgebra m (m a) where
algebra = join</code></pre><p>This is the free Eilenberg-Moore algebra for the monad <code>m</code> and the
type <code>a</code>.</p><p>The property of having an Eilenberg-Moore algebra is also preserved by
the construction of function types. If <code>b</code> has an Eilenberg-Moore
structure with respect to <code>m</code>, then so does <code>a -> b</code> for any <code>a</code>:</p><div class="annotation"><span class="time">2012-01-07 13:15</span>
Unfortunately, this instance and the previous one overlap, so GHC
needs <code>-XIncoherentInstances</code> to handle it. This seems to work
for the examples here, but probably a better solution is needed in
general.
</div>
<pre><code>instance (EMAlgebra m b) => EMAlgebra m (a -> b) where
algebra x a = algebra $ do f <- x; return (f a)</code></pre><p>Finally, every one of our interleaved data and effects types carries a
default Eilenberg-Moore structure, achieved by prepending the new
effect before the first effect of the data:</p><pre><code>instance (Functor m, Monad m) => EMAlgebra m (D m f) where
algebra x = D $ x >>= force</code></pre><p>Note that we now have two algebra structures on <code>D m f</code>: the
<code>m</code>-Eilenberg-Moore structure defined here, and the <code>f</code>-algebra
structure defined by the <code>construct</code> function above.</p><div class="annotation"><span class="time">2012-01-07 13:15</span>
Fixed a typo here, thanks to ehird in the comments below for spotting
it.</div>
<h3>A Recursion Scheme</h3>
<p>A basic recursion scheme is the <em>catamorphism</em>, which allows us to
eliminate an element of a recursive type <code>Rec f</code> using an algebra <code>f a
-> a</code>, where <code>Rec f</code> is defined as follows:</p><pre><code>data Rec f = In (f (Rec f))</code></pre><p>The catamorphism function, or to use its common name <code>fold</code>, has the
following type:</p><pre><code>fold :: Functor f => (f a -> a) -> Rec f -> a</code></pre><p>A function of type <code>f a -> a</code> is the (structure part) of an
<code>f</code>-algebra. <code>f</code>-algebras are similar to Eilenberg-Moore algebras,
except that they do not need to satisfy any laws (because we do not
assume that <code>f</code> is a monad).</p><p>The <code>fold</code> function itself satisfies the following law (which can also
be taken to be the definition):</p><pre><code>fold h (In x) = h (fmap (fold h) x)</code></pre><p>and is the <em>unique</em> function of type <code>Rec f -> a</code> to do so. This
uniqueness property can be used to reason about functions built from
<code>fold</code>.</p><p>To define functions that eliminate our interleaved effects and data
types <code>D m f</code>, we could just use the fact that they are equivalent to
the form <code>m (Rec (f :.: m))</code>, where <code>- :.: -</code> indicates composition of
functors, and use <code>fold</code> on the type <code>Rec (f :.: m)</code>. However, in
doing this we are forced to consider the pure and effectful parts of
our data simultaneously.</p><p>A better approach, which is derivable from the basic <code>fold</code> operator,
is to eliminate values of type <code>D m f</code> using
<code>f</code>-and-<code>m</code>-algebras. That is, we assume that the result type <code>a</code> has
a an <code>f</code>-algebra structure <code>f a -> a</code> <em>and</em> an Eilenberg-Moore
structure <code>m a -> a</code>. In this way, we can separate the pure and
effectful parts of our recursion. By making use of the <code>EMAlgebra</code>
type class defined above, we do not need to explicitly mention the
effectful part at all.</p><p>Our new <code>effectfulFold</code> combinator has the following type, and I have
given the direct Haskell implementation. It is an interesting exercise
to see now it can be defined in terms of the <code>fold</code> combinator on <code>Rec
(f :.: m)</code>.</p><div class="annotation"><span class="time">2012-01-07 13:15</span>
Thanks to <a href="http://www.reddit.com/r/haskell/comments/o5ioy/reasoning_about_stream_processing_with_effects/c3estki">spacespacecomma</a>
on reddit for pointing out that the original definition here didn't
type check. I was missing the applications of
<code>force</code>.</div>
<pre><code>effectfulFold :: (Functor f, EMAlgebra m a) =>
(f a -> a)
-> D m f
-> a
effectfulFold h = algebra . fmap loop . force
where
loop (Step x) =
h $ fmap algebra $ fmap (fmap loop) $ fmap force $ x</code></pre><p>As we prove in the
<a href="http://bentnib.org/induction-with-effects.html">paper</a>,
generalising Filinski and Støvring's proof, functions defined using
<code>effectfulFold</code> satisfy the following two properties. First, they
preserve <code>f</code>-algebras, taking the <code>f</code>-algebra <code>construct</code> to the
supplied <code>f</code>-algebra <code>h</code>:</p><pre><code>effectfulFold h (construct x) = h (fmap (effectfulFold h) x)</code></pre><p>Moreover, they preserve <code>m</code>-Eilenberg-Moore algebras, taking the
default Eilenberg-Moore structure on <code>D m f</code> to the implicitly
provided Eilenberg-Moore structure on <code>a</code>:</p><pre><code>effectfulFold h (algebra x) = algebra (fmap (effectfulFold h) x)</code></pre><p>Analogously to <code>fold h</code>, <code>effectfulFold h</code> is the <em>unique</em> function
satisfying these two properties. In the paper, we use uniqueness to
derive an induction principle for data interleaved with monadic
effects, but here we can use uniqueness directly to reason about
functions defined using <code>effectfulFold</code>.</p><h3>Using the Recursion Scheme</h3>
<p>The <code>append</code> function on streams that I defined by hand above can be
re-expressed using the <code>effectfulFold</code> combinator:</p><pre><code>append :: (Functor m, Monad m) => Stream m a -> Stream m a -> Stream m a
append s1 s2 = effectfulFold h s1
where h (StreamEmit a s) = cons a s
h StreamStop = s2</code></pre><p>By its definition in terms of <code>effectfulFold</code> we know that <code>append</code>
preserves the Eilenberg-Moore structure on its first argument. This
means that the following equation holds for <code>x :: m (Stream m a)</code></p><pre><code>append (algebra x) s2 = algebra (fmap (\s -> append s s2) x)</code></pre><p>Note that <code>append</code> does not preserve the Eilenberg-Moore structure on
its second argument. The Eilenberg-Moore structure for streams
prepends effects on to the stream, but these effects will not be
executed by <code>append</code> until <em>after</em> all the effects in the first stream
have been executed.</p><p>Also by its definition in terms of <code>effectfulFold</code>, we know that
<code>append</code> satisfies the following equations with respect to the "pure"
part of streams:</p><pre><code>append (construct StreamStop) s2 = s2
append (construct (StreamEmit a s1)) s2 = construct (StreamEmit a (append s1 s2))</code></pre><p>We can now use the uniqueness of functions defined by <code>effectfulFold</code>
to see that <code>append</code> is associative. To show this, I'll use the
uniqueness of functions defined by <code>effectfulFold</code>. We have two
functions of type <code>Stream m a -> Stream m a</code> that we wish to prove
equivalent:</p><pre><code>\s1 -> append s1 (append s2 s3)</code></pre><p>and</p><pre><code>\s1 -> append (append s1 s2) s3</code></pre><p>where the first one is defined directly in terms of
<code>effectfulFold</code>. If we can show that the second function obeys the
same properties as the first, i.e. that it preserves the
<code>m</code>-Eilenberg-Moore algebra and the <code>f</code>-algebra used in the definition
of <code>append</code>, then by uniqueness we will have shown that they are the
same function.</p><p>It is easy to check that the second function preserves the
Eilenberg-Moore structure on <code>Stream m a</code>, since it is just the
composition of two functions that do so. So the meat of the proof is
in showing that it preserves the <code>f</code>-algebra structure. This splits
into two cases, one for <code>StreamStop</code> and one for <code>StreamEmit</code>. For the
former, we must show that</p><pre><code>append (append (construct StreamStop) s2) s3 = append s2 s3</code></pre><p>but this follows directly from the properties we already know about
<code>append</code>. The second case is a little harder. We must show that </p><pre><code> append (append (construct (StreamEmit a s1)) s2) s3
=
construct (StreamEmit a (append (append s1 s2) s3))</code></pre><p>This follows by repeated application of our knowledge about how
<code>append</code> operates on input of the form <code>construct (StreamEmit a s1)</code>.</p><p>Thus we have used the uniqueness property of functions defined using
<code>effectfulFold</code> to show that <code>append</code> is associative.</p><h2>Back to Readers</h2>
<p>Just as the <code>Stream</code> type can be defined in terms of the generic <code>D m
f</code> type, the <code>Reader</code> type can be defined in the same way:</p><pre><code>data ReaderStep a b x
= ReaderRead (Maybe a -> x)
| ReaderEmit b
deriving Functor
type Reader a m b = D m (ReaderStep a b)</code></pre><p>The <code>runOnNothing</code> function can be defined in terms of
<code>effectfulFold</code>, by recursing on the <code>Reader</code> argument:</p><pre><code>runOnNothing :: (Functor m, Monad m) => Reader a m b -> m b
runOnNothing = effectfulFold f
where f (ReaderRead k) = k Nothing
f (ReaderEmit b) = return b</code></pre><p>Likewise, the <code>(|>|)</code> function that connects a stream with a reader
can be defined in terms of <code>effectfulFold</code>, by recursion on the stream
argument, and using the Eilenberg-Moore structure on the type <code>Reader
a m b -> m b</code>:</p><pre><code>(|>|) :: (Monad m, Functor m) => Stream m a -> Reader a m b -> m b
(|>|) = effectfulFold f
where f (StreamEmit a h) r = do
readerStep <- deconstruct r
case readerStep of
ReaderRead k -> h (k (Just a))
ReaderEmit b -> return b
f StreamStop r = runOnNothing r</code></pre><p>The <code>(|>|)</code> function executes the effects of the stream before the
effects of the reader, letting the stream dictate how events
proceed. An alternative is to execute the effects of the reader first,
by defining the function in terms of recursion on the reader argument,
again using <code>effectfulFold</code>:</p><pre><code>(|>>|) :: (Monad m, Functor m) => Stream m a -> Reader a m b -> m b
s |>>| r = effectfulFold f r s
where f (ReaderRead k) s = do
streamStep <- deconstruct s
case streamStep of
StreamEmit a s' -> k (Just a) s'
StreamStop -> k Nothing nil
f (ReaderEmit b) s = return b</code></pre><p>There are obviously many other possibilities for connecting <code>Stream</code>s
to <code>Reader</code>s. If the <code>Stream</code> finishes early, we could return the
uncompleted <code>Reader</code> rather than passing it on to
<code>runOnNothing</code>. Likewise, if the <code>Reader</code> emits a value before all the
<code>Stream</code> has been read, we could return the leftover stream instead of
just dropping it on the floor. In any case, all these possibilities are
definable in terms of <code>effectfulFold</code>.</p><p><code>Reader</code>s are an instance of the <code>Monad</code> type class as the following
Haskell declaration witnesses:</p><div class="annotation"><span class="time">2012-01-07 13:15</span>
Thanks again to <a href="http://www.reddit.com/r/haskell/comments/o5ioy/reasoning_about_stream_processing_with_effects/c3estki">spacespacecomma</a>
on reddit for pointing out that this doesn't directly work. You need
to wrap the <code>Reader</code> type in a <code>newtype</code> to get it to
work. I'll leave it as-is to give the general idea.</div>
<pre><code>instance Monad m => Monad (Reader a m) where
return b = construct (ReaderEmit b)
c >>= k = effectfulFold f c
where f (ReaderRead k) = construct (ReaderRead k)
f (ReaderEmit b) = k b</code></pre><p>Using the uniqueness property of <code>effectfulFold</code> it is possible to
prove that this instance really does satisfy the monads laws. In fact,
it is possible to show that <code>Reader a m</code> is the sum of the monad <code>m</code>
with the free monad on the functor <code>f x = Maybe a -> x</code>. This is an
instance of a fact originally observed by
<a href="http://homepages.inf.ed.ac.uk/gdp/publications/Comb_Effects_Jour.pdf">Hyland, Plotkin and Power</a>
(Theorem 4).</p><h2>Processors</h2>
<p>If <code>Stream</code>s produce data and <code>Reader</code>s consume data, then what goes
in between? <code>Processor</code>s! A <code>Processor</code> is an instance of the same
interleaved data and effects pattern as <code>Stream</code>s and <code>Reader</code>s:</p><pre><code>data ProcessorStep a b x
= ProcessorRead (Maybe a -> x)
| ProcessorEmit b x
| ProcessorStop
type Processor a m b = D m (ProcessorStep a b)</code></pre><p>A <code>Processor</code>, once kicked using <code>force</code>, can either ask for more
input (<code>ProcessorRead</code>), can emit some output (<code>ProcessorEmit</code>) with a
promise to carry on, or can signal end of stream
(<code>ProcessorStop</code>). <code>Processor</code>s are intended to fulfil the same role
as <code>Enumeratees</code> or <code>Conduit</code>s as intermediaries that manipulate data
in some way. They are also very similar to the StreamProcessor
representation defined by
<a href="http://personal.cis.strath.ac.uk/~ng/papers/ghani-lmcs09.pdf">Ghani, Hancock and Pattinson</a>.</p><p>With a few helper functions:</p><pre><code>processorRead :: Monad m => (Maybe a -> Processor a m b) -> Processor a m b
processorRead k = construct (ProcessorRead k)
processorStop :: Monad m => Processor a m b
processorStop = construct ProcessorStop
processorEmit :: Monad m => b -> Processor a m b -> Processor a m b
processorEmit b proc = construct (ProcessorEmit b proc)</code></pre><p>we can define a processor that filters:</p><pre><code>filter :: Monad m => (a -> Maybe b) -> Processor a m b
filter h = processorRead getInput
where
getInput Nothing = processorStop
getInput (Just a) =
case h a of
Nothing -> filter h
Just a -> processorEmit a (filter h)</code></pre><p>It is also possible to define several combinators that compose
<code>Stream</code>s with <code>Processor</code>s, <code>Processor</code>s with <code>Reader</code>s and
<code>Processor</code>s with <code>Processor</code>s in a similar manner to the
<a href="http://www.yesodweb.com/blog/2012/01/conduits-conduits">Conduits library</a>. We
have same choices as with the composition of <code>Stream</code>s and <code>Reader</code>s
in terms of the ordering of effects. The key point is that they may
all be defined in terms of <code>effectfulFold</code>, and reasoned about using
the universal property. For instance, we can prove that composition of
<code>Processor</code>s is associative, which opens the door to justified
optimisation principles for chains of stream processors.</p><h3>The Co-Inductive View</h3>
<p>In the above I have taken an inductive view on <code>Stream</code>s, <code>Reader</code>s
and so on. In Haskell, each recursive type is simultaneously an
inductive and co-inductive type. We can use this change-of-viewpoint
to give another way of defining interleaved data and effects in terms
of unfolds. The following function <code>unfold</code> takes a seed state of type
<code>s</code> and an evolution function of type <code>s -> m (f s)</code> and generates a
value of type <code>D m f</code>.</p><div class="annotation"><span class="time">2012-01-07 13:15</span> Was missing an additional
<code>Monad m</code> constraint here.</div>
<pre><code>unfold :: (Monad m, Functor f) => s -> (s -> m (f s)) -> D m f
unfold s step = loop s
where loop s = D $ do
f <- step s
return (Step (fmap loop f))</code></pre><p>The <code>ofHandle</code> stream defined above can be re-expressed in terms of
<code>unfold</code> as follows:</p><pre><code>ofHandle :: Handle -> Stream IO Char
ofHandle handle = unfold () step
where
step () = do
isEOF <- hIsEOF handle
if isEOF then do hClose handle
return StreamStop
else do c <- hGetChar handle
return (StreamEmit c ())</code></pre><p>In fact, we could change the representation of our interleaved data
and effect type to be directly expressed in terms of <code>unfolds</code>:</p><pre><code>data D m f where
D :: s -> (s -> m (f s)) -> D m f</code></pre><p>This representation makes it harder to define the <code>effectfulFold</code>
function and its associated reasoning principle. On the other hand, it
does allow for fusion of chained sequences of <code>Stream</code>s, <code>Processor</code>s
and <code>Reader</code>s in a similar manner to the
<a href="http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.104.7401">Stream Fusion</a>
(<a href="http://dl.acm.org/citation.cfm?id=1291199">alternative ACM paywall link</a>)
paper. Hopefully, this may allow for sequences of stream processing
functions to be fused together and open up more possibilities for
optimisation. But this remains to be seen.</p>http://bentnib.org/posts/2012-01-06-streams.htmlFri, 06 Jan 2012 00:00:00 +0000A Type Checker that knows its Monad from its Elbowhttp://bentnib.org/posts/2011-12-14-type-checker.html<p>I've been hacking a bit on
<a href="https://github.com/bobatkey/foveran">Foveran</a> lately. The main new
thing that I've added is the integration of the monad laws and functor
laws into the definitional equality. These additions were inspired by
some suggestions of Conor, and a post on
<a href="http://www.e-pig.org/epilogue/?p=504">Epilogue</a> by Pierre from some
time back. I've taken a different implementation approach to the one
sketched in that blog post. The scheme I am using is Filinski's
<a href="http://www.diku.dk/hjemmesider/ansatte/andrzej/papers/NbEftCLC-abstract.html">normalisation by evaluation for the computational lambda-calculus</a>,
with some little extensions.</p><h2>Descriptions of Indexed Functors form a (Relative) Monad</h2>
<p>To describe (most of) its own data-types, Foveran implements the
indexed descriptions (<code>IDesc</code>) data-type from the paper <a href="http://personal.cis.strath.ac.uk/~dagand/papers/levitation.pdf">The Gentle Art
of
Levitation</a>. Objects
of type <code>IDesc I</code> are codes for functors from <code>(I -> Set)</code> to <code>Set</code>,
where <code>I</code> is a <code>Set</code>. The <code>IDesc I</code> type has the following
constructors:</p><pre><code>“IId” : I -> IDesc I
“K” : Set -> IDesc I
_“×”_ : IDesc I -> IDesc I -> IDesc I
“Sg” : (A : Set) -> (A -> IDesc I) -> IDesc I
“Pi” : (A : Set) -> (A -> IDesc I) -> IDesc I</code></pre><p>The codes are all given their actual meanings by the <code>semI</code> operator
I'll talk about below. But without really thinking about what the
codes mean, one can easily see that there is a natural substitution
operation on <code>IDesc</code>s, simply because an object <code>D : IDesc I</code> is
really just a
<a href="http://blog.sigfpe.com/2010/01/monads-are-trees-with-grafting.html">tree with holes</a>
labelled with <code>I</code>s where the <code>“IId”</code>s are. Such a substitution operation
would have type:</p><pre><code>bind : (I J : Set) -> IDesc I -> (I -> IDesc J) -> IDesc J</code></pre><p>This is pretty easy to implement: just recurse down the first <code>IDesc
I</code> argument until you hit an <code>“IId” i</code> and then apply the <code>i</code> to the
provided function. If I just implement <code>bind</code>, though, the type
checker doesn't get to know that it satisfies the (relative) monad
laws. (Side note: <code>IDesc</code> itself has type <code>Set -> Set 1</code>, so it is a
<a href="http://www.cs.ioc.ee/~james/papers/Relative_Monads.pdf">relative monad</a>).</p><p>So I've built-in the <code>bind</code> operation into the Foveran system and
given it the following syntax:</p><pre><code>bind x <- D1 in D2</code></pre><p>where <code>x</code> is bound in <code>D2</code>.</p><p>When the Foveran type checker normalises terms of type <code>IDesc I</code>, it
knows that they have one of two kinds of normal form: either they are
constructed from one of the constructors above, and all the
sub-<code>IDesc I</code>s are in normal form; or they are of the form <code>bind x <-
D1 in D2</code> where <code>D1</code> is stuck and <code>D2</code> is in normal form. By
representing all objects of type <code>IDesc I</code> in this way, the normaliser
is able to normalise terms up to the monad laws, as I'll now
demonstrate.</p><p>If the normaliser is given a variable <code>D : IDesc I</code> with no
definition, or some other stuck term, then its normal form is <code>bind x
<- D in “IId” x</code>. So I automatically get the right-unit law for
(relative) monads to hold definitionally in the Foveran type theory.</p><p>The internal implementation of the <code>bind</code> operator rearranges terms to
always get things into the form described above. So nested <code>bind</code>
applications get linearised:</p><pre><code> bind x <- (bind y <- D1 in D2) in D3
=
bind y <- D1 in bind x <- D2 in D3</code></pre><p>This implements the associativity laws of (relative) monads.</p><p>Finally <code>bind</code> knows what to do with any of the constructors above for
building up <code>IDesc I</code> objects: it just commutes round them until it
gets to an <code>“IId” i</code>, where it performs the actual substitution:</p><pre><code> bind x <- “IId” i in D
=
D[i/x]</code></pre><p>where <code>D[i/x]</code> means <code>D</code> with <code>i</code> substituted for <code>x</code>. So the
left-unit law for (relative) monads holds definitionally. This is less
interesting though, since this law always holds even if I define
<code>bind</code> myself rather than building it into the type theory.</p><h2>The Interpretation of Descriptions...</h2>
<p>Given a object <code>D</code> of type <code>IDesc I</code> and a type <code>X</code> with a free
variable <code>i : I</code>, the interpretation of <code>D</code> at <code>X</code> is given by the
special type former <code>semI[D, i. X]</code>, which obeys the following rules:</p><pre><code>semI[“IId” e, i. X] = X[e/i]
semI[“K” A, i. X] = A
semI[D1 “×” D2. i. X] = semI[D1, i. X] × semI[D2, i. X]
semI[“Sg” A D, i. X] = (a : A) × semI[D a, i. X]
semI[“Pi” A D, i. X] = (a : A) -> semI[D a, i. X]</code></pre><p>Given the description of the normal forms of the <code>IDesc I</code> that I
described above, it would seem odd that there is no clause for <code>semI</code>
on terms constructed from <code>bind</code>. So let us add one:</p><pre><code>semI[bind x <- D1 in D2, i. X] = semI[D1, x. semI[D2, i. X]]</code></pre><p>(remember that <code>x</code> may be free in <code>D2</code>). By integrating this law into
the normaliser, I immediately get that the semantics of composed
descriptions of functors is definitionally equal to the composition of
the semantics of the descriptions. In short: the normaliser, and hence
the definitional equality, knows that <code>semI</code> is a (relative) monad
morphism!</p><h2>... is a Functor</h2>
<p>I've been saying that the <code>IDesc</code> type describes functors, but the
<code>semI</code> operator only tells you what happens on objects. I could
easily define a <code>mapI</code> function to implement an action on morphisms:</p><pre><code>mapI : (I : Set) ->
(D : I -> IDesc I) ->
(X Y : I -> Set) ->
((i : I) -> X i -> Y i) ->
semI[D, i. X] -> semI[D, i. Y]</code></pre><p>If I just implement this though, I am in the same position as with
the <code>bind</code> function above: the definitional equality doesn't know that
the functor laws hold. So I've added the following special operator to
the Foveran type theory:</p><pre><code>mapI[D, i. X, i. Y, f, x]</code></pre><p>where <code>D : IDesc I</code> (the <code>I</code> gets inferred), <code>X</code> and <code>Y</code> are types
each with a free <code>I</code> variable <code>i</code>, <code>f : (i : I) -> X[i] -> Y[i]</code>, and
<code>x : semI[D, i. X]</code>. The whole thing has type
<code>semI[D, i. Y]</code>. (There's a lot more that could be inferred here, even
without full-on type reconstruction, but this is what I've implemented
at the moment).</p><p>Now I have to pull several tricks to get the functor laws to hold
definitionally. The main one is forcing that the only normal forms of
type <code>semI[D, i. Y]</code> <em>when <code>D</code> is stuck</em> are of the form
<code>mapI[D, i. X, i. Y, f, x]</code>.</p><p>So if it has a stuck term, like a variable <code>x</code>, of type
<code>semI[D, i. Y]</code>, the normaliser always expands it to a map of the
(<code>I</code>-indexed) identity functor over <code>x</code>:</p><pre><code>x = mapI[D, i. Y, i. Y, λi y. y, x]</code></pre><p>This ensures that the identity preservation law always holds
definitionally, similarly to the case for the right-unit law of the
relative monad <code>IDesc</code> above.</p><p>As one might imagine, the rest of the implementation of <code>mapI</code>
proceeds by recursion on the structure of the <code>IDesc I</code> argument,
until it gets to a <code>bind</code>, whereupon it does something special:</p><div class="annotation"> <span class="time">2011-12-15 14:00</span>: Thanks
to Andrea Vezzosi in the comments for pointing out the mistake in the
old version here. I also found a bigger mistake in the types. I think
the new version is correct.</div>
<pre><code> mapI[bind x <- D1 in D2, i. Y, i. Z, f
, mapI[D1, x. semI[D2 x, i. X], x. semI[D2 x, i. Y], g, z]]
=
mapI[D1, x. semI[D2 x, i. X], x. semI[D2 x, i. Z]
, λi x. mapI[D2 i, i. Y, i. Z, f, g i x]
, z]</code></pre><p>Note that the last argument on the top line must have this form, by
the normal forms of stuck <code>IDesc</code>s and the normal forms of values of
type <code>semI</code>.</p><p>This rule ensures that <code>mapI</code> both satisfies the preservation of
composition, and tracks the monad morphism behaviour of <code>semI</code>.</p><h2>Making use: Inductive Types with Strictly Positive Parameters</h2>
<p>There's an example of the use of the extended definitional equality in
the file
<a href="https://github.com/bobatkey/foveran/blob/master/tests/parameters.fv"><code>parameters.fv</code></a>
in the Foveran github repo. I'll give a quick overview below. Thanks
to
<a href="http://www.cis.strath.ac.uk/cis/staff/index.php?uid=72830">Stevan Andjelkovic</a>
for the idea.</p><h3>Codes for Inductive Types with Parameters</h3>
<p>The goal of the file is to give an account of inductive data types
with parameters in such a way that I can define the functor laws
generically (the <code>IDesc</code> codes above cannot, alas, define inductive
data-types). For example, the type of lists of some type <code>A</code> uses the
type <code>A</code> in a particular way that ensures that I can always define a
generic map operation.</p><p>So I define a new type of codes of types indexed by some set <code>I</code>, with
parameters indexed by some other set <code>P</code>:</p><pre><code>IDescP : Set -> Set -> Set 1
IDescP P I = I -> IDesc (P + I)</code></pre><p>To realise these codes as actual inductive types, I must compose them
with something that handles the extra <code>P +</code> by instantiating it with
the parameters:</p><pre><code>IDescP:fork : (P I : Set) -> (P -> Set) -> P + I -> IDesc I
IDescP:fork P I X x =
case x with { inl p. “K” (X p); inr i. “IId” i }</code></pre><p>The complete codes are built using the following definition, which
uses the <code>bind</code> operator to compose the descriptions:</p><pre><code>IDescP:code : (P I : Set) -> IDescP P I -> (P -> Set) -> I -> IDesc I
IDescP:code P I D X =
λi. bind x <- D i in IDescP:fork P I X x</code></pre><p>Finally, they can be turned into actual <code>I</code>-indexed types, given the
<code>P</code>-indexed family of parameters:</p><pre><code>muP : (P I : Set) -> IDescP P I -> (P -> Set) -> I -> Set
muP P I D X = µI I (IDescP:code P I D X)</code></pre><h3>Defining Generic Map</h3>
<p>The generic map function is now defined as follows (<code>cata</code> is the
generically defined catamorphism function, derived from the built-in
induction on inductive types):</p><pre><code>map : (P I : Set) ->
(D : IDescP P I) ->
(X Y : P -> Set) ->
((p : P) -> X p -> Y p) ->
(i : I) -> muP P I D X i -> muP P I D Y i
map P I D X Y f =
cata I (IDescP:code P I D X)
« muP P I D Y
, λi x. construct
(mapI[D i, x. semI[IDescP:fork P I X x, i. muP P I D Y i]
, x. semI[IDescP:fork P I Y x, i. muP P I D Y i]
, λx. case x with { inl p. f p; inr i. λx. x }
, x])
»</code></pre><p>This looks complicated, but this is mainly due to <code>mapI</code> not yet
inferring as many of its arguments as it could. What is important is
what I haven't written. In the <code>λi x. construct</code>... bit, the variable
<code>x</code> has type <code>semI[IDescP:code P I D X i, i. muP P I D Y i]</code>. By the
definition of <code>IDescP:code</code> this <em>normalises</em> to
<code>semI[D i, x. semI[IDescP:fork P I X x, i. muP P I D Y i]</code> and I can
easily apply <code>mapI</code>.</p><p>If the normaliser didn't know that <code>semI</code> was a relative monad
morphism, then <code>x</code>'s type would be
<code>semI[bind x <- D i in IDescP:fork P I X x, i. muP P I D Y i]</code> and I
would've had to manually decompose into the two parts, apply the
<code>mapI</code> and then recompose. By making the definitional equality
stronger, I can get away with writing less code.</p><h2>What other Laws can be built-in?</h2>
<p>What other laws can be built-in to the definitional equality? I
believe that Conor wants to build the concept of free monad over a
functor into <a href="http://www.e-pig.org/">Epigram</a>, so that the monads laws
automatically hold for all (for some intensionally defined version of
"all") free monads. In the full-blown
<a href="http://personal.cis.strath.ac.uk/~dagand/papers/levitation.pdf">levitation</a>
scheme, <code>IDesc</code> would itself just be one free monad among many, and
the laws would automatically hold for it. I think the <code>semI</code> and
<code>mapI</code> laws would have to be still added specially though.</p><p>There are also subsets of the functors described by <code>IDesc</code> codes that
have special properties. If one stays away from codes that introduce
"data" (i.e. codes that describe containers with no shapes) then it is
automatically an
<a href="http://www.soi.city.ac.uk/~ross/papers/Applicative.html">applicative functor</a>. The
<code>lift</code>/<code>All</code> construction used for describing generic induction is the
leading example of this type. Dually, if one stays away from codes
that introduce a choice of positions, one automatically get the
(nameless?) dual of an applicative functor (e.g. the <code>Somewhere</code>
modality). Maybe it would be interesting to let the relevant laws hold
automatically in these cases as well?</p>http://bentnib.org/posts/2011-12-14-type-checker.htmlWed, 14 Dec 2011 00:00:00 +0000How to be a Productive Programmerhttp://bentnib.org/posts/2011-11-14-productive-programmer.html<p>On Friday, I gave a talk at the <a href="http://www.dcs.gla.ac.uk/research/spls/Nov11/index.html">Scottish Programming Languages
Seminar (SPLS) at
Heriot-Watt</a>. Many
thanks to Greg Michaelson for organising everything and giving me time
to speak.</p><p>I've put the slides I used <a href="http://bentnib.org/docs/productive-spls1111-slides.pdf">on-line as a PDF
file</a>
(with two small fixes, see below).</p><p><p class="displaylink"><a href="http://bentnib.org/docs/productive-spls1111-slides.pdf"><img alt="Thumbnail of slides for "How to be a Productive Programmer" talk" src="http://bentnib.org/thumbnails/slides-productive-spls1111-slides.png"></a></p></p><p>The talk presents an extension of a Nakano-style typed λ-calculus with
a delay modality for guarded recursion. In short, this means:</p><ul><li>The type system presented has a type-former <code>▷</code> so that for any type
<code>A</code>, there is another type <code>▷ A</code> representing an <code>A</code> delayed by one
step in the current time stream.</li><li>Unlike Nakano's calculus, where the delay modality is presented
using sub-typing rules, I have followed Conor's idea and presented
the delay modality using applicative functor rules. I think this
makes it a little easier to program with, and certainly easier to
experiment with in an existing language like Haskell.</li><li>The really exciting new thing is the addition of <em>clock variables</em>,
which allows for multiple time streams to be in play at once. This
means it is possible to look at “all” of a potentially infinite
computation and extract useful information from it. In the slides I
presented the <code>take</code> and <code>replaceMin</code> functions as examples of this.</li></ul>
<p>As I mentioned during the talk, this is based on Conor's blog post
<a href="http://www.e-pig.org/epilogue/?p=186">Time files like an Applicative
Functor</a> from over two years
ago.</p><p>The hoped-for outcome is a nice way of programming with codata in a
functional language like Haskell. Still more hopefully, this will
extend to systems with dependently types too, though I am still
confused about where the clock variables ought to live.</p><p>I fixed two typos that were present in copy of the slides that I used
in the talk:</p><ul><li><p>On slide 18 (page 42 in the PDF), I had “Time Files like an
Applicative Functor” as the title of Conor's blog post. Obviously,
time is rubbish at filing, or at least not as good as applicative
functors are.</p></li><li><p>On slide 28 (page 57 in the PDF), I had the semantics of data type
descriptions as an endofunction on some undefined curly D thing. It
ought to be the powerset of the semantic domain.</p></li></ul>http://bentnib.org/posts/2011-11-14-productive-programmer.htmlMon, 14 Nov 2011 00:00:00 +0000On Structural Recursion II: Folds and Inductionhttp://bentnib.org/posts/2011-04-28-folds-and-induction.html<p><a href="http://bentnib.org/posts/2011-04-22-structural-recursion.html">Last time</a> I talked
about the background on defining structurally recursive functions in
Type Theory and why you might want it. The key point is that
structural recursion is driven by the data that is being analysed, as
opposed to just doing its own thing with a side-proof that it always
terminates.</p><p>The goal here is to come up with a self-contained and syntax-free
definition of a dependently-typed structural recursion principle for a
type <code>A</code>. To start with, this will be purely for normal structural
recursion over inductively defined types. In future posts I will
extend this to other types by combining existing structural induction
principles.</p><p>In this post I'll cover how to present structural recursion and
induction in a generic way. The first is probably well known to most
people reading this, being just the standard stuff about initial
<code>F</code>-algebras, but hopefully the second will be new.</p><p>The generic approach to structural induction I'll describe below
underlies how <a href="http://www.e-pig.org/">Epigram 2</a> and my own language
<a href="https://github.com/bobatkey/foveran/">Foveran</a> implement structural
induction in a generic way. More information on Epigram's
representation of data types is available in the paper <a href="http://personal.cis.strath.ac.uk/~dagand/papers/levitation.pdf">The Gentle Art
of
Levitation</a>.</p><p>My last post led to a <a href="http://www.reddit.com/r/haskell/comments/gv6ze/structural_recursion/">very interesting discussion on
reddit</a>,
which quickly left the measly content of my original behind. Just to
clarify some of the points that came up in that discussion, I am only
talking here about structural recursion/induction over inductively
defined types---by which I mean types that arise as initial fixpoints
of <code>F</code>-algebras---not co-inductive types (though the general technique
for handling structural induction I'll describe below does extend to
co-induction, at least in a categorical setting).</p><h2>What is Structural Recursion?</h2>
<p>Structural recursion is often presented as a restriction. In Coq or
Agda, the programmer enters a recursive definition, and the system
inspects it to make sure it fits the template of acceptable
definitions. This "generate and test" approach naturally leads most
people to think that the system is just being stupid when it refuses
to accept their definition.</p><p>An alternative is for the system to be up-front about the definitions
it is prepared to accept. Instead of writing recursive definitions and
having them rejected, the programmer uses special-purpose recursion
combinators to write their programs. This is the basic idea behind
recursion schemes, as in the classic paper <a href="http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.41.125">Functional Programming
with Bananas, Lenses and Barbed
Wire</a>. An
important extra benefit of using recursion schemes, as pointed out in
that paper, is that they often also come with reasoning principles.</p><p>I am interested in going beyond normal recursion schemes, and doing
<em>dependent</em> recursion schemes as well. This allows us to use
structural recursion as a method of reasoning as well as a method of
definition, and to intertwine the two if we feel like it. First
though, I'll go over the standard stuff about how to treat the most
basic recursion scheme at a general level.</p><h3>Folds, Catamorphisms and F-Algebras</h3>
<p>We can represent structural recursion on lists (with elements from a
set <code>A</code>) using the right fold function:</p><pre><code>foldr : (B : Set) → B → (A → B → B) → List → B</code></pre><p>which satisfies the following two equations:</p><pre><code>foldr B n c nil = n
foldr B n c (cons a as) = c a (foldr B c n as)</code></pre><p>As I mentioned
<a href="http://bentnib.org/posts/2011-04-22-structural-recursion.html">last time</a>, we can read
these off directly as computation rules: when the implementation of
<code>foldr</code> sees <code>nil</code> in its last argument, it knows to use its "nil"
parameter; and when it sees <code>cons a as</code> in its third argument it knows
to use the "cons" parameter. Thus we have a method for computation,
driven by the structure of the data we are processing.</p><p>The well-known category-theoretic generalisation of this makes use of
<code>F</code>-algebras. The basic idea is that the constructors of our inductive
type are encoded as a functor <code>F : Set → Set</code>. For instance, the
constructors of lists, with elements from some set <code>A</code> are encoded
using the functor (I've only defined it here on objects, but the
action on morphisms is obvious):</p><pre><code>ListF : Set → Set
ListF X = 1 + A × X</code></pre><p>The first two arguments of the <code>foldr</code> function can now be captured as
a function <code>f : ListF B → B</code>. A pair of a set <code>B</code> and such a function
<code>f : F B → B</code> is called an <code>F</code>-algebra. Given two <code>F</code>-algebras <code>(k₁ :
F B → B)</code> and <code>(k₂ : F C → C)</code>, we define <code>F</code>-algebra morphisms to be
functions <code>h : B → C</code> such that <code>h ○ k₁ = k₂ ○ F h</code>. It is not too
difficult to see that <code>F</code>-algebras and their morphisms form a
category, called <code>F</code>-Alg.</p><p>For quite a large class of functors <code>F</code>, <code>F</code>-Alg has a (unique up-to
isomorphism) initial object. Spelling this out, we have an <code>F</code>-algebra
<code>(in : F µF → µF)</code> such that for any other <code>F</code>-algebra <code>(k : F B → B)</code>
there is a <em>unique</em> <code>F</code>-algebra morphism from <code>(in : F µF → µF)</code> to
<code>(k : F B → B)</code>.</p><p>For the lists example, an initial algebra for the functor <code>ListF</code> is
the set of lists with elements from the set <code>A</code>, with <code>(in : ListF
(List A) → List A)</code> defined simply as the construction of lists using
<code>nil</code> and <code>cons</code>. By initiality, if we have a <code>ListF</code>-algebra <code>(k :
ListF B → B)</code> we get a function of type <code>List A → B</code>. Since it is a
<code>ListF</code>-morphism it satisfies the two equations for <code>foldr</code> above, so
we can interpret <code>foldr</code>.</p><p>More generally, we can interpret specifications of inductive types as
functors <code>F</code>, the inductive types themselves as the carriers of
initial algebras and structural recursion on them as making use of the
initiality property to generate homomorphisms. Note that not every <code>F</code>
you can write down necessarily has an initial algebra; for instance,
<code>F X = (X → 2) → 2</code> doesn't have an initial algebra in the category of
sets and functions.</p><p>In the recursion schemes literature, folds are referred to as
<em>catamorphisms</em>. Edward Kmett has written <a href="http://knol.google.com/k/catamorphisms">a comprehensive Knol on
catamorphisms</a>. He also has a
<a href="http://comonad.com/reader/2009/recursion-schemes/">field guide</a> to
recursion schemes, for both data and codata.</p><h2>What is Structural Induction?</h2>
<p>From a high level, (constructive) structural induction is just
structural recursion with fancier types. And from a computation
viewpoint this is fairly accurate: computation still proceeds by
looking at data and then deciding what to do next. But we still need
to work out what those types should be.</p><h3>Dependent Structural Recursion (a.k.a. Structural Induction)</h3>
<p>In pure Type Theory (i.e. without extensions for pattern matching and
recursive definitions), structural recursion is presented in terms of
elimination principles. Continuing the example of lists, we are
provided with a pair of constructors <code>nil : List</code> and <code>cons : A → List
→ List</code> and an eliminator:</p><pre><code>elimList : (P : List → Set) →
(P nil) →
((a : A) → (l : List) → P l → P (cons a l)) →
(l : List) → P l</code></pre><p>This looks very similar to the type of the <code>foldr</code> function above. The
difference is that the return type depends on the input list, and this
dependency is carried through in the types of the "nil" and "cons"
parameters (the second and third parameters).</p><p>As with the <code>foldr</code> function, <code>elimList</code> satisfies a pair of
equations, which are almost identical, except for some extra
information being passed in the <code>cons</code> case.</p><pre><code>elimList P n c nil = n
elimList P n c (cons a as) = c a as (elimList P n c as)</code></pre><p>In presentations of type theory, the construction of elimination
principles is usually defined syntactically by looking at the types of
the constructors. See, for example, <a href="http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.37.74">Dybjer's presentation of
inductive
families</a>. This
seems a little ad-hoc: one might hope for a generic way of extracting
dependent eliminators for inductive types just from their
presentations as initial <code>F</code>-algebras. Happily, this is possible.</p><h3>Structural Induction for initial <code>F</code>-algebras</h3>
<p>Given a functor <code>F : Set → Set</code> that has an initial algebra <code>(in : F
µF → µF)</code> we first need to derive the type of the generic elimination
principle for <code>µF</code>. The generic fold operator for <code>µF</code> has this type:</p><pre><code>fold : (B : Set) → (F B → B) → µ F → B</code></pre><p>Looking at the similarities between the <code>foldr</code> function on lists and
the <code>elimNat</code> eliminator, we can guess that a generalisation should
look like this for general <code>(in : F µF → µF)</code>:</p><pre><code>elimF : (P : µF → Set) →
((x : F µF) → lift F P x → P (in x)) →
(x : µF) → P x</code></pre><p>I'll attempt to explain this type with reference to the type of
<code>elimList</code>. The first parameter, <code>P : µF → Set</code> is the dependent type
we are eliminating to, and is exactly the same. The second parameter
is the "algebra" component. It is a function that takes a value of
type <code>F µF</code>. In the case of lists, this would either be <code>inl *</code>
(representing nil) or <code>inr (a,l)</code> representing cons. So, similarly to
<code>F</code>-algebras representing the multiple parameters to <code>fold</code>, we are
encoding all the cases into one. The second argument of the algebra
component is a little trickier. I have assumed a functor <code>lift F : (µF
→ Set) → (F µF → Set)</code> that takes <code>P</code> and <code>x</code> and produces some
set. This part somehow encodes the "inductive hypothesis" for this
structural recursion principle. I'll explain this further
below. Finally, the "algebra" part ought to return a value of type <code>P
(in x)</code>, corresponding to the <code>P nil</code> and <code>P (cons a l)</code> parts of the
<code>elimList</code> type.</p><p>Given such an "algebra" parameter, <code>elimF</code> promises to take any
element <code>x</code> of <code>µF</code> and return a value of <code>P x</code>.</p><h4>Lifting <code>F</code></h4>
<p>I need to give a definition for <code>lift F : (F µF → Set) → (µ F → Set)</code>
now. In the case of lists, we can see that the following definition
gives the right answer:</p><pre><code>lift F P (inl *) = ⊤
lift F P (inr (a,l)) = P l</code></pre><p>Unfolding this definition in the type of <code>elim</code> above, it is easy to
see that it is an encoding of the original type of <code>elimList</code>.</p><p>But how do we do this in general for any <code>F</code>? <a href="http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.36.7400">Hermida and
Jacobs</a>
did this back in the nineties for polynomial <code>F</code> (i.e. <code>F</code>s
constructed from constants, sums, products and the identity functor)
by induction on the structure:</p><pre><code>lift Id P x = P x
lift (K A) P a = ⊤
lift (F × G) P (x,y) = (lift F P x) × (lift F P y)
lift (F + G) P (inl x) = lift F P x
lift (F + G) P (inr y) = lift G P y</code></pre><p>Note there are two cases for <code>F + G</code> depending on which component the
<code>(F + G) µH</code> argument is in. It is easy to see that <code>lift F</code> is also a
functor, so it has an action on morphisms between <code>µF → Set</code> objects.</p><p><em>Aside</em> Note that this definition doesn't depend on the type of <code>P</code>
being <code>µ F → Set</code>. We can actually give <code>lift F</code> the type <code>(A → Set) →
(F A → Set)</code> for any <code>A</code>. So we can extend <code>lift F</code> to be a functor on
the category <code>Fam(Set)</code>, where objects are pairs <code>(A : Set, P : A →
Set)</code>. This extra generality helps when proving that structural
induction is derivable from structural recursion, and for further
applications like doing <a href="http://bentnib.org/inductive-refinement.html">refinement of inductive
types</a>. Logical relations lovers will
also see that this is the definition of the unary logical relation
derived from the type scheme generated by a polynomial functor.</p><p>An important property of <code>lift F</code> is that it is
<em>truth-preserving</em>. For any <code>x : F µF</code>, there is a (unique) value of
<code>lift F (λy. ⊤) x</code>. I read this as saying that <code>x : F µF</code> completely
determines the shape of <code>lift F (λy. ⊤) x</code>. We can use this to derive
a useful function, using the functorality of <code>lift F</code>:</p><pre><code>all : (P : µF → Set) →
(x : F µF) →
(k : (y : µF) → P y) →
lift F P x</code></pre><p>Knowing that <code>lift F</code> is truth preserving turns out to be just enough
to prove that the <code>elimF</code> rule above holds, and also that the
following equation holds:</p><pre><code>elimF P h (in x) = h x (all P x (elimF P h))</code></pre><p>Spelling this out for the case of lists, it is pretty easy to see that
this gives us back the original elimination rule for lists. It is also
true that <code>elimF P h : (x : µF) → P x</code> is the <em>unique</em> function that
satisfies this equation, at least in the categorical setting. See
<a href="http://personal.cis.strath.ac.uk/~patricia/csl2010.pdf">Neil, Patty and Clem's
paper</a> for the
category-theoretic proof that this works. The proof only demands that
we have an initial algebra for <code>F</code> and a truth preserving lifting. As
a hint of the proof, note that if we regard <code>lift F</code> as an endofunctor
on <code>Fam(Set)</code> then the premises of the <code>elimF</code> function amount to
taking a <code>(lift F)</code>-algebra with carrier <code>(µF, P)</code> to a morphism <code>(x :
µF) → K₁ µF x → P x</code>, where <code>K₁ µF x = ⊤</code>. The proof consists of
showing that <code>K₁ µF</code> is the carrier of the initial <code>(lift F)</code>-algebra.</p><h4>Generalising to all <code>F</code></h4>
<p>I have only defined <code>lift F</code> for polynomial <code>F</code> so far, but it is
possible to characterise <code>lift F</code> for any arbitrary <code>F</code>, as hinted by
Hermida and Jacobs and fleshed out by Neil, Patty and Clem. Assuming
we had extensional equality, we could define <code>lift F</code> as:</p><pre><code>lift F P x = Σy : F (Σz : µF. P z). F π₁ y = x</code></pre><p>where <code>Σx : A. B x</code> denotes a dependent pair type, and <code>π₁</code> is the
first projection. It is fairly easy to see that this is always
truth-preserving, and that it coincides with the definition above for
polynomial functors. This is particularly nice because it gives an
extensional (i.e. we don't care about how <code>F</code> is constructed)
definition of what the lifting is. It is also possible to give a
generic description that works in any fibration with the right
structure, so we can generalise to indexed types, types over
categories of domains and so on.</p><p>When actually implementing this in a intensional type theory, like in
Epigram 2 or Foveran, it is better, both because of the lack of
extensional quality and for pragmatic reasons, to define the lifting
directly in terms of the descriptions of the functors. However, the
extensional definition does give a guide as to the form that <code>lift F</code>
ought to take when we move beyond polynomial functors.</p><h2>Can we generalise Structural Induction?</h2>
<p>My goal in the next post is to show that structural induction on the
structure of an inductively defined type is only one form of
structural induction. Generalising from the set up above, my current
thought for what a "structural induction principle" for some <code>A : Set</code>
should look like consists of:</p><ul><li>Another set <code>B</code>, which records the "sub-structure" of <code>A</code></li><li>A constructor <code>k : B → A</code></li><li>A predicate transformer <code>H : (A → Set) → (B → Set)</code></li><li>A witness to truth-preservation <code>tr : (b : B) → H (λa. ⊤) b</code></li><li>An eliminator <code>elim : (P : A → Set) → ((b : B) → H P b → P (k b)) → (a : A) → P a</code></li></ul>
<p>satisfying the equality</p><pre><code>elim P h (k b) = h b (all P b (elim P h))</code></pre><p>where <code>all</code> is derived from the witness to truth-preservation.</p><p>I should say that I'm not totally fixed on the details. My current
thought is that I might need multiple constructors, or that more
structure of <code>H</code> is required. But I'm pretty sure that an explicit
description of how elements of the data type we are analysing are
constructed is essential to any description of <em>structural</em> induction,
because it allows us to state the computation rule. My plan is to push
through examples and see what is required.</p><p>Previous work on this includes Nils Anders Danielsson's definition of
<a href="http://www.cse.chalmers.se/~nad/repos/lib/src/Induction.agda">Recursor</a>
in the Agda standard library, but this doesn't include a notion of
constructor, or a computation rule. Conor also has a definition called
<code>Below</code> which is pretty much identical.</p><p>Obviously structural induction on an inductively defined type fits
into this schema, with <code>B = F µF</code>, and <code>H = lift F</code>. It is not so
obvious that more general ones do. In the next few posts I'll cover
how to generate new structural induction principles from old ones. For
those of you that want to read ahead you can look at an outdated
implementation, using an old definition of structural induction
principle in the <a href="https://github.com/bobatkey/foveran/blob/master/tests/inductors-descript.fv">Foveran
repository</a>.</p>http://bentnib.org/posts/2011-04-28-folds-and-induction.htmlThu, 28 Apr 2011 00:00:00 +0000On Structural Recursionhttp://bentnib.org/posts/2011-04-22-structural-recursion.html<p>This is the first in (hopefully) a series of blog posts on defining an
algebra of structural induction principles in type theory, borrowing
inspiration from category theory. This is joint work with <a href="http://personal.cis.strath.ac.uk/~patricia">Patty
Johann</a> and <a href="http://personal.cis.strath.ac.uk/~ng">Neil
Ghani</a>, and builds on work and
ideas of many, notably <a href="http://personal.cis.strath.ac.uk/~conor">Conor
McBride</a>. In this post, I'll
just explain the background.</p><h2>Structural Recursion</h2>
<p>Structural recursion is a fundamental part of the definition of
functions in Type Theory, and also in functional programming
languages. A standard example is that of <code>length</code> on lists (in Haskell
syntax):</p><pre><code>length : [a] -> Int
length [] = 0
length (x:xs) = 1 + length xs</code></pre><p>Execution of <code>length</code> proceeds by looking at the constructor — either
<code>[]</code> or <code>_:_</code> — and then choosing a course of action. In the nil case,
it evaluates to <code>0</code>, in the cons case it recursively evaluates
<code>length</code>, on the sub-term exposed by the pattern matching, and then
adds <code>1</code> to it.</p><p>Definition by structural recursion has the following two features:</p><ul><li>It is always terminating, because we only ever call the function
again on <em>smaller</em> elements of the inductively defined type.</li><li>It provides an obvious way of computing. The process is: look at the
constructor, do an action. We can define what it means to <em>run</em>
functions, and hence use Type Theory as a programming language.</li></ul>
<p>The first of these, while a nice to have feature in a functional
programming setting, is essential in Type Theory, since we need
totality of defined functions to retain decidability of type checking.</p><h2>Nothing’s Ever Perfect</h2>
<p>Definition by structural recursion is, at least on the surface,
extremely restrictive. The restriction to recursive calls only on
terms that are immediate sub-terms of the current argument is
especially restrictive. The usual way to demonstrate this
restrictiveness is the functional not-in-place "quicksort":</p><pre><code>quicksort :: [a] -> [a]
quicksort [] = []
quicksort (x:xs) = quicksort l ++ quicksort (x:h)
where (l,h) = partition x xs</code></pre><p>For any reasonable definition of <code>partition</code>, the lists <code>l</code> and <code>h</code>
will be sufficiently shorter than <code>x:xs</code> to ensure termination, but
naïve structural recursion cannot see this.</p><p>To get around this kind of limitation, people have tried several
different approaches.</p><h3>General Recursion</h3>
<p>Give up on termination and just use general recursion. This is the
approach taken by most functional programming languages. General
recursion can also be encoded in Type Theory if you have co-inductive
types, as shown by <a href="http://www.lmcs-online.org/ojs/viewarticle.php?id=55&layout=abstract">Venanzio
Capretta</a>.</p><h3>Termination Checkers</h3>
<p>Instead of requiring that everything be defined using structural
recursion, bolt another termination checker on to the type
checker. This is the approach taken by <a href="http://wiki.portal.chalmers.se/agda/pmwiki.php">Agda
2</a>, and to a lesser
extent by <a href="http://coq.inria.fr">Coq</a>.</p><p>This has the advantage that natural looking function definitions can
be written, much as they would be written in a functional language
with general recursion. The downside is that the termination checker
is something separate to the core type theory, abrogating the claim of
type theory to be a "checkable language of evidence" (I got this
description from Conor McBride).</p><h3>Structural Recursion on Accessibility Predicates</h3>
<p>Exploit structural recursion by making use of a general purpose
inductively defined type <code>Acc</code> that allows us to encode recursion on
arbitrary well-founded relations (in Agda syntax):</p><pre><code>data Acc (x : A) : Set where
acc : (∀ y → y < x → Acc y) → Acc x</code></pre><p>Functions are defined by structural recursion on a special <code>Acc x</code>
argument. To actually call a function that requires such an argument
we need to provide a proof that <code>Acc x</code> for the our chosen relation
<code>_<_</code> and our argument <code>x</code>. See <a href="http://blog.ezyang.com/2010/06/well-founded-recursion-in-agda/">Edward Z. Yang’s blog
post</a>
for more information on doing this in Agda. As far as I am aware, this
technique goes back to Bengt Nordström’s paper <a href="http://www.cse.chalmers.se/~bengt/papers/genrec.ps">Terminating General
Recursion</a>.</p><p>In this technique, we are defining the function by structural
recursion on the <code>Acc x</code> argument, so it is this argument that drives
the computation. Since we probably want to erase this argument at
run-time, this creates a gap between the semantics of functions at
compile-time and functions at run-time.</p><p>A variant on this method is to generate a new inductively-defined
predicate for each recursive function that you want to define. This is
the <a href="http://www.cs.nott.ac.uk/~vxc/publications/General_Recursion_MSCS_2005.pdf">Bove-Capretta method</a>.</p><h2>Can we regain a slavish devotion to structure?</h2>
<p>In this series of blog posts, I’m going to explore a different
technique that allows more complex structural recursion principles to
be built up from basic structural recursion on inductively defined
datatypes, in such a way that the data we are analysing still drives
the computation. This idea goes back to (I think) <a href="http://www.springerlink.com/content/y3h26w6274664m6m/">Eduardo
Giménez</a>
(apologies for the paywall link), who used it to justify Coq’s
termination checker. The ideas were also promoted by McBride and
McKinna’s <a href="http://strictlypositive.org/view.ps.gz">The view from the
left</a>, where the systematic
use of eliminators for defining functions in type theory was
investigated. I’ll also be taking ideas from category-theoretic
presentations of structural recursion, specifically Hermida and
Jacobs’ <a href="http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.36.7400">original
paper</a>
and Neil Ghani, Patricia Johann and Clement Fumex’s <a href="http://personal.cis.strath.ac.uk/~patricia/csl2010.pdf">generalisation of
it</a>.</p><p>In the next post I’ll discuss how structural recursion is encoded in
type theory and category theory.</p>http://bentnib.org/posts/2011-04-22-structural-recursion.htmlFri, 22 Apr 2011 00:00:00 +0000