This post contains a development of an Agda proof that, for the Call-by-Name (CBN) λ-calculus, the number of steps needed to reduce a term to weak head normal form (WHNF) in the Krivine Abstract Machine (KAM) is equal to the size of the term's typing derivation in a non-idempotent intersection type system.

The inverse also holds: if a term reduces to WHNF in the KAM, then it has a typing derivation, the size of which is equal to the number of steps it takes to reduce the term. Therefore, typability characterises termination, quantitatively.

The main result manifests as a “quantitative” subject reduction lemma: not only is typing preserved by each reduction step, but the size of the typing derivation goes down by exactly one in each step.

Non-idempotent intersection types can also be seen as a syntactic presentation of a denotational semantics of untyped λ-calculus in the category Rel of sets and relations for a particular reflexive object. The fact that subject reduction (and expansion) have a quantitative aspectyields a proof of adequacy for this semantics that doesn't rely on logical relations (I've haven't proved this in this file though).

The quantitative nature of non-idempotent intersection types was originally observed for head normalisation by Daniel de Carvalho in Execution Time of lambda-Terms via Denotational Semantics and Intersection Types.

module non-idempotent-intersection-types where

First we need some things from the Agda standard library:

open import Data.Nat using (ℕ; zero; suc; _+_) open import Data.Nat.Properties using (+-assoc; +-comm; +-identityʳ; suc-injective) open import Data.Fin using (Fin; zero; suc; _≟_) open import Data.Product using (Σ-syntax; _×_; _,_; proj₁; proj₂) open import Data.List using (List; []; _∷_; _++_; [_]) import Relation.Binary.PropositionalEquality as Eq open Eq using (_≡_; refl; trans; cong; subst; sym; _≢_; cong₂) open Eq.≡-Reasoning using (begin_; _≡⟨⟩_; step-≡; _∎)

The syntax of the programming language we will look at is the same as the usual untyped λ-calculus. I've opted for an intrinsically well-scoped de Bruijn representation:

data term : ℕ → Set where `_ : ∀ {n} → Fin n → term n ƛ : ∀ {n} → term (suc n) → term n _·_ : ∀ {n} → term n → term n → term n infixl 20 _·_ infix 40 `_

The standard way to give an operational semantics to the untyped λ-calculus is to give a small step reduction relation where the main rule is β-reduction:

` (ƛ t) · t' ---> t [ t' ]`

However, our goal here is to measure the complexity of programs via
typing and counting the number of β reductions is not a very
convincing way of doing this. Substitution does a non-constant amount
of work, depending on how big `t`

is and how often the `0`

th variable
is used in it.

Instead, we want to use a semantics that performs a constant amount of work at each step. One way to do this is to use an abstract machine to interpret programs. Abstract machines break the complex process of β-reduction down into smaller steps, so they are more convincing ways to measure complexity.

The abstract machine we shall use here is the Krivine Abstract Machine (KAM), which implements reduction of λ terms to weak head normal form.

Rather than using substitution, the KAM uses *closures*: pairs of a
term and an *environment* that assigns a closure to each variable in
the term. As this description suggests, closures and environments are
mutually defined:

mutual data env : ℕ → Set where nil : env zero _,-_ : ∀ {n} → env n → clo → env (suc n) clo = Σ[ n ∈ ℕ ] env n × term n

Note that environments are built by extending to the right.

The second ingredient for the KAM are stacks of closures. Stacks are used to remember the context within a program we are currently executing. Compare to the congruence rule used in the definition of CBN small step operational semantics:

```
M --> M'
---------------------
M · N --> M' · N
```

In the rule, the context is the `N`

term, and in general we might have
to go under several `N`

s to get to the part we are currently
executing. The KAM uses a stack of closures to represent this:

data stk : Set where emp : stk _-,_ : clo → stk → stk infixr 10 _-,_

Stacks are built by extending to the left.

A *configuration* of the KAM is a pair of a closure, the current focus
of the computation, and a stack representing the evaluation
context. We use the special notation `⟨_⋆_⟩`

for configurations.

record configuration : Set where constructor ⟨_⋆_⟩ field closure : clo stack : stk infix 20 ⟨_⋆_⟩

The reduction semantics of the KAM is given by the following four rules, which are driven by the shape of the term in the first component of the configuration:

data _⇒_ : configuration → configuration → Set where grab : ∀ {n η c π} → ⟨ suc n , η ,- c , ` zero ⋆ π ⟩ ⇒ ⟨ c ⋆ π ⟩ skip : ∀ {n η c i π} → ⟨ suc n , η ,- c , ` (suc i) ⋆ π ⟩ ⇒ ⟨ n , η , ` i ⋆ π ⟩ push : ∀ {n η t s π} → ⟨ n , η , t · s ⋆ π ⟩ ⇒ ⟨ n , η , t ⋆ (n , η , s) -, π ⟩ pop : ∀ {n η t c π} → ⟨ n , η , ƛ t ⋆ c -, π ⟩ ⇒ ⟨ suc n , (η ,- c) , t ⋆ π ⟩ infixr 10 _⇒_

The rules `grab`

and `skip`

handle the execution of variables by
incrementally looking them up in the current environment. One could
also define a KAM that has a single rule for variable lookup (looking
arbitrarily deep into the environment in a single step), but we chose
to break it down into individual steps here.

The rule `push`

executes an application `t · s`

by pushing the
argument `s`

on to the stack (accompanied by its
environment). Execution will then carry on in the function position
(compare to the congruence rule given above for a small step
operational semantics).

Finally, the rule `pop`

executes a λ abstraction `ƛ t`

by popping the
current argument from the stack and adding it to the current
environment. Combined with the first two rules, this performs a
delayed β-reduction.

To start the KAM, we need to take programs that we want to execute and turn them into configurations:

init : term 0 → configuration init t = ⟨ 0 , nil , t ⋆ emp ⟩

Due to the fact that we have defined our KAM using intrinsically scoped syntax, it is not possible for the KAM to get stuck by not having a variable in scope. The only way the machine can get stuck is by ending up with a λ-abstraction paired with an empty stack. In this case we can say that the original term has been reduced to its weak head normal form (WHNF). It also possible that a term never reduces to a final configuration at all -- not all terms finish reducing under the Call-by-Name semantics. To identify which terms reduce, we make an inductive definition that also records how many steps it takes for a term to reduce:

data reduces : ℕ → configuration → Set where stop : ∀ {n} η t → reduces 0 ⟨ n , η , ƛ t ⋆ emp ⟩ step : ∀ {n p q} → p ⇒ q → reduces n q → reduces (suc n) p

So if we have a value of type `reduces n p`

, then we know that a
configuration `p`

reduces to a WHNF in `n`

steps.

Since the selection of rules the KAM is driven by the shape of the term being executed, execution in the KAM is deterministic:

deterministic : ∀ {p q q'} → p ⇒ q → p ⇒ q' → q ≡ q' deterministic grab grab = refl deterministic skip skip = refl deterministic push push = refl deterministic pop pop = refl

Consequently, any configuration that reduces does so in a unique number of steps:

reduces-det : ∀ {p m n} → reduces m p → reduces n p → m ≡ n reduces-det (stop η t) (stop .η .t) = refl reduces-det (step s1 r1) (step s2 r2) with deterministic s1 s2 ... | refl = cong suc (reduces-det r1 r2)

Non-idempotent intersection types assign types to terms according to their precise runtime behaviour. Usually type systems assign types that are over approximations of the possible behaviours of a program. Intersection types (whether idempotent or not) assign types that track precisely the behaviour a program will have (in a particular context). Indeed, as we will see below, the system we present here precisely tracks the termination behaviour of programs. This precision does make typechecking undecidable.

The types of our system are defined as:

data type : Set where ⋆ : type _↦_ : List type → type → type infixr 30 _↦_

We can interpret the types of the system as kinds of “observations” or
“behaviour descriptions” that we can make of a program. Per the
definition of `type`

, we have two kinds of observation in our system:

The observation

`⋆`

means that we can observe that the term will reduce to a value (a λ abstraction), but gives us no further information. To observe that a complete program reduces to WHNF, we will seek typings that type the whole program with the observation`⋆`

.The observation

`σ ↦ τ`

pairs a list of observations`σ`

with a single observation`τ`

. Making this observation means that the term will reduce to a function that when applied to an argument that exhibits all the behaviours in`σ`

, exhibits the observation`τ`

.

For example, the observation `[⋆] ↦ ⋆`

is one that we can assign to
the identity function: we can observe that the output is a value, as
long as it can make the observation that the input is a value.

The lists of types `σ`

are the *intersection types* of our
system. They indicate that a particular term must have all of the
behaviours listed, so it must be in the intersection of all the
possible behaviours: `σ = τ₁ ∧ τ₂ ∧ ... ∧ τₙ`

. Idempotent intersection
type systems disregard multiple occurrences of the same behaviour
(i.e. `τ ∧ τ = τ`

) in an intersection: they only track whether a
behaviour is present or not. Non-idempotent intersection type systems,
in contrast, track the number of occurrences of each behaviour. In
turn, this tracks the number of times a function uses its
arguments. It is this quantitative aspect that will allow us to use
non-idempotent intersection types to measure the reduction complexity
of programs.

To see how the rejection of idempotency allows tracking of number of uses, consider the following two terms (written using explicit names):

` λf. λx. f x`

and

` λf. λx. f (f x)`

When applied to the identity function `λx .x`

and any value `λz.t`

,
these terms both reduce to a value. We can see this by the types that
the system we define below will assign to them. The first one gets the
following type:

` ⊢ λf. λx. f x ⦂ [[⋆] ↦ ⋆] ↦ [⋆] ↦ ⋆`

In words, this says "if the first argument yields values when given values, and the second argument is a value, then the result is a value". The second term is assigned a similar type, but here we can see the quantitative nature of the system coming in to play:

` ⊢ λf. λx. f (f x) ⦂ [[⋆] ↦ ⋆, [⋆] ↦ ⋆] ↦ [⋆] ↦ ⋆`

Notice that the type now states that the first argument will be used
twice, because two behaviours appear in the first intersection. This
precisely tracks the structure of the term, which does indeed use `f`

twice. If we assumed idempotency, then the two types would be
equivalent, and we would be tracking the *qualitative* behaviour, but
not the *quantitative* behaviour.

To assign types to open terms, we need typing contexts. Typing contexts assign to each free variable position a list of possible behaviours. Typing contexts grow to the right.

data ctx : ℕ → Set where nil : ctx zero _,-_ : ∀ {n} → ctx n → List type → ctx (suc n)

We need the following two combinators for typing contexts. The first,
`empty`

, is the typing context where every variable is assumed to have
no behaviours. This is used to type terms that do not use any of the
variables in scope.

empty : ∀{n} → ctx n empty {zero} = nil empty {suc n} = empty {n} ,- []

The `_+++_`

combinator “zips” two typing contexts to combine their
possible behaviours. We will use this to create contexts that combine
the observations on contexts required by two the two subterms in an
application term.

_+++_ : ∀{n} → ctx n → ctx n → ctx n nil +++ nil = nil (Γ₁ ,- σ₁) +++ (Γ₂ ,- σ₂) = (Γ₁ +++ Γ₂) ,- (σ₁ ++ σ₂)

Non-idempotent intersection type systems are sensitive to the number of occurrences of each behaviour in an intersection, but not to their order. We will model this by requiring equivalence only up to permutation at various places. We define a permutation between two lists via sequences of swappings:

module permutations {X : Set} where data _⋈_ : List X -> List X -> Set where nil : [] ⋈ [] skip : ∀ {x l₁ l₂} -> l₁ ⋈ l₂ -> (x ∷ l₁) ⋈ (x ∷ l₂) swap : ∀ {x y l} -> (x ∷ y ∷ l) ⋈ (y ∷ x ∷ l) ⋈-trans : ∀ {l₁ l₂ l₃} -> l₁ ⋈ l₂ -> l₂ ⋈ l₃ -> l₁ ⋈ l₃ ⋈-refl : ∀ {l} -> l ⋈ l ⋈-refl {[]} = nil ⋈-refl {x ∷ l} = skip ⋈-refl open permutations

We use permutations to define when two typing contexts are pointwise permutations of each other. I.e., when two contexts have the same number of variables, but the observations at each variable are permuted.

data _⋈ctx_ : ∀ {n} → ctx n → ctx n → Set where nil : nil ⋈ctx nil _,-_ : ∀ {n}{Γ₁ Γ₂ : ctx n}{σ₁ σ₂} → Γ₁ ⋈ctx Γ₂ → σ₁ ⋈ σ₂ → (Γ₁ ,- σ₁) ⋈ctx (Γ₂ ,- σ₂) ⋈ctx-refl : ∀ {n}{Γ : ctx n} → Γ ⋈ctx Γ ⋈ctx-refl {Γ = nil} = nil ⋈ctx-refl {Γ = Γ ,- σ} = ⋈ctx-refl ,- ⋈-refl

We can now define the rules of our system. First, we define what it
means for a variable (represented as an Agda value of type `Fin n`

) to
be well typed in a context:

data _⊢v_⦂_ : ∀ {n} → ctx n → Fin n → type → Set where zero : ∀ {n τ} → (empty {n} ,- [ τ ]) ⊢v zero ⦂ τ suc : ∀ {n τ i}{Γ : ctx n} → Γ ⊢v i ⦂ τ → (Γ ,- []) ⊢v (suc i) ⦂ τ

The key point here is that the context for a variable must require no behaviours for any variable that is not the one being selected, and must require exactly one behaviour for the one being selected. Thus one use a variable corresponds to one observation.

The typing rules for terms are mutually defined with a rule for typing a single term against an intersection type. I will explain the rules after the definition.

mutual data _⊢_⦂_ : ∀ {n} → ctx n → term n → type → Set where var : ∀ {n τ i}{Γ : ctx n} → Γ ⊢v i ⦂ τ → Γ ⊢ ` i ⦂ τ lam : ∀ {n σ τ t}{Γ : ctx n} → (Γ ,- σ) ⊢ t ⦂ τ → Γ ⊢ ƛ t ⦂ (σ ↦ τ) app : ∀ {n σ τ s t}{Γ Γ₁ Γ₂ : ctx n} → Γ₁ ⊢ s ⦂ (σ ↦ τ) → Γ₂ ⊢ t ⦂' σ → (Γ₁ +++ Γ₂) ⋈ctx Γ → Γ ⊢ (s · t) ⦂ τ lam⋆ : ∀ {n t} → empty {n} ⊢ ƛ t ⦂ ⋆ data _⊢_⦂'_ : ∀ {n} → ctx n → term n → List type → Set where nil : ∀ {n}{t : term n} → empty ⊢ t ⦂' [] _,~_∷_ : ∀ {n t σ τ}{Γ₁ Γ₂ Γ : ctx n} → Γ₁ ⊢ t ⦂ τ → (Γ₁ +++ Γ₂) ⋈ctx Γ → Γ₂ ⊢ t ⦂' σ → Γ ⊢ t ⦂' (τ ∷ σ)

The rule `var`

uses the variable typing judgement to assign types to
terms that are variables. The rule `lam`

states that a λ abstraction
can be assigned the type `σ ↦ τ`

if the body can be assigned the type
`τ`

under the assumption that the `0`

th variable has the intersection
type `σ`

. The `app`

rule states that if the term `s`

can be observed
to act like a function mapping intersections of behaviours `σ`

to
behaviours `τ`

, and the term `t`

exhibits *all* the behaviours in `σ`

(using the auxiliary judgement `_⊢_⦂'_`

), then the whole application
term exhibits the behaviour `τ`

. Note that the two terms are typed in
separate contexts that are combined -- so the uses of variables in `s`

and `t`

are tracked separately. The rule `lam⋆`

is used to assign the
observation that λ abstractions are values (`⋆`

means that the term is
a value) to any λ abstraction.

The auxiliary judgement `Γ ⊢ t ⦂' σ`

is used to ensure that a single
term `t`

can exhibit all the possible behaviours in `σ`

. Note that
each individual behaviour in `σ`

must be observable under a separate
typing context. All the typing contexts are summed up (up to
permutation), to give the complete requirements on the context.

This completes the definition of the type assignment system. We demonstrate it using the two terms from above. These are defined as follows:

--- λf. λx. f x example₁ : term 0 example₁ = ƛ (ƛ (` suc zero · ` zero)) -- λf. λx. f (f x) example₂ : term 0 example₂ = ƛ (ƛ (` suc zero · (` suc zero · ` zero)))

To give typing derivations for these, we will save a bit of typing by using the following derived rule for application when we apply a function that will only use its argument once:

app₁ : ∀ {n Γ₁ Γ₂ τ₁ τ₂}{s t : term n} → Γ₁ ⊢ s ⦂ [ τ₁ ] ↦ τ₂ → Γ₂ ⊢ t ⦂ τ₁ → (Γ₁ +++ (Γ₂ +++ empty)) ⊢ s · t ⦂ τ₂ app₁ s-ok t-ok = app s-ok (t-ok ,~ ⋈ctx-refl ∷ nil) ⋈ctx-refl

With this derived rule to help us, the first example can be assigned the type we gave above like so:

typing₁ : nil ⊢ example₁ ⦂ [ [ ⋆ ] ↦ ⋆ ] ↦ [ ⋆ ] ↦ ⋆ typing₁ = lam (lam (app₁ (var (suc zero)) (var zero)))

And the second one like so:

typing₂ : nil ⊢ example₂ ⦂ ([ ⋆ ] ↦ ⋆ ∷ [ ⋆ ] ↦ ⋆ ∷ []) ↦ [ ⋆ ] ↦ ⋆ typing₂ = lam (lam (app₁ (var (suc zero)) (app₁ (var (suc zero)) (var zero))))

In the second one we see how the type reveals that the term uses its first argument twice.

Our main result below is that typing is preserved by execution
*quantitatively*. Therefore, we need to be able to compute the sizes
of typing derivations. This is done by the following three functions:

v-size : ∀ {n}{Γ : ctx n}{t τ} → Γ ⊢v t ⦂ τ → ℕ v-size zero = 1 v-size (suc d) = 1 + v-size d mutual size : ∀ {n}{Γ : ctx n}{t τ} → Γ ⊢ t ⦂ τ → ℕ size (var x) = v-size x size (lam d) = 1 + size d size (app d₁ d₂ _) = 1 + size d₁ + size' d₂ size lam⋆ = 0 size' : ∀ {n}{Γ : ctx n}{t σ} → Γ ⊢ t ⦂' σ → ℕ size' nil = 0 size' (d ,~ _ ∷ ds) = size d + size' ds

We have been a little selective in what we have counted. We have added
`1`

for every construct that can cause the KAM to make a step: whether
a variable lookup (`zero`

or `suc`

), an application, or a λ
abstraction that we expect to have some interesting behaviour. In
particular a use of `lam⋆`

is counted as `0`

because it represents
values that will not be inspected any further, and so will not cause
the KAM to do any work.

The `size'`

function adds up all the derivations used in checking an
intersection type. This means that we accurately account for all the
possible ways that a single term will be used.

We now extend the non-idempotent intersection typing discipline from terms to configurations of the KAM. This is required so that we can state our main lemmas.

Closures are typed by ensuring that the term contained in them is well typed for some context, and the associated environment is contains closures well typed according to that context. Because contexts contain intersection types, we need an auxiliary judgement that checks that a closure is well typed for all types in an intersection. These three judgements (closures, environments, closures with intersection types) need to be mutually defined:

mutual data ⊢c_⦂_ : (c : clo) (τ : type) → Set where t-clo : ∀ {n t τ}{η : env n}{Γ : ctx n} → Γ ⊢ t ⦂ τ → ⊢e η ⦂ Γ → ⊢c (n , η , t) ⦂ τ data ⊢e_⦂_ : ∀ {n} → env n → ctx n → Set where nil : ⊢e nil ⦂ nil _,-_ : ∀{n}{η : env n}{Γ : ctx n}{c}{σ} → ⊢e η ⦂ Γ → ⊢c c ⦂' σ → ⊢e (η ,- c) ⦂ (Γ ,- σ) data ⊢c_⦂'_ : (c : clo) (σ : List type) → Set where nil : ∀ {c} → ⊢c c ⦂' [] _∷_ : ∀ {c τ σ} → ⊢c c ⦂ τ → ⊢c c ⦂' σ → ⊢c c ⦂' (τ ∷ σ)

We also extend the measurement of sizes of derivations to well typed closures and environments. Notice that these derivations do not contribute themselves to the size: we only sum up the sizes of the contained term typing derivations.

mutual c-size : ∀ {c τ} → ⊢c c ⦂ τ → ℕ c-size (t-clo t e) = size t + e-size e e-size : ∀ {n}{η : env n}{Γ} → ⊢e η ⦂ Γ → ℕ e-size nil = 0 e-size (d ,- c) = e-size d + c'-size c c'-size : ∀ {c σ} → ⊢c c ⦂' σ → ℕ c'-size nil = 0 c'-size (c ∷ c') = c-size c + c'-size c'

Stacks are typed with two types `τ₁ ⊸ τ₂`

, meaning that if the
observation `τ₁`

can be made of the computation happening under the
stack, then the observation `τ₂`

can be observed at the "top level":

data ⊢s_⦂_⊸_ : (π : stk) (τ₁ τ₂ : type) → Set where emp : ∀ {τ} → ⊢s emp ⦂ τ ⊸ τ _-,_ : ∀ {c π σ τ₁ τ₂} → ⊢c c ⦂' σ → ⊢s π ⦂ τ₁ ⊸ τ₂ → ⊢s (c -, π) ⦂ (σ ↦ τ₁) ⊸ τ₂

Again, we measure the size of a stack typing derivation by adding up the sizes of the typing derivations contained within:

s-size : ∀ {π τ₁ τ₂} → ⊢s π ⦂ τ₁ ⊸ τ₂ → ℕ s-size emp = 0 s-size (c -, π) = c'-size c + s-size π

Well typed KAM configurations are constructed by pairing closures that
can generate a behaviour `τ₁`

with stacks that can turn that behaviour
into a behaviour `τ`

at the top level. The size of a configuration is
measured by summing up the sizes of the typing derivations included in
the closure and the stack.

data ⊢p_⦂_ : configuration → type → Set where cfg : ∀ {c π τ₁ τ} → ⊢c c ⦂ τ₁ → ⊢s π ⦂ τ₁ ⊸ τ → ⊢p ⟨ c ⋆ π ⟩ ⦂ τ cfg-size : ∀ {p τ} → ⊢p p ⦂ τ → ℕ cfg-size (cfg c π) = c-size c + s-size π

We can now prove that the initial configuration of the KAM for a given program is well typed if the program is:

init-typed : ∀{t τ} → nil ⊢ t ⦂ τ → ⊢p init t ⦂ τ init-typed d = cfg (t-clo d nil) emp

The size of this initial typing derivation is also equal to the size of the original typing derivation:

init-typed-size : ∀{t τ} → (d : nil ⊢ t ⦂ τ) → cfg-size (init-typed d) ≡ size d init-typed-size{τ = τ} d = begin size d + e-size nil + s-size{τ₁ = τ} emp ≡⟨⟩ size d + 0 + 0 ≡⟨ +-identityʳ (size d + 0) ⟩ size d + 0 ≡⟨ +-identityʳ (size d) ⟩ size d ∎

We now connect the KAM semantics to the non-idempotent intersection type system. This will take the form of three key lemmas:

*Subject Reduction*(a.k.a preservation), which says that if a configuration of the KAM is well typed, and steps to another configuration, then that configuration is also well-typed. The twist here is that this lemma is also quantitative: the size of the typing derivation after one step is exactly one smaller than it was before. This will allow us to prove that all typable terms reduce.*Subject Expansion*, the converse of subject reduction. If a configuration is well typed and was the result of a step of the KAM, then the predecessor was also well typed. Subject expansion does not usually hold for type systems, but is characteristic of intersection type systems due to the way they describe what a program*must*do, instead of what it*may*do. Subject expansion allows us to prove a completeness result: if a program reduces to WHNF, then we can step it backwards to recover a typing derivation for the initial state.*Progress*: if a configuration is well typed, and its derivation has non-zero size, then the KAM can take a step. Thus we can interpret the size of a derivation as the amount of computation that a program has left to perform.

Before we can prove subject reduction, we will need some auxiliary lemmas about the type system. These mostly say that typing is preserved under permutation, splitting, or joining of intersection types. We also prove that these preservations are size preserving, anticipating the quantitative subject reduction lemma in the next section.

First, we need that if a closure `c`

can be well typed for (i.e.,
exhibits the behaviour of) all the types in `σ₁`

then it is well typed
for any permutation of `σ₁`

. Also, this does not affect the size of
the typing derivation. We prove this by induction on the permutation
applied:

clo-permute : ∀ {c σ₁ σ₂} → ⊢c c ⦂' σ₁ → σ₂ ⋈ σ₁ → ⊢c c ⦂' σ₂ clo-permute d nil = d clo-permute (x ∷ d) (skip p) = x ∷ clo-permute d p clo-permute (x ∷ x₁ ∷ d) swap = x₁ ∷ x ∷ d clo-permute d (⋈-trans p p₁) = clo-permute (clo-permute d p₁) p clo-permute-size : ∀ {c σ₁ σ₂} → (d : ⊢c c ⦂' σ₁) → (p : σ₂ ⋈ σ₁) → c'-size d ≡ c'-size (clo-permute d p) clo-permute-size d nil = refl clo-permute-size (c ∷ d) (skip p) = cong (λ □ → c-size c + □) (clo-permute-size d p) clo-permute-size (x ∷ (x₁ ∷ d)) swap = trans (sym (+-assoc (c-size x) (c-size x₁) (c'-size d))) (trans (cong (λ □ → □ + c'-size d) (+-comm (c-size x) (c-size x₁))) (+-assoc (c-size x₁) (c-size x) (c'-size d))) clo-permute-size d (⋈-trans p p₁) = trans (clo-permute-size d p₁) (clo-permute-size (clo-permute d p₁) p)

Preservation of closure typing under permutation extends to preservation of environment typing under pointwise permutation of typing contexts, and this is also size preserving:

env-permute : ∀ {n}{η : env n}{Γ₁ Γ₂ : ctx n} → ⊢e η ⦂ Γ₁ → Γ₂ ⋈ctx Γ₁ → ⊢e η ⦂ Γ₂ env-permute nil nil = nil env-permute (d ,- x) (p ,- x₁) = env-permute d p ,- clo-permute x x₁ env-permute-size : ∀ {n}{η : env n}{Γ₁ Γ₂ : ctx n} → (d : ⊢e η ⦂ Γ₁) → (p : Γ₂ ⋈ctx Γ₁) → e-size d ≡ e-size (env-permute d p) env-permute-size nil nil = refl env-permute-size (d ,- x) (p ,- x₁) = cong₂ _+_ (env-permute-size d p) (clo-permute-size x x₁)

If a closure `c`

exhibits all the behaviours in the big intersection
`σ₁ ++ σ₂`

, then it exhibits them separately. Moreover, the sizes of
the two separate typing derivations sums to the original:

clo-split : ∀ {c σ₁ σ₂} → ⊢c c ⦂' (σ₁ ++ σ₂) → ⊢c c ⦂' σ₁ × ⊢c c ⦂' σ₂ clo-split {σ₁ = []} d = nil , d clo-split {σ₁ = x ∷ σ₁} (x₁ ∷ d) = let d₁ , d₂ = clo-split d in x₁ ∷ d₁ , d₂ clo-split-size : ∀ {c σ₁ σ₂} → (d : ⊢c c ⦂' (σ₁ ++ σ₂)) → c'-size d ≡ c'-size (clo-split {σ₁ = σ₁} d .proj₁) + c'-size (clo-split {σ₁ = σ₁} d .proj₂) clo-split-size {σ₁ = []} d = refl clo-split-size {σ₁ = x ∷ σ₁} (x₁ ∷ d) = trans (cong (λ □ → c-size x₁ + □) (clo-split-size {σ₁ = σ₁} d)) (sym (+-assoc (c-size x₁) _ _))

As a consequence of `clo-split`

, we can prove that if an environment
exhibits behaviours in two contexts combined, then it exhibits them
separately. We will use this to distribute welltypedness of
environments between the two parts of an application term in the
`push`

rule of the KAM.

env-split : ∀ {n} {η : env n} {Γ₁ Γ₂} → ⊢e η ⦂ (Γ₁ +++ Γ₂) → ⊢e η ⦂ Γ₁ × ⊢e η ⦂ Γ₂ env-split {n = zero} {nil} {nil} {nil} nil = nil , nil env-split {n = suc n} {η ,- x} {Γ₁ ,- x₁} {Γ₂ ,- x₂} (d ,- x₃) = (env-split d .proj₁ ,- clo-split x₃ .proj₁) , (env-split d .proj₂ ,- clo-split x₃ .proj₂)

The splitting of typing derivations by `env-split`

preserves sizes, but
to prove this we first need a lemma about addition:

interchange : ∀ a b c d → (a + b) + (c + d) ≡ (a + c) + (b + d) interchange a b c d = begin (a + b) + (c + d) ≡⟨ +-assoc a b (c + d) ⟩ a + (b + (c + d)) ≡⟨ cong (λ □ → a + □) (sym (+-assoc b c d)) ⟩ a + ((b + c) + d) ≡⟨ cong (λ □ → a + (□ + d)) (+-comm b c) ⟩ a + ((c + b) + d) ≡⟨ cong (λ □ → a + □) (+-assoc c b d) ⟩ a + (c + (b + d)) ≡⟨ sym (+-assoc a c (b + d)) ⟩ (a + c) + (b + d) ∎

With this lemma, we can prove that splitting an environment typing derivation is size preserving:

env-split-size : ∀ {n} {η : env n} {Γ₁ Γ₂} → (d : ⊢e η ⦂ (Γ₁ +++ Γ₂)) → e-size d ≡ e-size (env-split d .proj₁) + e-size (env-split d .proj₂) env-split-size {n = zero} {nil} {nil} {nil} nil = refl env-split-size {n = suc n} {η ,- x} {Γ₁ ,- σ₁} {Γ₂ ,- σ₂} (d ,- c) = begin e-size d + c'-size c ≡⟨ cong₂ _+_ (env-split-size d) (clo-split-size {x}{σ₁} c) ⟩ (e-size (env-split d .proj₁) + e-size (env-split d .proj₂)) + (c'-size (clo-split {x}{σ₁} c .proj₁) + c'-size (clo-split {x}{σ₁} c .proj₂)) ≡⟨ interchange (e-size (env-split d .proj₁)) (e-size (env-split d .proj₂)) (c'-size (clo-split {x}{σ₁} c .proj₁)) (c'-size (clo-split {x}{σ₁} c .proj₂)) ⟩ e-size (env-split d .proj₁) + c'-size (clo-split {x}{σ₁} c .proj₁) + (e-size (env-split d .proj₂) + c'-size (clo-split {x}{σ₁} c .proj₂)) ∎

We also have nullary version of `clo-split`

and `env-split`

: whenever
a closure has is typed to have no behaviours, or an environment
consists of closures that have no required behaviours, then the sizes
of their derivations is zero:

c'-empty : ∀ {c} → (d : ⊢c c ⦂' []) → c'-size d ≡ 0 c'-empty nil = refl e-empty : ∀ {n}{η : env n} → (d : ⊢e η ⦂ empty) → e-size d ≡ 0 e-empty nil = refl e-empty (d ,- c) = begin e-size d + c'-size c ≡⟨ cong₂ _+_ (e-empty d) (c'-empty c) ⟩ 0 ∎

The final piece we need before proving subject reduction is the
following derived typing rule for constructing typings of closures
against intersections from typings of terms against intersections,
assuming an appropriately typed environment. This uses the `env-split`

and `env-permute`

lemmas.

mk-clo : ∀ {n}{η : env n}{Γ : ctx n}{s : term n}{σ} → ⊢e η ⦂ Γ → Γ ⊢ s ⦂' σ → ⊢c n , η , s ⦂' σ mk-clo {σ = []} η-ok nil = nil mk-clo {σ = τ ∷ σ} η-ok (x ,~ p ∷ s-ok) = t-clo x (env-split (env-permute η-ok p) .proj₁) ∷ mk-clo (env-split (env-permute η-ok p) .proj₂) s-ok

This derived rule is again size preserving, using the fact that
`env-split`

and `env-permute`

are size preserving:

mk-clo-size : ∀ {n}{η : env n}{Γ : ctx n}{s : term n}{σ} → (η-ok : ⊢e η ⦂ Γ) → (s-ok : Γ ⊢ s ⦂' σ) → size' s-ok + e-size η-ok ≡ c'-size (mk-clo η-ok s-ok) mk-clo-size {σ = []} η-ok nil = e-empty η-ok mk-clo-size {σ = τ ∷ σ} η-ok (x ,~ p ∷ s-ok) = begin size x + size' s-ok + e-size η-ok ≡⟨ cong (λ □ → size x + size' s-ok + □) (env-permute-size η-ok p) ⟩ size x + size' s-ok + e-size (env-permute η-ok p) ≡⟨ cong (λ □ → size x + size' s-ok + □) (env-split-size (env-permute η-ok p)) ⟩ size x + size' s-ok + (e-size (env-split (env-permute η-ok p) .proj₁) + e-size (env-split (env-permute η-ok p) .proj₂)) ≡⟨ interchange (size x) (size' s-ok) (e-size (env-split (env-permute η-ok p) .proj₁)) (e-size (env-split (env-permute η-ok p) .proj₂)) ⟩ size x + e-size (env-split (env-permute η-ok p) .proj₁) + (size' s-ok + e-size (env-split (env-permute η-ok p) .proj₂)) ≡⟨ cong (λ □ → size x + e-size (env-split (env-permute η-ok p) .proj₁) + □) (mk-clo-size (env-split (env-permute η-ok p) .proj₂) s-ok) ⟩ size x + e-size (env-split (env-permute η-ok p) .proj₁) + c'-size (mk-clo (env-split (env-permute η-ok p) .proj₂) s-ok) ∎

Using the lemmas in the previous section, we can now prove the subject reduction property for the non-idempotent intersection type system. The proof consists of taking each possible KAM step in turn, pattern matching on the possible starting configurations' typing derivations, and constructing the corresponding typing derivation for the ending configuration:

subject-reduction : ∀ {p q τ} → p ⇒ q → ⊢p p ⦂ τ → ⊢p q ⦂ τ subject-reduction grab (cfg (t-clo (var zero) (_ ,- (c ∷ nil))) π-ok) = cfg c π-ok subject-reduction skip (cfg (t-clo (var (suc x)) (η-ok ,- _)) π-ok) = cfg (t-clo (var x) η-ok) π-ok subject-reduction push (cfg (t-clo (app t s p) η-ok) π-ok) = cfg (t-clo t (env-split (env-permute η-ok p) .proj₁)) (mk-clo (env-split (env-permute η-ok p) .proj₂) s -, π-ok) subject-reduction pop (cfg (t-clo (lam t) η-ok) (c -, π-ok)) = cfg (t-clo t (η-ok ,- c)) π-ok

In most of the cases, the final typing derivation is constructed from
parts of the initial one. In the application case, we need to use
`env-split`

to distribute the typing derivation of the environment
(`η-ok`

) between the closures for the function and its argument.

We now prove that subject reduction is *quantitative*: the typing
derivation generated by subject reduction is always of size one
smaller than the initial one. This is a consequence of the fact that
most of the operations we perform on typing derivations in he subject
reduction proof are size preserving, except for the fact that we
remove the top level term former, which accounts for the decrementing
by one.

sr-size : ∀ {p q τ} → (step : p ⇒ q) → (d : ⊢p p ⦂ τ) → cfg-size d ≡ suc (cfg-size (subject-reduction step d)) sr-size grab (cfg (t-clo (var zero) (η-emp ,- (c ∷ nil))) π-ok) = begin suc (e-size η-emp + (c-size c + 0) + s-size π-ok) ≡⟨ cong (λ □ → suc (□ + (c-size c + 0) + s-size π-ok)) (e-empty η-emp) ⟩ suc (0 + (c-size c + 0) + s-size π-ok) ≡⟨⟩ suc ((c-size c + 0) + s-size π-ok) ≡⟨ cong (λ □ → suc (□ + s-size π-ok)) (+-identityʳ (c-size c)) ⟩ suc (c-size c + s-size π-ok) ∎ sr-size skip (cfg (t-clo (var (suc x)) (η-ok ,- c')) π-ok) = begin suc (v-size x + (e-size η-ok + c'-size c') + s-size π-ok) ≡⟨ cong (λ □ → suc (v-size x + (e-size η-ok + □) + s-size π-ok)) (c'-empty c') ⟩ suc (v-size x + (e-size η-ok + 0) + s-size π-ok) ≡⟨ cong (λ □ → suc (v-size x + □ + s-size π-ok)) (+-identityʳ (e-size η-ok)) ⟩ suc (v-size x + e-size η-ok + s-size π-ok) ∎ sr-size push (cfg (t-clo (app t s p) η-ok) π-ok) = begin suc (size t + size' s + e-size η-ok + s-size π-ok) ≡⟨ cong (λ □ → suc (size t + size' s + □ + s-size π-ok)) (env-permute-size η-ok p) ⟩ suc (size t + size' s + e-size (env-permute η-ok p) + s-size π-ok) ≡⟨ cong (λ □ → suc (size t + size' s + □ + s-size π-ok)) (env-split-size (env-permute η-ok p)) ⟩ suc (size t + size' s + (e-size (env-split (env-permute η-ok p) .proj₁) + e-size (env-split (env-permute η-ok p) .proj₂)) + s-size π-ok) ≡⟨ cong (λ □ → suc (□ + s-size π-ok)) (interchange (size t) (size' s) (e-size (env-split (env-permute η-ok p) .proj₁)) (e-size (env-split (env-permute η-ok p) .proj₂))) ⟩ suc (size t + e-size (env-split (env-permute η-ok p) .proj₁) + (size' s + e-size (env-split (env-permute η-ok p) .proj₂)) + s-size π-ok) ≡⟨ cong (λ □ → suc (size t + e-size (env-split (env-permute η-ok p) .proj₁) + □ + s-size π-ok)) (mk-clo-size (env-split (env-permute η-ok p) .proj₂) s) ⟩ suc (size t + e-size (env-split (env-permute η-ok p) .proj₁) + c'-size (mk-clo (env-split (env-permute η-ok p) .proj₂) s) + s-size π-ok) ≡⟨ cong (λ □ → suc □) (+-assoc (size t + e-size (env-split (env-permute η-ok p) .proj₁)) (c'-size (mk-clo (env-split (env-permute η-ok p) .proj₂) s)) (s-size π-ok)) ⟩ suc (size t + e-size (env-split (env-permute η-ok p) .proj₁) + (c'-size (mk-clo (env-split (env-permute η-ok p) .proj₂) s) + s-size π-ok)) ∎ sr-size pop (cfg (t-clo (lam t) η-ok) (c -, π-ok)) = begin suc (size t + e-size η-ok + (c'-size c + s-size π-ok)) ≡⟨ cong (λ □ → suc □) (+-assoc (size t) (e-size η-ok) (c'-size c + s-size π-ok)) ⟩ suc (size t + (e-size η-ok + (c'-size c + s-size π-ok))) ≡⟨ cong (λ □ → suc (size t + □)) (sym (+-assoc (e-size η-ok) (c'-size c) (s-size π-ok))) ⟩ suc (size t + ((e-size η-ok + c'-size c) + s-size π-ok)) ≡⟨ cong suc (sym (+-assoc (size t) (e-size η-ok + c'-size c) (s-size π-ok))) ⟩ suc (size t + (e-size η-ok + c'-size c) + s-size π-ok) ∎

When we combine these two lemmas with the progress lemma we prove below, we will be able to prove by induction on the size of the typing derivation that the execution of the KAM terminates. This is the content of the soundness lemma below.

A feature of intersection type systems is that they often enjoy subject expansion as well as subject reduction. The type information is a precise enough description of a term's behaviour that we are always able to reconstruct typing derivations backwards through execution.

To prove subject expansion we will need the inverse to the closure and environment splitting lemmas we proved for subject reduction:

clo-join : ∀ {c σ₁ σ₂} → ⊢c c ⦂' σ₁ → ⊢c c ⦂' σ₂ → ⊢c c ⦂' (σ₁ ++ σ₂) clo-join {σ₁ = []} nil c₂ = c₂ clo-join {σ₁ = τ ∷ σ₁} (x ∷ c₁) c₂ = x ∷ clo-join c₁ c₂ env-join : ∀ {n} {η : env n} {Γ₁ Γ₂} → ⊢e η ⦂ Γ₁ → ⊢e η ⦂ Γ₂ → ⊢e η ⦂ (Γ₁ +++ Γ₂) env-join {n = zero} {nil} {nil} {nil} nil nil = nil env-join {n = suc n} {η ,- c} {Γ₁ ,- σ₁} {Γ₂ ,- σ₂} (e₁ ,- c₁) (e₂ ,- c₂) = env-join e₁ e₂ ,- clo-join c₁ c₂

We also need the fact that every environment is well typed if the types we use make no demands:

env-empty : ∀ {n}{η : env n} → ⊢e η ⦂ empty env-empty {η = nil} = nil env-empty {η = η ,- x} = env-empty ,- nil

Together, these allow us to prove that the derived rule for building
closures typed with intersection types (`mkclo`

) is invertible:

unmk-clo : ∀ {n}{η : env n}{s : term n}{σ} → ⊢c n , η , s ⦂' σ → Σ[ Γ ∈ ctx n ] (⊢e η ⦂ Γ × Γ ⊢ s ⦂' σ) unmk-clo {σ = []} nil = empty , env-empty , nil unmk-clo {σ = τ ∷ σ} (t-clo {Γ = Γ} s-τ η-ok ∷ d) with unmk-clo d ... | Γ' , η-ok' , s-σ = (Γ +++ Γ') , env-join η-ok η-ok' , (s-τ ,~ ⋈ctx-refl ∷ s-σ)

The proof of subject expansion is now has a similar structure to the proof of subject reduction:

subject-expansion : ∀ {p q τ} → p ⇒ q → ⊢p q ⦂ τ → ⊢p p ⦂ τ subject-expansion grab (cfg c-ok π-ok) = cfg (t-clo (var zero) (env-empty ,- (c-ok ∷ nil))) π-ok subject-expansion skip (cfg (t-clo (var x) x₂) x₁) = cfg (t-clo (var (suc x)) (x₂ ,- nil)) x₁ subject-expansion push (cfg (t-clo {Γ = Γ} t-ok η-ok) (c-ok -, π-ok)) with unmk-clo c-ok ... | Γ' , η-ok' , s-ok = cfg (t-clo (app t-ok s-ok ⋈ctx-refl) (env-join η-ok η-ok')) π-ok subject-expansion pop (cfg (t-clo t (x₂ ,- x₃)) x₁) = cfg (t-clo (lam t) x₂) (x₃ -, x₁)

As is usual progress only holds for complete programs. In our case,
that means configurations that are typed with the observation `⋆`

,
meaning that we expect the configuration to terminate with a
value. Our formulation of progress is quantitative, so rather than
examining the configuration `p`

to see whether it is a value or not,
we rely on measuring the amount of “potential” left in the
configuration by measuring its size. If this is non-zero, then the KAM
will make progress.

progress : ∀ {p} → (d : ⊢p p ⦂ ⋆) → cfg-size d ≢ 0 → Σ[ q ∈ configuration ] p ⇒ q progress (cfg (t-clo (var zero) (η-emp ,- (c ∷ _))) π-ok) _ = _ , grab progress (cfg (t-clo (var (suc x)) (η-ok ,- nil)) π-ok) _ = _ , skip progress (cfg (t-clo (lam t) η-ok) (c -, π-ok)) _ = _ , pop progress (cfg (t-clo (app s t p) η-ok) π-ok) _ = _ , push progress (cfg (t-clo lam⋆ η-ok) emp) r rewrite e-empty η-ok with r refl ... | ()

Putting all our results together, we can now prove that non-idempotent intersection types are a sound and complete characterisation of termination in the CBN λ-calculus, and moreover the size of the typing derivation is exactly the number of steps taken to terminate.

We first do soundness: if a closed term `t`

has a typing derivation
`d`

that shows that `⋆`

is a possible observation of `t`

, then `t`

will reduce in the KAM to WHNF in `size d`

steps:

soundness : ∀ {t} → (d : nil ⊢ t ⦂ ⋆) → reduces (size d) ⟨ 0 , nil , t ⋆ emp ⟩ soundness {t} d = subst (λ □ → reduces □ ⟨ 0 , nil , t ⋆ emp ⟩) (init-typed-size d) (run (init-typed d) (cfg-size (init-typed d)) refl) where helper : ∀ {k n} → n ≡ suc k → n ≢ 0 helper refl () final : ∀ {p} → (d : ⊢p p ⦂ ⋆) → cfg-size d ≡ 0 → reduces 0 p final (cfg (t-clo (var zero) _) π-ok) () final (cfg (t-clo (var (suc _)) _) π-ok) () final {⟨ n , η , ƛ t ⋆ emp ⟩} (cfg (t-clo lam⋆ x₁) emp) e = stop η t run : ∀ {p} → (d : ⊢p p ⦂ ⋆) → ∀ k → cfg-size d ≡ k → reduces k p run d zero e = final d e run d (suc k) e with progress d (helper e) ... | q , s = step s (run (subject-reduction s d) k (suc-injective (trans (sym (sr-size s d)) e)))

The bulk of the work in this lemma is performed by the `run`

function
which uses induction on the size of the typing derivation to
repeatedly call `progress`

to get the next step, `subject-reduction`

to preserve typability, and `sr-size`

to show that the size of the
derivation goes down.

The converse of soundness is completeness: every term that reduces to
WHNF has a typing derivation. This is proved by induction on the
reduction sequence: in the base case all WHNFs are typable by the
`lam⋆`

rule, and in the step case by using subject expansion. We also
know by soundness and the determinism of length of reduction sequences
that the computed typing derivation has the same size as the length of
the reduction.

completeness : ∀ {t n} → reduces n ⟨ 0 , nil , t ⋆ emp ⟩ → Σ[ d ∈ nil ⊢ t ⦂ ⋆ ] (size d ≡ n) completeness {t} r = d , reduces-det (soundness d) r where config-typed : ∀ {p k} → reduces k p → ⊢p p ⦂ ⋆ config-typed (stop η t) = cfg (t-clo lam⋆ env-empty) emp config-typed (step s r) = subject-expansion s (config-typed r) extract : ∀ {t} → ⊢p ⟨ 0 , nil , t ⋆ emp ⟩ ⦂ ⋆ → nil ⊢ t ⦂ ⋆ extract (cfg (t-clo x nil) emp) = x d : nil ⊢ t ⦂ ⋆ d = extract (config-typed r)

Inspecting the proof of subject expansion, we can also see that the computed typing derivation never uses a non-identity permutation, indicating that we might be able to use a variant of non-idempotent intersection types to predict the ordering of usage, as well as quantity of usage.

A possible application of this type system would be to proving that
optimisations actually optimise. If we have an optimisation that
replaces a term `M`

with a term `N`

, then if for all typings ```
Γ ⊢ M ⦂
τ
```

we can get a typing `Γ ⊢ N ⦂ τ`

that is smaller, then any context
which uses `M`

can replace it with `N`

, preserving meaning, and take
fewer steps to terminate.