Well-definedness and observational equivalence for inductive-coinductive programs

(1)

Well-definedness and observational equivalence for inductive-coinductive programs

Basold, Henning; Hansen, Helle Hvid DOI

10.1093/logcom/exv091

Publication date 2019

Document Version

Accepted author manuscript Published in

Journal of Logic and Computation

Citation (APA)

Basold, H., & Hansen, H. H. (2019). Well-definedness and observational equivalence for inductive-coinductive programs. Journal of Logic and Computation, 29(4), 419-468.

https://doi.org/10.1093/logcom/exv091 Important note

To cite this publication, please use the final published version (if applicable). Please check the document version above.

Copyright

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons. Takedown policy

Please contact us and provide details if you believe this document breaches copyrights. We will remove access to the work immediately and investigate your claim.

This work is downloaded from Delft University of Technology.

(2)

Inductive-Coinductive Programs

Henning Basold and Helle Hvid Hansen

Abstract

In this paper, we investigate well-definedness and observational equivalence for programs of mixed inductive and coinductive types. The programs we consider are terms of a variation on the copattern language introduced by Abel et al. These notions are defined by means of tests formulas which combine structural congruence for inductive types and modal logic for coinductive types. Tests also correspond to certain evaluation contexts. We define a program to be well-defined if it is strongly normalising under all tests, and two programs are observationally equivalent if they satisfy the same tests.

We show that observational equivalence is strong enough to ensure that least and greatest fixed point types are initial algebras and final coalgebras, respectively. This yields inductive and coinductive proof principles for reasoning about program behaviour. On the other hand, we argue that observational equivalence does not identify too many terms, by showing that tests induce a topology that coincides with the widely accepted topology used to define computable functions on coinductive types as continuous functions.

As one would expect, observational equivalence is, in general, undecidable, but in order to develop some practically useful heuristics we provide coinductive techniques for establishing observational normalisation and observational equivalence, along with up-to techniques for enhancing these methods.

“Everything that is observed in the whole of nature must be able to be deduced from it.”

— B. Spinoza, Principles of Cartesian Philosophy

1 Introduction

There is a growing interest in programming with mixed inductive-coinductive types [2, 3, 14, 16, 17, 20, 21, 33, 34], especially in combination with specifications of programs by means of (co)recursive equations in the style of behavioural differential equations [43]. We address two fundamental questions about such programs: First, when is a program well-defined, i.e., when does it have a well-defined semantics? Second, when are two programs considered behaviourally equivalent? Both questions are particularly important in type-based theorem provers, as well-definedness ensures logical consistency [31], and a strong program equivalence eases reasoning [6].

For programs specified by systems of (co)recursive equations, well-definedness amounts to existence and uniqueness of solutions. For purely corecursive equations well-definedness is often referred to as productivity, and it is usually ensured by requiring that the equations

(3)

are of a certain syntactic shape. Examples include the (abstract) GSOS format of [46] and the guardedness condition of [5]. Such syntactic conditions are easy to check, and are therefore used in theorem provers like Coq, cf. [17]. The problem, as argued in e.g. [1], is that they are very restrictive and not easy to extend in a compositional manner.

Equivalences of terms in (lazy) functional languages have been considered in e.g. [4, 6, 19, 23, 24, 32, 35, 37]. We discuss their relation to the present work in more detail below. The main difference to mention now is that coinductive types are not considered there. The only work we are aware of on equivalences in type systems with mixed inductive and coinductive types is by Plotkin & Abadi [36] and Hasegawa [22] which uses encoding of fixed point types in the polymorphic λ-calculus. We contribute to this line of research by investigating program equivalence in type systems in which inductive and coinductive types are first-class citizens. To the best of our knowledge, this has not been done before. In this paper, we define notions of productivity and behavioural equivalence in a sim-ply typed, lazy language with least and greatest fixed point types. This language is a variation of the one in [3] where copatterns are introduced to program on infinite data. A copattern equation is essentially a behavioural differential equations which also allows pattern matching.

Our perspective is coalgebraic [42], that is, based on the notion of observable be-haviour and bisimilarity. Informally, two programs are bebe-haviourally equivalent if they cannot be distinguished by any observation, and a term is productive if every observation terminates with a well-defined result. We formalise this by defining test formulae that capture sequences of observations on programs. These tests correspond to certain evalu-ation contexts, and are defined in a manner similar to the testing logic of [44, Sec. 6.2] and the coalgebraic modal logics for Kripke-polynomial functors in [40, 26]. For terms of inductive type, tests inspect terms by matching on constructors and testing subterms. On coinductive types, tests are essentially modal formulas. We call a program observa-tionally normalising (or productive) if it is strongly normalising under all tests, and two programs are observationally equivalent if they satisfy the same tests. For terms of func-tion type this amounts to requiring that on observafunc-tionally normalising input, the outputs are observationally equivalent.

We argue for the suitability of observational equivalence by showing that identifying observationally equivalent functions yields coinduction and induction principles for rea-soning about programs. More precisely, we construct a category T↓ that has types as objects and terms of function type modulo observational equivalence as arrows. On this category we define functors from types such that least and greatest fixed point types are initial algebras and final coalgebras, respectively (Thm. 4.11). Thus observationally equiv-alence is sufficiently strong to capture the relevant coalgebraic and algebraic notions of equivalence. On the other hand, we argue that observational equivalence does not identify too many terms by showing that tests induce a topology (Prop. 4.16) that coincides with the widely accepted topology used to define computable functions on coinductive types as continuous functions. This, together with the categorical properties, provides evidence that observational equivalence is a good semantical notion.

As one would expect, observational equivalence is, in general, undecidable which we prove by reduction to the Post correspondence problem. An immediate question is whether it becomes decidable over a restricted part of the language, which we answer positively by showing that observational equivalence is decidable over the “data fragment” (i.e., function-free fragment). In order to develop some practically useful heuristics for the full language, we provide coinductive techniques for establishing observational normalisation

(4)

and observational equivalence. These techniques are obtained by defining a transition system of terms on which observational equivalence coincides with bisimilarity, and ob-servational normalisation is a coinductive predicate. An advantage of these coalgebraic characterisations is, moreover, that they facilitate the use of up-to techniques that lead to a significant simplification of proofs and are compositional. Specifically, we provide up-to convertibility and up-to evaluation techniques, and we demonstrate with an example how up-to techniques can be derived for program definitions in a format that generalises the simple SOS format [29].

Related Work. The idea of categorical data types and unique mapping properties has previously been proposed by, e.g., Hagino [20] and Jacobs [25, Sec. 2.3]. These approaches are centred around postulating the equations that ensure the unique mapping properties. The hard part is to decide term equality induced by these equations. This can be achieved, e.g., by casting the equations into a strongly normalising and confluent rewriting system cf. Ghani [15] and Curien & Cosmo [11]. In contrast, we get unique mapping properties by taking observational equivalence as equality, and we obtain (co)inducive proof principles for reasoning about term equivalence.

Type systems with inductive and coinductive types have been studied via encoding into the (impredicative) polymorphic λ-calculus, see e.g. Geuvers [14]. Using this approach, Plotkin & Abadi [36] and Hasegawa [22] have shown that parametricity schemes induce an equivalence which yields initial algebras and final coalgebras. We avoid such an encoding into an impredicative language, and reason in the “native” language.

Program equivalence in a dependent type theory is formalised in Altenkirch et al. [6] via a propositional equality type called observational equality. This equality type is defined inductively over types in a manner similar to ours, but they do not need to require (an analogue of) observational normalisation as their term language contains only restricted forms of λ-abstraction and recursion which ensures that all terms are strongly normalising. In particular, their system has no coinductive types. We, on the other hand, have a more flexible term language which allows definitions via systems of copattern equations.

Our notion of observational equivalence is similar to the contextual equivalence of Plotkin [37] and Milner [32], since test formulae can be interpreted as certain program con-texts. Similar in spirit is also Zeilbergers’s observational equivalence [47], as he considers programs to be equivalent if they yield the same result in all environments. Interestingly, he starts by considering only the result that programs diverge, and then observes that, in order to get a useful equivalence, he needs to have a second possible result. This is similar to the fact that our test language has two basic tests, namely the everywhere true and the everywhere false test. Without these, we would get a trivial equivalence as well.

The observational congruence of Pitts [35] characterises contextual equivalence for an extension of PCF as the greatest relation with certain properties. One of these properties (called compatibility) is an inductive predicate on relations, hence observational congruence is different from our coinductive notion of equivalence. We can show that observational equivalence is contained in Pitts’ observational congruence when suitably adopted to our setting.

For inductive types and abstractions, observational equivalence is also closely related to the notion of applicative bisimilarity in the lazy λ-calculus by Abramsky [4], which compares terms by reducing to a weak head normal form (with a λ in head position) and comparing the results of function application. It is possible to extend applicative bisimilarity in a natural way to all our terms, but the resulting notion will differ from

(5)

observational equivalence. In particular, applicative bisimilarity is weaker on diverging terms, as also discussed in [19].

Our work is closest in spirit to the work by Howe [23, 24] and Gordon [19]. Howe [23, 24] studies coinductive proof princples for reasoning about equivalence in a general class of lazy languages with binding, and discusses applications in intuitistic type theory and the Nuprl theorem prover. Unique mapping properties are not considered. Gordon [19] gives a bisimulation characterisation of contextual equivalence in the language FPC, and studies up-to techniques. Again, coinductive types are not part of FPC, and unique mapping properties are not investigated.

Outline. The rest of the paper is structured as follows. In Section 2, we introduce types, terms and a reduction relation. We show moreover that this reduction relation is con-fluent, which allows us to define convertibility of terms. In Section 3, we define tests, observational normalisation and observational equivalence, and prove some basic prop-erties. In Section 4, we define the term category T↓, and show that least and greatest fixed point types are initial algebras and final coalgebras, respectively, for functors defined from types, thus obtaining induction and coinduction principles for programs. We also study the properties of the topology associated with observational equivalence. In Sec-tion 5, we define the coalgebra of terms on which observaSec-tional equivalence is bisimilarity and observational normalisation is a coinductive predicate, and we establish various up-to techniques that enhance these coinductive proof principles to become practical and useful. In Section 6, we prove the undecidability of observational equivalence, and decidability over the function-free fragment. We finish with concluding remarks and discuss directions for future research in Section 7.

2 Types and Terms

In this section we introduce our type system, the terms inhabiting these types, and a reduction relation on terms. In Abel et al. [3], a functional language was defined in which copatterns are used to specify coinductive data. Our language is based hereon with one modification: We split fixed point types from sum and product types, which eases the categorical reasoning.

2.1 Type Syntax

We assume given a countable set of type variables X, Y, . . . . Types are defined relative to a type context Ξ which is a finite sequence of type variables that contains all free variables. More precisely, the type syntax is defined by the following rules together with the usual weakening, contraction and exchange rules.

Ξ, X X Ξ A Ξ B Ξ A + B Ξ A Ξ B Ξ A × B A Ξ B Ξ A → B Ξ, X A Ξ µX.A Ξ, X A Ξ ν X.A

A type is said to be closed if it is defined in an empty context, and we will simply write A instead of A. The set of all closed types is denoted by Ty, and unless stated otherwise,

(6)

types are assumed to be closed, which will be the case in the most part of the paper. The exception is in Sec. 4 where we need to make the context explicit.

We refer to sum and least fixed point types as inductive types, and to product, function and greatest fixed point types as coinductive types. Typical examples are the one type 1 := ν X.X, natural numbers Nat := µX.1 + X; streams over A given by Str A := ν X.A × X; and “left fair streams” LFair A B := ν X.µY.(A × X + B × Y ), which are infinite sequences over A interleaved by finite lists over B.

Note that we can only construct strictly positive types by these rules, that is, types where type variables never occur left of an →. There are two reasons for restricting ourselves to strictly positive types. The first is that we can only guarantee the correctness of Def. 3.1 for strictly positive types, not for positive or let alone arbitrary types. The second is that we otherwise could form the type µX.X → X, allowing us to type any λ-term (see [7, Sec. 9.3]). This would cause fixed points to coincide, preventing an interpretation over categories like Set or, more generally, over algebraically non-compact categories.

We consider types to be equal up to α-equivalence, i.e., renaming of bound variables. See [7, Sec. 7] for a discussion of stronger notions of type equality.

2.2 Terms and Type Judgment

Next, we define the syntax and typing of terms, patterns and copatterns. Formal typing judgments are provided in Def. 2.1 below, but we first give an informal description and introduce terminology.

The term language has constructors for inductive types and destructors for coinductive types. For example, κ1, κ2(to be seen as coprojections) are constructors for the coproduct,

and dually, π1, π2 (projections) are destructors of the product. The constructor α should

be thought of as the initial algebra α : A[µX.A/X] → µX.A. For example, considering the type Nat, for a term n : Nat, α(κ2n) : Nat represents the successor n + 1. Dually, we

see the destructor ξ as the structure map of the final coalgebra ξ : ν X.A → A[ν X.A/X]. For example, if s : Str A then ξ s : A × Str A represents the pair consisting of the head and tail of s. A variable context Γ consists of a sequence of variable declarations x : A. A declaration block Σ consists of a sequence of declarations of the form f : A = D which expresses that the symbol f has type A and its behaviour is defined by the declaration body D where D consists of a sequence of copattern equations q 7→ t. Copatterns q are described below. An rlet-expression rlet Σ in t allows us to use declarations from Σ inside the term t. The name “rlet” should be read as “recursive let”, emphasising that the declarations are mutually recursive within one binding block. Finally, our term language contains a generalised form of λ-abstraction λD. What usually would be written as λx.t is expressed in our language as λ{· x 7→ t}.

As usual, patterns are used to define functions on inductive types by pattern match-ing on the arguments, e.g., as in Haskell. Copatterns, on the other hand, are used to define results of applying destructors. Syntactically, copatterns are certain contexts (i.e., terms with a hole ·). Note that inside a copattern q, we can use patterns p to match on function arguments. The typing judgment Γ `cop q : A ⇒ B of copatterns means that

a context that expects a term of type A and matches q, results in a term of type B. The definition of a symbol by a mutually recursive set D of copattern equations is closely related to Behavioural Differential Equations [43] which are used to specify elements of final coalgebras. For example, the rule (Ty-D) is used to assign type Str A to the term

(7)

λ{hd · 7→ t1; tl · 7→ t2} for terms t1 : A and t2 : Str A, using that `cop hd · : Str A ⇒ A

and `cop tl · : Str A ⇒ Str A, where hd = π1ξ and tl = π2ξ. Another example is given

by the canonical term of type 1 which we define ashi:= rlet f : 1 = {ξ · 7→ f } in f which represents a single state with a self-loop. Yet another example, using λ-abstraction, is given by λ{π1· 7→ t1; π2· 7→ t2} which defines the pair (t1, t2).

Definition 2.1. Let Var and Sig be countable sets of term variables and signature symbols. We write

• Γ; Σ ` t : A if t is well-formed term of type A in the variable context Γ using the declarations in Σ;

• Γ; Σ `_bdy D : A if D is a well-formed declaration body of type A in the variable context Γ using the declarations in Σ;

• Σ1`decΣ2if the declaration block Σ2 is well-formed using Σ1;

• Γ `pat p : A if p is a well-formed pattern on the type A that binds variables in Γ;

• Γ `copq : A ⇒ B if q is a well-formed copattern that binds variables in Γ such that

if a context e matches q and applying e to a term of type A results in a term of type B.

Well-formed, well-typed terms, and declaration bodies and declaration blocks are given by the rules in Fig. 1 together with the usual weakening, contraction and exchange rules. Well-formed and well-typed (co)patterns are given in Fig. 2.

x ∈ Var Γ, x : A; Σ ` x : A f ∈ Sig Γ; Σ, (f : A = D) ` f : A Γ; Σ ` t : Ai (i = 1, 2) Γ; Σ ` κit : A1+ A2 Γ; Σ ` t : A[µX.A/X] Γ; Σ ` α t : µX.A Γ; Σ ` t : A1× A2 (i = 1, 2) Γ; Σ ` πit : Ai Γ; Σ ` t : ν X.A Γ; Σ ` ξ t : A[ν X.A/X] Γ; Σ ` t1: A → B Γ; Σ ` t2: A Γ; Σ ` t1t2: B Γ; Σ `_bdyD : A Γ; Σ ` λD : A Γi`copqi: B ⇒ Ai Γ, Γi; Σ ` ti: Ai 1 ≤ i ≤ n (Ty-D) Γ; Σ `_bdy{q17→ t1; . . . ; qn 7→ tn} : B ∅; Σ1, Σ2`bdyD : A all (f : A = D) ∈ Σ2 Σ1`decΣ2 Σ1`decΣ2 Γ; Σ1, Σ2` t : A Γ; Σ1` rlet Σ2in t : A

Figure 1: Well-formed terms and declaration blocks

Often, we will simplify the notation and write definitions in a Haskell-like style, in that we separate type declaration from the declaration body, and replace the hole · by the symbol we define.

(8)

x : A `patx : A Γ `patp : Ai Γ `patκip : A1+ A2 Γ `patp : A[µX.A/X] Γ `patα p : µX.A ∅ `cop· : A ⇒ A Γ1`copq : A ⇒ (B → C) Γ2`patp : B Γ1, Γ2`copq p : A ⇒ C Γ `copq : A ⇒ A1× A2 Γ `copπiq : A ⇒ Ai Γ `copq : A ⇒ ν X.B Γ `copξ q : A ⇒ B[ν X.B/X]

Figure 2: Well-formed patterns and copatterns Notation 2.2. We will use the following abbreviations:

Types 0 := µX.X 1 := νX.X Bool := 1 + 1 Terms hi:= rlet f : 1 = {ξ · 7→ f } in f tt := κ1hi ff := κ2hi 0 := α(κ1hi) n + 1 := α(κ2n) hd t := π1(ξ t)) tl t := π2(ξ t)) λx.t := λ{· x 7→ t} (Co)Patterns 0 := α(κ1 ) n + 1 := α(κ2n) hd q := π1(ξ q)) tl q := π2(ξ q))

Example 2.3. We start with a very simple example. The stream ones = (1, 1, 1, . . . .) is defined as the unique solution to the following stream differential equation [43]:

hd(x) = 1, tl(x) = x. We define the stream ones as the following term:

rlet f : Str Nat = { hd · 7→ 1; tl · 7→ f } in f

Example 2.4. As an example of a mixed inductive-coinductive definition, we define a function H of type A with A = Str Nat → Str Nat → Str Nat that maps streams s and t

(9)

to a stream r with r(n) = s (Pn

i=0t(i)). The formal definition of H is,

H := rlet g : A = {· s t 7→ f (π1(ξ t)) s t } f : Nat → A = { π1(ξ(· α(κ1x) s t)) 7→ (π1(ξ s)) π2(ξ(· α(κ1x) s t)) 7→ g s (π2(ξ t)) (· α(κ2n) s t) 7→ f n (π2(ξ s)) t } in g

In the Haskell-like notation, the example becomes H : A H s t = f (hd t) s t f : Nat → A hd(f 0 s t) = hd s tl(f 0 s t) = H s (tl t) f (n + 1) s t = f n (tl s) t

The combination of patterns and copatterns is demonstrated in the declaration of f which uses pattern matching on the first argument, followed by specifying hhd, tli (f 0 s t).

2.3 Reduction Relation

As in [2, 3], the operational semantics of a program is obtained by applying its defining equations D as rewrite rules. More precisely, we define a reduction relation based on contraction of terms in context.

Definition 2.5. Evaluation contexts e are defined by the following grammar: e ::= · | e t | πie | ξ e

For a term r we write e[r] for plugging r into the hole · of e, yielding a term. If e[r] is typeable for any r of type A, we say that e is a context on type A.

These evaluation contexts allow us to define when a rewrite rule can fire. Given a copattern q and a substitution σ, we can obtain an evaluation context q[σ] by substituting σ(x) for a free variable x with unique occurrence in q. We say that a copattern q matches an evaluation context e, if there is a substitution σ with q[σ] = e. We use matching to define contraction of terms.

Definition 2.6. Given a term t : A and an evaluation context e, we say that the term e[t] contracts to a term t0 using declarations in Σ, if e[t] Σ t0 can be derived using the

following rules:

qi[σ] = e some qi7→ ti ∈ D

e[λD] Σti[σ]

e[λD] Σt0 f : A = D ∈ Σ

(10)

Note that contraction only happens at the top-level of a term, and that Σ is only necessary in the second rule, where a symbol is contracted.

Example 2.7. Let Σ = {g : A1= D1, f : A2= D2} be the declaration block used to define

H in Ex. 2.4, and for u, v : Str Nat let e1 = · u v 0 and e2 = hd e1 be evaluation contexts

applicable to H. Observe that e1 neither asks for the head nor for the tail of f u v 0. This

is where the laziness of the reduction relation lies: e1[f ] does not reduce since e1[λD2]

does not. On the other hand, we have hd(· s t 0)[u/s, v/t] = e2 and hd · s t 0 7→ hd s ∈ D2,

thus e2[λD2] Σhd u and e2[f ] Σhd u.

Using contraction, we define a reduction relation −→Σ for each Σ, as the union of

reduction relations −→k

Σ, k ∈ N, where −→kΣ is a relation on terms of rlet-nesting depth

k. More precisely, as in [2, 3], −→k

Σis the compatible closure of kΣoutside of declaration

blocks, 0

Σ= Σ, and k+1Σ is defined inductively by

t −→k Σ1,Σ2t

0

rlet Σ2in t k+1Σ1 rlet Σ2in t

0

rlet Σ2in (α t) k+1Σ1 α(rlet Σ2in t) rlet Σ2in (κit)

k+1

Σ1 κi(rlet Σ2in t)

ξ(rlet Σ2in t) k+1Σ1 rlet Σ2in (ξ t) πi(rlet Σ2in t)

k+1

Σ1 rlet Σ2in (πit)

(rlet Σ2in t) s k+1Σ1 rlet Σ2in (t s)

Terms of the form rlet Σ in t without further declarations in t can be translated into the language of [3] in a way that preserves the reduction relation, we therefore inherit that −→0_Σ preserves types. Moreover, we get that a reduction step is always possible on non-normalised terms, if they are well-covering, a notion we briefly describe now.

A sequence Q of copatterns covers a type A, written A C| Q, if the copatterns in Q are exhaustive and non-overlapping, that is, if for every (large enough) evaluation context e there is exactly one copattern in Q that matches e. For example, the three copatterns in Ex. 2.4 cover the type Str Nat. Non-example are the sequence (hd · ; tl · ; hd ·), as these copatterns are overlapping, and (hd(· 0) ; tl(· 0)), as this sequence is not exhaustive. The covering relation is formally introduced in Appendix A, and is based on that in [3, Sec. 5.2].

We call a term t or a declaration block Σ well-covering, if for every declaration body {q17→ t1; . . . ; qn7→ tn} of type A in t or Σ, we have A C| (qi)i=1,...,n.

Definition 2.8. For a declaration block Σ and a type A, we denote by ΛΣ(A) the set of all closed, well-covering terms t such that ∅; Σ ` t : A. We denote by ΛΣthe union of all ΛΣ(A) for A ∈ Ty. If Σ is empty, we omit Σ and simply write Λ(A) and Λ.

Using the notion of well-covering, we can ensure confluence, even in the presence of (co)pattern matching. We prove this using the work by Cirstea and Faure [10] which provides conditions on pattern matching algorithms that ensure confluence. Their last condition, called H2, is the content of following lemma, which we state here as we will

(11)

Lemma 2.9. Let A be a type, Q a set of typed copattern with A C| Q and e an evaluation context on A. If there is a q ∈ Q and a σ with q[σ] = e and e −→ e0, then there is a σ0 with q[σ0_{] = e and σ −→ σ}0_{, where the reduction relation is lifted to substitutions point-wise.}

Theorem 2.10. On ΛΣ_{(A), −→}

Σis confluent for any well-covering Σ and, in particular,

−→ is confluent on Λ(A). Proof. See Appendix A.

We finish this section with some basic facts about convertibility and strong normalisa-tion. We define convertibility of terms, as usual, by

t1≡ t2 iff ∃t3.t1 t3 t2.

We note that confluence ensures that ≡ is transitive, hence an equivalence relation. A term t is strongly normalising, denoted t ↓, if there is no infinite reduction sequence starting at t. The set of all strongly normalising terms is denoted by SN. Well-covering terms of inductive type in SN enjoy the crucial property that they can be reduced to (weak) head normal form (WHNF), i.e., to a form with a constructor in head position, see [3]. Concretely, for t ∈ Λ(A) there is a t0 with t ≡ κit0 if A = B1+ B2, and t ≡ α t0 if

A = µX.B.

3 Observational Equivalence and Observational

Nor-malisation

In this section, we define formally a notion of observational equivalence between terms using test formulae. Test formulae are typed and defined inductively in a manner closely related to the testing logic considered in [44, Sec. 6.2], and the many-sorted coalgebraic modal logics for polynomial functors studied, e.g., in [26] and [30, Sec. 2.1.2]. Just as predicates on a set X can be seen as functions X → 2, tests on type A are given an interpretation as terms of type A → Bool. Tests on inductive types inspect terms by matching on the (weak) head normal form, whereas tests on coinductive types can be thought of as modal formulae. In particular, tests on a function type amount to feeding an argument to a function and testing the result. Here, in order to guarantee that tests always yield a result, we must require that the function arguments are strongly normalising in every evaluation context, a property we call observational normalisation, and which can be understood as a formal definition of productivity in our language.

Definition 3.1 (Observational normalisation and tests). Let A ∈ Ty, i.e., A is a closed, strictly positive type. We simultaneously define tests on A, their interpretation as terms of type A → Bool, and the set of observationally normalising terms ONAof type A.

For a type A, we say that ϕ is a test (formula) on A if we can derive ϕ : A with the rules given below. The set of all test formulae on A is denoted by TestsA.

> : A ⊥ : A ϕ1: A1 ϕ2: A2 [ϕ1, ϕ2] : A1+ A2 ϕ : A[µX.A/X] α−1ϕ : µX.A ϕ : A[ν X.A/X] [ξ] ϕ : ν X.A ϕ : Ai [πi] ϕ : A1× A2 ϕ : B v ∈ ON_A [v] ϕ : A → B

(12)

The interpretation of tests on a type A as terms is given by the mapJ−KA: TestsA→

Λ(A → Bool) which is defined inductively as follows.

J>KA = λx.tt for all A ∈ Ty J⊥KA = λx.ff for all A ∈ Ty J[ϕ1, ϕ2]KA1+A2 = λ{κ1x 7→Jϕ1KA1(x) ; κ2x 7→Jϕ2KA2(x)} Jα −1_ϕ KµX.A = λ(α x).JϕKA[µX.A/X](x)

J[ξ] ϕKν X.A = λx.JϕKA[ν X.A/X](ξ x)

J[πi] ϕKA1×A2 = λx.JϕKAi(πix)

J[v] ϕKA→B = λf.JϕKB(f v)

Finally, we define the set ON_A of observationally normalising terms of type A by ON_A= {t : A | ∀ϕ : A. (_{JϕK t) ∈ SN}.}

We illustrate with an example how observational normalisation should be seen as a formal definition of productivity.

Example 3.2. Let x and even be given by x : Str Nat

hd x = 1 tl x = even x

even : Str Nat → Str Nat hd (even s) = hd s

tl (even s) = even (tl (tl s))

It is known that x is not productive. To see that x is not observationally normalising, we apply the evaluation context e = hd(tl(tl ·)) to x and find that

e[x] −→ hd (tl(even x)) −→ hd (even (tl(tl x))) −→ hd (tl(tl x)) = e[x]. Hence there is a diverging reduction sequence starting at e[x], thus x 6∈ ON.

Due to the mutual dependency between tests and observationally normalising terms, it is perhaps not immediately clear that their definitions are correct. In order to convince the reader, we show that the induction in the definition of tests in Def. 3.1 is well-founded. We define the order ord(A) of a type A inductively as follows.

ord(X) = 0

ord(A1 A2) = max_i=1,2ord(Ai), ∈ {+, ×}

ord(ρX.A) = ord(A), ρ ∈ {µ, ν}

ord(A → B) = max{ord(A) + 1, ord(B)}

The key property of strictly positive types is that the order of a fixed point type does not increase when unfolding.

Lemma 3.3. Let A be a type with a free type variable X that occurs only in strictly positive position. We have:

ord(A[ρX.A/X]) ≤ ord(ρX.A), ρ ∈ {µ, ν}.

Proof. Let ord(A) = n, and B be a type such that X occurs only strictly positively and ord(B) ≤ n. We prove by induction in B that ord(B[ρX.A/X]) ≤ n.

(13)

In the base case B = Y , we get ord(B[ρX.A/X]) = ord(ρX.A) = n if Y = X and ord(B[ρX.A/X]) = ord(B) = 0 ≤ n if Y 6= X.

For the induction step:

• If B = B1 B2, then ord(Bi) ≤ maxi=1,2ord(Bi) = ord(B) ≤ n and the induc-tion hypothesis applies. Thus ord(Bi[ρX.A/X]) ≤ n, hence ord(B[ρX.A/X]) =

maxi=1,2ord(Bi[ρX.A/X]) ≤ n.

• If B = ρY.C for ρ ∈ {µ, ν}, then ord(B[ρX.A/X]) = ord(C[ρX.A/X]) ≤ n, since ord(C) = ord(B) ≤ n and the induction hypothesis applies again.

• The interesting case is B = C → D. Since X occurs only strictly positively, we have that C[ρX.A/X] = C, thus ord(C[ρX.A/X]) = ord(C) ≤ ord(B) − 1. By induction, we have that ord(D[ρX.A/X]) = k ≤ n as in the other cases and, by combining these arguments, we find ord(B[ρX.A/X]) = max{ord(C) + 1, k} ≤ n.

Thus, by induction, we have ord(B[ρX.A/X]) ≤ n for any B with ord(B) ≤ n = ord(A) and X occurring only in strictly positive position. Using B = A, we get the desired result.

The order of types allows us to stratify the inductive definition in Def. 3.1. By definition of ord and Lem. 3.3, we have that tests used in the premises of the rules in Def. 3.1 are on types of order less or equal to that of types used in the conclusion. In particular, for tests on function types A → B, we have that ord(A) < ord(A → B), allowing us to define tests on A → B since TestsA and ONA are fully defined already. Thus, the induction in

Def. 3.1 is well-founded.

To ease readibility, we use the notation ON =S

A∈TyONA. Note that ON ⊆ SN by

definition, and so all terms in ON have a WHNF.

Remark 3.4. We could have called terms in ON persistently strongly normalising, in analogy to the similar notion of typed λ-calculus [7, Sec. 17.2], but we favour the name observationally normalising since it fits our setting better.

Definition 3.5 (Test satisfaction, observational equivalence). Let t ∈ Λ(A). We say that t satisfies a test ϕ : A, if t Aϕ holds, given as follows.

t A> always holds t A⊥ never holds t A1+A2 [ϕ1, ϕ2] iff ∃t 0_{. t ≡ κ} it0 and t0Ai ϕi t µX.Bα−1ϕ iff ∃t0. t ≡ α t0 and t0B[µX.B/X]ϕ t ν X.B [ξ] ϕ iff ξ t B[ν X.B/X]ϕ t A1×A2[πi] ϕ iff πit Aiϕ t B→C[v] ϕ iff t v Cϕ.

Two terms t1, t2∈ Λ(A) are observationally equivalent, written t1≡Aobs t2, if they satisfy

the same tests:

t1≡Aobst2iff ∀ϕ : A. t1Aϕ ⇔ t2Aϕ.

If ti= rlet Σ in ti for some Σ, then we define

(14)

For a context Γ, we say that a substitution σ is ON-Γ-closing, if dom(Γ) ⊆ dom(σ) and for every x ∈ dom(Γ), σ(x) : Γ(x) is a closed ON-term. We denote by Subst(Γ) the set of all ON-Γ-closing substitutions. We say that two open terms t1, t2with Γ ` t1, t2: A

are observationally equivalent, if for all ON-Γ-closing σ we have that t1[σ] ≡Aobs t2[σ] and

denote this by Γ ` t1 ≡Aobs t2. Analogously, we define ONAΓ to be the set of all terms t

such that t[σ] ∈ ON_A for any ON-Γ-closing substitution σ.

In what follows, we will frequently omit the type sub- and superscripts and simply write ,J−K and ≡obs when the typing can be inferred from the context.

Remark 3.6. We have given an interpretation of tests on A as terms of type A → Bool, this is similar to the notion of observation from λ-calculus [7, 3.5], but there observations are any terms of function type, whereas we restrict to special terms in order to identify more terms. Moreover, we can restrict to Boolean-valued observations, since one can show that allowing any type B as observation output yields the same notion of observational equivalence. This connection with λ-calculus is one reason for the name “observational equivalence”, another is that our notion is an instance of the coalgebraic notion of observ-able behaviour, as we will see in Sec. 5.

The requirement in Def. 3.1 that arguments to functions are closed and observationally normalising is crucial, since without it we can distinguish terms that should be equated. For example, let f = λx. α x and g = λ{α x 7→ x} : µX.A → A[µX.A/X], and let Ω be the divergent term Ω = rlet ω : µX.A = {· 7→ ω} in ω (of type µX.A). This term does not have a WHNF, hence f (g Ω) ≡ α(g Ω) 6≡ Ω ≡ id Ω. This means that, if we would allow Ω as function argument, the test [Ω] > would distinguish f ◦ g and id, thus f ◦ g 6≡obsid.

However, we want that f : A[µX.A/X] → µX.A is an initial algebra and so f has to be the inverse of g modulo observational equivalence. Hence we cannot allow terms like Ω as arguments.

A pair of terms that is, rightfully, distinguished by observational equivalence is H const0

and H const1, where constais the stream everywhere equal to a. For example, the formula

[const0] [hd] ϕ=1, where ϕ=1: Nat tests for equality to 1, distinguishes them. On the other

hand, proving that two terms are observationally equivalent is typically done by induction on tests.

The following lemma shows that the definition of observational equivalence of open terms makes sense.

Lemma 3.7. For all x : A ` ti: B with i = 1, 2, we have

x : A ` t1≡Bobst2⇔ λx.t1≡A→Bobs λx.t2 (1)

The next lemma states a number of fundamental properties that we need for showing that types and terms indeed form a category in the next section.

Lemma 3.8. Observational equivalence ≡obs has the following properties.

(i) Substitutivity: if Σ ` t1 ≡Aobs t2 and Σ; x : A ` r : B with t1, t2 ∈ ONA and

r ∈ ON_Bx:A, then Σ ` r[t1/x] ≡Bobs r[t2/x].

(ii) ≡obs is an equivalence relation.

(iii) ≡obs is a congruence on closed terms in ON.

(15)

(v) ≡obs strictly contains convertibility (i.e., ≡ ⊂ ≡obs).

(vi) ≡obs implies extensionality for terms of function type: if t1, t2: A → B and t1v ≡obs

t2v for all v : A in ON, then t1≡obs t2.

(vii) ≡obs contains η-equivalence: λx.tx ≡obst, x 6∈ fv(t).

Most of these properties are straightforward to prove, only substitutivity (i) requires a bit more work. The proof uses the rather technical Lem. 3.9 below, which is formulated more generally than needed in the proof of Lem. 3.8, because its proof then becomes easier and we can reuse it later in Sec. 4.2. To ease the formulation of the lemma, we use conjunctions of tests. Given tests ϕ1, ϕ2: A, their conjunction ϕ1∧ ϕ2is satisfied by t : A

if t ϕi for both i = 1 and i = 2.

Lemma 3.9. Let f ∈ ON_AΓ _{and let τ ∈ Subst(Γ). For all tests ϕ : A with f [τ ] ϕ there} are conjunctive tests {ψx}x∈dom(Γ) such that

(i) τ (x) ψx for all x ∈ dom(Γ) and

(ii) for all σ ∈ Subst(Γ), if σ(x) ψx for all x ∈ dom(Γ), then f [σ] ϕ.

Proof. We only sketch the proof. Since f is in ON_AΓ, _{JϕK(f [τ ]) is strongly normalising,} thus there is a finite reduction sequence_{JϕK(f [τ ])} N to a normal form. We obtain the family {ψx} by induction in this reduction sequence. In this induction, two cases have to

be distinguished: either τ (x) is contracted within an evaluation context e, in which case ψx is extendend by the modalities given by e, or τ (x) is used as a function argument in a

contraction. In the latter case, we reduce the corresponding function to λD or a symbol g, and extend ψx by the pattern of D or g that matched τ (x). Note that λD and g are

similar to what Abramsky [4] calls principal weak head normal forms.

Proof of Lemma 3.8. (i) To show that r[t1/x] ≡obs r[t2/x], we show that both terms

simultaneously satisfy any test ϕ : B.

First, assume that r[t1/x] ϕ. Lemma 3.9 gives us that for τ = [t1/x] there is

a test ψ : A, using the context Γ = x : A, such that t1 ψ and if t2 ψ, then

r[t2/x] ϕ. Since t1≡obst2, both terms simultaneously satisfy and thus t2satisfies

ψ and r[t2/x] ϕ.

Vice versa, r[t2/x] ϕ implies, in the same way, that r[t1/x] ϕ. Summarising, the

terms r[t1/x] and r[t2/x] satisfy the same tests, hence are observationally equivalent.

(ii) This follows immediately from ≡ being an equivalence.

(iii) Since we are only interested in closed terms, a context is just a term with a free variable, hence (i) and (ii) apply.

(iv) This is just the combination of (i) and Lem. 3.7.

(v) The inclusion is proved by induction in tests and (ii). It is strict by, e.g., item (vi). (vi) Trivially by the shape of tests on A → B.

(vii) Let t : A → B be a term with x 6∈ fv(t). For every v ∈ ON_A, we have (λx.(t x)) v ≡ t v and hence by (v) and (vi) the equivalence λx.(t x) ≡obst tollows.

(16)

Finally, we note that ≡obsis not trivial, that is, the equational theory ≡obsis consistent

according to [8, Def. 2.1.30].

Proposition 3.10. The equational theory ≡obs is consistent.

Proof. The terms tt, ff : Bool are not observationally equivalent, as they are distinguish-able by the test [>, ⊥]. Hence ≡obs is consistent.

4 Semantic Properties of Observational Equivalence

We now investigate the semantic implications of taking observational equivalence as equal-ity on terms.

In the first part, we show that by identifying observationally equivalent function terms, we obtain (co)inductive proof principles for reasoning about programs. These proof prin-ciples are organised in a category T↓ _{in which the objects are types and the arrows are}

(well-formed), observationally normalising terms of type A → B modulo observational equivalence. We show that this category has coproducts, is Cartesian closed, and that µ-types are initial algebras and ν-types are final coalgebras. The proof principles are thus given by the unique mapping properties of the corresponding structures. In the last part, we show that tests induce a topology on T↓which makes functions continuous, and in the the case of streams we recover the standardly used prefix topology.

4.1 Category of Observationally Equivalent Terms

We start by briefly recalling the basic definitions of (co)algebras and the unique mapping properties of initial algebras and final coalgebras. For more details, we refer to [42, 27, 14]. Let C be a category, and F : C → C an endofunctor. An F -algebra is a C-arrow a : F A → A in C. A homomorphism of F -algebras from a : F A → A to a0 : F A0 → A0 _is

a C-arrow f : A → A0 such that a0◦ F (f ) = f ◦ a. An initial F -algebra is an F -algebra β : F (µF ) → µF with the property that for every F -algebra a : F A → A there is a unique homomorphism a from β to a. We call a the inductive extension of a.

Dually, an F -coalgebra is a C-arrow c : X → F X, and a homomophism of F -coalgebras from c : X → F X to c0: X0→ F X0 _{is a C-arrow f : X → X}0 _{such that f ◦ c = c}0_{◦ T (f ).}

A final F -coalgebra ω : νF → F (νF ) is such that for any F -coalgebra c : C → F (C), there is a unique homomorphism_ec from c into ω, called the coinductive extension of c.

These unique mapping properties give rise to proof principles. For initial algebras, we get the familiar induction principle. Dually, the coinduction principle can be used to show that two functions h1, h2, whose codomain is the carrier of a final coalgebra, are equal by

showing that they are homomorphisms, as illustrated in the diagram on the right.

X νF F X F (νF ) c h1 h2 ω F (h1) F (h2)

The coalgebra c can be seen as a system of corecursive equa-tions, and the diagram then states that h1 and h2 are both

solutions to these equations. In our setting, h1and h2are

pro-grams that define some computation. For example, h1 could

be an abstract description and h2 a concrete implementation.

In our category, equality will be observational equivalence, and

hence we can use the coinduction principle to show that the implementation is observa-tionally equivalent to the specification.

(17)

We first note that it is possible to define a basic category with types as objects and function terms modulo observational equivalence as arrows. However, this category will not have the desired properties. In order to ensure the existence of coproducts and initial algebras, we must restrict arrows to observationally normalising terms. This restriction is possible since one can easily check that observationally normalising function terms are closed under composition. Moreover, in order to obtain an initial object, we must extend our notion of well-covering to include so-called trivial types, as we explain now.

A good candidate for the initial object is 0 = µX.X with the term λ{} : µX.X → A (abstraction without cases) as the unique map for any object A. However, the term λ{} is not well-covering by the definition of the covering relation in Def. A.1. The reason is that, if we would allow the term λ{} : A → B for arbitrary types A, then, given a term t : A, the application (λ{}) t cannot be reduced to a WHNF and the computation would get stuck. However, if we can ensure that there are not closed terms of type A, then this is not a problem, as the application (λ{}) t could never occur. This is captured by the following definition of trivial types.

Definition 4.1. A type A is called trivial, if A = µX.X or, for trivial types A1, A2, C, if

A = A1× A2, A = A1+ A2, A = B → C or A = µY.C.

The notion of a trivial type is known from domain theoretic models for λ-calculus with fixed point types, where the denotation of trivial types is isomorphic to that of 1 [7, Sec. 10.3]. This is reflected by Lem. 4.2.(i).

Lemma 4.2. Let A be a trivial type, then

(i) |Λ(A)/≡obs| = 1, and (ii) Λ(A) ∩ ON = ∅.

The second part of this lemma gives us that there are no observationally normalising terms of trivial type. Hence, since we restrict ourselves in the following to terms in ON, we can safely allow the term λ{} without losing the existence of WHNFs, and thus extend the covering relation from Sec. 2 such that A C| ∅ whenever A is a trivial type. With this definition, λ{} : µX.X → A is well-covering.

We can now finally define our category of interest.

Definition 4.3. Let Λ(A) denote the set of all closed terms of type A that are well-covering with respect to the extended well-covering relation. For A ∈ Ty, we let

T (A) = (Λ(A) ∩ ON)/≡obs.

We define T↓to be the category in which objects are types from Ty and arrows are given by Hom_T↓(A, B) = T (A → B). The identity arrows are the equivalence classes of idA= λx.x

and composition is given by composition of representatives: g ◦ f = λx.g(f x).

Note that we can find a WHNF for terms in Λ(A) ∩ ON, since, by Lem. 4.2.(ii), there is no term in ON to which λ{} can be applied. Also, note that composition is well-defined due to Lem. 3.7.

Notation 4.4. In what follows, we will implicitly pick representatives for the observational equivalence classes in T (A). For example, when we say that a term t : C → D is an arrow in T↓, we mean the observational equivalence class of which t is a representative.

(18)

Theorem 4.5. The category T↓ has all finite coproducts and is Cartesian closed. Proof. The final object in T↓ is given by the type 1 = ν X.X and for any type A, the final arrow !A: A → 1 is !A = λx.hi. Uniqueness is ensured as all terms of type 1 are

observationally equivalent tohi.

The binary product of types A1, A2 is the type A1× A2 together with the projections

πi: A1× A2→ Ai, i = 1, 2. If we are given arrows fi: B → Ai for i = 1, 2 we define the

product arrow hf1, f2i : B → A1× A2 to be

hf1, f2i = λx.λ{π1· 7→ f1x ; π2· 7→ f2x}.

This arrow clearly fulfils πi◦ hf1, f2i ≡ λx.πi(hf1, f2i x) ≡ λx.(fix) ≡obsfi. Moreover, it

is unique with this property modulo observational equivalence, i.e., if f : B → A1× A2 is

such that πi◦ f ≡obsfi then f ≡obshf1, f2i:

f ≡obsλx.f x

≡obsλx.λ{π1· 7→ π1(f x) ; π2· 7→ π2(f x)}

≡obsλx.λ{π1· 7→ (π1◦ f ) x ; π2· 7→ (π2◦ f ) x}

≡obsλx.λ{π1· 7→ f1x ; π2· 7→ f2x}

= hf1, f2i

We take 0 = µX.X as initial object for any type A, the initial arrow !A: 0 → A is

!A= λ{}. Since there are no observationally normalising terms of type 0 (by Lem. 4.2.ii),

the only tests on 0 → A are > and ⊥, hence all terms of type 0 → A are observationally equivalent, and thus !Ais unique.

Next, the binary coproduct of A1, A2 is given by A1+ A2 with inclusions κi : Ai →

A1+ A2, i = 1, 2. For terms fi: Ai → B we can form the case distinction [f1, f2] : A1+

A2→ B by [f1, f2] = λ{κ1x 7→ f1x ; κ2x 7→ f2x} which factors through the inclusions:

[f1, f2] ◦ κi ≡obs fi. By normalisation we find that [f1, f2] is again unique with this

property. To this end, let O ∈ Obs(A1+ A2→ B) be any observation, then

O f ≡ O0(f u) O0 ∈ Obs(B), u : A1+ A2 closed, u ∈ ON ≡ O0(f (κiu0)) u0 : Ai by normalisation ≡ O0((f ◦ κi)u0) λx.O0(x u0) ∈ Obs(Ai→ B) ≡ O0(fiu0) ≡ O0(([f1, f2] ◦ κi) u0) ≡ O0([f1, f2] u) ≡ O [f1, f2]

So f ≡obs[f1, f2] and hence A1+ A2 is the coproduct of A1, A2.

Finally, we show that T↓ is Cartesian closed. The exponential functor (−)B

: T↓→ T↓

is, as expected, given by CB_{= B → C and t}B _{= λf.f ◦t. We need to show (−)×B a (−)}B_,

which we do by giving a natural isomorphism ρ : Hom((−) × B, (−))−∼=→ Hom((−), (−)B_).

We define ρA,C : Hom(A × B, C) → Hom(A, CB) to be ρA,C(f ) = λa.λb.f (a, b), and its

inverse to be ρ−1_A,C(g) = λx.g (π1x) (π2x).

These are well-defined mappings in the sense that f ≡obsf0 implies that ρA,C(f ) ≡obs

ρA,C(f0) and the same for ρ−1, hence these maps respect equivalence classes.

Naturality of ρ is given as follows. Let t1 : A0 → A and t2 : C → C0, we need to

(19)

definition, we get

Hom(t1× B, t2)(ρA,C(f )) = Hom(t1× B, t2)(λa.λb.f (a, b))

= (λh.t2◦ h) ◦ (λa.λb.f (a, b)) ◦ t1 ≡ λx. (λh.t2◦ h)((λa.λb.f (a, b))(t1x)) ≡ λx. (λh.t2◦ h)(λb.f (t1x, b)) ≡ λx.t2◦ (λb.f (t1x, b)) ≡ λx.λy.t2(f (t1x, y)) and ρA0_,C0(Hom(t₁, tB₂)(f )) = ρ_A0_,C0(t₂◦ f ◦ (t₁× id_B)) = λa.λb.(t2◦ f ◦ (t1× idB)) (a, b) ≡ λa.λb.t2(f (t1a, b))

which are the same modulo renaming. Thus ρ is a natural transformation.

It remains to be proved that ρ and ρ−1_{are indeed inverses of each other. The direction}

ρA,C◦ ρ−1A,C= id is easy:

ρA,C(ρ−1A,C(g)) = λa.λb.ρ −1

A,C(g)(a, b)

≡ λa.λb.g (π1(a, b)) (π2(a, b))

≡ λa.λb.g a b ≡obsg

Showing ρ−1_A,C ◦ ρA,C = id is slightly more complicated. First, we notice that for all

u ∈ ON_A×B we have that (π1u, π2u) ≡obs u. This implies that for all f ∈ ONA×B→C

we get f (π1u, π2u) ≡obs f u by substitutivity (Lemma 3.8), which in turn implies that

λx.f (π1x, π2x) ≡obsf . This gives us

ρ−1_A,C(ρA,C(f )) = λx.ρA,C(f ) (π1x) (π2x)

≡ λx.f (π1x) (π2x)

≡obsf

by the above discussion. Thus ρ−1_A,C◦ ρA,C = id and each ρA,C is an isomorphism.

Since ρ−1 is the inverse of ρ, it is natural as well, thus (−) × B is left-adjoint to (−)B and T↓is Cartesian closed.

Next we prove that we can define for each type A, satisfying certain conditions, a functor FA on T↓, and that fixed point types are initial algebras or final coalgebras for

such functors. These two results are proved my mutual induction, since initial algebras and final coalgebras are used to define functors from fixed point types, and conversely, functoriality is needed to obtain initial algebras and final coalgebras for smaller types.

First, we define FA on objects, which is independent of the mutual induction.

Definition 4.6 (Functors from types, on objects). For a type A in a context X1, . . . , Xn

we define a map on objects of product categories

(20)

Recall that arrows (C1, . . . , Cn) → (D1, . . . , Dn) in a product category are tuples

(t1, . . . , tn) with ti: Ci → Di. We denote such a tuple by #—t and, analogously, we

de-note objects byC = (C#— 1, . . . , Cn). If n = k + 1 and C = (C#— 1, . . . , Ck), then we denote by

FC#— A : T↓ → T↓ the mapping F #— C A(D) = FA #—

C , D= F_A[_{C /}#— _X]#—(D). Moreover, we use the following notation: µFC#— A = µX.A[ #— C /X],#— α_F_C#— A = λx. α x : FC#— A µFC#— A → µFC#— A, νFC#— A = ν X.A[ #— C /X],#— ξ_F_C#— A = λx. ξ x : νFC#— A → F #— C A νFC#— A . Finally, if k = 0, then we drop the superscriptC and simply write F#— A.

We start the mutual induction by defining the action of FA on arrows.

Lemma 4.7 (Functors from types, on arrows). For A a type in a context X1, . . . , Xn, we

define FA on an arrow #—t : C →#— D inductively as follows.#—

FXi(t1, . . . , tn) = ti FµX.A #—t = α_F_D#— A ◦FA #— t , id_µF_D#— A FA+B #—t = FA #—t + FB #—t Fν X.A #—t = FA #— t , id_νF_C#— A ◦ ξ_F_C#— A ∼ FA×B #—t = FA #—t × FB #—t FA→B( #—t )(f ) = FB( #—t ) ◦ f

The bar and tilde superscripts denote inductive and coinductive extensions with respect to α_F_C#—

A

and ξ_F_D#— A

.

Proof. We proceed by induction in A. In the base case A = Xi, we note that FXi is the

ith projection from the product category, hence a functor.

To prove the induction step, Thm. 4.5 is used to show the functor laws for the compound types A + B, A × B and A → B. The case for the fixed point types was essentially provd in e.g. [28]. We provide the details for convenience. Consider the fixed point type ν X.A. By induction hypothesis, FA : (T↓)n → T↓ is a functor. By Lem. 4.9 (see below), the

functors F_AC#— and F_AD#—have as final coalgebras ξ_F_C#— A and ξF

#— D

A, and we take Fν X.A( #—t ) to be

the coinductive extension of νF_AC#— ξ F_AC#— −−−→ FA(C , νF#— #— C A) FA( #—t ,id) −−−−−−→ FA(D, νF#— #— C A).

The functor laws follow from uniqueness, see e.g. [28]. The case A = µX.A is treated analogously.

Remark 4.8. Def. 4.6 and Lem. 4.7 can be proved more generally for types A in which variables Xi occur in either negative position or in positive position. However, since we

restrict to strictly positive types the current formulation suffices.

The following lemma is needed in the induction step for fixed points in the proof of Lem. 4.7.

Lemma 4.9. For any type A with a single free variable X, if the associated FA defined

in Def. 4.6 and Lem. 4.7 is a functor, then αFA is the initial algebra and ξFA the final

(21)

Proof. Clearly, the algebra and coalgebra structures have the correct types FA(µX.A) →

µX.A and ν X.A → FA(ν X.A), respectively. Well-covering and observational

normali-sation are evident, as well. For terms a : FAC → C and c : C → FAC, we define the

terms

a : µFA→ C

a (α x) = (a ◦ FAa) x

and ec : C → νFA

ξ (_ec x) = (FA_ec ◦ c) x.

Clearly, both a and_ec are closed, typeable and well-covering. With a bit of effort, one can prove that they are also in ON, hence they represent arrows in T↓.

We must show that a and _ec are well-defined, i.e., that they are independent of the choice of representative, and that they are in fact (co)inductive extensions. We achieve this by proving that the terms a and _ec fulfil the homomorphism equations, and then applying Lem. 4.10 (below).

First,_ec is an FA-coalgebra homomorphism since

ξ_F

A◦ec ≡ λx. ξ(ec x) ≡ λx.(FAec ◦ c) x ≡ λx.(FA_ec)(c x) ≡ FA_ec ◦ c.

For any other representative c0 with c0≡obsc, we get that _ec0 is a homomorphism as well,

which we use to show that_ec0≡obs_ec as follows. For any non-trivial test ψ : C → ν X.A, we

must have that ψ = [v] [ξ] ψ0 for some v : C and test ψ0: FA(ν X.A). Applying Lem. 4.10.1

to c and the test ϕ = [v] ψ0: C → FA(ν X.A), we obtain a term t : C → FA(ν X.A) such

that t ϕ ⇔ ξFA◦ec ϕ ⇔ec ψ and t ϕ ⇔ ξFA◦ec

0

ϕ ⇔ec

0

ψ. Since this holds for any ψ, we have_ec0≡obsec.

In the same way, we show that_ec is unique up to observational equivalence, by applying Lem. 4.10.1 to any other FA-coalgebra homomorphism f from c to ξFA. Note that in this

case the choice of representative from f does not matter.

Analogously, a is well-defined and an FA-algebra homomorphism by an application of

Lem. 4.10.2. So a and_ec are the unique extensions of a and c, respectively. Hence αFA is

the initial algebra and ξ_F

A is the final coalgebra of FA.

The following lemma is used in the proof of Lem. 4.9.

Lemma 4.10. Let A be a type with a single free variable X and assume that FAas defined

in Def. 4.6 and Lem. 4.7 is a functor.

1. For every FA-coalgebra term c : C → FAC and test ϕ : C → FA(ν X.A), there is a

term t : C → FA(ν X.A), such that for any FA-coalgebra homomorphism f : C →

ν X.A from c to ξ_F_A_{, we have t ϕ ⇔ ξ}_F_A◦f ϕ.

2. For every FA-algebra term a : FAC → C and test ϕ : FA(µX.A) → C, there is a term

t : FA(µX.A) → C, such that for every FA-algebra homomorphism f : µX.A → C

from αFA to a, we have t ϕ ⇔ f ◦ αFA ϕ.

Proof. We only sketch the proof for the first item. Since f is an FA-coalgebra

homomor-phism, we have ξ_F_A◦f ϕ ⇔ FAf ◦ c ϕ hence it suffices to prove the existence of a t

with t ϕ ⇔ FAf ◦ c ϕ. We do this by proving the following claim.

For any subexpression (or ingredient) B of A, any term c0 : C → FBC and any

test ψ : C → FB(ν X.A) there is a t : C → FB(ν X.A), such that for any FA-coalgebra

homomorphism f from c to ξFA the equivalence t ψ ⇔ FBf ◦ c

0

ψ holds. The result then follows by taking c0 = c and ψ = ϕ. The claim can be proved by induction in ψ.

(22)

From Lem. 4.7 and Lem. 4.9 the main result of this section follows.

Theorem 4.11. For all types A with a single free variable, the functor FA has an initial

algebra with carrier µX.A, and a final coalgebra with carrier ν X.A. We demonstrate Lem. 4.7 and Lem. 4.9 on a mixed fixed point type.

Example 4.12. Recall the type LFair A B = ν X.µY.A × X + B × Y . The type L = A × X + B × Y with free variables X, Y induces a functor FL : T↓ × T↓ → T↓ by

FL(C, D) = A × C + B × D. Since for any type C, the type µY.A × C + B × Y is the

carrier of an initial algebra of FC

L = FL(C, −) (cf. Lem. 4.9), the construct FµY.Las defined

in Lem. 4.7 is a functor. Now, we find by Lem. 4.9 that FµY.L has a final coalgebra, whose

carrier is LFair A B.

We illustrate coinduction as proof principle by an elaborate example.

Example 4.13. In Ex. 2.4 we defined a map H : Str Nat → Str Nat → Str Nat of mixed inductive-coinductive type. We also gave a direct definition by explicitly indexing into streams. In this example, we show that both definitions give rise to the same map.

The basic step is to make the coinduction principle applicable. Since streams are a final coalgebra, we redefine H to H1 : Str Nat × Str Nat → Str Nat by H1 x = H (π1x) (π2x),

using that T↓ is Cartesian closed. Now the codomain of H1 is a final coalgebra and the

corresponding coinduction principle is applicable.

Let us give the formal definition of the version of H that uses the explicit indexing. To this end, we define a higher derivative on streams.

∂ : Nat → Str A → Str A

∂ 0 s = s

∂ (k + 1) s = ∂ k (tl s)

Moreover, we need a map that sums the first k entries in a stream: P

≤ : Nat → Str Nat → Nat

P ≤0 s = 0 P ≤k+1 s = (hd s) + P ≤k(tl s) .

Using these maps, we can define the alternative version of H with explicit indexing by: H2: Str Nat × Str Nat → Str Nat

H2(s, t) = toStr (g s t)

g : Str Nat → Str Nat → (Nat → Nat) g s t n = hd∂P

≤n+1t

s This uses one part of the isomorphism ANat∼_{= Str A:}

toStr : (Nat → A) → Str A hd (toStr h) = h 0

tl (toStr h) = toStr (λn.h (n + 1)).

With all this set up, we can show that H1≡obsH2. This is where the universal property

(23)

H2 are coalgebra homomorphisms from c into the final coalgebra. Hence, by uniqueness,

they must be equal.

We use a coalgebra on the type C = Str Nat × Str Nat given by c : C → Nat×C with c(s, t) = (hd r, (r, tl t)) where r = ∂ (hd t) s.

First, we show that H1is a coalgebra homomorphism. To do this, we need the following

intermediate result. For all n, s and t we have that

f 0 (∂ n s) t ≡ f n s t, (2)

which is proved by induction in n. Using this result, we can show ξ ◦ H1≡obs(id × H1) ◦ c

by a simple calculation. Letting r = ∂ (hd t) s, we have (id × H1)(c(s, t)) ≡ (id × H1)(hd r, (r, tl t))

≡ (hd r, H1(r, tl t))

≡ (hd r, H r (tl t)) Def. H1

≡obs ξ(f 0 r t) Def. f (both cases for 0 combined)

≡ ξ(f (hd t) s t) by (2)

≡ ξ(H s t) Def. H

≡ ξ(H1(s, t))

Hence, ξ ◦ H1≡obs(id × H1)◦c by extensionality (Lem. 3.8.(vi)) and H1is a homomorphism

from c to ξ.

To show that H2is a homomorphism as well, we need again two intermediate results:

hd r ≡ hd(H2 s t) (3)

H2(r, tl t) ≡obstl(H2 s t) (4)

First, we establish the equation for the head of H2, that is, equation (3).

hd (H2s t) ≡ hd (toStr(g s t)) ≡ g s t 0 ≡ hd∂ P ≤1t s ≡ hd(∂ (hd t) s) by hd t + 0 ≡ hd t ≡ hd r

To establish (4), we need two intermediate results. The first result shows that ∂ is linear in the first argument. It can be proved by induction in n.

∂ (n + m) s ≡ ∂ m (∂ ns) (5)

The second result is concerned with the evaluation of g at n + 1. g s t (n + 1) ≡ hd∂ P ≤n+2t s ≡ hd∂ (hd t) +P ≤n+1(tl t) s Def. of sum ≡ hd∂ P ≤n+1(tl t) (∂ (hd t) s) by (5) ≡ g (∂ (hd t) s) (tl t) n (6)

(24)

Using these two equations, we can establish (4). tl(H2s t) ≡ tl(g s t)

≡ toStr(λn.(g s t (n + 1)))

≡ toStr(λn.g (∂ (hd t) s) (tl t) n) by (6)

≡obs toStr(g (∂ (hd t) s) (tl t)) by extensionality

≡ H2(∂ (hd t) s, tl t)

≡ H2(r, tl t)

This concludes the intermediate results and we can finally show that H2 is a

homo-morphism: (id × H2)(c(s, t)) ≡ (id × H2)(hd r, (r, tl t)) ≡ (hd r, H2(r, tl t)) ≡ (hd(H2 s t), H2(r, tl t)) by (3) ≡obs (hd(H2 s t), tl(H2 s t)) by (4) ≡obs ξ(H2s t)

So ξ ◦ H2≡obs(id × H2) ◦ c follows from extensionality.

Summarising, we have that both H1 and H2 are homomorphisms from c to ξ and

thus must be, by uniqueness, observationally equivalent. Hence, up to currying, H is observationally equivalent to H2, which is given by explicit indexing.

4.2 Topological Properties

In this section we show that tests induce in a natural way a topology such that arrows in T↓are continuous. This result relates our work to the constructive view that computable functions are continuous functions, cf. [12]. We also show that on streams and other coin-ductive types, this topology coincides with the commonly used prefix-induced topology. This tells us that observational equivalence does not identify too much.

Definition 4.14 (Topology on T (A)). Given a term t ∈ T (A) and a test ϕ : A, we define Uϕ(t) = {t0 ∈ T (A) | t ϕ ⇔ t0 ϕ}.

The topology ΘA on T (A) is generated by the subbase

U (A) = {Uϕ(t) | ϕ : A, t ∈ T (A)}.

Note that the definition of Uϕ(t) is independent of the choice of representative t by the

definition of ≡obs.

We give a topological interpretation of T↓ by defining a functor F : T↓→ Top to the category of topological spaces and continuous maps as follows.

for all types A ∈ Ty, F (A) = (T (A), ΘA)

for all t ∈ T (A → B), F (t) =bt : T (A) → T (B)

(7) wherebt evaluates t on arguments, that is, for s ∈ T (A),bt(s) = t s. Note that bt is well-defined due to Lem. 3.7, and F preserves identity and composition since id s ≡ s and ( \t2◦ t1)(s) ≡ t2(t1s) =tb2(tb1(s)). It remains to show that for terms t,bt is continuous. Lemma 4.15. For any t ∈ T (A → B), the map bt : T (A) → T (B) is continuous with respect to ΘA and ΘB.

(25)

Proof. Let U = Uϕ(t0) ∈ U (B) be a subbasic open set. We show that for each s inbt←(U ), thebt-preimage of U , that there is an open set Vs ⊆bt←(U ) containing s. It follows that bt←(U ) =

S

sVsis open, hencebt is continuous.

We have s ∈bt←(U ) iff t s ∈ Uϕ(t0) iff U = Uϕ(t s). By letting f = t x we have that

f ∈ ON_Bx:A _{and f [s/x] ϕ, and we obtain from Lem. 3.9 a conjunctive test ψ : A such} that s ψ, and for all s0_{∈ ON}

A, s0 ψ implies that f [s0/x] ϕ. In terms of open sets,

this implies that Uψ(s) ⊆bt←(U ). Furthermore, Uψ(s) is open since it is the intersection

of the subbasic opens Uψi(s) for conjuncts ψi of ψ, and hence continuity ofbt follows by taking Vs= Uψ(s).

Proposition 4.16. The map F defined in (7) is a functor F : T↓→ Top.

We finish the topological investigation by relating the topology ΘAto classical

topolo-gies. To this end, let us, for a test s : A and a type B, denote by s application to s, that is, s : (A → B) → B with s = λf.(f s). We characterise the topology on ΘAas “extreme”

for the corresponding canonical maps.

Theorem 4.17. The topologies ΘA1×A2, Θν X.A and ΘA→B are initial with respect to

{_cπ1,_cπ2}, bξ and {_bs | s ∈ ONA}, respectively (that is, the coarsest topology making these

maps continuous). On the other hand, ΘA1+A2 and ΘµX.Aare final (i.e., the finest

topol-ogy) for {κ_c1,_cκ2} andα, respectively._b

As an example, let A be a type such that ΘA is discrete, for example A = Nat. It is

easy to see that in this case the topology on Str A is induced by the usual prefix metric, given by d(x, y) = 2−k with k = min{k | hd tlkx 6≡ hd tlky}.

5 Proof Techniques

In this section, we present coinductive methods for proving that two terms are observa-tionally equivalent, and for proving that a term is observaobserva-tionally normalising. We do so by defining a transition system on terms such that observational equivalence coincides with bisimilarity, and hence observational equivalence of two terms can be proved by es-tablishing a bisimulation relation containing them. On the same transition system, we show that observational normalisation is a coinductive predicate such that observational normalisation can be proved by establishing a subset of strongly normalising terms that is closed under transitions. Moreover, we provide a number of up-to techniques that will enhance these proof techniques.

5.1 Terms as Transition System

We define a transition structure in which the successors of a term t are the terms (modulo reductions) that a test can inspect in order to determine whether it is satisfied by t. For example, a term t of type C = A1+ A2 can either be reduced to a term of the form κit0

for some i and term t0: Ai, or t does not have a WHNF. In the first case, the successors of

t are all those t0 such that t κit0 (these are the terms that will be inspected by tests).

In the second case, no further inspection is possible and there is no outgoing transition from t.

Formally, the transition structure of terms is a coalgebra in the category of families of sets indexed by the types of our type system. Let I be a set, the category SetI has as

(26)

objects families of sets X = {Xi}i∈I indexed by I, and as morphisms f : X → Y families

of functions {fi: Xi→ Yi}i∈I.

The branching type is given by a functor F : SetTy → SetTy _{such that for C ∈ Ty,}

F (X)C consists of sets indexed by the types of subterms that can be inspected by tests

on type C. Formally, F is defined by:

F (X)C=                ` i∈{1,2}P (XAi) + {∗}, C = A1+ A2 P XA[µX.A/X] + {∗}, C = µX.A Q i∈{1,2}P (XAi) , C = A1× A2

P XA[ν X.A/X] , C = ν X.A

P (XB) ON_A

, C = A → B

where P (−) is the covariant powerset functor, and for a set U we denote by (−)U _the

function space functor. F acts on morphisms in the obvious way. The component for coproducts was explained above. The other components can be understood similarly following Def. 3.5.

Let Λ = {Λ(A)}A∈Tyand define the coalgebra on terms δ : Λ → F (Λ) by

δA1+A2(t) = ( ιi({t0 : Ai| t κit0}), ∃t0. t κit0 ∗, otherwise δµX.A(t) = ( {t0_{: A | t} _{α t}0_}, _∃t0_{. t} _{α t}0 ∗, otherwise δA1×A2(t)(i) = {t 0_{: A} i | πit t0} δν X.A(t) = {t0: A | ξ t t0} δA→B(t)(u) = {t0: C | t u t0}

Note that δ is well-defined on terms of sum type by confluence.

Example 5.1. We give two examples of the structure δ, the first for an inductive and the second on a coinductive type. Since δ is a transition system, we use the common way of displaying such systems by nodes and arrows. A node is labelled by the term it represents. Transition arrows are provided with a label that indicates how the successor was obtained. 1. We denote by n the representation of a natural number as term, that is, 0 = α(κ1hi)

and n + 1 = α(κ2n). We display below some of the transitions starting in 1.

1 κ20 0 κ1hi hi

α κ2 α κ1

ξ

Of course, there are also terms of type Nat that do not have a WHNF. For example, let Σ be the definition block Σ = (f : Nat = {· 7→ f }), in which case we have δNat(rlet Σ in f ) = ∗.

2. The second example is H. We denote by o, z : Str Nat the streams that are constantly 1 and 0, respectively. We show a part of δ starting at H where o and z are used as arguments.

(27)

H H o H o z f 0 o z H o (tl z) 1 · · · o · · · z nats · · · z _z · · · o hd tl tl tl tl tl tl

Remark 5.2. Note that F only distinguishes between inductive and coinductive types:

F (X)C∼=                ` i∈{1,2}P (XAi) + {∗}, C = A1+ A2 `

i∈{∗}P XA[µX.A/X] + {∗}, C = µX.A

Q

i∈{1,2}P (XAi) , C = A1× A2

Q

i∈{∗}P XA[ν X.A/X] , C = ν X.A

Q

i∈ON_AP (XB) , C = A → B

This is interesting because it makes clear how the duality between inductive and coinduc-tive types arises in the term coalgebra: Observations on induccoinduc-tive types are given by a reduction WHNF, if possible, and removing the constructor in head position. Observa-tions for coinductive types, on the other hand, are just given by applying the destructors of the corresponding type.

5.2 Observational Equivalence as Bisimilarity

In this section, we establish that observational equivalence is a bisimulation. We use the definition of bisimulation in terms of relation lifting, see e.g. [45]. For a family X ∈ SetTy, we denote by Rel(X) the poset category of relation families R = {RA}A∈Ty where RA⊆

XA× XA, and the order is given by: R v R0 iff RA⊆ RA0 for each A ∈ Ty. The lifting of

F at Λ is defined as the functor F : Rel(Λ) → Rel(F (Λ)) where F (R) is the image of the map F (R)−−−−−−−−−→ F (Λ) × F (Λ), where πhF (π1),F (π2)i i: R → Λ, i = 1, 2, are the projections of

R. That is, for all (U1, U2) ∈ F (Λ)A× F (Λ)A,

(U1, U2) ∈ F (R)A ⇐⇒ ∃U ∈ F (R)A: Uk = (F πk)A(U ).

A bisimulation on δ is a relation R ∈ Rel(Λ) such that R v Φ(R) where Φ(R) = (δ × δ)−1(F (R)), i.e.,

Φ(R)A= {(t1, t2) ∈ Λ(A)2| (δA(t1), δA(t2)) ∈ F (R)A}.

We can now formulate the main result of this section. Theorem 5.3. The largest bisimulation on δ is ≡obs.