Hostname: page-component-586b7cd67f-r5fsc Total loading time: 0 Render date: 2024-11-25T10:29:23.920Z Has data issue: false hasContentIssue false

ANF preserves dependent types up to extensional equality

Published online by Cambridge University Press:  16 September 2022

PAULETTE KORONKEVICH
Affiliation:
University of British Columbia, Vancouver, British Columbia, Canada (e-mail: [email protected])
RAMON RAKOW
Affiliation:
University of British Columbia, Vancouver, British Columbia, Canada (e-mail: [email protected])
AMAL AHMED
Affiliation:
Northeastern University, Boston, MA 02115, USA (e-mail: [email protected])
WILLIAM J. BOWMAN
Affiliation:
University of British Columbia, Vancouver, British Columbia, Canada (e-mail: [email protected])
Rights & Permissions [Opens in a new window]

Abstract

Many programmers use dependently typed languages such as Coq to machine-verify high-assurance software. However, existing compilers for these languages provide no guarantees after compiling, nor when linking after compilation. Type-preserving compilers preserve guarantees encoded in types and then use type checking to verify compiled code and ensure safe linking with external code. Unfortunately, standard compiler passes do not preserve the dependent typing of commonly used (intensional) type theories. This is because assumptions valid in simpler type systems no longer hold, and intensional dependent type systems are highly sensitive to syntactic changes, including compilation. We develop an A-normal form (ANF) translation with join-point optimization—a standard translation for making control flow explicit in functional languages—from the Extended Calculus of Constructions (ECC) with dependent elimination of booleans and natural numbers (a representative subset of Coq). Our dependently typed target language has equality reflection, allowing the type system to encode semantic equality of terms. This is key to proving type preservation and correctness of separate compilation for this translation. This is the first ANF translation for dependent types. Unlike related translations, it supports the universe hierarchy, and does not rely on parametricity or impredicativity.

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
© The Author(s), 2022. Published by Cambridge University Press

1 Introduction

Dependently typed languages such as Coq, Agda, Idris, and F* allow programmers to write full-functional specifications for a program (or program component), implement the program, and prove that the program meets its specification. These languages have been widely used to build formally verified high-assurance software including the CompCert C compiler (Leroy, Reference Leroy2009), the CertiKOS operating system kernel (Gu et al., Reference Gu, Koenig, Ramananandro, Shao, Wu, Weng, Zhang and Guo2015; Gu et al., Reference Gu, Shao, Chen, Wu, Kim, SjÖberg and Costanzo2016), and cryptographic primitives (Appel, Reference Appel2015) and protocols (Barthe et al., Reference Barthe, Grégoire and Zanella-Béguelin2009).

Unfortunately, even these machine-verified programs can misbehave when executed due to errors introduced during compilation and linking. For example, suppose we have a program component S written and proven correct in a source language like Coq. To execute S, we first compile S from Coq to a component T in OCaml. If the compiler from Coq to OCaml introduces an error, we say that a miscompilation error occurs. Now T contains an error despite S being verified. Because S and T are not whole programs, T will be linked with some external (i.e., not verified in Coq) code C to form the whole program P. If C violates the original specification of S, then we say a linking error occurs and P may contain safety, security, or correctness errors.

A verified compiler prevents miscompilation errors, since it is proven to preserve the run-time behavior of a program, but it cannot prevent linking errors. Note that linking errors can occur even if S is compiled with a verified compiler, since the external code we link with, C, is outside of the control of either the source language or the verified compiler. Ongoing work on CertiCoq (Anand et al., Reference Anand, Appel, Morrisett, Paraskevopoulou, Pollack, Bélanger, Sozeau and Weaver2017) seeks to develop a verified compiler for Coq, but it cannot rule out linking with unsafe target code. One can develop simple examples in Coq that, once compiled to OCaml and linked with an unverified OCaml component, jump to an arbitrary location in memory—despite the Coq component being proven memory safe. We provide an example of this in the supplementary materials.

To rule out both miscompilation and linking errors, we could combine compiler verification with type-preserving compilation. A type-preserving compiler preserves types, representing specifications, into a target typed intermediate language (IL). The IL uses type checking at link time to enforce specifications when linking with external code, essentially implementing proof-carrying code (Necula, Reference Necula1997). After linking in the IL, we have a whole program, so we can erase types and use verified compilation to machine code. To support safe linking with untyped code, we could use gradual typing to enforce safety at the boundary between the typed IL and untyped components (Ahmed, Reference Ahmed2015; Lennon-Bertrand et al., Reference Lennon-Bertrand, Maillard, Tabareau and Tanter2022). Applied to Coq, this technique would provide significant confidence that the executable program P is as correct as the verified program S.

We consider this particular application of type preservation important for compiler verification and for protecting the trusted computing base when working in a dependently typed language, but type preservation has other long studied applications. Type-preserving compilers can provide a lightweight verification procedure using type-directed proof search (Chlipala, Reference Chlipala2007), can be better equipped to handle optimizations and changes than a certified compiler (Tarditi et al., Reference Tarditi, Morrisett, Cheng, Stone, Harper and Lee1996; Chen et al., Reference Chen, Hawblitzel, Perry, Emmi, Condit, Coetzee and Pratikaki2008), and can support debugging internal compiler passes (Peyton Jones, Reference Peyton Jones1996).

Unfortunately, dependent-type preservation is difficult because compilation that preserves the semantics of the program may disrupt the syntax-directed typing of the program. This is particularly problematic with dependent types since expressions from a program end up in specifications expressed in the types. As the compiler transforms the syntax of the program, the expressions in types can become “out of sync” with the expressions in the compiled program.

To preserve dependent typing, the type system of the IL needs access to the semantic equivalences relied upon by compilation. Past work has used strong axioms, including parametricity and impredicativity, to recover these equivalences (Bowman et al., Reference Bowman, Cong, Rioux and Ahmed2018). However, this approach does not support all dependent type system features, particularly predicative universe hierarchies (which are anti-impredicative) or ad hoc polymorphism and type quotation (which is anti-parametric) (Boulier et al., Reference Boulier, Pédrot and Tabareau2017). Dependently typed languages, such as Coq, restrict impredicative quantification as it is inconsistent with other features and axioms, such as excluded middle and large elimination. Restricting impredicativity restricts type-level abstraction, so such languages often add a predicative universe hierarchy—abstractions over types live in a higher universe than the parameter, and so on. Higher universes enable recovering the ability to abstract over types and the types of types, etc., which is useful for generic programming and abstract mathematics. This means reliance on parametricity and impredicativity restrict which source programs can be compiled and thus cannot be applied in practice. To scale to a realistic dependently typed language such as Coq, we must encode these semantic equivalences in the type system without restricting core features.

To preserve dependent types, the compiler IL must encode the semantic equivalences relied upon by compilation. In this work, we do this using extensionality, which allows the type system to consider two types (or two terms embedded in a type) definitionally equivalent if the program contains an expression representing a proof that the two types (or terms) are equal. Using this feature, the translation can insert hints for the type system about which terms it can assume to be equivalent and provide proofs of those facts elsewhere to discharge these assumptions. This approach scales to higher universes and other features that prior work could not handle and does so without relying on parametricity or impredicativity in the target IL. Relying on extensionality has one key downside: decidable type checking becomes more complex. We discuss how to mitigate this downside in Section 7.

We present a dependent-type-preserving translation to A-normal form (ANF), a compiler intermediate representation that makes control flow explicit and facilitates optimizations (Sabry & Felleisen, Reference Sabry and Felleisen1992; Flanagan et al., Reference Flanagan, Sabry, Duba and Felleisen1993). The source of this translation is ECC, the Extended Calculus of Constructions with dependent elimination of booleans and natural numbers. ECC represents a significant subset of Coq. The translation supports all core features of dependency, including higher universes, without relying on parametricity or impredicativity, in contrast to prior work (Bowman et al., Reference Bowman, Cong, Rioux and Ahmed2018; Cong & Asai Reference Cong and Asai2018a). This ensures that the translation works for existing dependently typed languages and that we can reuse existing work on ANF translations, such as join-point optimization. This work provides substantial evidence that dependent-type preservation can, in theory, scale to the practical dependently typed languages currently used for high-assurance software.

Our translation targets our typed IL $\mathrm{CC}_{e}^{A}$, the ANF-restricted extensional Calculus of Constructions. $\mathrm{CC}_{e}^{A}$ features a machine-like semantics for evaluating ANF terms, and we prove correctness of separate compilation with respect to this machine semantics. To support the type-preserving translation of dependent elimination for booleans, $\mathrm{CC}_{e}^{A}$ uses two extensions to ECC: it records propositional equalities in typing derivations and applies equality reflection to access these equalities. These extensions make type checking undecidable. We discuss how to recover decidability in Section 7.

Our ANF translation is useful as a compiler pass, but it also provides insights into dependent-type preservation and ANF translations. The target IL is designed to express and check semantic equivalences that the translation relies on for correctness. This lesson likely extends to other translations in a type-preserving compiler for dependent types.

We also develop a new proof architecture for reasoning about the ANF translation. This proof architecture is useful when ANF is achieved through a translation rather than a reduction system. Our translation is indexed by a continuation, a program with a hole, to build up the translated term. Thus, the type of the translated term can only be determined when the hole is filled with a well-typed term. Mirroring the translation, we use the type of the continuation to build up a proof of type-preservation.

This paper includes key definitions and proof cases; extended figures and proofs are available in Appendix 1.

2 Main ideas

ANF 101. A-normal form (ANF)Footnote 1 is a syntactic form that makes control flow explicit in the syntax of a program (Sabry & Felleisen, Reference Sabry and Felleisen1992; Flanagan et al., Reference Flanagan, Sabry, Duba and Felleisen1993). ANF encodes computation (e.g., reducing an expression to a value) as a sequence of primitive intermediate computations composed through intermediate variables, similar to how all computation works in an assembly language.

For example, to reduce , which projects the second element of a pair, we need to describe the evaluation order and control flow of each language primitive. We evaluate to a value, and then, we project out the second component. ANF makes control flow explicit in the syntax by decomposing into the series of primitive computations that the machine must execute, sequenced by . Roughly, we can think of the ANF translation as . However, could also be a deeply nested expression in the source language. In general, the ANF translation reassociates all the intermediate computations from so there are no nested expressions, and we end up with .

Once in ANF, it is simple to formalize a machine semantics to implement evaluation. Each -bound computation is some primitive machine step, performing the computation and binding the value to . In this way, control flow has been compiled into data flow. The machine proceeds by always reducing the left-most machine step, which will be a primitive operation with values for operands. For a lazy semantics, we can instead delay each machine step and begin forcing the inner-most body (right-most expression).

Why Translation Disrupts Typing and How to Fix it. The problem with dependent-type preservation has little to do with ANF itself and everything to do with dependent types. Transformations which ought to be fine are not because the type theory is so beholden to details of syntax. This is essentially the problem of commutative cuts in type theory (Herbelin, Reference Herbelin2009; Boutillier, Reference Boutillier2012). Transformations that change the structure (syntax) of a program can disrupt dependencies. By dependency, we mean an expression whose type and evaluation depends on a sub-expression . We call a sub-expression such as depended upon. These dependencies occur in dependent elimination forms, such as application, projection, and if expressions. Transforming a dependent elimination can disrupt dependencies.

For example, the dependent type of a second projection of a dependent pair is typed as follows:

Notice that the depended upon sub-expression is copied into the type, indicated by the solid line arrow. Dependent pairs can be used to define refinement types, such as encoding a type that guarantees an index is in bounds: . Then, the second projection represents an explicit proof about first projection. Unfortunately, transforming the expression can easily change its type.

For example, suppose we have the nested expression , and we want to -bind all intermediate computations (which, incidentally, ANF does).

This is not well typed, even with the following dependent-rule (treated as syntactic sugar for dependent application):

The problem now is the dependent elimination is -bound and then used, changing the type to . This means the equality is missing,Footnote 2 but is needed to type check this term (indicated by the dotted line arrow). This fails since expects , but is applied to . The typing rule essentially forgets that, by the time happens, the machine will have performed the step of computation , forcing to take on the value of , so it ought to be safe to assume that in the types. When type checking in linearized machine languages, we need to record these machine steps throughout the typing derivation.

The above explanation applies to all dependent eliminations of negative types, such as dependently typed functions and compound data structures (modeled as $\Pi$ and $\Sigma$ types).

An analogous problem occurs with dependent elimination of positive types, which include data types eliminated through branching such as booleans with expressions. In the dependent typing rule for , the types of the branches learn whether the predicate of the is either . This allows the type system to statically reflect information from dynamic control flow.

Now consider an expression and a function . The expression is well typed, but if we want to push the application into the branches to make behave a little more like goto (like the ANF translation does), this result is not well typed.

This transformation fails when type checking the applications of to the branches and . The function is expecting an argument of type but is applied to arguments of type and . The type system cannot prove that is equal to both and . In essence, this transformation relies on the fact that, by the time the expression executes, the machine has evaluated to (and, analogously, to in the other branch), but the type system has no way to express this.

We design a type system in which we can express these intuitions and recover type preservation of these kinds of transformations. We use two extensions to ECC: one for negative types ($\Pi$ and $\Sigma$ in ECC) and one for positive types (booleans and natural numbers in ECC).

For negative types, it suffices to use definitions (Severi & Poll Reference Severi and Poll1994), a standard extension to type theory that changes the typing rule for to thread equalities into sub-derivations and resolve dependencies. The relevant typing rule is:

The highlighted part, , is the only difference from the standard dependent typing rule. This definition is introduced when type checking the body of the and can be used to solve type equivalence in sub-derivations, instead of only in the substitution in the “output” of the typing rule. While this is an extension to the type theory, it is a standard extension that is admissible in any Pure Type System (PTS) (Severi & Poll Reference Severi and Poll1994) and is a feature already found in dependently typed languages such as Coq. With this addition, the transformation of type checks in the target language by recording the definition while type checking the body of a expression.

For positive types, we record a propositional equality between the term being eliminated and its value. For booleans, we need the following typing rule for .Footnote 3

The two highlighted portions of the rule are modifications over the standard typing rule. This rule introduces a propositional equality between the term that appears in the calling context’s type to the value known in the branches. This represents an assumption that and are equal in the type system and allows pushing the context surrounding the expression into the branches.

Like definitions, propositional equalities thread “machine steps” into the typing derivations of and . In contrast, these equalities are accessed through an additional type equivalence rule , which states that an equivalence holds between two terms if some proof of equality exists between them.

Our earlier example expression now type checks using the modified typing rule for and . We thread the equivalence (respectively ) which allows to be applied to (respectively ) at the correct type.

Equality reflection is not a perfect solution, as adding this type equivalence rule can introduce undecidable type checking. While equality reflection allows definitions like to be subsumed by a propositional equality , we use definitions when sufficient (such as for negative types) to avoid undecidable type checking. We discuss how to recover decidability of $\mathrm{CC}_{e}^{A}$ in Section 7; however, this would distract us from the ANF translation in the present work.

Formalizing Type-Preserving ANF Translation. Despite these simple extensions to $\mathrm{CC}_{e}^{A}$, formalizing the ANF type-preservation argument is still tricky. In the source, looking at an expression such as , we do not know whether the expression is embedded in a larger context. To formalize the ANF translation, it helps to have a compositional syntax for translating and reasoning about the types of an expression and the unknown context.

To make the translation compositional, we index the ANF translation by a target language (non-first-class) continuation representing the rest of the computation in which a translated expression will be used.Footnote 4 A continuation is a program with a hole (single linear variable) $[\cdot]$ and can be composed with a computation , written , to form a program . Keeping continuations non-first-class ensures that continuations must be used linearly and avoids control effects, which cause inconsistency with dependent types (Barthe & Uustalu, Reference Barthe and Uustalu2002; Herbelin, 2005). In ANF, there are only two continuations: either $[\cdot]$ or . Using continuations, we define ANF translation for $\Sigma$ types and second projections as follows. We use as shorthand for translating with an empty continuation, .

This allows us to focus on composing the primitive operations instead of reassociating bindings.

For compositional reasoning, we develop a type system for continuations. The key typing rule is the following.

The type of continuations describes that the continuation must be composed with the term of type , and the result will be of type . Note that this type allows us to introduce the definition via the type, before we know how the continuation is used.Footnote 5 We discuss this rule further in Section 4, and how continuation typing is not an extension to the target type theory.

The key lemma to prove type preservation is the following.

Lemma 2.1

The intuition behind this lemma shows that type preservation follows if every time we construct a continuation , we show that is well typed. The translation starts with an empty continuation , which is trivially well typed. We systematically build up the translation of in by recurring on sub-expressions of and building up a new continuation . If each time we can show that this is well typed, then by induction, the whole translation is type preserving. This lemma is indexed by an additional environment and answer type . Intuitively, this is because it is the continuation , not the expression , that dictates the final answer type of the translation, and the environment will provide some additional equalities and definitions in under which we can type check the transformed .

3 Source: ECC with definitions

Our source language, ECC, is Luo’s Extended Calculus of Constructions (ECC) (Luo, Reference Luo1990) extended with dependent elimination of booleans, natural numbers with the recursive eliminator, and definitions (Severi & Poll, Reference Severi and Poll1994). This language is a subset of CIC and is based on the presentation given in the Coq reference manualFootnote 6; Timany & Sozeau, (Reference Timany and Sozeau2017) give a recent account of the metatheory for the CIC, including all the features of ECC. We typeset ECC in a . We present the syntax of ECC in Figure 15. ECC extends the Calculus of Constructions (CC) (Coquand & Huet, Reference Coquand and Huet1988) with $\Sigma$ types (strong dependent pairs) and an infinite predicative hierarchy of universes. There is no explicit phase distinction; that is, there is no syntactic distinction between terms, which represent run-time expressions, and types, which classify terms. However, we usually use the meta-variable to evoke a term and the meta-variables and to evoke a type. The language includes one impredicative universe, , and an infinite hierarchy of predicative universes . The syntax of expressions includes names , universes , dependent function types , functions , application , dependent pair types , dependent pairs , first and second projections of dependent pairs, dependent let , the boolean type , the boolean values , dependent if , the natural number type , the natural number values and , and the recursive eliminator for natural numbers . For brevity, we omit the type annotation on dependent pairs, as in . Note that bound definitions do not include type annotations; this is not standard, but type checking is still decidable (Severi & Poll, Reference Severi and Poll1994), and it simplifies our ANF translation. As is standard, we assume uniqueness of names and consider syntax up to $\alpha$-equivalence.

The following table summarizes the judgments for ECC and the new notation introduced, in particular the notation for definitions in contexts and substitution of variables for expressions.

In Figure 2, we give the reductions for ECC, which are entirely standard. We extend reduction to conversion by defining to be the reflexive, transitive, compatible closure of reduction $\triangleright$. The conversion relation is used to compute equivalence between types, but we can also view it as the operational semantics for the language. We define as the evaluation function for whole programs using conversion, which we use in our compiler correctness proof.

Fig. 1: ECC syntax.

Fig. 2: ECC dynamic semantics (excerpt).

In Figure 3, we define definitional equivalence (or just equivalence) as conversion up to $\eta$-equivalence. We use the notation for equivalence, eliding the environment when it is obvious or unnecessary. We also define cumulativity (subtyping) , to allow types in lower universes to inhabit higher universes.

Fig. 3: ECC equivalence and subtyping (excerpt).

We define the type system for ECC in Figure 4, which is mutually defined with well-formedness of environments. The typing rules are entirely standard for a dependent type system. Note that types themselves, such as have types (called universes), and universes also have types which are higher universes. In [AX-PROP], the type of is , and in [Ax-Type], the type of each universe is the next higher universe . Note that we have impredicative function types in , given by [Prod-Prop]. For this work, we ignore the versus distinction used in some type theories, such as Coq’s, although adding it causes no difficulty. Note that the rules for second projection, [Snd], let, [Let], if, [If], and the eliminator for natural numbers [ElimNat] substitute sub-expressions into the type system. These are the key typing rules that introduce difficulty in type-preserving compilation for dependent types.

Fig. 4: ECC typing (excerpt).

4 Target: ECC with ANF support

Our target language, $\mathrm{CC}_{e}^{A}$, is a variant of ECC with a modified typing rule for dependent that introduces propositional equalities between terms and equality reflection for accessing assumed equalities. The type system of $\mathrm{CC}_{e}^{A}$ is not particularly novel as its type theory is adapted from the extensional Calculus of Constructions (Oury, Reference Oury2005). While $\mathrm{CC}_{e}^{A}$ supports ANF syntax, the full language is not ANF-restricted; it has the same syntax as ECC. We do not restrict the full language because maintaining ANF while type checking adds needless complexity, especially when checking for equality since conversion often breaks ANF. Instead, we show that our compiler generates only ANF restricted terms in $\mathrm{CC}_{e}^{A}$ and define a separate ANF-preserving machine-like semantics for evaluating programs in ANF. We typeset $\mathrm{CC}_{e}^{A}$ in a ; in later sections, we reserve this font exclusively for the ANF restricted $\mathrm{CC}_{e}^{A}$. This ability to break ANF locally to support reasoning is similar to the language $F_J$ of Maurer et al. (Reference Maurer, Downen, Ariola and Peyton Jones2017), which does not enforce ANF syntactically, but supports ANF transformation and optimization with join points.

We can imagine the compilation process as either (1) generating ANF syntax in $\mathrm{CC}_{e}^{A}$ from ECC or (2) as first embedding ECC in $\mathrm{CC}_{e}^{A}$ and then rewriting $\mathrm{CC}_{e}^{A}$ terms into ANF. In Section 5, we present the compiler as process (1), a compiler from ECC to ANF $\mathrm{CC}_{e}^{A}$. In this section, we develop most of the supporting metatheory necessary for ANF as intra-language equivalences and process (2) may be a more helpful intuition.

We give the ANF syntax for $\mathrm{CC}_{e}^{A}$ in Figure 5(a). We impose a syntactic distinction between values which do not reduce, computations which eliminate values and can be composed using continuations , and configurations which represent the machine configurations executed by the ANF machine semantics. We add the identity type and to enable preserving dependent typing for expressions and the join-point optimization, further described in Section 6. We do not require an elimination form for identity types; they are instead used via the restricted form of equality reflection in the equivalence judgment. A continuation is a program with a hole and is composed with a computation to form a configuration . For example, . Since continuations are not first-class objects, we cannot express control effects. Note that despite the syntactic distinctions, we still do not enforce a phase distinction—configurations (programs) can appear in types. Finally in Figure 5(b), we give the full non-ANF syntax, denoted by meta-variables . As done with ECC, we usually use the meta-variable to evoke a term and the meta-variables to evoke a type.

Fig. 5: $\mathrm{CC}_{e}^{A}$ syntax.

To summarize, the meaning of the judgments of $\mathrm{CC}_{e}^{A}$ is the same as ECC. However, $\mathrm{CC}_{e}^{A}$ has additional syntax for the identity type.

We give the new typing rules in Figure 6. The rules for the identity type are standard. The key change in $\mathrm{CC}_{e}^{A}$ is in the typing rule for . The typing rule for introduces a propositional equality into the typing environment for each branch. These record a machine step: the machine will have reduced to before jumping to the first branch and reduced to before jumping to the second branch. This is necessary to support the type-preserving ANF transformation of .

Fig. 6: $\mathrm{CC}_{e}^{A}$ typing (excerpt).

We require a new definition of equivalence to support extensionality, as shown in Figure 7. The rule is used to determine two terms are equivalent if there is a proof of their equality. In particular, we use this rule to access equalities in the context such as those introduced in the typing rule for . We add congruence rules for all syntactic forms including the identity type. The rules Footnote 7 and state that equivalence is preserved under substitution and the standard reduction defined for ECC. We also include $\eta$-equivalence as defined for ECC and $\eta$-equivalences for booleans.

Fig. 7: $\mathrm{CC}_{e}^{A}$ equivalence (excerpt).

Fig. 8: Composition of configurations.

To ensure reduction preserves ANF, we define composition of a continuation and a configuration , Figure 24, typically called renormalization in the literature (Sabry & Wadler Reference Sabry and Wadler1997; Kennedy Reference Kennedy2007). In ANF, all continuations are left associated, so the standard definition of substitution does not preserve ANF, unlike in alternatives such as CPS or monadic form. Note that $\beta$-reduction takes an ANF configuration but would naßvely produce . Substituting the term , a configuration, into the continuation could result in the non-ANF term . When composing a continuation with a configuration, , we unnest all continuations so they remain left associated. Note that these definitions rely on our uniqueness-of-names assumption.

Digression on composition in ANF. In the literature, the composition operation is usually introduced as renormalization, as if the only intuition for why it exists is “well, it happens that ANF is not preserved under $\beta$-reduction.” It is not mere coincidence; the intuition for this operation is composition, and having a syntax for composing terms is not only useful for stating $\beta$-reduction, but useful for all reasoning about ANF. This should not come as a surprise—compositional reasoning is useful. The only surprise is that the composition operation is not the usual one used in programming language semantics, that is, substitution. In ANF, as in monadic normal form, substitution can be used to compose any expression with a value, since names are values and values can always be replaced by values. But substitution cannot just replace a name, which is a value, with a computation or configuration. That would not be well typed. So how do we compose computations with configurations? We can use , as in , which we can imagine as an explicit substitution. In monadic form, there is no distinction between computations and configurations, so the same term works to compose configurations. But in ANF, we have no object-level term to compose configurations or continuations. We cannot substitute a configuration into a continuation , since this would result in the non-ANF (but valid monadic) expression . Instead, ANF requires a new operation to compose configurations: . This operation is more generally known as hereditary substitution (Watkins et al., Reference Watkins, Cervesato, Pfenning and Walker2003), a form of substitution that maintains canonical forms. So we can think of it as a form of substitution, or, simply, as composition.

In Figure 9, we present the call-by-value (CBV) evaluation semantics for ANF $\mathrm{CC}_{e}^{A}$ terms. The semantics is only for the run-time evaluation of configurations; during type checking, we continue to use the conversion relation defined in Section 3. The semantics is essentially standard, but recall that $\beta$-reduction produces a configuration which must be composed with the existing continuation . To maintain ANF, the natural number eliminator for the case must first let-bind the intermediate application and recursive call before plugging the final application into the existing continuation .

Fig. 9: $\mathrm{CC}_{e}^{A}$ evaluation.

4.1 Dependent continuation typing

The ANF translation manipulates continuations as independent entities. To reason about them, and thus to reason about the translation, we introduce continuation typing, defined in Figure 10. The type of a continuation expresses that this continuation expects to be composed with a term equal to the configuration of type and returns a result of type when completed. Normally, is equivalent to some computation , but it must be generalized to a configuration to support typing expressions. This type formally expresses the idea that is depended upon in the rest of the computation. The [K-Bind] rule ensures we have access to the specific needed to type body of the continuation and is similar to the rule [T-Cont] used in the type-preserving CPS translation by Bowman et al., (Reference Bowman, Cong, Rioux and Ahmed2018). The rule [K-Empty] has an arbitrary term , since an empty continuation $[\cdot]$ has no “rest of the program” that could depend on anything.

Fig. 10: $\mathrm{CC}_{e}^{A}$ continuation typing.

Intuitively, what we want from continuation typing is a compositionality property—that we can reason about the types of configurations (generally, for configurations ) by composing the typing derivations for and . To get this property, a continuation type must express not merely the type of its hole , but which term will be bound in the hole. We see this formally from the typing rule [Let] (the same for $\mathrm{CC}_{e}^{A}$ as for ECC in Section 3), since showing that is well typed requires showing that , that is, requires knowing the definition . If we omit the expression from the type of a continuation, we know there are some configurations that we cannot type check compositionally. Intuitively, if all we knew about was its type, we would be in exactly the situation of trying to type check a continuation that has abstracted some dependent type that depends on the specific into one that depends on an arbitrary . We prove that our continuation typing is compositional in this way, Lemma 4.1 (Continuation Cut).

Note that the result of a continuation type cannot depend on the term that will be plugged in for the hole, that is, for a continuation does not depend on . To see this, first note that the initial continuation must be empty and thus cannot have a result type that depends on its hole. The ANF translation will take this initial empty continuation and compose it with intermediate continuations . Since composing any continuation with any continuation results in a new continuation with the final result type , then the composition of any two continuations cannot depend on the type of the hole.

To prove that continuation typing is not an extension to the type system—that is, is admissible—we prove Lemmas 4.1 and 4.3, that plugging a well-typed computation or configuration into a well-typed continuation results in a well-typed term of the expected type.

We first show Lemma 4.1 (Continuation Cut), which is simple. This lemma tells us that our continuation typing allows for compositional reasoning about configurations whose result types do not depend on .

Lemma 4.1 (Continuation Cut)

Proof. By cases on .

  1. Case: , trivial.

  2. Case:

We must show that , which follows directly from [Let] since, by the continuation typing derivation, we have that .

Continuation typing seems to require that we compose a continuation syntactically with , but we will need to compose with some . It is preferable to prove this as a lemma instead of building it into continuation typing to get a nicer induction property for continuation typing. The proof is essentially that substitution respects equivalence.

Lemma 4.2 (Continuation Cut Modulo Equivalence) If , and , then .

Proof. By cases on the structure of .

  1. Case: Trivial.

  2. Case:

It suffices to show that: If then .

Note that anywhere in the derivation that is used, it must be used essentially as: . We can replace any such use by to construct a derivation for .

Some presentations of context typing (Gordon, Reference Gordon1995)Footnote 8, in non-dependent settings, use a rule like the following.

This suggests we could define continuation typing as follows.

That is, instead of adding separate rules [K-Empty] and [K-Bind], we define a well-typed continuation to be one whose composition with the expected term in the hole is well typed. Then, Lemma 4.1 (Continuation Cut) is definitional rather than admissible. This rule is somewhat surprising; it appears very much like the definition of [Cut], except the computation being composed with the continuation comes from its type, and the continuation remains un-composed in what we would consider the output of the rule.

The presentations are equivalent, but it is less clear how [K-Type] is related to the definitions we wish to focus on. It is exactly the premises of [K-Bind] that we need to recover type-preservation for ANF, so we choose the presentation with [K-Bind]. However, the rule [K-Type] is more general in the sense that the continuation typing does not need any changes as the definition of continuations change.

The final lemma about continuation typing is also the key to why ANF is type preserving for . The heterogeneous composition operations perform the ANF translation on expressions by composing the branches with the current continuation, reassociating all intermediate computations. The following lemma states that if a configuration is well typed and equivalent to another (well-typed) configuration under an extended environment , and the continuation is well typed with respect to , then the composition is well typed. We include the additional environment , as this environment contains information about new variables introduced through the composition with , such as definitions and propositional equalities. We choose to include an explicit extra environment to avoid needing to prove tedious properties about individual environments. Thus, the proof is simple, except for the case, which essentially must prove that ANF is type preserving for .

Lemma 4.3 (Heterogeneous Continuation Cut) If such that and , then .

Proof. By induction on .

  1. Case: , by Lemma 4.2.

  2. Case:

    1. Our goal follows by the induction hypothesis applied to the sub-expression if we can show (1) and (2) under a context . (1) follows by the following equations and transitivity of $\equiv:$:

      For (2), note that from our premises.

  3. Case:

    1. These follow by the induction hypotheses applied to the sub-expressions and if we can show (1) and (2) under a context (analogously for . For (1), note that by [$\equiv$-Reflect] and the typing rule [Var].

      For (2), note that from our premises and weakening. By [$\equiv$-Subst], we know , which follows by [$\equiv$-Reflect] and the typing rule [Var].

4.2 Metatheory

To demonstrate that the new typing rule for is consistent, we develop a syntactic model of $\mathrm{CC}_{e}^{A}$ in extensional CIC (eCIC). We also prove subject reduction for $\mathrm{CC}_{e}^{A}$. Subject reduction is valuable property for a typed IL to model various optimizations through the equational theory of the language, such as modeling inlining as $\beta$-reduction. If subject reduction holds, then all these optimizations are type preserving.

4.2.1 Consistency

Our syntactic model essentially implements the new rule using the convoy pattern (Chlipala, Reference Chlipala2013), but leaves the rest of the propositional equalities and equality reflection essentially unchanged. Each expression is modeled as an expression that returns a function expecting a proof that the predicate is equal to in the first branch and in the second branch. The function, after receiving that proof, executes the code in the branch. The model expression is then immediately applied to refl, the canonical proof of the identity type. We could eliminate equality reflection using the translation of Oury (Reference Oury2005), but this is not necessary for consistency.

The essence of the model is given in Figure 11. There is only one interesting rule, corresponding to our altered dependent if rule. The model relies on auxiliary definitions in CIC, including subst, if-eta1 and if-eta2, whose types are given as inference rules in Figure 11. Note that the model for $\mathrm{CC}_{e}^{A}$’s is not valid ANF, so it does not suffice to merely use the convoy pattern if we want to take advantage of ANF for compilation.

Fig. 11: $\mathrm{CC}_{e}^{A}$ model in eCIC (excerpt).

We show this is a syntactic model using the usual recipe, which is explained well by Boulier et al. (Reference Boulier, Pédrot and Tabareau2017): we show the translation from $\mathrm{CC}_{e}^{A}$ to eCIC preserves equivalence, typing, and the definition of $\perp$ (the empty type). This means that if $\mathrm{CC}_{e}^{A}$ were inconsistent, then we could translate the proof of $\perp$ into a proof of in eCIC, but no such proof exists in eCIC, so $\mathrm{CC}_{e}^{A}$ is consistent.

We use the usual definition of as , and the same in eCIC. It is trivial that the definition is preserved.

Lemma 4.4 (Model Preserves Empty Type)

The essence of showing both that equivalence is preserved and that typing is preserved is in showing that the auxiliary definitions in Figure 11 exist and are well typed. We give these definitions in Coq in the supplementary materials, and the equivalences hold and are well typed without any additional axioms.

Lemma 4.5 (Model Preserves Equivalence)

Lemma 4.6 (Model Preserves Typing)

Consistency tells us that there does not exist a closed term of the empty type . This allows us to interpret types as specifications and well-typed terms as proofs.

Theorem 4.7 (Consistency) There is no such that

4.2.2 Syntactic metatheory

$\mathrm{CC}_{e}^{A}$ satisfies all the usual syntactic metatheory properties, but the main property of interest for developing our typed IL is subject reduction. We present the proof of subject reduction, which follows the structure presented by Luo (Reference Luo1990).

Theorem 4.8 (Subject Reduction)

Proof. By induction on

The proof of subject reduction relies on some additional lemmas. In particular, the proof relies on subject reduction for single step reduction (in the [Red-Trans] case), and a lemma called Luo calls context replacement (in the [Red-Cong-Lam] and [Red-Cong-Let] cases).

Lemma 4.9 (Single Step Subject Reduction) .

Proof. By cases on . Most cases are straightforward, except for $\rhd_{\beta}$ and $\rhd_{\zeta}$, which rely on Lemmas 4.12 and 4.13, respectively.

Lemma 4.10 (Context Replacement) If and , then .

Proof. By (mutual) induction on .

Because our typing contexts also contain definitions, we must also be able to replace a definition expression with an equivalent expression. This is required specifically for the [Red-Cong-Let] case in the proof of subject reduction.

Lemma 4.11 (Context Definition Replacement) If and , then .

Proof. By (mutual) induction on .

Finally, the proof of single step subject reduction relies on a lemma Luo calls Cut, named after the cut rule in sequent calculus. We must also prove an additional similar lemma due to definitions in our context, specifically for the $\rhd_{\delta}$ case.

Lemma 4.12 (Cut) If and then .

Proof. By (mutual) induction on .

Lemma 4.13 (Cut With Definitions) .

Proof. By (mutual) induction on .

The proofs of both versions of context replacement and Cut are done by mutual induction, because typing in $\mathrm{CC}_{e}^{A}$ is mutually defined with well-formedness of contexts. Typing also relies on subtyping in the [Conv] case, subtyping relies on equivalence, and equivalence now relies on typing due to the equality reflection rule [$\equiv$-Reflect]. The mutual reliance of all these judgments requires that these lemmas be proven by mutual induction. The lemma statements above are not the full statement of the lemma, which we omit for brevity, but the corollaries are of interest. For example, the full lemma definition for Lemma 4.10 is as follows:

Lemma 4.14 (Context Replacement (full definition))

The full definitions for Lemmas 4.11, 4.12, and 4.13 are included in Appendix 1.

4.3 Correctness of ANF evaluation

In $\mathrm{CC}_{e}^{A}$, we have an ANF evaluation semantics for run time and a separate definitional equivalence and reduction system for type checking. We prove that these two coincide: running in our ANF evaluation semantics produces a value definitionally equivalent to the original term.

When computing definitional equivalence, we end up with terms that are not in ANF and can no longer be used in the ANF evaluation semantics. This is not a problem—we could always ANF translate the resulting term if needed—but can be confusing when reading equations. When this happens, we wrap terms in a distinctive boundary such as and . The boundary indicates the term is not normal, that is, not in A-normal form. The boundary is only meant to communicate with the reader; formally, .

The heart of the correctness proof is actually naturality, a property found in the literature on continuations and CPS that essentially expresses freedom from control effects (e.g., (Thielecke Reference Thielecke2003) explain this well). Lemma 4.15 is the formal statement of naturality in ANF: composing a term with its continuation in ANF is equivalent to running to a value and plugging the result into the hole of the continuation . Formally, this states that composing continuations in ANF is sound with respect to standard substitution. The proof follows by straightforward equational reasoning.

Lemma 4.15 (Naturality)

Proof By induction on the structure of .

Next we show that our ANF evaluation semantics are sound with respect to definitional equivalence. To do that, we first show that the small-step semantics are sound. Then, we show soundness of the evaluation function.

Lemma 4.16 (Small-step soundness) If .

Proof By cases on . Most cases follow easily from the ECC reduction relation and congruence, except for $\mapsto_{\beta}$ which requires Lemma 4.15. We give representative cases.

Theorem 4.17 (Evaluation soundness)

Proof. By induction on the length n of the reduction sequence given by . Note that, unlike conversion, the ANF evaluation semantics have no congruence rules.

5 ANF translation

The ANF translation is presented in Figure 12. The translation is standard, defined inductively over syntax and indexed by a current continuation. The continuation is used when translating a value and is composed together “inside-out” the same way continuation composition is defined in Section 4. When translating a value such as , we plug the value into the current continuation and recursively translate the sub-expressions of the value if applicable. For non-values such as application, we make sequencing explicit by recursively translating each sub-expression with a continuation that binds the result and performs the rest of the computation.

Fig. 12: Naïve ANF translation.

We prove that the translation always produces syntax in ANF (Theorem 5.1). The proof is straightforward.

Theorem 5.1 (ANF) For all and , for some .

Our goal is to prove type preservation: if is well typed in the source, then is well typed at a translated type in the target. But to prove type preservation, we must also preserve the rest of the judgmental and syntactic structure that the dependent type system relies on. The type judgment is defined mutually inductively only with well-formedness of environments; all other judgments relied on are defined inductively. We proceed top-down, starting from our main theorem, in order to motivate where each lemma comes into play. The full proof of type preservation is omitted for brevity but is included in Appendix 1.

Type-preservation is stated below. We do not prove it directly. Since the ANF translation is indexed by a continuation , we need a stronger induction hypothesis to reason about the type of the continuation .

Theorem 5.2 (Type Preservation) If

Proof. By Lemma 5.3, it suffices to show that , which follows by [K-Empty].

To see why we need a stronger induction hypothesis, consider the case of ANF translation: . The induction hypothesis for Theorem 5.2 proves that , but this cannot be used to show that the translation of with the new continuation is well typed. We need the induction hypothesis to also include typing information for the new continuation that expects the translated sub-expression . Note the new continuation also composes the original continuation with a target computation . Intuitively, the composition is well typed because it appears in a context where is equivalent to the translation of the source expression .

We abstract this pattern into Lemma 5.3, which takes both a well-typed source term and a well-typed continuation as input, just like the translation. The ANF translation takes an accumulator (the continuation ) and builds up the translation in an accumulator as a procedure from values to ANF terms. The lemma builds up a proof of correctness as an accumulator as well. The accumulator is a proposition that if the computation it receives is well typed, then composing the continuation with the computation is well typed. Formally, we phrase this as: if we start with a well-typed term , and a well-typed continuation , then translating with results in a well-typed term. The continuation expects a term that is the translation of the source expression directly, under an extended environment . Intuitively, this extended environment contains information about new variables introduced through the ANF translation, such as definitions and propositional equivalences. This lemma and its proof are main contributions of this work.

Lemma 5.3.

Proof. The proof is by induction on the mutually defined judgments and . Cases [Ax-Prop], [Snd], [ElimNat], and [If] are given, as they are representative.

Note that for all sub-expressions, of type holds by instantiating the induction hypothesis with the empty continuation . The general structure of each case is similar; we proceed by induction on a sub-expression and prove the new continuation is well typed by applications of [K-Bind]. Proving the body of is well typed requires using Lemma 4.3. This requires proving that is composed with a configuration equivalent to , and showing their types are equivalent. To show is equivalent to $[\!\![\texttt{e}]\!\!]$, we use the Lemmas 5.4 and 4.15 to allow for composing configurations and continuations. Additionally, since several typing rules substitute sub-expressions into the type system, we use Lemma 5.5 in these cases to show the types of and are equivalent. Each non-value proof case focuses on showing these two equivalences.

  1. Case: [Ax-Prop]

    1. We must show that . This follows from Lemma 4.2.

  2. Case: [Snd] Following our general structure, we must show (1) in an environment where and (2) . For goal (1), we focus on :

  1. For goal (2), since by Lemma 5.5, we focus on showing . Focusing on , we have:

  1. Case: [ElimNat] Following our general structure, we must show (1) in an environment where

  2. For goal (1), we focus on :

  1. For goal (2), since by Lemma 5.5, and by the definition , we conclude using [$\equiv$-Subst].

  2. Case: [If] Following our general structure, we must show (1) in an environment where and (2) (and analogously for where ). Focusing on in goal (1), we have:

  1. For goal (2), by Lemma 5.5. Since we conclude with [$\equiv$-Subst]

Key steps in the proof require reasoning about equivalence of terms. To prove equivalence of terms, we need to know their syntax so a structural equivalence rule applies, which is not true of terms of the form . To reason about this term, we need to show an equivalence between the compiler and ANF-composition, so we can reason instead about , a definition we can unroll and we know is correct with respect to evaluation via Lemma 4.15 (Naturality). We prove this equivalence, Lemma 5.4 next. This essentially tells us that our compiler is compositional, that is, respects separate compilation and composition in the target language. We can either first translate a program under continuation and then compose it with a continuation , or we can first compose the continuations and and then translate under the composed continuation.

Lemma 5.4 (Compositionality)

Proof. By induction on the structure of . All value cases are trivial. The cases for non-values are all similar, following by definition of composition for continuations or configurations.

Corollary 5.4.1. .

Similarly, at times we need to reason about the translation of terms or types that have a variable substituted, such as . We often use induction on a judgment , which gives us the structure of the expression but not much of the structure of the type . The type may have a term substituted, as , but we have no information about the structure of itself. Since we cannot reason about directly, we cannot obtain the final term after substitution, and thus, we do not know the syntax of the compilation of this arbitrary term. We show another kind of compositionality with respect to this substitution, Lemma 5.5, which shows that substitution is equivalent to composing via continuations. Since standard substitution does not preserve ANF, this lemma does not equate terms in ANF, but $\mathrm{CC}_{e}^{A}$ terms that are not normal. We again mark this shift with the boundary term $\mathcal{NN}()$.

Lemma 5.5 (Substitution).

Our type system relies on a subtyping judgment, so we must show subtyping is preserved. The proof is uninteresting, except insofar as it is simple, while it seems to be impossible in prior work for CPS translation (Bowman et al., Reference Bowman, Cong, Rioux and Ahmed2018).

Lemma 5.6 If .

Subtyping relies on type equivalence (but not vice versa), so we must also show the equivalence judgment is preserved. This lemma is also useful as a kind of compiler correctness property, ensuring that our notion of program equivalence (since types and terms are the same) is preserved through compilation.

Lemma 5.7 If

Since equivalence is defined in terms of reduction and conversion, we must also show reduction and conversion are preserved up to equivalence in the target language. This is convenient, since reduction and conversion also define the dynamic semantics of programs. These proofs correspond to a forward simulation proof up to target language equivalence and also imply compiler correctness. The proofs are straightforward; intuitively, ANF is just adding a bunch of $\zeta$-reductions.

Lemma 5.8. If

Lemma 5.9. If

Proof. By induction on the structure of .

As a result of proving that ANF preserves the judgmental and syntactic structure that a dependent type system relies on, we can also easily prove correctness of separate compilation (with respect to the ANF evaluation semantics). To do this, we must define linking and a specification of when outputs are related across languages, independent of the compiler.

We define an independent specification relating observation across languages, which allows us to understand the correctness theorem without reading the compiler. We define the relation to compare ground values in Figure 13.

Fig. 13: Separate compilation definitions.

We define linking as substitution with well-typed closed terms and define a closing substitution with respect to the environment (also in Figure 13). Linking is defined by closing a term such that with a substitution , written . Any is valid for if it maps each to a closed term of type . For definitions in , we require that if , then , that is, the substitution must map to a closed version of its definition . We lift the ANF translation to substitutions.

Correctness of separate compilation says that we can either link then run a program in the source language semantics, that is, using the conversion semantics, or separately compile the term and its closing substitution then run in the ANF evaluation semantics. Either way, we get equivalent terms. The proof is straightforward from the compositionality properties and forward simulations we proved for type preservation.

Theorem 5.10 (Correctness of Separate Compilation) If , (and a base type) and then .

Proof. The following diagram commutes, because $\equiv$ corresponds to $\approx$ on base types (or ), the translation commutes with substitution, and preserves equivalence.

6 Join-point optimization

Recall from Figure 8 that the composition of a continuation with an configuration, , duplicates in the branches: . Similarly, the ANF translation in Figure 12 performs the same duplication when translating expressions. This can cause exponential code duplication, which is no problem in theory but is a problem in practice.

$\mathrm{CC}_{e}^{A}$ supports implementing the join-point optimization, which avoids this code duplication. We modify the ANF translation from Section 5 to use the definition in Figure 14 and prove it is still correct and type preserving. Note that the new translation requires access to the type from the derivation. We can do this either by defining the translation by induction over typing derivations or (preferably) modifying the syntax of to include as a type annotation, similar to dependent pairs in ECC or dependent case analysis in Coq.

Fig. 14: Join-point optimized ANF translation.

Lemma 6.1.

Proof. By induction.

  1. Case: [If] We proceed by induction on the branch sub-expressions and ensure their corresponding continuation is well typed. This means the application of the join point must be on well-typed arguments of (1) and (2) (analogously for in the branch). (1) follows from the equivalence , which has been shown before in Lemma 5.3. (2) follows if we show in an environment where . By [$\equiv$-Cong-Equiv], we must show , which by definition is , previously shown in Lemma 5.3.

7 Future and related work

We first discuss possible extensions to the source language. We sketch adding recursion and inductive data types to our source language ECC and conjecture that our proof technique can scale to these features. However, extending the source with extensional equality seems more difficult. We then discuss ways to efficiently type check $\mathrm{CC}_{e}^{A}$ terms constructed from the ANF translation. We conclude by discussing related work as alternatives to the ANF translation, in particular the CPS translation, Call-by-Push-Value, and monadic form. These alternatives either fail to scale to higher universes, or fail to compile to a language as “low-level” as $\mathrm{CC}_{e}^{A}$ in ANF.

7.1 Costs of type-preserving compilation

Compared to other approaches to solving linking errors, such as PCC (Necula, Reference Necula1997) which requires finding and encoding witnesses of safety proofs, type-preserving compilation is a lightweight and effective technique (Morrisett et al., Reference Morrisett, Walker, Crary and Glew1999; Xi & Harper Reference Xi and Harper2001; Shao et al., Reference Shao, Trifonov, Saha and Papaspyrou2005).

Unfortunately, one major disadvantage is the compilation time. TIL (Tarditi et al., Reference Tarditi, Morrisett, Cheng, Stone, Harper and Lee1996), a type-preserving compiler for Standard ML, was found to be eight times slower than the non-type-preserving compiler SML/NJ. Chen et al., (Reference Chen, Hawblitzel, Perry, Emmi, Condit, Coetzee and Pratikaki2008) found that their type-preserving compiler for Microsoft’s Common Intermediate Language (CIL) was 82.8% slower than their base compiler, mostly due to type inference and writing type information to object files. We consider long compilation times a relatively small price to pay for the safety guarantees provided for the compiled program, particularly given the cost of unsafety (Stecklein et al., Reference Stecklein, Dabney, Dick, Haskins, Lovell and Moroney2004; Briski et al., Reference Briski, Chitale, Hamilton, Pratt, Starr, Veroulis and Villard2008).

It is also unclear that this cost is necessary. Shao et al., (Reference Shao, League and Monnier1998) show that careful attention to representation of typed intermediate language nearly eliminated compile time overhead introduced in the FLINT compiler compared to the SML/NJ compiler. The focus in that work is with respect to large types, which is particularly relevant for dependent-type preservation.

However, annotations overhead with dependent types can be far larger. For example, annotation size (including both types and proofs) was 6x time larger than the code size in CompCert (Leroy, Reference Leroy2009), and one extension to CompCert doubled that (Stewart et al., Reference Stewart, Beringer, Cuellar and Appel2015). If this sort of difference between annotation size and code size is typical, it could be a huge problem. It is not clear that is typical; Stewart et al., (Reference Stewart, Beringer, Cuellar and Appel2015) note that the doubling was in part due to duplication of definitions which should be unnecessary.

Removing inference from the type system could significantly improve compile time (Chen et al., Reference Chen, Hawblitzel, Perry, Emmi, Condit, Coetzee and Pratikaki2008), but could increase code size and require more annotations. This time/space tradeoff has been studied in the context of LF (Necula & Rahul, Reference Necula and Rahul2001; Sarkar et al., Reference Sarkar, Pientka and Crary2005). Adapting these techniques to dependently typed languages in the style of Coq would likely be necessary for practical implementation of dependent-type preservation, but the issues are very similar to that of LF.

7.2 Recursive functions and match

Our source language ECC still lacks some key features of a general purpose dependently typed language. In particular, our language does not have a construct for defining recursive functions and a construct for eliminating general inductive datatypes. These features require a termination condition to ensure termination of programs; thus, building a type-preserving compiler for a language with these features would require the compiler preserve the termination condition as well. We could preserve a syntactic termination condition, as used by Coq; however, this requires a significantly more complex target guard condition. Bowman & Ahmed (Reference Bowman and Ahmed2018) provide an example of preserving the guard condition through the closure conversion pass, where the guard condition must essentially become a data-flow analysis in order to track the flow of recursive calls through closures. This would greatly increase the complexity of the target type system and possibly affect type checking performance. This suggests that using a syntactic condition is undesirable for a dependently typed intermediate language. An alternative could be compiling this syntactic guard condition to sized types, but adding sized types to Coq is still a work in progress (Chan & Bowman, Reference Chan and Bowman2020).

However, assuming we have preserved the guard condition through the ANF translation with the aforementioned possible complexities, we conjecture that the technique presented in this paper scales to these language features. In particular, proving that the ANF translation of is type preserving would be very similar to the technique used for dependent . Suppose we have added a syntactic form and typing rule:

The target language would have a similar syntactic construct as a configuration . Any proofs of lemmas over configurations should be updated to include the construct. The ANF translation would first translate the scrutinee and then push into each branch of the . In order to prove this translation type preserving, we would change the target typing rule for to include a propositional equality when checking each branch. That is, the premises in the source rule above would change to . The target typing rule threads the equality when checking each branch, to record that the scrutinee has reduced to a particular case of the match .

Threading an equality into the branches proves type preservation in the same way as with statements with the equality . The continuation originally expects something of type but then is pushed into the branches and translated with something of type . The equality in the context introduced by the target typing rule helps resolve the discrepancy between the substitution and .

Assuming the termination condition can be preserved through the ANF translation, proving that ANF is type-preserving for would be similar to proving type preservation for functions. Given the typing rule for and the ANF translation:

We could show that the ANF translation is type preserving by Lemma 4.2 and show that the target fix expression is well typed. This is easy to show using the target [Fix] typing rule, assuming the guard condition is preserved, and by induction on .

7.3 Preserving extensional equality

A natural question is whether we can preserve extensional equality through ANF translation. Since the target language already supports this, it seems trivial to preserve, no?

Our study so far suggests that adding new features with the same pattern of dependency and judgmental structural does not take much effort. Adding new positive or negative types is not significantly more complicated if the translation already supports positive and negative types. Adding recursion does not seem difficult since the dependency is not new, although it adds a new judgment that must be preserved.

By contrast, extensional equality is a radical change to the judgmental structure of the source language. Currently, in our source language, the typing and equivalence judgments are separately defined—typing depends on equivalence, but not vice versa. Adding extensional equality typically makes these two judgment mutually dependent. This completely changes the structure of the type preservation proof—as we discussed in Section 5, we stage our proof by first showing reduction is preserved, then equivalence, etc, and finally that typing is preserved. This structure follows the structure of the source typing derivations. Because adding extensional equality changes the structure of source typing derivations, this entire proof could no longer be staged and would need to be mutually inductive.

Past authors have not had much success in solving these large mutual inductive proofs and instead avoid them. Barthe et al., (Reference Barthe, Hatcliff and SØrensen1999) discuss this problem as choose to use a Curry-style type system to avoid the problem. Bowman et al., (Reference Bowman, Cong, Rioux and Ahmed2018) discuss it also, but choose to use untyped equivalence and untyped reduction to avoid it. We follow the latter approach, but that is not tenable if we extend the source language with extensional equability.

We have considered alternative variants of extensional equality. For example, consider the following idea for a simplification of the Rule [$\equiv$-Reflect] which avoids mutual dependency.

In this version of extensional equality, we avoid dependence on the typing judgment by restricting extensionality to apply only when we have a proof that we can use in the current context. Unfortunately, this breaks subject reduction, and if equivalence is untyped, it can lead to unsoundness. Since the only purpose was to break the dependence between equivalence and typing, and subject reduction is important for a compiler IL, we end up with the standard reflection rule.

That said, these issues about typed equivalence are only about complicating the proof technique, not the translation. Extensional equivalence does not introduce any new dependencies that ANF can disrupt, so it seems likely that ANF would still be type preserving, but it will almost certainly be difficult to prove.

7.4 Recovering decidability

Our target language $\mathrm{CC}_{e}^{A}$ is extensional type theory, which is well known to have undecidable type checking. An IL with undecidable type checking affects our type-check-then-link approach, even though linking may not necessarily occur at $\mathrm{CC}_{e}^{A}$. This is because the decidability of type checking of earlier ILs may affect the type checking of later ILs in the compiler, where linking is to be done. If some information is lost resulting in undecidability, then that information is unlikely to be regained later.

There are then two main approaches to avoid undecidable type checking. One is to find ways to recover decidability for the subset of $\mathrm{CC}_{e}^{A}$ constructed by the compiler. The other is to remove equivalence reflection from $\mathrm{CC}_{e}^{A}$ and attempt to recover the proof of type preservation, which is much more difficult and not ideal for compilation.

Recovering Decidability for $\mathrm{CC}_{e}^{A}$. Intuitively, the target terms in $\mathrm{CC}_{e}^{A}$ constructed by the compiler should be decidable. This is because the terms are constructed from terms in a decidable source language. In principle, the translation could make use of this fact and insert whatever annotations are necessary into the target term to ensure it can be decidably checked in an extensional type theory. Determining small, suitable annotations for compilation is the main subject of future work.

One approach for annotating the compiled term is a technique from proof-carrying code (PCC). In a variant of PCC by Necula & Rahul (Reference Necula and Rahul2001), the inference rules are represented as a higher-order logic program, and the proof checker is a non-deterministic logic interpreter. The proofs are made deterministic by recording non-deterministic choices as a bit-string. Our compiler could be modified to produce a similar bit-string encoding the non-deterministic choices. The type checker could then be modified to interpret the encoding to ensure type checking is decidable.

If all else fails, we can recover decidability to translating typing derivations rather than syntax. Donning our propositions-as-types hat once again, we can obtain the desired typing derivation by using the proof of type preservation. The proof can be viewed as a function from source typing derivations to target typing derivations. The target typing derivation can then be translated to CIC with additional axioms and be decidably checked (Oury, Reference Oury2005). However, this would require shipping the type preservation proof with the compiler, which might be undesirable.

Using a Target Language Without Equivalence Reflection. We could use the convoy pattern (Chlipala, Reference Chlipala2013) to compile if expressions, as we did to show consistency of $\mathrm{CC}_{e}^{A}$ in Section 4.2.1. However, there are two issues with this approach: (1) the convoy pattern is not in ANF and (2) the resulting code generated just to show types are preserved (i.e., code not related to execution) is extremely large.

One could attempt to reduce the code size generated by the convoy pattern by using type safe coercions. Consider the addition of the following coercion, :

Naive additions of such coercions, however, result in a non-type-preserving translation. For example, the coercions could be inserted into the translation for as follows:

The type preservation proof fails in the branches of the , despite the coercion. This is because our judgmental equivalence cannot determine that the expression expected by the continuation, , is equivalent to the expression . Equality reflection was key in proving this equivalence in our proof of type preservation.

7.5 CPS translation

ANF is favored as a compiler intermediate representation, although not universally. Cong et al., (Reference Cong, Osvald, Essertel and Rompf2019) include a thorough summary of the arguments for and against both CPS and ANF as intermediate representations.

Most recent work on CPS translation of dependently typed languages focuses on expressing control effects (Miquey, Reference Miquey2017; Pédrot, Reference Pédrot2017; Cong & Asai, Reference Cong and Asai2018a, 2018b), which we discuss further in Section 7.7. When expressing control effects with dependent types, it is necessary for consistency to prevent certain dependencies from being expressed (Barthe & Uustalu, Reference Barthe and Uustalu2002; Herbelin, 2005), so these translations do not try to recover dependencies in the way we discuss in Section 2.

Two CPS translations exist that do try to recover dependencies in the absence of effects, but fail to scale to a predicative universe hierarchy. Bowman et al., (Reference Bowman, Cong, Rioux and Ahmed2018) present a type-preserving CPS translation for the Calculus of Constructions. They add a special form and typing rule for the application of a computation to a continuation which essentially records a machine step and is essentially similar to the typing rule. They also add a non-standard equality rule that essentially corresponds to Lemma 4.15 (Naturality). Cong & Asai (Reference Cong and Asai2018a) extend the translation of Bowman et al., (Reference Bowman, Cong, Rioux and Ahmed2018) to dependent pattern matching using essentially the same typing rule for as we do in $\mathrm{CC}_{e}^{A}$.

Unfortunately, this rule relies on interpreting all functions as parametric and in a single computationally relevant impredicative universe. Both of these restrictions are a problem; Boulier et al., (Reference Boulier, Pédrot and Tabareau2017) give an example of a type theory that is anti-parametric because it enables ad hoc polymorphism and type quotation. It is simpler to demonstrate the problems that occur when relying on impredicativity.

Impredicativity is inconsistent with large elimination, that is, eliminating a type to a term (Huet, Reference Huet1986; Chan, Reference Chan2021)Footnote 9 This means the CPS translation cannot be used for a consistent type theory with large elimination, while our ANF translation can. For a concrete example of what kind of expressiveness is regained, consider the following definition of the type of arbitrary arity functions.

This definition lets us build well-typed multi-arity functions over natural numbers. For example, the type of the n-ary summation function is . But, this requires large elimination and would be inconsistent in an impredicative universe and thus cannot be supported by the CPS translation of Bowman et al., (Reference Bowman, Cong, Rioux and Ahmed2018). The desire for this sort of definitions makes a predicative universe hierarchy almost, but not quite, universal in dependently typed languages. Cedille is a notable exception, instead trying to take full advantage of impredicativity (Stump & Jenkins, Reference Stump and Jenkins2018).

7.6 Call-by-push-value and monadic form

Call-by-push-value (CBPV) is similar to our ANF target language and to CPS target languages. In essence, CBPV is a $\lambda$ calculus in monadic form suitable for reasoning about call-by-value (CBV) or call-by-name (CBN), due to explicit sequencing of computations (Levy, Reference Levy2012). It has values, computations, and continuations, as we do, and has compositional typing rules (which inspired much of our own presentation). The structure of CBPV is of useful for modeling effects; all computations should be considered to carry an arbitrary effect, while values do not.

Work on designing a dependent call-by-push-value (dCBPV) runs into some of the same design issues that we see in ANF (Ahman, Reference Ahman2017; Vákár, Reference Vákár2017; Pédrot & Tabareau, Reference Pédrot and Tabareau2019), but critically avoids the central difficulties introduced in Section 2. The reason is essentially that monadic form is more compositional than ANF, so dependency is not disrupted in the same way.

Recall from Section 4 that our definition of composition was entirely motivated by the need to compose configurations and continuations. In monadic form generally, there is no distinction between computation and configurations, and let is free to compose configurations. This means that configurations can return intermediate computations, instead of composing the entire rest of the continuation inside the body of a let. The monadic translation of , which is problematic in ANF, is given below and is easily type preserving.

Note that since let can bind the “configuration” , the typing rule [Let] and the compositionality lemma suffice to show type preservation, without any reasoning about definitions. In fact, we do not even need definitions for monadic form; we only need a dependent result type for let. A similar argument applies to the monadic translation of dependent if: since we can still nest if on the right-hand side of a let, the difficulties we encounter in the ANF translation are avoided.

The dependent typing rule for let without definitions is essentially the rule given by Vákár (Reference Vákár2017), called the dependent Kleisli extension, to support the CBV monadic translation of type theory into dCBPV, and the CBN translation with strong dependent pairs. Vákár (Reference Vákár2017) observes that without the dependent Kleisli extension, CBV translation is ill-defined (not type preserving), and CBN only works for dependent elimination of positive types. This is the same as the observation made independently by Bowman et al., (Reference Bowman, Cong, Rioux and Ahmed2018) that type-preserving CBV CPS fails for $\Pi$ types, in addition to the well-known result that the CBN translation failed for $\Sigma$ types (Barthe & Uustalu Reference Barthe and Uustalu2002).

Recently, Pédrot & Tabareau (Reference Pédrot and Tabareau2019) introduce another variant of dependent call-by-push-value dubbed $\delta$CBPV and discuss the thunkability extension in order to develop a monadic translation to embed from CC$_\omega$ into $\delta$CBPV. Thunkability expresses that a computation behaves like a pure computation, and so it can be depended upon. This justifies the addition of an equation to their type theory that is similar to our Lemma 4.15, but should hold even when the language is extended with effects. The authors note that the thunkability extensions seems to require that the target of the thunkable translation be an extensional type theory.

Monadic form has been studied for compilation (Benton et al., Reference Benton, Kennedy and Russell1998; Benton & Kennedy, Reference Benton and Kennedy1999). In these instances, the compiler simplifies from monadic form to ANF as a series of intra-language rewrites. Once in ANF, that is, a form that is normal with respect to the commuting conversions within monadic form, code generation is greatly simplified since all nesting has been eliminated. This more practical implementation technique, not strictly adhering to ANF until necessary, is also used in GHC (Maurer et al., Reference Maurer, Downen, Ariola and Peyton Jones2017) and Chez Scheme.Footnote 10

These commutative conversions are called commutative cuts in the dependent type theory literature (Herbelin Reference Herbelin2009; Boutillier Reference Boutillier2012). Formally, the problem of commutative cuts can be phrased as: Is the following transformation type preserving?

ANF necessarily performs this transformation, as we have shown.

These same commuting conversion do not preserve typing in a standard dependently typed language, but do in our target language $\mathrm{CC}_{e}^{A}$. While we study this through the ANF translation directly, the more practical application of this work is likely in the design of the target language, which is designed to support ANF, rather than in the ANF translation itself. In this view, the ANF translation can be seen as a proof technique that all intermediate rewrites from monadic form to ANF are support in the target language.

7.7 CPS for control effects in dependent types

Most work on CPS for dependent types is primarily concerned with integrating control effects and dependent types, not with type-preserving compilation. Therefore, these CPS translations necessarily make different design choices than we describe in Section 2. For example, we want to avoid any restriction on the source language. However, one must necessarily restrict certain dependencies to integrate control effects and dependent types, at least for the effectful parts of the language. For our translation, naturality is important, but it is non-goal for effectful terms, or is a goal only up to a boundary. We briefly describe some of these translations.

To justify soundness of a dependent sequent calculus, Miquey (Reference Miquey2017) present a CPS translation based on double negation. Sequent calculi make co-values (essentially, continuation) explicit in the language and thus express control effects. Miquey (Reference Miquey2017) develop their CPS translation and avoid relying on parametricity and impredicativity. They rely on modified source type system to track equalities when checking co-values that depend on expression, which is essentially similar to the definitions in our continuation typing. They use delimited continuations to enforce naturality at certain boundaries, to constrain the scope of effects. They also rely on the negative-elimination-free (NEF) restriction (Herbelin, Reference Herbelin2012), which disallows proofs that contain certain kinds of computations, such as arbitrary dependent projection from $\Sigma$ types. If we want to compile existing languages, such as Coq, which have significant code bases, we cannot admit restrictions like the NEF restriction.

7.8 Non-dependently typed ANF translation

There has not been much work on type preservation for simply typed (or, more specifically, non-dependently typed) ANF translation. Most of the type preservation literature focuses on CPS translation (Shao, Reference Shao1997; Morrisett et al., Reference Morrisett, Walker, Crary and Glew1999; Thielecke, Reference Thielecke2003; Kennedy, Reference Kennedy2007; Ahmed & Blume, Reference Ahmed and Blume2011; Perconti & Ahmed, Reference Perconti and Ahmed2014; Patterson et al., Reference Patterson, Perconti, Dimoulas and Ahmed2017).

This is probably because ANF translation without dependent types is, essentially, not interesting. CPS translation transforms every expression, making explicit a representation of computation and continuation. This requires a (typed) encoding that captures all the properties of interest and changes the types of every term change. This makes type preservation non-trivial and raises interesting questions about exactly type computations and continuations should be assigned. This can be particularly interesting when considering a compiled component interacting with a hostile context (Ahmed & Blume, Reference Ahmed and Blume2011).

By contrast, ANF essentially just reorders existing terms and binds them to names without introducing any new objects into the syntax. No types change at all, so long as types do not depend on the syntax of terms.

A cursory study of non-dependently typed ANF appears in the technical appendix by Ahmed & Blume, (Reference Ahmed and Blume2011), companion to Ahmed & Blume, (Reference Ahmed and Blume2011), which briefly studies ANF translation of System F. They show that, in System F, ANF is type preserving and fully abstract, since System F in ANF is actually just a subset of System F. The proof is not given in detail, however.

Another tangential study of type-preserving ANF appears in the work of Saffrich & Thiemann (Reference Saffrich and Thiemann2020), which use ANF as part of a proof to show that imperative session types are subsumed by a functional session types API. The correctness and type-preservation of ANF translation is similarly considered trivial in the text since ANF does not affect typing, although they provide a detailed proof of type preservation in the appendix.

8 Conclusion

We develop a type-preserving ANF translation for ECC—a significant subset of Coq including dependent functions, dependent pairs, dependent elimination of booleans, natural numbers, and the infinite hierarchy of universes—and prove correctness of separate compilation with respect to a machine semantics for the target language. The translation, these proofs, and our proof technique for ANF are main contributions. This translation provides strong evidence that type-preserving compilation can support all of dependent type theory. It gives insights into type preservation for dependent types in general and into related work on type-preserving control flow transformations.

Acknowledgments

We gratefully acknowledge Youyou Cong, Max S. New, Hugo Herbelin, Matthias Felleisen, Greg Morrisett, Simon Peyton Jones, Paul Downen, Andrew Kennedy, Brian LaChance, Danel Ahman, Carlo Angiuli, and the many anonymous reviewers for their time in discussing ANF, CPS and related problems during the course of this work. Thank you all. We acknowledge the support of the Natural Sciences and Engineering Research Council of Canada (NSERC), funding reference number RGPIN-2019-04207. Cette recherche a été financée par le Conseil de recherches en sciences naturelles et en génie du Canada (CRSNG), numéro de référence RGPIN-2019-04207

Conflicts of interest

None.

1. Appendix: Full proof of type preservation and extended figures

Lemma 1.1

  1. 1.

  2. 2. .

Proof. The proof is by induction on the mutually defined judgments and . Cases [Ax-Prop], [Lam], [App], [Snd], [ElimNat], and [If] are given, as they are representative.

Note that for all sub-expressions of type holds by instantiating the induction hypothesis with the empty continuation . The general structure of each case is similar; we proceed by induction on a sub-expression and prove the new continuation is well typed by applications of [K-Bind]. Proving the body of is well typed requires using Lemma 4.3. This requires proving that is composed with a configuration equivalent to , and showing their types are equivalent. To show is equivalent to , we use the Lemma 5.4 and Lemma 4.15 to allow for composing configuations and continuations. Additionally, since several typing rules subsitute sub-expressions into the type system, we use Lemma 5.5 in these cases to show the types of and are equivalent. Each non-value proof case focuses on showing these two equivalences.

  1. Case: [Ax-Prop] We must show that . This follows from Lemma 4.2.

  2. Case: [Lam] Must show . This follows from Lemma 4.2 if we can show . This follows from [Lam] if we can show . This follows by induction on the judgment with the empty continuation .

  3. Case: [App] Must show that .

  4. Let . Our conclusion follows by induction on if we show . By [K-Bind], we must show and . The first goal follows from induction on with the empty continuation . The second goal also follows by induction on , but we must show :. By [K-Bind] again, we must show (which again follows by induction with the empty continuation) and . This follows from Lemma 4.3 if we can show (1) (2) and (3) . By $\vartriangleright_{\delta}$ and [$\equiv$-Step], goal (1) changes to . We have previously shown that and . Using these facts with [App], we derive . We have that by Lemma 5.5 and derive our conclusion by [Conv]. By $\vartriangleright_{\delta}$ and [$\equiv$-Step], goal (3) changes to . We find that the left-hand side of the equivalence can be converted as well:

    Then goal (2) follows from combining the fact that and : .
  5. Case: [Snd]

  6. Must show . This follows by the induction hypothesis for the sub-derivation if we can show By [K-Bind], it suffices to show (1) and (2) . Goal (1) follows by the induction hypothesis for with the well-typed empty continuation . Goal (2) follows by Lemma 4.3 if we can show (3) , (4) , and (5) .

  7. By $\rhd_{\delta}$, we are trying to show . We have previously shown that . Using this fact with [Snd], we derive that . We can derive our goal by [Conv] if we can show that is equivalent to . By Lemma 5.5, we have that is equivalent to . Focusing on , we have:

    Finally, is equivalent to by [$\equiv$-Subst].
  8. By $\rhd_{\delta}$ and [$\equiv$-Step], we are trying to show . Focusing on , we have:

    Combining goals (3) and (4), we can show goal (5).
  9. Case: [If]

  10. Must show Let . Our conclusion follows by induction on if we show . By [K-Bind], we must show (1) and (2) . Goal (1) follows from induction on with the empty continuation . By [If] in goal (2), we focus on showing the new sub-goal as it is the most interesting. This follows by Lemma 4.3 if we can show (3) , (4) and (5) . Goal (3) follows from induction with the empty continuation and [Conv] if we can show that . By Lemma 5.5, we are trying to show . By [$\equiv$-Reflect] and [Var], as well as equivalence by $\rhd_{\delta}$, we can derive that under the extended environment , and we conclude with [$\equiv$-Subst]. Focusing the right-hand side of goal (4), we have:

    Combining goals (3) and (4), we can show goal (5).
  11. Case: [ElimNat]

  12. Must show . This can be proven using the techniques in the [App] case: induction on subexpressions and showing that each continuation is well typed. The body of the final expression must be shown to be of type , which can be proven by Lemma 4.2 if we show This can be done by focusing on the left-hand side of the equivalence:

Lemma 1.2 (Context Replacement (full definition))

  1. 1. .

  2. 2.

  3. 3.

  4. 4.

  5. 5.

  6. 6.

Lemma 1.3 (Context Definition Replacement (full definition))

  1. 1.

  2. 2.

  3. 3.

  4. 4.

  5. 5.

  6. 6.

Lemma 1.4 (Cut (full definition))

  1. 1.

  2. 2.

  3. 3.

  4. 4.

  5. 5.

  6. 6.

Lemma 1.5 (Cut With Definitions (full definition))

  1. 1.

  2. 2.

  3. 3.

  4. 4.

  5. 5.

  6. 6.

Fig. A15: ECC syntax.

Fig. A16: ECC dynamic semantics (excerpt).

Fig. A17: ECC congruence conversion rules.

Fig. A18: ECC equivalence and subtyping

Fig. A19: ECC typing

Fig. A20: ECC well-formed environments.

Fig. A21: $\mathrm{CC}_{e}^{A}$ syntax.

Fig. A22: $\mathrm{CC}_{e}^{A}$ equivalence

Fig. A23: $\mathrm{CC}_{e}^{A}$ evaluation.

Fig. A24: Composition of configurations.

Fig. A25: $\mathrm{CC}_{e}^{A}$ continuation typing.

Fig. A26: $\mathrm{CC}_{e}^{A}$ typing

Footnotes

1 The A in A-normal form has no further meaning. The name originates in a study of the (informal notion of) administrative redexes in CPS and formalizes these as a set A of source-to-source rewrites. The name A-normal form refers to the normal form with respect to the A reductions; the form is A normal. We emphasize this point because it is useful to think of ANF as syntactic form normal with respect to particular set of rewrites and not necessarily the output of a particular translation; we revisit this perspective in Section 7.

2 This missing equality is added to most modern proof assistants, as explained further in the section.

3 This rule essentially implements the convoy pattern (Chlipala Reference Chlipala2013). This pattern is common in dependent types; for example, transformation into the convoy pattern is used by Sozeau (Reference Sozeau2008) to implement the Program vernac in Coq to simplify reasoning with equality in function definitions.

4 This is how ANF translation is implemented in Scheme by Flanagan et al. (Reference Flanagan, Sabry, Duba and Felleisen1993), although their formal model is as a reduction system. An alternative implementation technique avoids continuations and instead returns a list of new declarations, similar to the allocation pass of Morrisett et al. (Reference Morrisett, Walker, Crary and Glew1999).

5 This is essentially a singleton type, but we avoid explicit encoding with singleton types to focus on the intuition—machine steps—and avoid complicating the IL syntax.

6 The Coq reference manual, https://coq.inria.fr/distrib/current/refman/language/cic.html, accessed 2021-07-07.

7 Formally, this rule is admissible. However, we make it explicit as we tried several various simplifications of extensionality in which it is not admissible and find it useful to mark this requirement. We discuss on in Section 7.

8 This presentation of context typing goes back at least to Gordon (Reference Gordon1995), and has seen much use in the literature on operational approaches to full abstraction, logical relations, and secure compilation. Pitts (Reference Pitts1997) gives a summary of some of the early work and their application to program equivalence, particularly observation equivalence.

9 The idea for this proof is due to Huet, (Reference Huet1986); however, he seems to have left the proof as an exercise to the reader. The idea seems quite well understood in the type theory folklore, but Chan, (Reference Chan2021) formalizes the full proof in Agda.

References

Ahman, D. (2017) Fibred Computational Effects. Ph.D. thesis. University of Edinburgh. Available at: http://arxiv.org/abs/1710.02594.Google Scholar
Ahmed, A. (2015) Verified Compilers for a Multi-language World. In Summit on Advances in Programming Languages (SNAPL). 10.4230/LIPIcs.SNAPL.2015.15.Google Scholar
Ahmed, A. & Blume, M. (2011) An equivalence-preserving CPS translation via multi-language semantics. In International Conference on Functional Programming (ICFP). 10.1145/2034773.2034830.CrossRefGoogle Scholar
Ahmed, A. & Blume, M. (2011) An Equivalence-Preserving CPS Translation via Multi-Language Semantics (Technical Appendix). Technical report. Available at: http://www.ccs.neu.edu/home/amal/papers/epc-tr.pdf.CrossRefGoogle Scholar
Anand, A., Appel, A. W., Morrisett, G., Paraskevopoulou, Z., Pollack, R., Bélanger, O. S., Sozeau, M. & Weaver, M. (2017) CertiCoq: A verified compiler for Coq. In International Workshop on Coq for Programming Languages (CoqPL). Available at: http://www.cs.princeton.edu/ appel/papers/certicoq-coqpl.pdf.Google Scholar
Appel, A. W. (2015) Verification of a cryptographic primitive: SHA-256. ACM Trans. Program. Lang. Syst. (TOPLAS). 37(2). 10.1145/2701415.CrossRefGoogle Scholar
Barthe, G., Grégoire, B. & Zanella-Béguelin, S. (2009) Formal certification of code-based cryptographic proofs. In Symposium on Principles of Programming Languages (POPL). 10.1145/1480881.1480894.CrossRefGoogle Scholar
Barthe, G., Hatcliff, J. & SØrensen, M. H. B. (1999) CPS translations and applications: The cube and beyond. Higher-Order Symb. Comput. 12(2). 10.1023/a:1010000206149.CrossRefGoogle Scholar
Barthe, G. & Uustalu, T. (2002) CPS translating inductive and coinductive types. In Workshop on Partial Evaluation and Semantics-based Program Manipulation (PEPM). 10.1145/509799.503043.Google Scholar
Benton, N. & Kennedy, A. (1999) Monads, effects and transformations. Electron. Notes Theoret. Comput. Sci. 26, 320. 10.1016/s1571-0661(05)80280-4.CrossRefGoogle Scholar
Benton, N., Kennedy, A. & Russell, G. (1998) Compiling standard ML to Java bytecodes. In International Conference on Functional Programming (ICFP). 10.1145/289423.289435.CrossRefGoogle Scholar
Boulier, S., Pédrot, P. & Tabareau, N. (2017) The next 700 syntactical models of type theory. In Conference on Certified Programs and Proofs (CPP). 10.1145/3018610.3018620.Google Scholar
Boutillier, P. (2012) A relaxation of Coq’s guard condition. Journées Francophones des langages applicatifs (JFLA). Available at: https://hal.archives-ouvertes.fr/hal-00651780.Google Scholar
Bowman, W. J., Cong, Y., Rioux, N. & Ahmed, A. (2018) Type-preserving CPS translation of $\Sigma$ and $\Pi$ types is not not possible. Proc. ACM Program. Lang. (PACMPL). 2(POPL). 10.1145/3158110.Google Scholar
Briski, K. A., Chitale, P., Hamilton, V., Pratt, A., Starr, B., Veroulis, J. & Villard, B. (2008) Minimizing Code Defects to Improve Software Quality and Lower Development Costs. Development Solutions. IBM. Available at: ftp://ftp.software.ibm.com/software/rational/info/do-more/RAW14109USEN.pdf.Google Scholar
Chan, J. (2021) An Analysis of An Analysis of Girard’s Paradox. Available at: https://web.archive.org/web/20220429152422/https://ionathan.ch//2021/11/24/inconsisten- cies.html.Google Scholar
Chan, J. & Bowman, W. J. (2020) Practical sized typing for coq. Available at: https://arxiv.org/abs/1912.05601.Google Scholar
Chen, J., Hawblitzel, C., Perry, F., Emmi, M., Condit, J., Coetzee, D. & Pratikaki, P. (2008) Type-preserving compilation for large-scale optimizing object-oriented compilers. In International Conference on Programming Language Design and Implementation (PLDI). 10.1145/1375581.1375604.CrossRefGoogle Scholar
Chlipala, A. (2007) A certified type-preserving compiler from lambda calculus to assembly language. In International Conference on Programming Language Design and Implementation (PLDI). 10.1145/1250734.1250742.CrossRefGoogle Scholar
Chlipala, A. (2013) Certified Programming with Dependent Types - A Pragmatic Introduction to the Coq Proof Assistant. MIT Press. Available at: http://adam.chlipala.net/cpdt/.Google Scholar
Cong, Y. & Asai, K. (2018a) Handling delimited continuations with dependent types. Proc. ACM Program. Lang. (PACMPL). 2(ICFP). 10.1145/3236764.CrossRefGoogle Scholar
Cong, Y. & Asai, K. (2018b) Shifting and resetting in the calculus of constructions. In International Symposium on Trends in Functional Programming (TFP). Available at: https://sites.google.com/site/youyoucong212/tfp-2018.Google Scholar
Cong, Y., Osvald, L., Essertel, G. M. & Rompf, T. (2019) Compiling with continuations, or without? whatever. PACMPL 3(ICFP), 79:179:28. 10.1145/3341643.Google Scholar
Coquand, T. & Huet, G. (1988) The calculus of constructions. Inf. Comput. 76(2–3). 10.1016/0890-5401(88)90005-3.CrossRefGoogle Scholar
Flanagan, C., Sabry, A., Duba, B. F. & Felleisen, M. (1993) The essence of compiling with continuations. In International Conference on Programming Language Design and Implementation (PLDI). 10.1145/155090.155113.CrossRefGoogle Scholar
Gordon, A. D. (1995) Bisimilarity as a theory of functional programming. In Eleventh Annual Conference on Mathematical Foundations of Programming Semantics, MFPS 1995, Tulane University, New Orleans, LA, USA, March 29–April 1, 1995. Elsevier, pp. 232252. 10.1016/S1571-0661(04)80013-6.CrossRefGoogle Scholar
Gu, R., Koenig, J., Ramananandro, T., Shao, Z., Wu, X. n., Weng, S.-c., Zhang, H. & Guo, Y. (2015) Deep specifications and certified abstraction layers. In Symposium on Principles of Programming Languages (POPL). 10.1145/2775051.2676975.CrossRefGoogle Scholar
Gu, R., Shao, Z., Chen, H., Wu, X. N., Kim, J., SjÖberg, V. & Costanzo, D. (2016) CertiKOS: An extensible architecture for building certified concurrent OS kernels. In Symposium on Operating Systems Design and Implementation (OSDI). Available at: https://www.usenix.org/conference/osdi16/technical-sessions/presentation/gu. Herbelin, H. (2005) On the degeneracy of $\Sigma$-types in presence of computational classical logic. In International Conference on Typed Lambda Calculi and Applications. 10.1007/11417170_16.Google Scholar
Herbelin, H. (2009) On a few open problems of the calculus of inductive constructions and on their practical consequences. Updated 2010. Available at: https://web.archive.org/web/20181125182737/http://pauillac.inria.fr/ herbelin/talks/cic.pdf.Google Scholar
Herbelin, H. (2012) A constructive proof of dependent choice, compatible with classical logic. In Symposium on Logic in Computer Science (LICS). 10.1109/lics.2012.47.CrossRefGoogle Scholar
Huet, G. (1986) Formal Structures for Computation and Deduction. Technical report. Available at: http://web.archive.org/web/20220408113829/http://pauillac.inria.fr/ huet/PUBLIC/Formal_Structures.ps.gz.Google Scholar
Kennedy, A. (2007) Compiling with continuations, continued. In International Conference on Functional Programming (ICFP). 10.1145/1291220.1291179.CrossRefGoogle Scholar
Lennon-Bertrand, M., Maillard, K., Tabareau, N. & Tanter, E. (2022) Gradualizing the calculus of inductive constructions. ACM Trans. Program. Lang. Syst. 10.1145/3495528.CrossRefGoogle Scholar
Leroy, X. (2009) A formally verified compiler back-end. J. Autom. Reas. 43(4). 10.1007/s10817-009-9155-4.Google Scholar
Levy, P. B. (2012) Call-By-Push-Value. Available at: https://www.worldcat.org/oclc/7330961929.Google Scholar
Luo, Z. (1990) An Extended Calculus of Constructions. Ph.D. thesis. University of Edinburgh. Available at: http://www.lfcs.inf.ed.ac.uk/reports/90/ECS-LFCS-90-118/.Google Scholar
Maurer, L., Downen, P., Ariola, Z. M. & Peyton Jones, S. (2017) Compiling without continuations. In International Conference on Programming Language Design and Implementation (PLDI). 10.1145/3062341.3062380.CrossRefGoogle Scholar
Miquey, é. (2017) A classical sequent calculus with dependent types. In European Symposium on Programming (ESOP). 10.1007/978-3-662-54434-1_29.CrossRefGoogle Scholar
Morrisett, G., Walker, D., Crary, K. & Glew, N. (1999) From system F to typed assembly language. ACM Trans. Program. Lang. Syst. (TOPLAS). 21(3). 10.1145/319301.319345.CrossRefGoogle Scholar
Necula, G. C. (1997) Proof-carrying code. In Symposium on Principles of Programming Languages (POPL). 10.1145/263699.263712.CrossRefGoogle Scholar
Necula, G. C. & Rahul, S. P. (2001) Oracle-based checking of untrusted software. In Conference Record of POPL 2001: The 28th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, London, UK, January 17–19, 2001. ACM, pp. 142154. 10.1145/360204.360216.CrossRefGoogle Scholar
Oury, N. (2005) Extensionality in the calculus of constructions. In Theorem Proving in Higher Order Logics, 18th International Conference, TPHOLs 2005, Oxford, UK, August 22–25, 2005, Proceedings. Springer, pp. 278293. 10.1007/11541868_18.CrossRefGoogle Scholar
Patterson, D., Perconti, J., Dimoulas, C. & Ahmed, A. (2017) FunTAL: Reasonably mixing a functional language with assembly. In International Conference on Programming Language Design and Implementation (PLDI). 10.1145/3062341.3062347.CrossRefGoogle Scholar
Pédrot, P. (2017) A parametric CPS to sprinkle CIC with classical reasoning. In Workshop on Syntax and Semantics of Low-Level Languages. Available at: https://web.archive.org/web/20220122222238/https://www.cs.bham.ac.uk/zeilbern/lola2017/abstracts/LOLA_2017_paper_5.pdf.Google Scholar
Pédrot, P.-M. & Tabareau, N. (2019) The fire triangle: How to mix substitution, dependent elimination, and effects. In Symposium on Principles of Programming Languages (POPL). 10.1145/3371126.Google Scholar
Perconti, J. T. & Ahmed, A. (2014) Verifying an open compiler using multi-language semantics. In European Symposium on Programming (ESOP). 10.1007/978-3-642-54833-8_8.CrossRefGoogle Scholar
Peyton Jones, S. L. (1996) Compiling Haskell by program transformation: A report from the trenches. In European Symposium on Programming (ESOP). 10.1007/3-540-61055-3_27.Google Scholar
Pitts, A. M. (1997) Operationally-based theories of program equivalence. In Semantics and Logics of Computation, Dybjer, P. & Pitts, A. M. (eds), Publications of the Newton Institute. Cambridge University Press, pp. 241298. 10.1017/CBO9780511526619.007.CrossRefGoogle Scholar
Sabry, A. & Felleisen, M. (1992) Reasoning about programs in continuation-Passing style. In LISP and Functional Programming (LFP). 10.1145/141478.141563.Google Scholar
Sabry, A. & Wadler, P. (1997) A reflection on call-by-value. ACM Trans. Program. Lang. Syst. (TOPLAS) 19(6). 10.1145/267959.269968.CrossRefGoogle Scholar
Saffrich, H. & Thiemann, P. (2020) Relating functional and imperative session types. Available at: https://arxiv.org/abs/2010.08261.Google Scholar
Sarkar, S., Pientka, B. & Crary, K. (2005) Small proof witnesses for LF. In International Conference Logic Programming (ICLP). 10.1007/11562931_29.CrossRefGoogle Scholar
Severi, P. & Poll, E. (1994) Pure type systems with definitions. In International Symposium Logical Foundations of Computer Science (LFCS). 10.1007/3-540-58140-5_30.CrossRefGoogle Scholar
Shao, Z. (1997) An Overview of the FLINT/ML Compiler. Available at: https://web.archive.org/web/20161125002746/http://cs.bc.edu/ muller/TIC97//Shao.ps.gz.Google Scholar
Shao, Z., League, C. & Monnier, S. (1998) Implementing typed intermediate languages. In International Conference on Functional Programming (ICFP). 10.1145/289423.289460.CrossRefGoogle Scholar
Shao, Z., Trifonov, V., Saha, B. & Papaspyrou, N. (2005) A type system for certified binaries. ACM Trans. Program. Lang. Syst. (TOPLAS) 27(1). 10.1145/1053468.1053469.CrossRefGoogle Scholar
Sozeau, M. (2008) Un environnement pour la programmation avec types dépendants. (An Environment for Programming with Dependent Types). Ph.D. thesis. Orsay, France: University of Paris-Sud. Available at: https://tel.archives-ouvertes.fr/tel-00640052.Google Scholar
Stecklein, J. M., Dabney, J., Dick, B., Haskins, B., Lovell, R. & Moroney, G. (2004) Error Cost Escalation through the Project Life Cycle. Technical report. NASA. Available at: https://ntrs.nasa.gov/search.jsp?R=20100036670.Google Scholar
Stewart, G., Beringer, L., Cuellar, S. & Appel, A. W. (2015) Compositional CompCert. In Symposium on Principles of Programming Languages (POPL). 10.1145/2676726.2676985.CrossRefGoogle Scholar
Stump, A. & Jenkins, C. (2018) Syntax and Semantics of Cedille. Available at: http://arxiv.org/pdf/1806.04709v3.Google Scholar
Tarditi, D., Morrisett, G., Cheng, P., Stone, C., Harper, R. & Lee, P. (1996) TIL: A type-directed optimizing compiler for ML. In International Conference on Programming Language Design and Implementation (PLDI). 10.1145/231379.231414.CrossRefGoogle Scholar
Thielecke, H. (2003) From control effects to typed continuation passing. In Symposium on Principles of Programming Languages (POPL). 10.1145/640128.604144.CrossRefGoogle Scholar
Timany, A. & Sozeau, M. (2017) Consistency of the predicative calculus of cumulative inductive constructions (pCuIC). arXiv preprint arXiv:1710.03912. Available at: https://arxiv.org/abs/1710.03912.Google Scholar
Vákár, M. (2017) In Search of Effectful Dependent Types. Ph.D. thesis. Oxford University. Available at: http://arxiv.org/abs/1706.07997.Google Scholar
Watkins, K., Cervesato, I., Pfenning, F. & Walker, D. (2003) A concurrent logical framework: The propositional fragment. In International Workshop on Types for Proofs and Programs (TYPES). 10.1007/978-3-540-24849-1_23.Google Scholar
Xi, H. & Harper, R. (2001) A dependently typed assembly language. In International Conference on Functional Programming (ICFP). 10.1145/507635.507657.CrossRefGoogle Scholar
Figure 0

Fig. 1: ECC syntax.

Figure 1

Fig. 2: ECC dynamic semantics (excerpt).

Figure 2

Fig. 3: ECC equivalence and subtyping (excerpt).

Figure 3

Fig. 4: ECC typing (excerpt).

Figure 4

Fig. 5: $\mathrm{CC}_{e}^{A}$ syntax.

Figure 5

Fig. 6: $\mathrm{CC}_{e}^{A}$ typing (excerpt).

Figure 6

Fig. 7: $\mathrm{CC}_{e}^{A}$ equivalence (excerpt).

Figure 7

Fig. 8: Composition of configurations.

Figure 8

Fig. 9: $\mathrm{CC}_{e}^{A}$ evaluation.

Figure 9

Fig. 10: $\mathrm{CC}_{e}^{A}$ continuation typing.

Figure 10

Fig. 11: $\mathrm{CC}_{e}^{A}$ model in eCIC (excerpt).

Figure 11

Fig. 12: Naïve ANF translation.

Figure 12

Fig. 13: Separate compilation definitions.

Figure 13

Fig. 14: Join-point optimized ANF translation.

Figure 14

Fig. A15: ECC syntax.

Figure 15

Fig. A16: ECC dynamic semantics (excerpt).

Figure 16

Fig. A17: ECC congruence conversion rules.

Figure 17

Fig. A18: ECC equivalence and subtyping

Figure 18

Fig. A19: ECC typing

Figure 19

Fig. A20: ECC well-formed environments.

Figure 20

Fig. A21: $\mathrm{CC}_{e}^{A}$ syntax.

Figure 21

Fig. A22: $\mathrm{CC}_{e}^{A}$ equivalence

Figure 22

Fig. A23: $\mathrm{CC}_{e}^{A}$ evaluation.

Figure 23

Fig. A24: Composition of configurations.

Figure 24

Fig. A25: $\mathrm{CC}_{e}^{A}$ continuation typing.

Figure 25

Fig. A26: $\mathrm{CC}_{e}^{A}$ typing

Submit a response

Discussions

No Discussions have been published for this article.