I’d like to announce a result that might be interesting: an interpreter for a very small dependent type system in Coq, assuming uniqueness of identity proofs (UIP). Because it assumes UIP, it’s not immediately compatible with HoTT, but it seems straightforward to port it to HoTT by explicitly restricting the semantics to h-sets.
In his earlier blog post, Mike Shulman pointed out that a good foundational system for math should be able to handle its own metatheory. In particular, a foundational type theory should be able to interpret itself, modulo size issues. This would be the analogue of set theory defining its own standard model. Mike also pointed out that if HoTT can interpret even a small fragment of itself, then we have a definition of semisimplicial types, by interpreting a syntax representation of the type of n-truncated semisimplicial types, for all n. The same approach seems able to solve many other coherence problems that arise with non-truncated objects.
Unfortunately, but perhaps unsurprisingly, defining an interpreter also seems to run into coherence problems. Mike had them, and he discussed them in his post. I also ran into coherence problems, which is why I resorted to UIP. More on that below. But before you even get up to the issue of coherence, there’s a tricky aspect to defining an interpreter that I should discuss, since it’s the problem that I do seem to have solved.
Dependent Types Systems are in a Knot:
An annoying thing about type theory is that everything you’re allowed to say has to make sense. I find it annoying because when I’m trying to figure out something new, I am typically unable to make sense at first; I have to play around with some plausible-seeming nonsense until I understand all the details to get right before it technically makes sense. Type theory does not provide a setting for this early development to take place. Luckily there’s paper and mental imagery for that.
When writing an interpreter, the annoyance of everything needing to make sense becomes an actual challenge. From a naïve programming point of view, interpreters should be easy: you analyze the syntax, and in each case you build the appropriate form of object. Done. But in a type system, you’re not building anything unless the expression to build it is well-typed. Interpreted objects will have various types, so to interpret a term as an object of the appropriate type, we need to first interpret that type. And to interpret open terms and types, we need an environment to interpret the variables. But an environment is an element of a context, considered as a type. So we need to interpret contexts first, so that we can interpret terms and types as maps out of the appropriate context. But types have terms in them, and contexts have types in them, so there’s no right order in which to separately define interpreters for contexts, types, and terms, one after another. And the usual style of mutual recursive definition can’t handle one mutually defined function being used in the type of another one. Dependent type theory is tied up in this crazy knot.
It seems that we haven’t yet figured out how to untie this knot, because apparently no one has yet written an interpreter for dependent type theory within dependent type theory. Not if the input is truly just plain old syntax and typing derivations (“raw syntax”).
The language to interpret was designed to be the easiest-possible language that still has n-truncated semisimplicial types. There is one universe, and pi types. The universe is Tarski-style, in that there’s an
El operator to lift universe elements to types. But there are no constructors for type codes. There is application, but no lambda. Also, function terms can only be applied to variables. Thus, a term is just the application of a head variable to a list of argument variables.
The language has no redexes, so there is no need for reduction rules or equality rules. Capture-avoiding substitution is nonetheless defined as usual, and is used in the typing rules. Well actually, substitution is a bit different from usual, since we only need to substitute variables for variables.
The language is interpreted, feature-for-feature into Coq. In other words, we use a fragment of Coq’s type theory as the model of the object language. No setoids or logical relations or anything. Contexts are interpreted as left-nested sigmas, judgments (with subject omitted) are interpreted as (dependent) function types, and so a derivation is interpreted as a function. This seems to be the sensible notion of “standard interpretation” for a Church-style dependent type system. It’s analogous to the tagless interpreters already known for simply-typed languages.
The development of the interpreter itself is done, but it could use prettying up, and I haven’t done anything with it yet. The code is available online. I would welcome any constructive criticism (Classical criticism isn’t good enough. 😉 ), ideas for simple improvements, discussion of related work, etc. I would especially welcome discussions about what’s going wrong that’s preventing us from working with coherence of non-truncated types, and what to do about it.
How it Works:
My interpreter is based on a strongly-typed representation of dependently-typed terms and types; I’ll explain the background. Recall that type theory is tied in a knot. Most of the successful attempts to partially untie the knot use inductive-inductive definitions of the language to interpret. Mike spent most of his post talking about trying to use this approach for HoTT and semisimplicial types. The elimination rule for inductive-inductive types basically gives you (almost?) exactly what you want in order to interpret contexts, types, and terms mutually. But inductive-inductive types are not regular syntax types. They’re kind of like syntax, but terms in this style are intrinsically well-typed, rather than being proven well-typed with a separate derivation. The unfortunate result is that constructing an inductive-inductive representation of an object is about as tricky as just directly constructing the object. And because of that, it seems hard just to interpret from raw syntax to inductive-inductive representations. And of course, it’s hard to define inductive-inductive representations of semisimplicial types.
My interpreter doesn’t use that kind of strongly-typed representation. A major nuisance with the inductive-inductive approach is the need for explicit conversions between represented types that are judgmentally equal in the object language. In fact, for HoTT, this seems to be more than just a nuisance because now we have to worry about judgmental coherences to keep these explicit “judgmental transports” from getting in the way. And yes, you get an infinite regress where you seem to need ever higher coherences.
A Totally Outrageous Solution:
A different kind of strongly-typed representation was developed in Conor McBride’s paper Outrageous but Meaningful Coincidences. The “Outrageous” approach is essentially to use the metalanguage’s types to classify the object language’s terms. Amazingly, it works: you get another way to represent intrinsically well-typed terms that doesn’t use judgmental equality in the object language, effectively using the metalanguage’s judgmental equality instead. This does away entirely with any issue of judgmental coherence. And this appropriation of the metalanguage equality just so happens to be exactly what we want for a standard interpretation. Conor’s paper used inductive-recursive types to develop the representation, but there’s a trick for imitating certain inductive-recursive types with inductive families. Inductive families are available in Coq, so I decided to try writing an interpreter in Coq, based on Conor’s outrageous structures. Indeed, the interpreter I came up with uses many inductive families based on Conor’s type definitions.
We still have a knot to untie, since we’re starting from raw syntax. How do inductive families help us? Well we need to figure out types for the interpretation functions such that we can actually try to define them by recursion. Using the type interpreter in the type of the term interpreter is too circular because types have terms in them. What if we try to just interpret terms as pairs? Some type with some element of that type? The induction hypothesis would be too weak; things won’t fit together.
In set theory, a function is just a relation that’s functional. What I did was first define interpretation relations, then prove that for all inputs, there’s at most one output (determinism), then prove that for all inputs, there’s at least one output (totality). So basically I defined some functional relations. The actual type theoretic functions are hiding in the constructive totality proofs. This may seem cumbersome and unhelpful, but in fact it’s merely cumbersome. The relations give us the strengthened induction hypotheses we need to make things fit together. Almost. It didn’t solve coherence.
Inductive-recursive types, and especially their inductive family imitations, are kind of like inductively-defined subsets. The outrageous representation is inductively defining the subset of the metalanguage’s objects that are expressible in the object language. The represented terms are the proofs that the corresponding object is expressible. We can forget the proof, just keeping the object; that’s how outrageous terms are “interpreted”. To get from outrageous representations to the interpretation relations we need, we simply add an index for the corresponding raw syntax. That is, our outrageous terms are now proofs that some raw term is interpreted as a certain object of the metalanguage. We do not need to deal with type derivations here; the relation only deals with well-typed raw syntax. In fact, the definition of the interpretation relations looks an awful lot like the typing rules. Only when proving totality do we need to use derivations.
So in summary, the interpreter is totality proofs for a relation-ized outrageous representation of the language.
Odds and Ends:
In fact, why do we actually need to prove determinism? What we really want is the constructive totality proofs. Well, we can get through many of the cases of totality without needing determinism. The induction hypothesis gives us what we need. But the cases that correspond to typing rules which involve a type equality check get stuck. For my subset, that’s just the application rule, in order to check the argument type against the function domain. We are stuck because we have two hypotheses telling us what the same syntactic type interprets to, and we need to know that those interpretations are the same, which is exactly what determinism would tell us.
Typing variables requires weakening a type from the context, and typing application requires substitution. Handling this requires a semantic notion of substitutions, and proofs that interpretation respects substitutions. In other words, interpreting a syntactically substituted raw term is the same as interpreting the original term, then semantically substituting. (This would be a commutative square, but the interpreter is still just a relation.) These proofs take the form of an implementation of substitutions for the outrageous representation. Although I eventually got this working fine, it’s actually the most complicated part of the development.
For me, the least intuitive aspect of the outrageous representation is that it wraps up the metalanguage types in a universe construction. Why would you need a universe construction only to then wrap the types up in another, more specific layer of structure for expressibility? In fact there are several simplifications you could make to the outrageous representation if you were thinking of it merely as intrinsically valid syntax. But it turns out you need all that structure for various lemmas ultimately used by the totality proofs. (I don’t know how Conor anticipated the need for it all.) In particular, the universe construction is vital for getting anywhere with the determinism proofs. Without it, the determinism proofs’ induction hypotheses are too weak, because you can’t do enough with equalities on constructed types. For example, given equal function types, we would like equal domains and codomains (injectivity). And given equal (type * term) pairs, when the types are the universe, we would like the terms to be equal as types (disjointness).
The Coherence Problem:
Unfortunately, the application case causes trouble again in the determinism proof, and this time I couldn’t think of any trick. The problem is once again that the domain and argument type need to agree. But this time it’s a coherence problem. We actually have two reasons why those types are connected, and various other paths we need to contract lie over one or the other. But we don’t have the coherence between the type paths that would allow as to contract everything in order to solve the goal with reflexivity. This is where I assumed UIP to finish the proof.
The universe construction is not enough to tell us about an arbitrary type because of type variables. Semisimplicial types are full of type variables, and the problem is to allow arbitrary types for those variables. To avoid demanding any additional structure on abstract types, the
El constructor just lifts a plain
Type into the universe construction. We have no way of analyzing a path in an abstract type; we need any paths we deal with to be coherent “by construction”.
But the annoying thing is: they are coherent by construction! All the interpreter started with was raw syntax; we constructed all the paths ourselves. And since we didn’t use any HoTT axioms, we’re consistent with UIP, so we couldn’t possibly have constructed disconnected parallel paths. Sadly, intentional type theory (ITT) lacks the ability to express strict equality, so in our induction hypotheses, we’ve forgotten that we built everything to fit together on the nose.
All type families in ITT respect weak equality, so with induction predicates we can only characterize things up to weak equality. Judgmental equality, on the other hand, is strict. In ITT, we only have judgmental equalities expressing computation by unfolding of definitions. But this definitional equality ought to be enough in some sense. We’re defining an interpreter, and it builds things that fit together on the nose by definition. The problem seems to be that we need to know this while we’re still defining it, to handle the tough cases. ITT doesn’t seem to provide any way to use the defining equations of a recursive function while defining it, to help use the recursive call. So it seems to me like ITT currently has no way to define a standard interpretation function. But I have no rigorous argument.
For solving the coherence problem, we don’t have to wait until we understand what’s going wrong in ITT. We can extend the system to ensure that the desired constructions can be carried out. Here are some ideas I know of:
If I understand correctly, one of the motivations for Vladimir Voevodsky’s proposed HTS is that it would overcome the kind of coherence problems that come up with semisimplicial types. Hopefully it would allow non-truncated interpreters to be defined as well. The reason why HTS is expected to solve the problem was discussed on the HoTT Google group. HTS provides a universe of non-fibrant types, where type families need not respect weak equality. In this universe is a strict equality type. This type can be used in an induction predicate to remember that a construction fits together on the nose while you build it. But the finishing move is to somehow get a fibrant function out of the induction even though strict equality is not fibrant. HTS will allow this because many inductive types, including the type of natural numbers, are fibrant and also allowed to eliminate into non-fibrant types. Once the induction is done, the strict equalities can be forgotten, yielding a fibrant function.
Adding a whole ‘nother universe is a pretty heavy solution just to get some new constructions, important as they may be. But the more serious problem is that, if Mike’s concerns are correct, non-fibrant elimination for nats is not justified by enough models of HoTT. The ideal solution to the coherence problem will allow a construction of semisimplicial types (and other things) that works uniformly for as many models as possible. It’s not yet clear that non-fibrant elimination accomplishes that.
Limits of Towers:
On the Google group, subject “Infinitary type theory”, Mike has asked about infinitary type theory, proposing infinitary rules that axiomatize limits of externally-defined towers. Because the definitions are external, they wouldn’t run into the usual coherence problems of certain internal towers. However, Mike doesn’t have a specific proposal yet for finitely presenting the infinitary terms, AFAIK. The details could turn out to be complex. If you want to know more about this idea, ask Mike!
Assuming the existence of a sufficiently complete standard interpretation function could do the trick. It would be kind of a nuisance though, since you’d need to define your coherent structures as syntax, and prove them well-typed to interpret them. Also, internal interpretation functions would be awfully complicated things for all models to have to provide. Jason Gross has pointed out a paper describing a primitive interpreter for Agda. It’s not ready-made for HoTT, but that’s the kind of thing I’m thinking of.
It was Mike’s observation that the problems of interpreting dependent type systems and defining semisimplicial types (both in HoTT) are related, and both unsolved. If it weren’t for his post, I wouldn’t be here thinking about HoTT.
The cleverest parts of my interpreter are adapted from Conor McBride’s paper (Outrageous but Meaningful Coincidences). You should read it! I have not discussed the outrageous representation in much detail. Also, many of the ingredients of the outrageous representation have their own story, and Conor provides the citations.
I’ve been talking with others about the interpreter problem on and off for several months, mainly on the comments area of Mike’s post. Mike and Jason Gross have been especially active in this discussion. Thanks to this I had a lot of fun, and learned a few things.
I’m wondering what I should do next, in the collective/aggregate opinion of the HoTT community. Although the main reason I wrote an interpreter was not for HoTT, I am interested in HoTT, and I would like to help if it isn’t too big a detour, or if someone would hire me.
Even without solving the coherence problems, interpretation functions would be useful because they are necessary to fully formalize models of HoTT, working in HoTT or any other dependent type system. They are also useful for dependently typed generic programming, and for justifying powerful forms of reflective proof systems, which is what got me interested in the problem.
Here are some ideas for what I might do:
- Spend some time fixing up style issues with the current code, in case people have many such issues.
- Prove things (what things?) about the interpreter.
- Define functions (what functions?) using the interpreter.
- Port the interpreter to HoTT Coq, using explicit h-sets.
- Expand the subset of type theory handled by the interpreter, sticking to the standard semantics.
- Try to generalize the interpreter to use other models for the semantics.
- Try an alternative approach to interpreters based on taking a limit of restricted interpreters.
Request for Advice:
I am hoping to publish a paper about this kind of interpreter, but I’ve never published papers before. I would really appreciate opinions and advice on how to go about this, like:
- What journals or conferences would be good for this formalized interpreter written in Coq? How do I pick one?
- Is this interpreter a significant enough result to be published?
- How do I go about tracking down the papers I should cite?
- What nasty surprises do I need to deal with to publish something?