In this post I will argue that, improving on previous work of Awodey-Frey-Speight, (higher) inductive types can be defined using impredicative encodings with their full dependent induction principles — in particular, eliminating into all type families without any truncation hypotheses — in ordinary (impredicative) Book HoTT without any further bells or whistles. But before explaining that and what it means, let me review the state of the art.

Categorically, a (higher) inductive type is an initial object in some category. For instance, the coproduct of and is the initial object of the category of types equipped with maps and ; and the circle is the initial object of the category of types equipped with a point and an equality . A (H)IT is therefore a kind of “colimity” thing, while an “impredicative encoding” of it is a way to construct it (or an approximation thereof) using *limits* instead — specifically, *large* limits, which requires a universe closed under impredicative quantification (i.e. -types whose domain is not necessarily an element of the universe).

The most basic such encodings (usually associated with System F) approximate the initial object of a category by *the product of all objects of that category*. For instance, the System F coproduct is the product of all types equipped with maps and , which in type-theoretic syntax becomes

Such impredicative encodings of ordinary inductive types are well-known in type theory, and back when we were first discovering HITs I blogged here about the fact that they can also be encoded impredicatively. However, these basic encodings have the problem that they don’t satisfy the “uniqueness principle” or -conversion, and (equivalently) they don’t support a “dependent eliminator” or “induction principle”, only the non-dependent “recursion principle”. In categorical language, the *product* of all objects of a category, even if it exists, is only a weak initial object.

Last year, Sam Speight blogged here about a way to fix this problem (paper here with Awodey and Frey), by defining a (higher) inductive type to be the *limit of the identity functor* of the category in question, and appealing to this theorem. Such a limit, if constructed out of products and equalizers, appears as a subobject of the product of all objects; and if we “compile out” this definition in type theory, it adds additional assertions that System F encoding data is *natural*. For instance, the coproduct becomes

This approach is very nice, but unfortunately it only works for h-sets, i.e. 0-types. That is, given two sets , it produces a coproduct set , which has a dependent induction principle that can only eliminate into other sets. Categorically, the point is that the construction of limits out of products and equalizers works in a 1-category, but not in a higher category. We can boost up the dimension one at a time by adding additional coherence conditions — the paper ends with an example of constructed as a 1-type with a dependent induction principle that can eliminate into other 1-types (but not higher types) — but this approach offers no hope of an induction principle into types that are not -truncated for any . If we could solve the “problem of infinite objects” (perhaps by working in a context like two-level type theory) maybe we could add “all the coherence conditions at once”, although the path algebra would probably get quite hairy.

Surprisingly (to me), it turns out that there is a different solution, which works in (impredicative) Book HoTT without any bells or whistles. Recall that theorem about initial objects as limits of identity functors, and note that it’s proven using this lemma, which says that if is an object of a category equipped with a natural transformation from the constant functor to the identity functor, i.e. a cone consisting of maps for all objects such that for all morphisms , and moreover is the identity morphism, then is initial. Can we use this lemma directly?

You might think this is no better, since even obtaining a fully-coherent -natural transformation already involves infinitely many coherence conditions. However, inspecting the *proof* of this lemma, we see that it will work in HoTT even for an “incoherent” natural transformation, essentially because the definition of contractibility is a propositions-as-types translation of being a singleton set. That is, to show that is initial in , we want to show that is contractible. We take its center to be , and then we must show that any is equal to . But by incoherent naturality, we have , which is equal to since .

This is great, because the 0-type version of the Awodey-Frey-Speight construction, if written down with the full universe in place of , already comes with an incoherent cone over the identity functor. So “all” we need to do is ensure . (Note that this is a version of for an inductive type.)

Here’s the trick. Note that the proof of the lemma shows is initial by applying the naturality property in a case where one of the objects is itself. Let’s try to prove is the identity by applying that same naturality property in the case where the *morphism* is itself. This gives us . In other words, although may not (yet) be the identity, it’s already the next best thing: an *idempotent*. And in the words of Robinson and Rosolini,

No category theorist can see an idempotent without feeling the urge to split it.

This is especially true when we were hoping that that idempotent would turn out to be an identity, since splitting it is a way to “make” it an identity. And in fact it’s not hard to show with categorical algebra that if and are a splitting of (so that and ), where is a cone over the identity functor, then is initial. For we have another cone , whose -component is . And we have , using naturality and the splitting equations, so , so the lemma applies. Note that this argument also uses no higher coherence, and so should work just fine for an incoherent natural transformation.

What’s left is to split the idempotent in type theory. Fortunately, I wrote a paper a few years ago (original blog posts here and here) about precisely this problem, inspired by analogous results of Lurie in higher category theory. It turns out that an arbitrary “incoherent idempotent” (a map equipped with an equality ) may not be splittable, but as soon as the “witness of idempotency” satisfies *one* additional coherence condition (not the infinite tower of such one might expect), the map can be split. And in our situation, we can obtain this one additional coherence condition for the naturality triangle by using the 1-type version of the Awodey-Frey-Speight construction. (Actually, we can even omit their unit condition; all we need is the composition coherence for pseudonaturality.)

I’ve verified in Coq that this works for coproducts, using `type-in-type` for impredicativity. (I don’t think Coq’s built-in `impredicative-set` option is compatible with the HoTT library, which has tried to excise Coq’s built-in smallest universe `Set` as much as possible.) Most of the proof is quite easy; the longest part (40 lines of path algebra) is deducing the idempotence coherence condition from the naturality coherence condition. As expected, we only get typal computation rules for the induction principle — although we do have definitional computation rules for the *recursion* principle (i.e. the non-dependent eliminator). More precisely, there is *a* non-dependent eliminator (a special one, not just the dependent eliminator specialized to a non-dependent motive) that satisfies a definitional computation rule.

I’m working on coding it up for higher inductive types as well, but the path algebra gets quite annoying. Further bulletins as events warrant.

To conclude, let me point out that I find this a very satisfying answer to the question with which I ended my second idempotents post:

… [For equivalences] we also have a “fully coherent” notion … that is a retract of a type of “partially coherent” objects…. Are there any other structures that behave like this? Are there any other “fully coherent” gadgets that we can obtain by splitting an idempotent on a type of partially-coherent ones?

The answer is yes: a fully-coherent impredicative encoding can be obtained by splitting an idempotent on a partially-coherent one.

Hmm, apparently wordpress posted this for me even though I wasn’t quite ready yet. I intended to add the remark that this construction of (H)ITs requires that we already have the natural numbers, which are used in the construction of idempotent splitting. So we can’t use it to define the natural numbers impredicatively. I also don’t know how generally it will work for other recursive HITs: I don’t immediately see an obstacle, but if it works for “LS section 9” then that would mean that impredicativity is more powerful than I ever realized.

Oh, and I also meant to add that we need function extensionality. Of course that was already true for the Awodey-Frey-Speight version, but it’s “even more true” here because idempotent-splitting also requires function extensionality.

Univalence, however, is not required.

Wow – fantastic!

This should open up all kinds of possibilities.

Yes — here’s one additional, slightly different, possibility that just occurred to me. Suppose we have a subuniverse that we’d like to show is reflective. If it consists of the -local types for some , then we can express this as a HIT localization and impredicatively encode the latter. But in fact, as soon as is closed under limits (which is always true for the -local types), we should be able to reason more directly, starting from the naive impredicative encoding

refining it a la AFS:

adding an extra level of coherence, then splitting an idempotent. So in impredicative HoTT, every subuniverse closed under limits should be reflective (i.e. we have the “dream adjoint functor theorem”).

Yup – this should also work the same way for algebras for endofunctors, since they’re also closed under limits, which are created by the forgetful functor. We do the set-level W-types this way in our paper, but now it should also work for arbitrary (polynomial) endofunctors. Similarly, I guess we can also prove the “dream fixed-point theorem”:

everyendofunctor has a (“least”) fixed point.In particular, the dream adjoint functor theorem should allow us to localize at large families of morphisms, which ordinary HITs do not. For instance, in this way we should be able to construct the hypercompletion modality, as well as homotopical localization and completion with respect to homology and cohomology theories. The latter are standard tools of classical algebraic topology, but so far we have not managed to construct them in HoTT; classically one only finds a small generating set by a cellular cardinality argument.

Very nice! About the classical situation: we don’t know whether we can localize with respect to an arbitrary cohomology theory, unless we assume Vopenka’s principle. (Some references are at https://ncatlab.org/nlab/show/cohomology+localization) Is there a relationship between impredicativity and such large cardinal axioms?

However, there is a serious issue remaining with this approach to HITs (which may be related to Dan’s question about cohomological localization). A HIT constructed impredicatively with this construction can eliminate into types of unbounded h-level, but still only ones that belong to the impredicative universe. In particular, it can’t eliminate into the impredicative universe itself. However, in order to prove much of anything about a HIT (e.g. with encode-decode) we need it to eliminate into

someunivalent universe!I think I recall that it’s inconsistent for one impredicative universe to contain another one. In impredicative Coq, the impredicative universe is the smallest universe; is there any other choice? Is it consistent for an impredicative universe to contain a predicative one? If not, then we may not actually be able to prove very much about these impredicative HITs.

Fiddling around with impredicative universes is one thing I eventually want to try. My intuition is that what’s really going on is you need to keep straight which function types are for functional

programs, and which are for functionalrelations. With Coq’s`-impredicative-set`

,`Set`

‘s function types are for programs, and`Type`

‘s function types are for anafunctions. I think you could put “predicative” universes inside impredicative`Set`

as long as you realize that the functions still can’t be anafunctions, because you want cumulativity.My hope was to use modes to enforce program/relation, instead of universe level, and give

allthe universes impredicative quantifiers! Caveat: The impredicative quantifiers themselves need to be anafunctions, but their elements cannot be.But this may not be the only/best way to think about it. It may also be dead wrong, since I haven’t actually worked out details. Also, this thinking is all rather assemblies-ish. My impression was that toposes don’t play nice with proof-relevant impredicative quantifiers. And HoTT is supposed to be topos-like.

On the other hand, what I’m calling “anafunctions” are just those things that work like anafunctions from the metalanguage’s perspective. With honestly predicative HoTT, the functions all internally behave like anafunctions, but externally, they can be plain old programs. Maybe that can work to some extent with an impredicative HoTT too.

An impredicative universe can (and does) contain a predicative one: from the type of natural numbers one can construct a universe of finite sets in the usual way. Then one can even make it univalent by forming the “pregroupoid” of finite sets, and then using an impredicative encoding to make the groupoid quotient (aka Rezk completion) , which will be (the indexing type of) a univalent universe with Sigma-, Pi-, and Id-types.

Hmm… but the groupoid quotient is a HIT, so it seems that the only way to characterize its identity types (and show that your universe is univalent) is if we could

alreadyeliminate from it into a univalent universe?Let me get caught up – there’s already a lot going on here:

– yes the groupoid quotient is a HIT, but the idea is to make it by an impredicative encoding, like you can make a set-quotient by an impredicative encoding (I don’t want to write it out here, but I think you can imagine how it goes: homming out of the quotient has a universal property that can be unwound in the impredicative encoding).

– I don’t know what you mean by “to characterize the identity types you need to eliminate into a univalent universe”. Maybe you are not thinking of doing it impredicatively, but instead by forming the usual HIT? Either way, we do have the big impredicative universe which we might as well assume is univalent.

– we can separate the issue of impredicative encoding of groupoid quotients from making a predicative universe inside an impredicative one. Which part are you not happy with?

Yes, I assumed you meant to make the groupoid quotient by an impredicative encoding; that shouldn’t be a problem. Such a groupoid quotient would have the right universal property, but to show that it actually defines a univalent universe would still require characterizing its identity types (to prove univalence), presumably using an encode-decode along the lines of Step 3 of the second proof of Theorem 9.9.5 in the book. We can assume the big impredicative universe is univalent, but that doesn’t help because the groupoid quotient, like all the other impredicatively defined HITs, can’t eliminate into the big impredicative universe, only to types inside of it.

Put differently: if all we know is that we have one univalent impredicative universe (and perhaps other larger univalent universes containing it, but no HITs in the ordinary sense), then there’s nothing to prevent all types

insidethat impredicative universe from being sets. In that case the impredicatively defined HITs would still have the right universal property for eliminating intoother types in the impredicative universe, which is all you get out of their construction; but globally, they would be more like the 0-truncation of the HITs we really want.Yeah, to actually do a predicative universe construction inside the type theory, I’m pretty sure you need large elimination.

That’s the big remaining problem with the impredicative encodings of this post: that they still don’t provide large elimination.

I don’t really see the problem. As we say in the paper, *some* of the things that conventional wisdom says require large elims can actually be done differently. For example, we get the dependent elim from simple elim via the eta rules. So are you saying that the result might fail to be a *universe* (how could that be — it’s a universe for everything that it classifies), or that might fail to be univalent — i.e. to have the universal property of a groupoid quotient? The latter holds if the impredicative encoding works right to give the eta rules, which I think we agree is ok.

So it seems that the issue is just whether it has the right universal property with regard to large types, like the universe itself — which of course it won’t necessarily have. But then the question is again, for what do we need that?

I’m saying it can fail to be univalent. The universal property of a HIT groupoid quotient is different from saying that the quotient is univalent. The universal property is a “mapping out” property; it doesn’t tell you what the identity types of the object are, and in particular not about univalence, until you have some kind of large elims. Just like the unit type has the universal property of in the category of sets; the universal property of only tells you that once you have large elims into a univalent universe.

Being univalent for a pre-groupoid just means that it is a 1-type and its pre-groupoid structure is its identity type. The universal property implies that the quotient is a 1-type, so it does indeed say something about the identity types — namely that they are (at most) sets. Are you saying that we don’t know that the quotient actually is a *proper* 1-type because we don’t know whether there are any (other) proper 1-types? We don’t need the universe for this.

That’s right: for all we know, the “groupoid quotient” might actually be a 0-type. I suppose technically we might not need a universe, but univalence is the only axiom I know of that guarantees that non-0-types exist, and encode-decode into a univalent universe (and related arguments) is the only way I know of to prove that a particular type is not a 0-type and characterize its identity types.

I can see how this argument might be modified to show in a hypothetical model that a universe of “modest -groupoids” contains a univalent universe of finite sets.

However, I think a univalent universe containing only finite sets is of limited usefulness. In particular, in order to show something like we need to eliminate from into a univalent universe that contains $\mathbb{Z}$.

I was just giving an example to answer your question of whether it is consistent to have a (univalent) predicative universe inside an impredicative (univalent) one. If you now want a bigger one, we can look for that, too : – )

Anyway, I think it’s better to have the impredicative universe at the “top”, rather than at the bottom of a hierarchy of predicative universes.

Maybe if we built an impredicative universe of modest -groupoids based on a PCA of inaccessible cardinality? (-:O

Maybe … I haven’t considered such things. But I don’t see any reason why we can’t have a small predicative universe containing the natural numbers in the big impredicative one.

That’s what I was trying to imagine how to construct. An obvious set-theoretic point of view is that a universe containing the natural numbers must be inaccessibly large, so I was trying to imagine how such a thing could sit inside a universe of modest things.

Quite-comfortable predicative universes work with modest sets based on a countable PCA. This is essentially what the semantics of Nuprl’s CTT has. I think this is also how universes work in Feferman’s explicit mathematics.

The trick is to have type codes as programs of the PCA, separately have “semantic types” as modest sets or whatever, then relate them by a partial interpretation function. The partial interpretation is outside the PCA, but the type codes give you a way to handle types from within. Since the predicative universes really are predicative, you can just define them one after another, using the partial interpretation functions as the modest sets for the universes themselves.

From a size perspective, having all sorts of pretoposes as universes doesn’t seem problematic, but the programs of the PCA aren’t really relations.

Oh, I think I see how that might work; thanks! And Harper et. al.’s computational cubical type theory is built on a PCA like that too, right? So their PER model, with all its univalent universes, ought to live inside an impredicative universe of modest objects in an ambient “realizability -topos”. Nice!

Well, computational cubical type theory seems more like a PCA-based

model categoryfor a predicative HoTT. So in terms of size, your OK, but I don’t know off hand how to get the predicative universes to agree with the impredicative one about paths. Maybe it’s easy, or hard, or impossible.Do we even know for sure that a proof-relevant, impredicative universe is consistent with, say, propositional truncation? (Including the consequent unique choice principle?) I repeat that my impression was that a proof-relevant impredicative universe doesn’t work in a topos. But I’m fuzzy on the details. Maybe the situation is that proof-relevant quantifiers range only over projective objects?

The modest sets in the assemblies on a pca are an example of a proof-relevant, impredicative universe in a setting with propositional truncation, since the category of assemblies is regular.

Yes, but the notion of proposition in assemblies is different from HoTT: you can’t prove unique choice. Well, semantically I’d rather think that’s because the notion of function is different. Unique choice collapses functions and anafunctions.

I have no idea what you mean. There’s no non-trivial HoTT in assemblies, because everything is a 0-type. The notion of a proposition in a regular category is defined by image factorization. There’s no room for anything to be “different” here. This is an example of an impredicative, proof relevant universe in a setting with propositional truncation (period). The only way around that is by interpreting something in a non-standard way.

Nothing forces you to use functions rather than anafunctions just because you’re talking about assemblies or PERs. The computational type theory people seem to prefer functions, but I believe that if you define modest sets inside a realizability topos, the notion of “function” you get is an anafunction, and they’re still impredicative.

The question was whether it’s consistent to have a proof-relevant, impredicative universe in a setting with propositional truncation. The example I gave shows that it is.

My response was to Matt, not to you. I agree that your example shows that; I was trying to explain why this example behaves differently than he seems to expect assemblies to behave.

To reply to both of you, maybe I should back up and state some assumptions: We want an impredicative universe, whose type theoretic axiomatization is the same as for the PERs/modest sets within the category of assemblies. But we also want propositional truncation, whose axiomatization is like in Book HoTT.

So my concern is whether that actually has a model anyone knows of. By subsetting the impredicative universe to its h-props, it seems like you should get a subobject classifier. But I thought the category of assemblies doesn’t have a subobject classifier, and so we may be confusing what we have a model for.

In assemblies or PERs/modest sets, you have a distinction between functions and anafunctions. With a subobject classifier, you have unique choice and you don’t have a distinction between functions and anafunctions. Yes of course, with assemblies you can use anafunctions, but to use assemblies, you need to chuck out unique choice.

Maybe the confusing part is that in Book HoTT, we didn’t need to do anything special to get unique choice. But I think that just means Book HoTT’s propositional truncation is not a correct axiomatization of bracket types for assemblies.

In the internal type theory of a realizability topos, which is the model Steve is talking about, there is an impredicative subobject classifier (an impredicative universe of all hprops) and also a proof-relevant impredicative universe of modest sets. The morphisms between the modest sets (and indeed between all objects, which are rather more general) in this model are, by definition, what one might call “anafunctions”. Unique choice holds globally; the model doesn’t know anything about the “non-ana functions”. However, I believe that not every hprop is modest: the impredicative universe of all hprops is not a subtype of the proof-relevant impredicative universe of modest sets.

This column is getting pretty skinny, but I’ll give it a try:

To show that the modest sets are closed under subjects in the topos, we can use the characterization of the modest sets as the right orthogonal class of . An easy diagram chase shows that that class is closed under subobjects.

Oh, okay, that’s nice to know. I guess the only caveat then is that the subobject classifier itself is not modest (which is, of course, impossible, since one impredicative universe cannot contain another one).

Steve referred to the category of assemblies on some PCA. I knew most of what you say about the corresponding realizability topos. The part that concerns me is whether the modest sets still count as impredicative in the right way, as a universe for the whole topos, rather than just the assemblies. The CIC-style axiomatization of an impredicative universe is supposed to be for the category of assemblies. I don’t know if it works for the topos. (But maybe it does. That’s just what I’ve been worrying about this whole time.)

I believe this is the point of the Hyland-Robinson-Rosolini paper

The discrete objects in the effective topos.Moving out from Mike’s last, very skinny, comment about whether large elim into a univalent universe is needed:

Ok, now I think I see your point — it’s not that we *need* a univalent universe, just that the known proof uses one. I agree with that, but I don’t think it’s a problem. The modest Kan cubical assemblies, e.g., include plenty of higher types.

Okay, so you are thinking about models. I must have misunderstood your first post about the groupoid quotient; I thought you were proposing that we could

constructa univalent universe of finite sets inside an impredicative oneinside of type theory. I am certainly convinced by your argument that such things should exist in models.Maybe the discussion moved off in a different direction, but the original question was whether it is *consistent* to have a predicative (and perhaps univalent) universe inside an impredicative one. That’s what the example was supposed to show — not that it’s *provable* that there are such things.

So by impredicative Book HoTT, it looks like you *don’t* mean Book HoTT with propositional resizing. For that system, maybe I’m being thick, but it seems like none of this would actually be impredicative, since the quantifiers are not yielding propositions. So it probably wouldn’t work.

So does impredicative Book HoTT mean same axioms, but with `-impredicative-set`? I don’t understand the technical reason to test this with `-type-in-type`. Doesn’t impredicative `Set` support everything that regular `Set` does?

Right, propositional resizing is a totally different kind of “impredicativity”. (They overlap in permitting impredicative definitions of propositional truncation, though, as well as things like truncated disjunction and existential quantifiers.)

As I said in the post,

Maybe you’re asking for more details on that last point? I don’t completely understand what’s going on, but (aside from the annoyance of having to add

: Setannotations everywhere) it seems universe polymorphism doesn’t play well with impredicative-set. For instance, the universe-polymorphic funext axiomfunext : forall {A : Type} {P : A -> Type} (f g : forall x : A, P x) (forall x, f x = g x) -> (f = g)(I’m simplifying a bit from how it’s actually defined) is parametrized by

threeuniverses: one forAto belong to (the first occurrence ofTypeabove), one forPto belong to (the second occurrence ofTypeabove), and a third one forforall x : A, P xto belong to, which has to be at least as big as both of the first two. If we then try to instantiate this withA := Set : Type, so the first universe is strictly bigger thanSet, whileP : Set -> Set, so that the second universe is equal toSet, the constraint that the third universe has to be at least as big asbothof the first two universes doesn’t go away, even though under impredicative-set,in this particular instancethe third universe doesn’t actually have to be as large as the first one. In other words, the universe polymorphism algorithm doesn’t notice that instantiating particular universes toSetcan remove universe constraints.Possibly I could work around this by asserting a separate funext axiom for

Set-valued type families. I haven’t tried, but it seems likely that I would have to duplicate significant parts ofTypes/Forall.v, which seems tedious and boring.By the way, universe polymorphism also doesn’t seem to play perfectly well with

type-in-type: I still get universe inconsistencies. Seems like I should never get any universe inconsistencies undertype-in-type! I haven’t tracked down the cause of this in any more detail, though; putting some things inSetexplicitly seems to work around the problem.Whoops. Forgot this site only has HTML formatting.

That already sounds odd. I’m no expert in Coq universe polymorphism, but semantically, I don’t see a need to parameterize by the universe of

`forall x : A, P x`

, since the minimal one can be computed. This computation ought to take into account whether the formation rule is impredicative. But I take it that you tried it, and something went wrong, for some reason.Prior to universe polymorphism, I thought the universe inference algorithm only generated a universe level variable when you literally uttered

`Type`

.I’m no expert in the theory of universe polymorphism, but my understanding is that Coq’s universes don’t actually include a “max” operation, so wherever you want to say “max(i,j)” you actually have to give an arbitrary k and assert that i <= k and j <= k.

In fact (repeat caveats about non-expertise) it seems to me that there are at least some reasons to want this behavior: I think it "makes cumulativity admissible" in some sense. That is, if

forall x:A, P xis only asserted to live in the max of the universes of A and P, then to make it also live inlargeruniverses one would need an explicit cumulativity rule. But if by definition it lives inanyuniverse larger than those of A and P, then you don’t need an explicit cumulativity rule.