Universal properties without function extensionality

A universal property, in the sense of category theory, generally expresses that a map involving hom-sets, induced by composition with some canonical map(s), is an isomorphism. In type theory we express this using equivalences of hom-types. For instance, the universal property of the sum type A+B says that for any X, the induced map

(A+B \to X) \to (A\to X)\times (B\to X)

is an equivalence.

Universal properties are very useful, but since saying that a map between function types is an equivalence requires knowing when two functions are equal, they seem to depend irreducibly on function extensionality (“funext”). Contrary to this impression, in this post I want to talk about a way to define, construct, and use universal properties that does not require function extensionality.

First, however, I should perhaps say a little bit about why one might care. Here are a few reasons:

  1. Maybe you just like to avoid unnecessary assumptions.
  2. Maybe you want to define things that compute, rather than getting stuck on the function extensionality axiom.
  3. Maybe you want to your terms to be free of unnecessary “funext redexes”, making them easier to reason about. Here by a “funext redex” I mean a path obtained by applying funext to a homotopy to get a path between functions, then applying that path at an argument to get a path between function values. (Following the book, by a “homotopy” f\sim g I mean an element of \prod_{x:A} (f x = g x).) Clearly we could just have applied the homotopy in the first place. Once we have a type theory in which paths in function types literally are homotopies, this won’t be a problem, but for now, it makes life easier if we can avoid introducing funext redexes in the first place.
  4. Finally, this is not a very good reason, but I’ll mention it since it was actually what started me looking into this in the first place: maybe you want to declare a coercion that conforms to Coq’s “uniform inheritance condition”, which means that it can’t depend on an extra hypothesis of funext.

Onwards! To begin with, as you may know, type theory already includes an excellent way to express some universal properties without funext, namely induction principles. The induction principle for sum types makes perfect sense, and importantly is usable, without funext; whereas if we assume funext then the induction principle is equivalent (modulo judgmentality of computation) to the universal property.

So why is this not good enough? Well, to prove the universal property (A+B \to X) \to (A\to X)\times (B\to X) from the induction principle for A+B, we need to apply that induction principle not only with motive X, but also with motive being a type of homotopies between functions into X. That’s not a problem here, but in some situations, we may want to express a universal property that doesn’t apply to all type families, so that we can’t assert a general induction principle.

A prime example is the notion of reflective subuniverse. As stated in the book, it involves an equivalence of hom-types: there is a reflector \bigcirc and maps A\to \bigcirc A such that for any type B in the subuniverse, the induced map (\bigcirc A \to B) \to (A\to B) is an equivalence. Like any universal property, this is unprovable and unusable without funext. However, it’s tricky to state an induction principle for \bigcirc A: if we did the obvious thing and ask it to induct into all families of types in the subuniverse, we would get not a reflective subuniverse but the stronger notion of a modality.

Another, closely related, example is the notion of local type. Given a map f:S\to T, a type X is called f-local if the induced map (T\to X) \to (S\to X) is an equivalence. Again, this is unprovable and unusable without funext, but because we’re only talking about a property of a particular type X, we can’t express it as an induction principle.

In considering whether there is any way around this, our first thought might be to take the definition of equivalence, applied to a map such as (\bigcirc A \to B) \to (A\to B) or (T\to X) \to (S\to X), and replace all the occurrences of paths in function types by homotopies. This certainly gives a notion that is equivalent to the original if we have funext, and which at least doesn’t appear to require funext. However, it seems that in order to actually avoid funext, we have to be quite careful in choosing one out of the many definitions of “equivalence”.

To get an idea of where the problem lies, suppose we assert that (-\circ f): (\bigcirc A \to B) \to (A\to B) is an “equivalence” in the following sense: for every g:A\to B there is a \mathrm{rec}(g):\bigcirc A\to B, such that \mathrm{rec}(g) \circ f \sim g, and for any h:\bigcirc A \to B we have h\sim \mathrm{rec}(h\circ f) (plus, of course, a coherence). Now, one of the things we’d like to know, which certainly ought to follow from (-\circ f) being an “equivalence”, is that if g,h:\bigcirc A\to B become the same upon precomposition with f, they should already be the same. We are doing without funext, so “the same” should mean a homotopy, i.e. we would like g\circ f \sim h\circ f to imply g\circ h. We naturally want to write

" g \sim \mathrm{rec}(g\circ f) \sim \mathrm{rec}(h\circ f) \sim h "

but there is a problem, namely that the “rec” operation need not preserve homotopies. It preserves equalities (paths), of course, since everything does, but without funext, we can’t make a homotopy into an equality.

The same problems arise with most of the other definitions of equivalence, but it turns out that there is one of them that works (after a bit of modification). This definition is not one of the better-known ones (in fact, I don’t think it appears in the book at all): a map f:X\to Y is an equivalence if and only if (1) f is split surjective, i.e. there is a g:Y\to X such that f\circ g \sim \mathrm{id}, and (2) for all x,y:X, the induced map \mathrm{ap}_f : (x=y) \to (f x = f y) is also split surjective. Moreover, this is even a good definition of equivalence, i.e. if we assume funext then it is an hprop. (There is, I believe, no hope of any definition of equivalence being an hprop without funext, since any notion of equivalence will involve functions as part of its data.) Proving this is a nice exercise.

This definition is kind of cute because it is a direct enhancement of the propositions-as-types translation of “f is surjective and injective”. The PAT notion of “surjective” is split surjectivity, while the ordinary PAT notion of “injective” would say that for all x,y:X we have (f x = f y) \to (x = y); we simply add the assumption that the latter map is a section of \mathrm{ap}_f. (I don’t remember who first pointed this definition out to me.)

Let’s take this definition and funext-reduce it in the case of a precomposition map (-\circ f): (\bigcirc A \to B) \to (A\to B). Split surjectivity, with paths made into homotopies, says that for any g:A\to B there is a \mathrm{rec}(g):\bigcirc A\to B, such that \mathrm{rec}(g) \circ f \sim g. This is exactly what we had before: maps A\to B extend along f. However, making paths into homotopies in the second clause says that given h,k:\bigcirc A \to B and a homotopy p : h\circ f \sim k \circ f, we have a homotopy \mathrm{rec}(p):h\circ k such that \mathrm{rec}(p)\circ f = p. Thus, it exactly solves the problem we saw before!

We are not quite golden yet, however. What if we want to extend this to higher homotopies? That is, suppose we have two homotopies p,q:h\sim k and a 2-homotopy p\circ f \sim q\circ f; can we conclude that p\sim q? This definition seems insufficient to show that without funext. Since we are assuming directly that functions and homotopies extend, maybe we need to assume directly that all higher homotopies extend as well. But is that possible?

Looking back at our chosen definition of equivalence, we are struck by the fact that it treats paths and points in an entirely analogous way. Moreover, clauses (1) and (2) are completely independent; neither depends on the other! Therefore, it admits a very simple recursive extension to all higher paths. We define what it means for a map f:X\to Y to be n-path-split by recursion on n as follows:

  • Every map is (uniquely) 0-path-split.
  • f is (n+1)-path-split if (1) it is split surjective, and (2) for all x,y:X, the induced map \mathrm{ap}_f : (x=y) \to (f x = f y) is n-path-split.

Thus, our previous notion of equivalence is what we now call “2-path-split”. It’s not hard to show by induction that (assuming funext) “n-path-split” is a good notion of equivalence whenever n\ge 2. Moreover, we can put them all together: say that f is ∞-path-split if it is n-path-split for all n. This is also a good notion of equivalence (since by funext, the dependent product of hprops is an hprop).

The notions of n-path-split as n goes to infinity are closely related to the higher notions of coherent adjoint equivalence. We expect, from homotopy theory, that if instead of cutting off the definition of half-adjoint equivalence after one 2-path coherence, we added also the second 2-coherence and one 3-coherence, or both 2- and 3-coherences and a 4-coherence, etc., we would get good notions of equivalence. At each step we add two new data; if we added only one, we would no longer have a good notion. The tower of path-splitness is entirely analogous, but easier to express and work with internally, because the two new data we add at each step don’t depend on the data added at any previous step; they depend only on the original map f.

If we now funext-reduce ∞-path-splitness in the case of a precomposition map, we get a notion of universal property, expressed completely in terms of homotopies, that can deal with all higher homotopies as well. Given f:A\to B, we define what it means for a type family C:B\to \mathcal{U} to be n-extendable along f:

  • Every C is (uniquely) 0-extendable along f.
  • C is (n+1)-extendable along f if (1) for every g:\prod_{a:A} C(f(a)) we have a \mathrm{rec}(g) : \prod_{b:B} C(b) such that \mathrm{rec}(g) \circ f \sim g, and (2) for all h,k:\prod_{b:B} C(b), the family \lambda b . (h b = k b) is n-extendable.

Similarly, we have ∞-extendable families. We can now express an equivalence such as (-\circ f): (\bigcirc A \to B) \to (A\to B) by saying that the constant family \lambda (x:\bigcirc A). B is ∞-extendable along f:A\to \bigcirc A. Note that we’ve come almost full circle: this is very close to the induction principles we began with, except that we’re only allowing “induction” into families of a very special recursively defined sort.

If we assume funext, then ∞-extendability is equivalent to saying that (-\circ f) is ∞-path-split, and hence to saying that it’s an equivalence. However, both directions of the equivalence between ∞-extendability and ∞-path-splitness require funext, and in the absence of funext, ∞-extendability seems to be more useful.

I’ve formalized reflective subuniverses in Coq using this definition of the universal property, and the theory turns out to be entirely free of funext. I don’t have a proof that funext is never needed, but the evidence so far is clear.

To end with, let me say something about localization. Recall from above that given f:S\to T, we define a type X to be f-local if (T\to X) \to (S\to X) is an equivalence. Since this is a precomposition map, we can ask instead that it is ∞-extendable, to get a notion of “f-local” that’s usable without funext. Now the localization of an arbitrary type X is supposed to be its reflection into f-local types, i.e. an f-local type L_f X with a map X\to L_f X such that for any f-local Y, we have an equivalence (L_f X \to Y) \to (X\to Y).

We’ve known for a long time that L_f X can be constructed as a HIT. Simply take a definition of equivalence, funext reduce it, and make it into constructors of L_f X. The easiest one to do this with is “bi-invertibility” (having both a section and a retraction), since it doesn’t involve higher paths; we get


Inductive L_f X : Type :=
| to_local : X \to L_f X
| local_e1 : \forall (g : S \to L_f X), T \to L_f X
| local_s1 : \forall (g : S \to L_f X) (s : S), local_e1 g (f s) = g s
| local_e2 : \forall (g : S \to L_f X), T \to L_f X
| local_r2 : \forall (h : T \to L_f X) (t : T), local_e2 (h \circ f) t = h t.

However, the proof that L_f X is f-local and has the right universal property has always required funext. But now, with ∞-extendable maps, we can avoid this: instead of the latter four constructors above, we instead add a constructor asserting that the constant family \lambda (x:T). L_f X is ∞-extendable along f:S\to T.

This is admittedly a different kind of constructor than we usually see for HITs; it’s parametrized over natural numbers and it has a different “shape” for each natural number. But there seems to be no problem formulating the induction principle and working with it. And semantically, we should be able to replace this constructor by an infinite number of more ordinary ones and then apply the usual methods.

In particular, note that this constructor inserts paths of all dimensions. Have we seen a HIT like that before? I don’t remember for sure, but I can’t think of any right now.

Postscript: Everything in this post has been formalized in the HoTT Coq library. Here is the pull request.

This entry was posted in Code, Foundations, Higher Inductive Types. Bookmark the permalink.

16 Responses to Universal properties without function extensionality

  1. Arnaud Spiwack says:

    This looks very interesting: I really didn’t think it was possible. But I’m puzzled about a technical step. How can you, in your words, “mak[e] paths into homotopies” without using functional extensionality? What do you replace “ap” with then?

    • jessemckeown says:

      ap is fine! it’s isequiv_ap that Mike is avoiding here.

    • Mike Shulman says:

      Actually, isequiv_ap is also fine; it doesn’t require function extensionality to define ap or to show that ap-ing an equivalence is an equivalence. The technical way to “turn a path into a homotopy”, in the sense of being given a path f = g and getting out a homotopy f \sim g (or in the notation of the Coq library, f == g) is what the book calls happly and the Coq library calls ap10 — and that also doesn’t require funext! Funext is only required for the reverse direction, being given f \sim g and getting out f = g, which the book calls funext and the library calls path_arrow.

      However, none of those is what I meant by “make paths into homotopies”. What I meant is taking a known definition and syntactically replacing paths by homotopies in it to obtain a new definition. That’s not an operation in the theory, it’s a metatheoretic construction. And there, indeed, the central problem is that there is no replacement for ap. If we have a definition involving a path p : f = g which applies it to an argument to obtain happly p x : f x = g x, then under our paths-to-homotopies translation p gets replaced by a homotopy p : forall x, f x = g x , and happly p x is simply replaced by p x. However, if our definition involves ap R p where R is some “functional” defined on the function type, such as rec : (A -> B) -> (O A -> B) (the asserted inverse of precomposition with the unit \eta_A : A \to \bigcirc A), then there is nothing to replace ap R p by if we replace p by a homotopy.

      However, this is exactly the problem that the notion of ∞-extendability solves. If a precomposition map (-\circ\eta_A) : (\bigcirc A \to B) \to (A\to B) is ∞-extendable, then the n=2 part of that definition says that for any two maps f,g:\bigcirc A\to B and homotopy p: f\circ \eta_A \sim g\circ \eta_A, there is a homotopy f\sim g that yields p when precomposed with \eta_A.

      • Arnaud Spiwack says:

        I have yet to understand precisely how n-path-splitness specialises to n-extendability.

        That being said, are you implying that the n-path-splitness uses equality in an irreductible manner, but that some of its specialisations may refer to homotopies instead. If so, this approach, besides being really nice, happens to lose some modularity with respect to using functional extensionality. Am I correct?

        • Mike Shulman says:

          Yes, n-path-splitness, like other definitions of “g:X\to Y is an equivalence”, does refer to equalities in X and Y irreducibly. When we specialize it to “(-\circ f) : (B\to C) \to (A\to C) is an equivalence”, i.e. we take X := B\to C and Y := A\to C and g := (-\circ f), then of course the result refers to equalities in B\to C and A\to C. But in that case, we can replace those equalities by homotopies to obtain a different notion, which I’m calling n-extendability, that is equivalent to n-path-splitness if we have funext, but in the absence of funext is more useful.

          I’m not sure what you mean by “modularity”, but maybe this answers your question?

          • Arnaud Spiwack says:

            Yes, I believe it does.

            It is indeed less modular not to have functional extensionality in that you don’t have a one-size-fits-all definition which you can apply in a generic way: you have to design every instance of a universal property “by hand”, so to speak. On the other hand the derivation is rather systematic, which is interesting, and, of course, makes you wonder if it can be automated.

  2. It is really interesting to be able avoid function extensionality by changing the definition of equivalence, and it is nice that, in particular, if I understand correctly, (0) you don’t need funext to prove that “f is an equivalence” in this sense is a proposition . I presume that, then, (1) to show that this notion of equivalence is equivalent to any of the standard ones you will need funext, and, I ask (2) if one defines univalence using this notion of equivalence you propose, will we still get that UA implies funext?

    • Mike Shulman says:

      Oh no, I’m very sorry that I gave that impression. You do definitely need funext to prove that “f is an equivalence” in this sense is a proposition, because this notion of of equivalence (like any notion of equivalence) involves functions as part of its data. I’ll edit the post to make that clear.

      As for the other questions, if you’re talking about the notion of equivalence for general maps (not just precomposition maps) which I called “n-path-split for n\ge 2“, then you don’t need funext to show that it is logically equivalent to any other notion of equivalence. Once you have funext, therefore, it follows that it is actually equivalent, since all of them are hprops. This is the same as the situation for comparing all the other definitions of equivalence. And I think it doesn’t matter much, for purposes of proving funext from univalence, which definition of equivalence you take.

      • Thanks for the clarification regarding (0), and hence (1), and for editing the post to account for that in the new parenthetical remark.

        I am not yet convinced about (2), however (and this is not to be seen as criticism of your very interesting “neutral” notion of equivalence).

        To begin with, you can’t run the proof of the book that UA->funext for your version of equivalence (which is really equivalence conditionally to funext), because the fact that your notion of equivalence is an abstract notion of equivalence in the sense of the book depends on funext (which is what we want to prove (from UA)). The point is that if you don’t know that funext holds, you can’t claim that Shulman-equivalence is equivalence (you do know, however, that it is implied by equivalence). Hence if you define Shulman-UA in the obvious way by mimicking the usual definition of UA with usual equivalence replaced by Shulman-equivalence, in MLTT *without* funext, I don’t see how the arguments we know would give that Shulman-UA implies funext. This is not to say that Shulman-UA doesn’t imply funext: it may well be the case that it actually does. But the arguments of the book cannot be directly reused, as they stand, to conclude this.

        • Mike Shulman says:

          Actually, the fact that any definition of equivalence is an abstract notion of equivalence in the sense of the book (i.e. it is an hprop that is logically equivalent to qinv) depends on funext. None of the book’s definitions can be shown to be hprops without funext. So the situation with path-split maps is no different from any of them, and the proof that UA implies funext should work just as well.

          Is there a reason you don’t want to use the terms I defined like “n-path-split”? Saying “Shulman-equivalence” is, among other things, ambiguous, becuase I mentioned several new-ish notions of equivalence in the post. One family of them, the n-path-split and ∞-path-split ones, have really the same status as all the other notions of equivalence: they are logically equivalent to all the others without funext, and with funext they are hprops and therefore actually equivalent to the others.

          The other family, the n-extendable and ∞-extendable pairs, apply only to precomposition maps; they’re not a “notion of equivalence” in the general sense (even with funext) because they just don’t make sense for arbitrary functions. In particular, UA can’t be stated with these notions because UA is about all equivalences, not just precomposition ones. In the special case of precomposition maps where it makes sense, extendability is equivalent with funext to other notions of equivalence, while without funext it is not (as far as I can tell) even logically equivalent to them; in fact, both directions of the implication seem to require funext.

          • Martin Escardo says:

            This reply is after a short private interaction with Mike, in which he showed me how to see that qinv(f) iff 2-path-split(f) with a nice short and clever argument for the left-to-right direction, which, crucially, doesn’t use funext (the other is immediate, again without using funext).

            What we don’t have, without funext, is that isProp(2-path-split f). But luckily this turns out not to matter for the proof that 2-path-split-UA implies funext.

            This is not easily visible from the proof given in the book. But it is immediate from the concise original proof given by Voevodsky in Coq, informally renderered at http://www.pitt.edu/~krk56/UA_implies_FE.pdf

            That proof doesn’t depend on the particular notion of equivalence used (although officially it uses the notion given by the contractibility of all fibers). Moreover, crucially, it doesn’t depend on the fact that this notion is hpropositionally-valued.

            The only thing it relies on is that, whatever notion “equiv” of equivalence is chosen, it satisfies equiv(f) iff qinv(f). It doesn’t ever use the assumption isProp(equiv f), or the particular nature of “equiv” other than the (incomplete) specification given by equiv(f) iff qinv(f).

            So the situation with the notion of equivalence “2-path-split” is very pleasing: Mike can prove what he wants, without using funext, without being able to assert that 2-path-split really amounts to equivalence. But that doesn’t matter: (1) if funext holds, it is indeed equivalent to any of the standard notions, and, moreover, (2) if we use that notion of equivalence to define UA, then UA shows that this notion of equivalence is equivalent to any of the standard ones.

            So, perhaps, a better abstract notion of equivalence is this: “e” is a notion of equivalence iff (1) e(f) iff qinv(f), and (2) UA(e) implies isProp(e(f)).

            At least this is what is happening for the notion of equivalence e = 2-path-split.

            • Mike Shulman says:

              Yes, that’s the same thing that happens with every general notion of equivalence; they’re all logically equivalent to each other without funext, and therefore any of them suffices for UA -> Funext.

              I agree that the proof of UA -> Funext in the book is a bit confusing about its assumptions. The reason for that, I would say, is that it doesn’t really fit in the book at all. (-: In general, we made a conscious decision in the book not to spend any effort tracking which theorems require funext and UA; we take the point of view that we are studying only one formal system, and funext and UA are part of that system. Metatheoretic questions about how much we can do without those assumptions are interesting, and important in other contexts, but for purposes of exposition and to avoid confusing the newcomer, we decided not to mess around with sometimes assuming axioms and sometimes not.

              Nevertheless, we also decided for some reason to include a proof that UA implies funext. I’m not sure exactly why; maybe just because it’s considered an important result? Unsurprisingly, this makes for a slightly awkward exposition in that section. Maybe we can improve it in the second edition.

              • Martin Escardo says:

                Just one quick comment: for the notion of equivalence “equiv” originally considered by Vladimir, we do have, in MLTT without any further assumptions, in particular without funext, that equiv(f) iff qinv(f) and that isProp(equiv(f)). It is the second fact that fails (or rather is undecided) in MLTT for your notion of equivalence “2-path-split”. From what I understand in your blog post, you explicitly wanted to work in MLTT, rather than HoTT, which you successfully accomplished, to prove what you wanted. My comment is just an analysis of what is going on.

  3. Mike Shulman says:

    Here’s a thought that occurred to me this morning, along the lines of Arnaud’s suggestion of automation. Consider iterated dependent products of the form

    \prod_{x_1:A_1} \prod_{x_2:A_2(x_1)} \cdots \prod_{x_n:A_n(x_1,\dots,x_{n-1})} B(x_1,\dots,x_n),

    A map from a type of this form to another one

    \prod_{y_1:C_1} \prod_{y_2:C_2(y_1)} \cdots \prod_{y_m:C_m(y_1,\dots,y_{m-1})} D(y_1,\dots,y_m)

    can be induced by a map of contexts f:(C_1,\dots,C_m) \to (A_1,\dots,A_n) and a map g : B[f] \to D over (C_1,\dots,C_m). We should be able to define ∞-extendability for any map of this sort. When n=m=1 and g=\mathrm{id}_{B[f]} this will be ∞-extendability as defined in the post, applicable to precomposition maps

    (-\circ f) : \Big(\prod_{x:A} B(x)\Big) \to \Big(\prod_{y:C} B(f(y))\Big).

    When n=m=1 and f=\mathrm{id}, it will be a corresponding version for postcomposition maps

    (g \circ -): \Big(\prod_{x:A} B(x)\Big) \to \Big(\prod_{x:A} D(x)\Big).

    And when n=m=0, it will be precisely ∞-path-splitness of an ordinary function B\to D. Moreover, it can also be applied to maps such as currying

    (A_1 \times A_2 \to B) \to (A_1 \to A_2\to B).

    So this achieves some unification. We can also generalize it to include maps such as

    (A_1 + A_2 \to B) \to (A_1 \to B) \times (A_2 \to B)

    and

    (A \to B_1 \times B_2) \to (A \to B_1) \times (A \to B_2)

    by just allowing multiple choices of (C,D,f,g) for a fixed (A,B). Does that yet include “all” universal properties?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s