11
$\begingroup$

Sorry if this question is too basic or vague, I'm new to using proof assistants.

Using a proof assistant, one can construct the type Ordered n t of ordered lists of length n over a type t, provided that t has an ordering. We might then construct an inhabitant sort of the type Vec n t -> Ordered n t together with some other properties, for example that contains l v -> contains (sort l) v. In so doing, we would have proved that this sort function fulfills its specification to sort lists.

But there are sorting algorithms that run in $\mathscr{O}(n^2)$ and some that run in $\mathscr{O}(n\cdot\log n)$ in the worst case. If propositions are types and programs are proofs, can a proof assistant also reason about the runtime complexity of a program / proof? The fact that sort is an inhabitant of a well-defined type implies that the program terminates, but it seems like it could be useful to know more, for example that merge sort runs in $\mathscr{O}(n\cdot\log n)$. From my exposure to type theory, it seems like we always need to study the semantics of how terms in a formal language reduce to other terms by using some higher-level meta-language because such a capacity for self-interpretation often implies unsoundness. So I can imagine that the answer to my question is "no" but I couldn't find anything saying so.

$\endgroup$
3
  • 3
    $\begingroup$ Do you know about existing formalizations of complexity theory? $\endgroup$
    – Trebor
    Commented Feb 6, 2023 at 18:46
  • $\begingroup$ I did not -- this is perfect, thanks! I did some searching on google scholar and got a lot of other unrelated things. If you post this or any other good resources you know of I'll accept. $\endgroup$ Commented Feb 6, 2023 at 19:29
  • $\begingroup$ This may be of interest: "Liquid Resource Types" dl.acm.org/doi/pdf/10.1145/3408988. Though it looks at "resource" complexity which may differ from time complexity, and it's also really dense. They use a type system to prevent programs from exceeding their given resource bounds. $\endgroup$
    – tarzh
    Commented Feb 7, 2023 at 13:50

3 Answers 3

9
$\begingroup$

Yes and no. It depends what you mean "reason about complexity of programs".

The yes side is easy. Complexity theory is just math. And all math can be formalized. And indeed there has been work formalizing complexity theory in various proof assistants, including the work that Trebor cited above in Coq, work in Isabelle (and I'm sure other theorem provers as well), and proposed work in Lean. The fact that sorting is at best O(n log n) is certainly a formally provable mathematical fact.

The trickier bit, and I think the bit you are confused about, is how does one verify that an actual implementation of a sorting algorithm is O(n log n)? And this question holds for any property you want to verify of code. How do we verify that a C compiler is correct, or that Rust code is correct, or that Lean code is correct? There are lots of methods to do this. I'm not an expert on all the approaches, so I won't try to speculate on what they are. Others can chime in. But the one approach I am familiar with (and the approach I think you have in mind) is how ones typically proves a property of a Lean or Coq function (or code in any dependent type theory language).

Consider this Lean 4 function:

def foo (n : Nat) : Nat := 0

In Lean, it is possible to prove all sorts of properties about it, like

theorem foo_is_constant (n m : Nat) : foo n = foo m

But now consider these two functions:

def bar (n : Nat) : Nat := (list.range n) |>.sort |>.map (fun _ => 0) |>.sum

def baz (n : Nat) : Nat := (fun x => 0)(bar n)

Assuming I got the syntax right (which I probably didn't), bar makes a list [0, ..., n-1], sorts it, replaces the elements all with 0 and adds them. Let's assume the sort algorithm is O(n log n) even if the list is sorted and that there are no compiler optimizations. Then bar and baz are O(n log n) and foo is O(1)-ish (or maybe O(n) because of the natural number input?). But because Lean satisfies function extensionality, all three are provably equal. And even if we don't use function extensionality (which is not provable in pure DTT unless you add extra axioms) by beta reduction we still have that foo = baz.

A fundamental property of equality is that equal things have the same properties, and this is provable in almost any DTT system, so if we could express and prove in Lean that foo is O(n) then we could do the same for baz. So while one could prove a mathematical fact about the most optimal implementation of a constant function foo, one can't prove anything special about a particular implementation of a constant function in this way.

So we can't prove runtime complexity results in the same way as other results in Lean, Coq, Agda, etc. We need another way. What is the best way? I don't know, and I don't think it has been worked out. One could certainly develop a model of Lean code and prove properties about that, just like one can develop a model of Rust or C code. One can also just work in computational frameworks like lambda calculus or Turing machines. There may also be fancier tricks which are more practical and useful for programmers writing Lean code, or maybe even new logics which will one day take into account runtime complexity. There may also be published research in this area already, but I'm unaware of the state of the art here.

$\endgroup$
2
  • $\begingroup$ Yes I think your point about extensionality really clears things up, thanks so much! So just to make sure I really understand you -- say I have two sorting algorithms. The first one loops through every permutation of the input list and returns the first one that's sorted. The second one does merge sort. When the underlying type is strictly totally ordered, these two functions are extensionally equal. So you can't expect to prove within a formal system satisfying the extensionality axiom that they differ in any way. $\endgroup$ Commented Feb 7, 2023 at 5:11
  • 2
    $\begingroup$ Note that extensionality doesn't stop us from trying to define best_complexity f that characterizes the best possible complexity to compute a given function. $\endgroup$
    – Trebor
    Commented Feb 7, 2023 at 9:47
3
$\begingroup$

To complement Jason Rute's answer: indeed there is a clear difference between two kinds of "functions". The first is reasoning about the code of a function, or, in typical complexity theory work, about some Turing machine. This is just standard mathematics, and can definitely be formalized in most proof assistants without specific difficulties. The second is reasoning directly about functions of your type theory, which seems to be what you are asking about. Here things, are more complicated.

Indeed, if your system proves function extensionality (which states that functions are equal whenever their values are equal, ie something like $\forall f g, (\forall x, f x = g x) \Rightarrow f = g $), then you cannot reason on the intensional content of function, ie how they compute, only on their extension, ie what they compute. This is because function extensionality exactly tells you that functions that compute the same results should be identified. Standard mathematicians seem to like this, because they usually do not want to think about intensional content of their functions, and so function extensionality is often seen as a nice property to have, and not proving it is seen as a defect.

However, function extensionality is not provable in most vanilla dependent type theories (such as those of Agda or Coq). This means that in such a system it is consistent to reason about the intensional content of a function. I do not know of developments of complexity theory going this way, but there has been a lot of work on developing synthetic computability theory in Coq, see for instance Yannick Forster's PhD thesis, or the synthetic undecidability project. Following this line of work, there is currently ongoing research on trying to make sense of a consistent Church Thesis, which would take a function (typically, from $\mathbb{N}$ to $\mathbb{N}$) and return some "code" corresponding to it. Thus, in a type theory without function extensionality, it should definitely possible to reason about complexity of functions, for instance by linking this Church Thesis with your standard textbook definition of complexity. However, this has not been achieved yet.

Moreover this synthetic approach is not only feasible, but I believe it is much better for formalizing complexity/computability theory than trying to follow the textbook approach. Indeed, after a while, textbooks tend to take a more high-level approach to computation models: nobody wants to/can write down an actual Turing machine computing a non-trivial function. The synthetic approach would let you use the full power of your language to do these definitions instead, being much closer to the actual non-formalized account of it. In a way, this gives you the possibility to go back and forth between the two approaches, using the best side of each when you need it.

$\endgroup$
3
  • 1
    $\begingroup$ How does my example above of foo and baz, which are equal merely by beta reduction relate to your answer? I think beta reduction is really going to mess up any ability to work with complexity of a function in DTT in the “direct way” even in a theory without functional extensionality. Or am I mistaken? $\endgroup$
    – Jason Rute
    Commented Feb 7, 2023 at 11:31
  • $\begingroup$ Yes, you are entirely right, such an internal notion of "complexity" will still need to be stable by the conversion of the system, including in particular β-conversion. So in your example, the "complexity" of baz would probably be O(1). In general, you would be able to reason on the complexity of the normal form of a function, or of an equivalence class under conversion. $\endgroup$ Commented Feb 8, 2023 at 9:46
  • 1
    $\begingroup$ Just a quick update: A type theory validating $\mathrm{CT}_\Sigma$ (and, hence, not funext) has now been obtained by Pierre-Marie Pédrot, see xn--pdrot-bsa.fr/slides/gallinette-10-23.pdf. $\endgroup$ Commented May 22 at 0:17
3
$\begingroup$

My PhD student Hing-Lun Chan mechanised the proof that the AKS algorithm for doing the primality test is indeed polynomial using a technique that counted "steps" (using a writer monad) while computations were performed.

This approach is simple, but requires the human to check that all computations have been accounted for. This is pretty straightforward for an algorithm as simple as AKS. This flaw comes about because the model is what I'd term a shallow embedding of the algorithm.

His thesis (from 2019) is available.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.