20
$\begingroup$

Planning on learning lean/lean4, I was wondering if a proof assistant system like lean4 can be used for general purpose programming?

For example, can it replace a language like Standard ML?

More specifically, are there major limitations for using them for general purpose programming?

First, I have a vague impression from reading tutorials that various proof assistants require/emphasize termination. So does this mean that the proof languages are not Turing complete?

Second, are they good at/OK with supporting GUIs? I don't intend to do heavy graphics programming with proof assistants. Other languages might be better. But I am just wondering if there are basic support for writing a stand-alone program with minimalist input/output.

A similar issue is about I/O, which is considered impure by some functional programmers. Are they supported in a very pure system like proof assistants?

In sum, the question is whether it is feasible (or how feasible) to use proof languages for general purpose programming.

$\endgroup$
4
  • 2
    $\begingroup$ In Lean at least, you can write programs wiithout proving termination if you wish. I think one main goal of Lean 4 is for it to be usable as a general purpose language. $\endgroup$ Commented Mar 19, 2022 at 18:47
  • 2
    $\begingroup$ As for GUI tools, you may want to look into widgets in Lean 3. They provide a way to interact with Lean graphically in VS Code (via HTML I believe). They are only in Lean 3 right now, but I've heard there is work on adding it to Lean 4. Also, it doesn't hurt to ask about GUI tools in the Lean4 stream on the Lean zulip. Actually, it looks like there is already a thread. $\endgroup$
    – Jason Rute
    Commented Mar 19, 2022 at 23:47
  • 1
    $\begingroup$ Is this question the same as proofassistants.stackexchange.com/questions/74/… $\endgroup$ Commented Mar 23, 2022 at 0:57
  • $\begingroup$ I see general-purpose languages on a spectrum, approximately from Python to Go to Rust to Haskell to Idris to Lean4. $\endgroup$
    – Asclepius
    Commented Oct 27, 2023 at 23:09

3 Answers 3

17
$\begingroup$

There are many types of proof assistants using many types of foundations. While all of them resemble "code", in the same way that $LaTeX$ or HTML resembles code, most of them don't resemble programming languages.

One of the main exceptions to this is dependent type theory, which acts simultaneously as a programming language and as a theorem prover. This is pretty cool since it merges the two worlds together. In particular, you can run code to prove theorems with judgmental equality (see below), and you can prove properties about your computer code, such as that a function will never reach a certain value, or that it satisfies a mathematical property. (This is like unit tests, but much better since you are proving the correctness of your code.)

Now as you pointed out there are some caveats here. First let me discuss three separate categories of functions, which might provide some clarity.

1. Pure computable functions

In pure dependent type theory, if you avoid axioms, functions are computable. They can in principle, and often in practice, be executed. (I say "in principle", since the Ackerman function is definable in DTT.)

Also in pure dependent type theory, functions need to be total. You can't have an infinite loop which never ends (say implemented with an unbounded recursive self-call). Moreover, you have to be able to prove that your function is total. Sometimes this is syntactically obvious if the recursive call is of a particular form and the compiler will handle it for you, and other times you need to supply a proof directly. Therefore, it wouldn't be possible to implement the function which counts the number of steps it takes the Collatz series to reach 1 starting with input n (unless you first prove the Collatz conjecture). Similarly, you can't define a halting function, so yes, pure DTT code is not Turing complete.

In pure dependent type theory there are tricks around this total termination. The simplest is to carry along a counter which is decremented in each recursive call and returns a dummy value if the counter reaches zero before exiting your function. Then when executing your code, just use a really high starting counter value.

In DTT, you can prove 5 + 5 = 7 + 3 by just running 5 + 5 and 7 + 3 until both terminate and the result is the same. This is judgmental equality. In Coq, the kernel is very powerful for this sort of stuff, and if I'm not mistaken this was a major part of how the four color theorem was proved in Coq. It just ran a program inside the kernel to check all the cases.

You can also prove properties about your functions. For example you can prove the nat.succ function never can equal 0.

2. Mathematical functions which are not computable

However, if you are assuming certain axioms, especially the axiom of choice, then any function you construct with those axioms may not be computable and you can't run it like you would a normal computer program. Lean is good at keeping track of what functions use axioms which make them non-computable.

Note, the law of excluded middle doesn't necessarily make one's code non-computable in Lean. This is because of the way Lean handles Props. I believe this means for example (although I might be mistaken on some subtleties here) that you can use a non-constructive proof to show termination of a computable function.

3. Impure (non-mathematical) computable functions

Now, so far I've only talked about pure DTT, or DTT with axioms. This is the setting of DTT-based theorem provers. However, what Lean did starting with Lean 3 is say, "hey, why don't we also use Lean as a meta-programming language for writing tactics and doing other stuff?" The syntax of the meta-programming Lean language is the same as the theorem proving Lean language. Moreover, one can use pure mathematical Lean functions (assuming they don't use non-computable axioms in their code) for meta-programming as well. (My understanding is that this is different from Coq, which has separate meta-programming language (or languages) for writing tactics.)

The Lean meta-programming language has some features that pure Lean doesn't. In Lean3, if you use the meta keyword then you can write unbounded recursion. (Lean 4 uses different keywords for this.) This lets you write infinite loops and non-total functions. This makes the Lean 3 meta-language Turing complete.

You are still "encouraged" to write pure code in Lean meta-programming, but it isn't forced. Again, if using the meta keyword (in Lean 3), you have access to the io monad, which (like in Haskell) provides lots of ways to use "impure" non-terminating, non-deterministic, or side-effect-producing code

Since your meta functions may not be pure or total, you can't prove properties about meta functions in Lean 3, as that would lead to inconsistencies.

Also, in Lean, when you run code as a programming language (and not as part of a judgmental equality in a proof), then Lean 3 runs it in a virtual machine instead of a kernel. For example, to compute arithmetic on the natural numbers, Lean (the programming language) uses libraries for fast arithmetic of integers, instead of the unary definition of the natural numbers. (In the kernel, Lean 3 struggles to compute even fib 7, but such a computation is fast in the virtual machine.)

This virtual machine computation, while faster, is less secure. It involves more tricks than Lean's minimalistic kernel for proofs and is a bit more likely to have a bug (probably as likely as a compiler for another early-stage programming language, so it isn't really a huge concern).

Lean 4 as a full programming language

Lean 3's programming language always felt hacky to me. It worked, and I actually did a lot of coding in it as part of the Lean GPTf project, but it didn't feel very user friendly.

Lean 4 however is intended as a full user friendly programming language. It has support for all the things programmers need like IO, arrays, floating point numbers, and hash tables. It also has a lot of syntax choices that I really like. It compiles to C (or maybe LLVM?) which makes it faster than Lean 3's VM execution.

But what Lean 4 offers, over say Haskell, is the power of theorem programming. For one, when working with pure computable functions, a Lean user can prove properties about them. For example, instead of checking that an array access call is inside the bounds of the array, one can just prove it, saving the code from having to perform the check every time it accesses the array. Moreover, Lean 4 also takes this to the next level by showing that lots of optimizations are possible when using pure code. They have a paper, Counting Immutable Beans, on how they use reference counting instead of garbage collection which is only possible because of Lean's use of pure functions. They have another paper, Sealing pointer-based optimizations behind pure functions, showing that theorem proving allows one to avoid extra computations when computing on algebraic data types (such as expressions which contain lots of duplicate sub-terms) by directly accessing pointers in a provably safe manner that is indistinguishable from pure code which doesn't know about pointers.

Update: I think I treated your question as more theoretical than you meant it. In theory, Lean 4 or any other DTT based language which has support for unsafe/meta/impure non-terminating code with side effects, can compete with Standard ML or Haskell. The only question is the usability of the language, the availability of libraries, and the size of the user community. It seems to me that the Lean 4 developers aspire to reach such a level, and they have put a lot of work into the usability of Lean 4 as a programming language. But note, Lean 4 is not even officially released yet, so it doesn't have the ecosystem of libraries that say Standard ML has. If I was personally going to code a GUI application right now, I wouldn't use Lean 4, just because I don't think it has mature tooling for that right now. But if I wanted to learn a new, cool, bleeding edge, functional programming language that I would interact with via STDIN/STDOUT and text files, then that is a great motivation to learn Lean 4.

Other proof assistants which are programming languages

I don't mean to snub Agda here. It is another DTT language which I believe is intended as a full programming language. I unfortunately know little about it, so I can't say enough about it to provide a good picture of where it fits in.

Also Idris is a programming language using DTT, but again, I don't know enough about it. For example, I'm not sure if it is used as a proof assistant.

Update: Since writing this, others have pointed out in the comments more proof assistants which double as programming languages. For example Michael Norris mentioned that not all examples have to be DTT languages. "The Nqthm theorem prover and its descendent ACL2 both use Lisp as their logical language!" Further, others have pointed out proof assistants like PVC let you extract code. Actually, a number of proof assistants like Coq and (I believe) Isabelle have proof extract capabilities, but it should be pointed out that extracting code to another full-featured programming language like OCaml, isn't the same as running it directly which is what Lean, Agda, and Idris let you do. I would consider this an important distinction.


An aside on purity, totality, and monads

I've been accused in the comments of repeating "ages old false propaganda that is damaging to our industry" by suggesting that totality is in conflict with Turing completeness. So let me clarify.

In functional programming there is a notion of a pure function, namely one which:

  • is deterministic,
  • has no side effects, and
  • doesn't manipulate state (which is a side effect).

DTT and other proof assistant like programming languages add to that

  • is provably total.

This purity, including totality, is clearly an important "gain" for programming language theory as a commenter pointed out.

But on the surface this seems antithetical to modern programming which needs non-deterministic side effects like IO, and needs non-terminating programs with infinite loops or unbounded search.

There are two approaches I've seen which preserve most of the purity of pure programming, but allow non-purity when needed. Lean 4 has both.

One is to have the compiler keep track of pure vs not-pure functions. For example, in Lean 4 if one uses the partial keyword, one can write an infinite loop:

partial def f (x : Nat) (p : Nat -> Bool) : Nat :=
  if p x then
    x
  else
    f (x+1) p

It looks like f (which performs an unbounded search) has type f : Nat -> (Nat -> Bool) -> Nat, but of course this is a "lie" since f may not be total, so f isn't really a function in the mathematical sense. This is fine since Lean keeps track of this (and for example won't let you do the sort of mathematical reasoning on f that would lead to inconsistencies, or let you run this function on the kernel).

The other way to get around purity is with monads. There are a lot of good resources on the Internet about monads (especially in Haskell), but if you want a Lean-3-centric introduction, see the Monad section in Programming in Lean. While monads are quite general and have many use cases, one particular use case is to get around purity. Consider the function:

def findNatLessThan (x : Nat) (p : Nat → Bool) : IO Nat := do
  -- [:x] is notation for the range [0, x)
  for i in [:x] do
    if p i then
      return i -- `return` from the `do` block
  throw (IO.userError "value not found")

This function doesn't return a Nat, but returns IO Nat. IO is a monad which wraps around other types like Nat and provides the following:

  • From the point of view of the user, IO Nat is a "back door" to common non-pure functionality like the ability to print to stdout and raise exceptions.
  • The monadic do notation allows the user to write code in a manner which looks more like procedural programming instead of functional programming. (Note, Lean 4's do notation looks a lot different than that in other functional languages like Lean 3 or Haskell, but the idea is the same.)
  • Once "inside" an IO monad you can't really "get out". In particular, for another function to use the result of this computation, it must also return an IO wrapped type. Therefore, the entry point of a Lean program must be main : IO unit or similar.
  • From the point of view of the type theory, a function returning IO Nat is not actually returning a natural number. It is just returning a "computation" which interacts with "the real world" and may produce a Nat or may not. This is not the sort of computation that the mathematical kernel can run directly, but it is a computation that the Lean compiler can run (because the Lean compiler has additional support for running an IO monad which is outside of the rules of the type theory).

Similarly, it is possible to use monads to write non-terminating code. I don't know if Lean 4 has this, but Lean 3's io monad has io.iterate and io.forever which let you run infinite loops. Therefore one can write any partial computable function inside an io monad in Lean 3 even though from the outside, the Lean type theory thinks this is a total function (returning an io nat object for example).

Now, can a pure, total programming language be Turing complete? This can probably be interpreted in a few different ways. From the point of view of the type theory, all functions must be total mathematical functions. And more so, a pure DTT kernel can't run a computation that doesn't terminate. You can't write in pure mathematical Coq or Lean a computation for an infinite loop, an unbounded search, or even a DTT kernel (for the same logic) since every computation in a pure DTT kernel must run to completion and provably so.

On the other hand, from the point of view of the end user, if you have a compiler which has extra support for certain monads that let you run infinite loops, then yes. You can use monads written in a total language to express and run all partial computable functions, and the behavior of the language is Turing complete. (I would argue your programming language is no longer total, but that is just arguing over semantics.)

(Mathematically, the results are clear. A universal Turing machine is a partial computable function, so any Turing-complete language needs support for running partial computable functions. Similarly, Rice's theorem states that one can't enumerate all total computable functions, so therefore, one can't have a compiler which accepts only total computable functions, but which also accepts every total computable function. However, I don't think the mathematics is in question, just its philosophical and political interpretation.)

$\endgroup$
11
  • 3
    $\begingroup$ "The main exception to this is dependent type theory," is stretching a very long bow. The Nqthm theorem prover and its descendent ACL2 both use Lisp as their logical language! This gives these systems extremely good execution (which they use a great deal). It's also possible to write $\endgroup$ Commented Mar 22, 2022 at 23:31
  • 1
    $\begingroup$ Totality is not in conflict with Turing-completeness. $\endgroup$
    – pigworker
    Commented Mar 24, 2022 at 20:31
  • 1
    $\begingroup$ @pigworker isn’t provable totally as in Lean however in conflict with Turing completeness? One can enumerate all functions provably total in Lean, but by Rice’s theorem one can’t enumerate all total computable functions. (I could have said this better in my answer however.) $\endgroup$
    – Jason Rute
    Commented Mar 24, 2022 at 23:58
  • 1
    $\begingroup$ Maybe said better, if one has a computable compiler/interpreter like Coq that can only compile total functions then by Rice’s theorem it can’t compile all computable total functions. I would say that does mean “totality is in conflict with Turing completeness”. $\endgroup$
    – Jason Rute
    Commented Mar 25, 2022 at 0:02
  • 1
    $\begingroup$ No. You have the boot on entirely the wrong foot. And you repeat ages old false propaganda that is damaging to our industry. Every potentially nonterminating computational process is expressible as precisely that - a potentially nonterminating computational process. What you can't do is lie that you have a Nat -> Nat function for, e.g., computing the length of Collatz sequences. The standard "but you might not get an answer" deal is honestly, monadically, expressible. Totality is an expresivity gain, because sometimes you can honestly promise to deliver an answer! $\endgroup$
    – pigworker
    Commented Mar 25, 2022 at 0:16
6
$\begingroup$

I just want to add a couple of minor footnotes to an excellent answer by Jason Rute

First of all there's no "the language" of proof assistants — pretty much every proof assistant implements its own language, some fundamentally different from the others.

Which brings us to a question about Lean 4 in particular: yes it was intentionally designed as a dependently typed programming language and thus supports partial functions, IO, FFI and some other practically important bits.

They went so far it turned out to be possible to program robots in Lean 4! :D Also there are libraries like https://github.com/xubaiw/Socket.lean being developed now. That said it's indeed very early days, Lean 4 still isn't released yet and we don't have anything remotely close to mature ecosystem of libraries like OCaml and Haskell have.

Idris is another project with similar (or even stronger) aim at general-purpose programming. Again people started implementing practical libraries like HTTP, JSON and even GTK+ bindings for GUI programming, and many more, but still not even close to Haskell or OCaml level of maturity and breadth.

Thus my personal opinion is yes, dependently-typed languages ("proof assistant languages") like Lean 4 or Idris 2 can totally replace ML language for personal or research projects, especially if you'd like to develop a useful library.

$\endgroup$
4
  • $\begingroup$ I have a silly question about Idris. I know it is DTT, but it used as a proof assistant? $\endgroup$
    – Jason Rute
    Commented Mar 24, 2022 at 9:52
  • 1
    $\begingroup$ @JasonRute I guess it completely depends on your definition of a proof assistant, and from this site I've learnt that people have widely different opinions on what counts as one. :) Still Idris separates "impure" fragment from the pure (total) one and provides a number of features supporting interactive proofs. Does that answer your question? :) $\endgroup$ Commented Mar 24, 2022 at 10:49
  • $\begingroup$ A more practical version of my question: Are there any libraries of formally verified mathematics or formally verified algorithms in Idris? $\endgroup$
    – Jason Rute
    Commented Mar 24, 2022 at 11:03
  • 1
    $\begingroup$ @JasonRute ah, in that sense! :) There was a significant effort at Category Theory formalization in Idris 1 and I see they started Idris 2 rewrite but unfortunately the whole thing stalled. In the department of "algorithms" there was a Software Foundations rewrite but it halted too. On the other hand I don't really follow recent developments there might be alternative projects. Though I'm pretty sure we all would notice something of Lean's mathlib size... $\endgroup$ Commented Mar 24, 2022 at 16:13
3
$\begingroup$

You say "language" but really most proof assistants support multiple languages. You would want at least one language to write your algorithm in and another language to write your proofs in.

In dependent type theory based proof assistants like Coq or Agda these languages can be basically the same aside from maybe working at different universe levels but by no means is that a requirement.

For example, the Princeton Verified Software Toolchain extracts C code to an abstract syntax tree readable by Coq and provides tooling based on seperation logic to prove properties about the program.

In my opinion tooling for embedding languages inside proof assistants and reasoning about them is still early days yet. It would really help to have computing features like parametricity and nominal sets.

But this sort of approach has been done with large systems like SeL4. I highly doubt any are ready to write GUIs get though.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.