9
$\begingroup$

Formal verification is

the act of proving or disproving the correctness of intended algorithms underlying a system with respect to a certain formal specification or property, using formal methods of mathematics.

Proof assistants can help with that; it's only a small jump from algorithms to computer source code. Let's say I have some code written in [insert the language of your choice here]: can I use proof assistants to verify whether it's free from bugs? You may assume it's a low-level language if that matters, but I expect it doesn't because almost all programming languages are Turing complete.

$\endgroup$
1
  • $\begingroup$ The source code needs to be annotated to specify the expected results of each part of the program, here are some examples: stackoverflow.com/a/65218443/3648282 en.wikipedia.org/wiki/ANSI/ISO_C_Specification_Language#Syntax - an unannotated source doesn't describe (contains insufficient information) to describe its intended operation. For example given a "Hello World" program how would you know (how would a different program prove) what the output should be (along with checking for "bugs"), if it said "Goodbye" would it be bug free? -- Not an answer, comment is to improve Q. $\endgroup$
    – Rob
    Commented May 17, 2022 at 15:08

3 Answers 3

18
$\begingroup$

An important point to clarify is the meaning of "free of bugs". Correctness only makes sense relative to a specification. Only once you have decided what property you care about (your definition of "correctness"), can you start to think about how to prove it.

For safety properties (memory safety (buffer overflows, use after free, etc.), undefined arithmetic (overflow, divide by zero, etc.), unhandled exceptions, etc.), there are popular approaches to detect or prevent them automatically, via type systems ("well-typed programs do not go wrong") or program analyses (e.g., Infer).

Functional correctness is a class of properties that relate your program to a "high-level" specification. For example you may expect that a function int sum(int* x) does compute a sum $\sum_i x_i$, and that allows you to think about that function as the mathematical sum without worrying about how the numbers are laid out in memory. Since there is no hope for full automation for the most expressive specification languages, this is an area where proof assistants have a lot of potential (and it has been active for decades).

A most popular example is sel4, a microkernel written in C and verified in Isabelle/HOL. In the following excerpt, note again the precision of "against its specification":

seL4's implementation is formally (mathematically) proven correct (bug-free) against its specification, has been proved to enforce strong security properties, and if configured correctly its operations have proven safe upper bounds on their worst-case execution times. It was the world's first operating system with such a proof, and is still the only proven operating system featuring fine-grained capability-based security and high performance. It also has the most advanced support for mixed criticality real-time systems.

--- https://sel4.systems/About/

For another example, VST provides a program logic for C embedded in the Coq proof assistant. VST is particularly notable because it is formally related to a verified compiler, CompCert: there is a proof that VST's program logic is sound with respect to the same operational semantics that CompCert was proved against. Software Foundations, Volume 2 contains an introduction in Coq to Hoare logic, a core idea in the development of program logics for imperative languages, such as VST.

For a functional language such as Haskell, it is a lambda calculus at its core, similar to proof assistants based on type theory (Coq, Lean, Agda). hs-to-coq banks on that similarity, by translating functions into the proof assistant's language, so you can prove any properties expressible there.

$\endgroup$
10
$\begingroup$

Sort of. Bugs are a nebulous concept, so it's difficult to prove their absence. Theorem provers can show that certain components of a software program are behaving exactly as specified, so bugs must be elsewhere.

As a practical example, the Csmith tool generates random programs that looks for differences between how C compilers compile the code.

This tool found errors in GCC and Clang, but no errors in the middle end of Compcert a compiler verified using Coq. (I think the compiler itself is extracted from Coq too, but I don't know the details.)

However, as mentioned in this paper by the Csmith authors, there were bugs in Compcert that was discovered as a result of this effort.

The following is an excerpt of Finding and Understanding Bugs in C Compilers, from page 6 of the pdf. The code snippet has been condensed to one line.

The first silent wrong-code error that we found in CompCert was due to a miscompilation of this function:

int bar (unsigned x) { return -1 <= (1 && x); }

CompCert 1.6 for PowerPC generates code returning 0, but the proper result is 1 because the comparison is signed. This bug and five others like it were in CompCert’s unverified front-end code.

$\endgroup$
3
$\begingroup$

I don't mean to be a "doubting Thomas", but I've always been under the impression that proof assistants check the code implements the specification and can verify the specification satisfies certain properties. But it is insufficient to prove the absence of bugs stemming from a faulty specification.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.