11
$\begingroup$

Most proof assistants assume that their hardware is correct. However, there are many recorded instances of CPUs outputting incorrect answers for various calculations (especially, but not limited to, floating point calculations).

It's impossible in principle for a computer program to prove its CPU correct: consider a CPU that, if it detects that it's running such a program, instead just outputs a success message. However, most real-world CPU errors are inconsistencies, which can in principle be detected by a program running on the CPU; indeed, most of these errors (when discovered) are demonstrated by presenting computer programs that behave incorrectly. The human brain is not infallible, yet humans are nonetheless capable of noting errors in their own calculations and proofs – even when it's due to mis-memorised arithmetic results, like 7×8=42.

Are there any theorem provers (or similar programs) that mitigate the impact of running on incorrect / faulty hardware? (Perhaps by verifying the correctness of particular aspects of their hardware before relying on it, or by double-checking that an intermediate result doesn't imply contradiction.)

Well-known CPU bugs include:

  • Rowhammer: certain memory access patterns can corrupt other bits of memory.
  • Xbox 360's xDCBT bug: if your code contains a certain sequence of bytes, the branch predictor might decide to make your L1 cache inconsistent. (Or it might not. It depends on the pattern of code execution.)
  • Pentium FDIV bug: Intel accidentally zeroed part of the division lookup table, so $\frac{4 195 835}{3 145 727} = 1.333\color{red}{739\dots}$.
  • The x86 trigonometric functions are very wrong; to do range reduction correctly, it would need to know 128 bits of pi, but x86 only knows 68 bits. (This is the case in modern processors, too; pretty much everyone implements their own trig in software.)
$\endgroup$
6
  • 1
    $\begingroup$ Imagine you have a calculator that always adds 2 to the output. None of the answers it gives are correct, but you can still use it for mathematics. I don't expect that a proof assistant could work around something like that, but at least flagging up “hey, 2+2 doesn't equal 4, I think something's wrong” might be a good idea in that case. I'm asking whether there are algorithms (or, better still, actual theorem provers) that do this. $\endgroup$
    – wizzwizz4
    Commented Feb 8, 2022 at 21:13
  • 1
    $\begingroup$ reminds me of the trusting trust attack. A theorem prover that outputs executable code is a compiler. I don't see why a theorem proving compiler wouldn't be vulnerable to the same problems. Diverse double compiling is the solution for this kind of software hack dwheeler.com/trusting-trust . Hardware hacked to recognize a theorem prover and fudge itself to be proved correct could maybe be defended against by similar methods? $\endgroup$ Commented Feb 9, 2022 at 0:07
  • 1
    $\begingroup$ @Li-yaoXia If a statistical simulation uses a compiler optimisation that's wrong (whether due to the compiler's error, or the CPU executing the code wrong), the statistical simulation's result will be off – but probably not enough to be particularly noticeable. If a theorem prover's JIT optimises something wrong (not due to compiler error – that's been proven correct – but due to the CPU executing the code wrong), the result will be off, possibly by enough to change "true" to "false" (or vice versa). If it's a CPU design error, the proof might reproducibly go wrong. That's bad. $\endgroup$
    – wizzwizz4
    Commented Feb 9, 2022 at 0:40
  • 1
    $\begingroup$ @MolossusSpondee It's similar, yes, but not quite the same. Here, if there's a flaw in the execution of the theorem prover, it could prove a newer version of itself valid when it's not. A human's unlikely to do this, but if you had a self-improving theorem prover (for instance, a superoptimizer) then you might have the theorem prover gradually detaching itself from reality, if the CPU doesn't quite match its model of the CPU and it doesn't check that; the principle of explosion is an excellent source of speed optimisations. $\endgroup$
    – wizzwizz4
    Commented Feb 9, 2022 at 0:43
  • 2
    $\begingroup$ @GuyCoder I have tried to create a theorem prover, more than once. If you rely on the “deep down” stuff, then they're slow; proving 252616 + 5245281 = 5497897 would take millions of operations, when it can be done in about 5 if you're more sensible about it. RE test-and-set, you might be thinking of something like the classic Xbox 360 CPU bug; you can't always rely on the CPU's “you read and write to RAM” abstractions, so even the really basic design you described isn't foolproof. $\endgroup$
    – wizzwizz4
    Commented Feb 9, 2022 at 13:28

1 Answer 1

7
$\begingroup$

Your claim that a theorem prover cannot prove the correctness of the hardware design it's running is not argued sufficiently carefully. Just because there could be one particular kind of CPU that is able to detect that one particular proof assistant is running on it, that is not sufficient reason to conclude that no proof assistant can verify any CPU.

Firstly, real-world CPU's do not try to trick the software they're running, so for that reason alone your suggestion does not apply in the real world. But more generally, we are not in the territory of Gödel incompleteness here because a real-world CPU is not Turing complete. It's "just" a very large finite-state automaton. Finite state automata can and have been formalized in proof assistants. There is no theoretical obstacle to verifying their design.

Of course, there is a question of practicality: can we actually verify correctness of hardware with a proof assistant? Well, formal verification of hardware is an entire industry in itself, and some of it is carried out with proof assistants, so the answer is definitely positive. I am not an expert in this field, so I do not know whether current state-of-the-art can actually verify an entire CPU, but I am sure good people will provide them in the comments.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.