9
$\begingroup$

My understanding of most automated theorem provers (which is possibly an incorrect understanding!) is that they start with some premises stored as unprocessed statements, and repeatedly select one to "process", generating different consequences and substitutions. Because the total number of statements is so big, they rely on heuristics to determine which statements are most promising to focus on.

However, I'm under the impression that current theorem provers are weak compared to human mathematicians, and often fail to solve theorems which humans have already proved.

If there were a very strong search heuristic which often selected the very best statement at a given step, would provers reach a superhuman level? Or are there other major challenges here, like limitations in the rules they use to process each statement?

$\endgroup$

2 Answers 2

5
$\begingroup$

Automated theorem provers rely on the resolution algorithm because it is complete. This means that if a proof exists, the prover is guaranteed to find it. Resolution works as follows:

Say you want to prove $\Gamma\vdash \phi$, then set $\mathcal I_0 := \Gamma\cup\{\neg\phi\}$ as your initial set of clauses. The goal is to iteratively saturate all possible derivable clauses (using the resolution rule) until you reach a contradiction:

$$\mathcal I_0 \subset \mathcal I_1\subset\mathcal I_2\subset\ldots\subset\mathcal I_n$$ so that $\bot\in\mathcal I_n$. Then you can reconstruct a proof as a binary tree of depth $n$. The problem, as you already noted, is that the size of $\mathcal I_k$ grows extremely quickly, so it is nearly impossible to find proofs beyond a few steps.

In principle, a perfect heuristic technique would pick out (at most) $2^{n-1}$ pairs in $\mathcal I_0$, (at most) $2^{n-2}$ pairs in $\mathcal I_1$,... and one pair in $\mathcal I_{n-1}$ on which to apply the resolution rule.

If good heuristics (even far from perfect) existed, there would be no limit to what provers could do. The problem is that it's nearly impossible to give good heuristics for a resolution theorem prover.

$\endgroup$
4
  • 2
    $\begingroup$ I agree with you, however, your argument can be simplified a bit by starting with the completeness of linear resolution. $\endgroup$
    – M. Lonardi
    Commented Feb 15, 2022 at 10:50
  • $\begingroup$ Perhaps this follow up question is off-topic, but is there some intuition I can use to understand why making a good heuristic is "nearly impossible"? $\endgroup$
    – Reubend
    Commented Feb 16, 2022 at 8:19
  • 1
    $\begingroup$ @Reubend I don't have a good justification for that comment, other than my experience with resolution provers. The trouble is that resolution proofs typically look completely different from any proof a human would come up with. I think if a good automated proving technique exists, it can't be complete. $\endgroup$
    – Couchy
    Commented Feb 16, 2022 at 18:31
  • 1
    $\begingroup$ @Reubend Maybe I have a justification, however hanc marginis exiguitas non caperet, so I will edit my answer. $\endgroup$
    – M. Lonardi
    Commented Feb 17, 2022 at 0:29
5
$\begingroup$

I used to feel the same way. My dream was a computer that would give me the answers I wasn't able to reach because I wasn't smart enough. Now I no longer hope that such a thing is possible; in fact, maybe I'm happy with it.

Two observations. First: human reasoning works by "deepening", by "clarification"; a computer cannot see what is in your head, you have to explain it to it step by step; however, once you have succeeded, perhaps you no longer need the computer, because you have understood the subtleties of the problem and the problem disappears. Second: in the automation of the game of chess there is a function that evaluates a configuration (heuristic) and a search algorithm based on this heuristic; if the heuristic were "exact", the search would no longer be needed.

In addition, if the heuristics are computationally heavy, the search cannot expand significantly, thus there is a tradeoff between accuracy and computation time. Neural network-based heuristics are computationally heavy, and those are the ones you're probably interested in. There are ATPs based on neural networks, e. g. Enigma (by Jan Jakubův). At the CASC-J10 Enigma performed very well, but did not win; Enigma relies on the "classic" theorem prover E (by Stephan Schulz), which is based on the equational superposition calculus and uses a variant of the Given-Clause Algorithm, the DISCOUNT Loop.

Regarding the possibility of having a "good" heuristic, I have thought of an argument that might convince you of the impossibility of its existence. I define a heuristic as "good" if it finds a proof of $n$ steps in at most $q(n)$ attempts, where $q$ is a polynomial and is known. If I had "good" heuristics, I could solve NP problems in polynomial time: suppose an NP problem has a verification algorithm in at most $p(n)$ steps ($p$ polynomial and known); in that case I can decide in polynomial time the NP problem, because I have answer YES if the heuristic finds a solution in less than $q(p(n))$ attempts and answer NO if the heuristic does not find a solution in less than $q(p(n))$ attempts (the composition of two polynomials is a polynomial).

$\endgroup$
5
  • $\begingroup$ In this case, I think I phrased my question wrong. I'll edit it to clarify. $\endgroup$
    – Reubend
    Commented Feb 13, 2022 at 20:59
  • $\begingroup$ What I was hoping to understand better was to what extent current performance is reliant on the search heuristic, and whether an extremely strong heuristic might be sufficient to achieve performance on the level of human mathematicians. I used the phrase "perfect heuristic" confusingly, which I think is not what I actually meant. $\endgroup$
    – Reubend
    Commented Feb 13, 2022 at 21:01
  • 1
    $\begingroup$ Actually, I think you explained yourself well, however I wanted to respond with a punch line. The sense is that if the heuristics are computationally heavy, the search cannot expand significantly, thus there is a tradeoff between accuracy and computation time for heuristic functions. Neural network-based heuristics are computationally heavy, and those are the ones you're probably interested in. There are ATPs based on neural networks, e. g. Enigma. At the CASC-J10 Enigma performed very well, but did not win. $\endgroup$
    – M. Lonardi
    Commented Feb 14, 2022 at 1:02
  • $\begingroup$ On second thought, I think it's best to edit the response by adding the content of the last comment. $\endgroup$
    – M. Lonardi
    Commented Feb 15, 2022 at 0:22
  • $\begingroup$ You're spot on that I'm curious about neural network based heuristics. The link to Enigma is great, since that will give me a lot of prior art to look at. At the risk of getting off topic here, does the heuristic prune one "main" search, or are there many separate searches (perhaps due to different clauses)? $\endgroup$
    – Reubend
    Commented Feb 16, 2022 at 7:57

Not the answer you're looking for? Browse other questions tagged or ask your own question.