8
$\begingroup$

This paper describes a theorem generator called MetaGen.

MetaGen's purpose is to make larger training sets to train provers, by producing theorems similar to ones a human might select for inclusion in a training set.

The application that I'm interested in is slightly different. I'm interested in initial exploration of a proposed set of axioms and assumptions, which means some kind of automated theorem prover combined with a heuristic for measuring how "good" a potential theorem is (such as its length in FOL, possibly combined with some rules to make false arguments to $\to$ inexpressible).

  • What are some known automated theorem generators?
  • Are there any that use a nonclassical logic in the background?
  • What heuristics do they use for theorem goodness?
$\endgroup$
5
  • $\begingroup$ I plan to propose unifying automated-theorem-proving and atp once I have the ability to do so. $\endgroup$ Commented Feb 15, 2022 at 20:16
  • 1
    $\begingroup$ I wonder if this needs the machine learning tag. My reasons are as follows: MetaGen uses machine learning. The other similar stuff either uses ML or is in service of ML. Also, almost any heuristic for "theorem goodness" is likely going to have to use ML since it is just such too vague a concept to be coded by hand. Last, I think this topic would be of interest to those who work on the intersection of ML and theorem proving. $\endgroup$
    – Jason Rute
    Commented Feb 16, 2022 at 12:36
  • $\begingroup$ @JasonRute Done. \@GuyCoder Yes, the tags should be merged. I don't have a strong opinion whether the longer or shorter name should be the canonical one. I actually can't propose either as a synonym of the other, which is interesting. $\endgroup$ Commented Feb 16, 2022 at 16:10
  • $\begingroup$ @JasonRute "theorem goodness" was just supposed to a heuristic that rates the equivalent of 47 is a prime number as worse than addition is associative or exponentiation is associative because the former making a weaker claim. I guess we could call it "informativeness". From what I can tell from the metagen paper, they're using similarity to theorems that humans would select as a notion of informativeness ... with some additional rules to prevent trivial theorems with false antecedents from being generated. (edited to fix grammar) $\endgroup$ Commented Feb 16, 2022 at 17:54
  • $\begingroup$ This reminds me of AM and Eurisko. I'm not sure how advanced their heuristics would be compared to a modern theorem prover. This may merit a question of its own. $\endgroup$ Commented Jul 3, 2023 at 12:22

1 Answer 1

7
$\begingroup$

MetaGen may be unique in that it is a Neural Theorem Generator which generates both statements and proofs.

Nonetheless, there are a number of other papers which do 2 of the 3: Neural, Generate statements, Generate Proofs.

Generate statements and proofs

INT is a synthetic dataset and theorem proving environment for testing and developing machine learning theorem provers. It generates inequalities and equalities for integers. (I think it is basically using the universal fragment of the theory of ordered fields or something similar.) The proofs likely are constructive since they just basically work backwards applying axioms at random. It isn't as generic as MetaGen, but it is useful for its purpose as a toy theorem proving environment for testing AI. Also an INT clone was used as synthetic training data for the recent Open AI paper solving inequalities.

I'm also aware of a number of other AI papers using manually coded synthetic data as a curriculum or just as extra data, again usually in some narrow fragment of mathematics. GPTf (for Metamath) uses synthetic arithmetic facts and ring algebra facts. Towards finding longer proofs uses synthetic statements of Robinson arithmetic, i.e. Peano Arithmetic without induction.

All these works basically use a fairly simple random generator which works because they are working in a well behaved fragment of mathematics (and usually that generator generates constructive proofs).

Also, it is possible to generate random chains of rewrites. The closest work I know to this is Mathematical Reasoning in Latent Space which creates datasets using random rewrites in HOL Light (to test if AI models can predict the effect of many rewrites in the future).

It's also well known that you can generate random SAT problems.

Neural statement generators

Transformer language models are good at generating any text, including in a formal language. First Neural Conjecturing Datasets and Experiments uses a language model to generate Mizar. They show some success with conjecturing Mizar statements. I don't think they try to generate both a conjecture along with a proof, but if they did it would be an instance of generating synthetic theorems.)

Also Mathematical Reasoning via Self-supervised Skip-tree Training is another paper showing you can use language models to predict theorem statements. The method is to mask out a part of a theorem and have the model guess what is there.

The Lean GPTf model (in the PACT paper and the recent OpenAI paper) uses a language model to predicts arbitrary tactics, which in some cases include have and show tactics which could be considered as introducing a conjectured lemma. If the theorem prover completes the proof then I guess it generates a synthetic theorem.

Similarly the GPTf paper, instead of predicting lemmas in the environment by name or reference, it predicts the lemma statement (and then tries to unify that statement with an existing lemma in the database). Again, this can be viewed as a form of neural conjecturing using language models.

And of course machine aided conjecturing is a quite old topic going back to the early days of AI, when AI was more symbolic and rule based.

Neural proof generators

There are a number of papers, too many to list here, which use neural or other machine learning methods to construct proofs given the statement. Many also do premise selection, i.e. selecting which lemmas from the whole library (in that point in the environment) to use for proofs.

Other connections

Of course if you have an automatic conjecturer and an automatic theorem prover, you can put them together to find proofs. You results will likely vary depending on how good each of the systems are.

Some AI for theorem proving papers use reinforcement learning and learn from fragments of half-successful proof attempts. That could be considered a form of theorem generation.

Also it is possible to mine a proof from a theorem prover for every subtree of the low-level proof (e.g. a term proof in Lean) to get a whole bunch of extra data. The PACT paper does this for Lean. Lemma Mining Over HOL Light does something similar to get more premises to use for theorem proving.

One can think of functions in functional programming as proofs in intuitionistic propositional logic. As such, DreamCoder has a mechanism where it generated functions using a neural network to guide the construction of the program tree (which is used to train a neural function finder). I don't remember if they also generated the function signatures (which would correspond to the theorem statements).

More answers to your questions

Most of the techniques I mentioned so far are fairly logic agnostic. The ones which use datasets, of course require a constructive dataset if you are interested in generating constructive theorems.

In reinforcement learning of theorem proving, of which there are a number of papers, one constructs a proof using a tree search. Usually there is some sort of value function learned through reinforcement learning to guide the search. The two most common value functions are (1) an estimated probability that the current state of the proof is provable (by the agent), or (2) an estimate of how long the proof is (with infinity being not provable). One trains a neural network (or other machine learning model) to estimate those values.

Final comment

Your project seems fairly cutting edge. Good luck, but don't expect to find a lot published doing the exact same things.

$\endgroup$
0

Not the answer you're looking for? Browse other questions tagged or ask your own question.