Reinforcement Learning of Theorem Proving

Cezary Kaliszyk

We introduce a theorem proving algorithm that integrates machine learning. Instead of a complete brute-force search, the algorithm runs many Monte-Carlo simulations guided by reinforcement learning from previous proof attempts. Various predictors for estimating the usefulness of proof steps and the likelihood of closing the open tableaux branches will be compared. Complete proof search with backtracking will be replaced by Monte-Carlo tree search with upper confidence bounds. The trained system solves within the same number of inferences over 40% more problems than the human-optimized baseline prover.

Back to the main page