BliStrTune:
Hierarchical Invention of Theorem Proving Strategies
Jan Jakubův & Josef Urban
Automated Reasoning Group @ Czech Technical University in Prague
CPP'17, 16th January 2017, Paris
Talk Roadmap
- E Prover and Proof Search Control
- Parameter Learning and E Protocols
- Hierarchical Invention of E Protocols
- Results: Performance of Invented Protocols
E Prover
by Stephan Schulz
- automated theorem prover for FOL with equality
- predefined --auto-schedule mode
- command line arguments to guide proof search
E Prover
Guiding Proof Search
- term ordering (KBO, LPO, ...)
- literal selection (to perform superpositions on)
- clause selection (to select a given clause)
- axiom relevancy pruning (SInE)
E Prover
Given Clause Loop
- select an unprocessed given clause
- make all inferences with given clause
- mark the given clause as processed
- goto 1 unless all clauses are processed
E Prover
Given Clause Selection
- implemented using clause weight functions
- assign real weight to each clause
- the clause with the smallest weight is selected
- parametrized combinations possible
1*ConjTermWeight(ConstPrio,0,1,0.1,18,400,50,300)
4*ConjTermWeight(PreferUnits,1,1,0.1,100,9999,100,5)
8*ConjSymbolWeight(PreferGround,0.2,50,100,5))
E Prover Protocol
- cmd-line arguments guiding proof search
- dozens of arguments, thousands of values
-tKBO6 -WSelectComplexG ...
-H'(1*ConjTermWeight(ConstPrio,0,1,0.1,18,400,50,300),
4*ConjTermWeight(PreferUnits,1,1,0.1,100,9999,100,5),
8*ConjSymbolWeight(PreferGround,0.2,50,100,5))'
E Prover Scheduler
- one protocol is not enough
- scheduler = collection of complementary protocols
- protocols divide available cpu time
Talk Roadmap
- E Prover and Proof Search Control
- Parameter Learning and E Protocols
- Hierarchical Invention of E Protocols
- Results: Performance of Invented Protocols
ParamILS
Parameter Learning System
- method for parameter tuning and algorithm configuration
- by Hutter, Hoos, Stützle, Leyton-Brown, Fawcett
- from University of British Columbia (UBC)
- implementation available for download
Using ParamILS
To Invent E Protocols
- given training problems ...
- ... find the best performing protocol
- Problem: Single protocol for all problems?
BliStr: Blind Strategy Maker
by Josef Urban
BliStr: Blind Strategy Maker
- giraffes ~ protocols
- food ~ problems
- the better a giraffe specializes ...
- ... the more it gets fed and evolves
BliStr: Brief Overview
- evaluate initial protocols on training problems
- select protocol P to improve
- improve P on its best-cheap problems
- evaluate resulting P' and goto 2
Talk Roadmap
- E Prover and Proof Search Control
- Parameter Learning and E Protocols
- Hierarchical Invention of E Protocols
- Results: Performance of Invented Protocols
BliStrTune
- BliStr explores a limited E protocol space
- Considers fixed set of clause weight functions
- Only 12 functions:
ConjTermWeight(ConstPrio,0,1,0.1,18,400,50,300)
ConjSymbolWeight(PreferGround,0.2,50,100,5)
...
BliStrTune
- Idea: Extend BliStr to change weight parameters
- Problem: Too big parameter space
- ParamILS does not perform well
- Solution: Use two phases.
- tune global parameters
- tune weight function arguments
Global Tuning Phase
-tKBO6 -WSelectComplexG
-H'(3*ConjTermWeight(ConstPrio,0,1,0.1,18,400,50,300),
34*ConjTermWeight(PreferUnits,1,1,0.1,100,9999,100,5),
8*ConjSymbolWeight(PreferGround,0.2,50,100,5))'
Fine Tuning Phase
-tKBO6 -WSelectComplexG
-H'(3*ConjTermWeight(ConstPrio,0,1,0.1,18,400,50,300),
34*ConjTermWeight(PreferUnits,1,1,0.1,100,9999,100,5),
8*ConjSymbolWeight(PreferGround,0.2,50,100,5))'
Talk Roadmap
- E Prover and Proof Search Control
- Parameter Learning and E Protocols
- Hierarchical Invention of E Protocols
- Results: Performance of Invented Protocols
Testing Benchmark
- Mizar @ Turing division of competition CASC'12
- problems exported from Mizar
- 1000 training problems known beforehand
- 400 testing problems in the competition
Results on Training Problems
prover | solved | V+ |
E (BliStrTune) | 744 | +9.8% |
Vampire 4.0 | 677 | +0% |
E (auto-schedule) | 605 | -10.6% |
Results on Testing Problems
prover | solved | V+ |
E (BliStrTune) | 280 | +5.2% |
Vampire 4.0 | 266 | +0% |
E (auto-schedule) | 231 | -13.1% |
Progress on Testing Problems
BliStr vs. BliStrTune