BliStrTune:

Hierarchical Invention of Theorem Proving Strategies

Jan Jakubův & Josef Urban
Automated Reasoning Group @ Czech Technical University in Prague

CPP'17, 16th January 2017, Paris

Talk Roadmap

  1. E Prover and Proof Search Control
  2. Parameter Learning and E Protocols
  3. Hierarchical Invention of E Protocols
  4. Results: Performance of Invented Protocols

E Prover

by Stephan Schulz
  • automated theorem prover for FOL with equality
  • predefined --auto-schedule mode
  • command line arguments to guide proof search

E Prover

Guiding Proof Search
  • term ordering (KBO, LPO, ...)
  • literal selection (to perform superpositions on)
  • clause selection (to select a given clause)
  • axiom relevancy pruning (SInE)

E Prover

Given Clause Loop
  1. select an unprocessed given clause
  2. make all inferences with given clause
  3. mark the given clause as processed
  4. goto 1 unless all clauses are processed

E Prover

Given Clause Selection
  • implemented using clause weight functions
  • assign real weight to each clause
  • the clause with the smallest weight is selected
  • parametrized combinations possible

     1*ConjTermWeight(ConstPrio,0,1,0.1,18,400,50,300)  
     4*ConjTermWeight(PreferUnits,1,1,0.1,100,9999,100,5) 
     8*ConjSymbolWeight(PreferGround,0.2,50,100,5)) 
 

E Prover Protocol

  • cmd-line arguments guiding proof search
  • dozens of arguments, thousands of values

 -tKBO6 -WSelectComplexG ...
 -H'(1*ConjTermWeight(ConstPrio,0,1,0.1,18,400,50,300),
     4*ConjTermWeight(PreferUnits,1,1,0.1,100,9999,100,5),
     8*ConjSymbolWeight(PreferGround,0.2,50,100,5))'
 

E Prover Scheduler

  • one protocol is not enough
  • scheduler = collection of complementary protocols
  • protocols divide available cpu time

Talk Roadmap

  1. E Prover and Proof Search Control
  2. Parameter Learning and E Protocols
  3. Hierarchical Invention of E Protocols
  4. Results: Performance of Invented Protocols

ParamILS

Parameter Learning System
  • method for parameter tuning and algorithm configuration
  • by Hutter, Hoos, Stützle, Leyton-Brown, Fawcett
  • from University of British Columbia (UBC)
  • implementation available for download

Using ParamILS

To Invent E Protocols
  • given training problems ...
  • ... find the best performing protocol
  • Problem: Single protocol for all problems?

BliStr: Blind Strategy Maker

by Josef Urban

BliStr: Blind Strategy Maker

  • giraffes ~ protocols
  • food ~ problems
  • the better a giraffe specializes ...
  • ... the more it gets fed and evolves

BliStr: Brief Overview

  1. evaluate initial protocols on training problems
  2. select protocol P to improve
  3. improve P on its best-cheap problems
  4. evaluate resulting P' and goto 2

BliStr Life

Talk Roadmap

  1. E Prover and Proof Search Control
  2. Parameter Learning and E Protocols
  3. Hierarchical Invention of E Protocols
  4. Results: Performance of Invented Protocols

BliStrTune

  • BliStr explores a limited E protocol space
  • Considers fixed set of clause weight functions
  • Only 12 functions:
    
     ConjTermWeight(ConstPrio,0,1,0.1,18,400,50,300)
     ConjSymbolWeight(PreferGround,0.2,50,100,5)
     ...
     
    

BliStrTune

  • Idea: Extend BliStr to change weight parameters
  • Problem: Too big parameter space
    • ParamILS does not perform well
  • Solution: Use two phases.
    1. tune global parameters
    2. tune weight function arguments

Global Tuning Phase


 -tKBO6 -WSelectComplexG 
 -H'(3*ConjTermWeight(ConstPrio,0,1,0.1,18,400,50,300),
     34*ConjTermWeight(PreferUnits,1,1,0.1,100,9999,100,5),
     8*ConjSymbolWeight(PreferGround,0.2,50,100,5))'
 

Fine Tuning Phase


 -tKBO6 -WSelectComplexG
 -H'(3*ConjTermWeight(ConstPrio,0,1,0.1,18,400,50,300),
     34*ConjTermWeight(PreferUnits,1,1,0.1,100,9999,100,5),
     8*ConjSymbolWeight(PreferGround,0.2,50,100,5))'
     

Talk Roadmap

  1. E Prover and Proof Search Control
  2. Parameter Learning and E Protocols
  3. Hierarchical Invention of E Protocols
  4. Results: Performance of Invented Protocols

Testing Benchmark

  • Mizar @ Turing division of competition CASC'12
  • problems exported from Mizar
  • 1000 training problems known beforehand
  • 400 testing problems in the competition

Results on Training Problems

prover solved V+
E (BliStrTune) 744 +9.8%
Vampire 4.0 677 +0%
E (auto-schedule) 605 -10.6%

Results on Testing Problems

prover solved V+
E (BliStrTune) 280 +5.2%
Vampire 4.0 266 +0%
E (auto-schedule) 231 -13.1%

Progress on Testing Problems

BliStr vs. BliStrTune

Future Work

Thank you