[Matryoshka-devel] Evil axioms

Haniel Barbosa hanielbbarbosa at gmail.com
Wed Apr 19 20:42:21 CEST 2017


Pascal Fontaine writes:

> Hi All,
> On 04/19/2017 07:28 PM, Blanchette, J.C. wrote:
>> Hi all,
>> I was about to write an email with five recipients but then it occurred to me that we have this mailing list, which can be used for general discussions as well as for announcements. I hope this is not too much of a problem to whomever I promised the list would be low traffic.
>> The typical "hammer" use case of automatic provers includes hundreds of heuristically selected lemmas as axioms in the generated problems. However, I've often observed that some axioms can be "evil", meaning that a single axiom can be enough to go from "provable within 1 s" to "the prover spins seemingly forever". This is not so surprising in itself (both superposition and SMT are unstable in many ways), but I suspect that this evilness might not always be the result of randomness. Maybe some axioms are just generally evil, and the provers are not always robust enough.
>> For example, it is well known that for superposition provers, axioms with unit equalities between two variables, such as extensionality
>>  (forall x. F x = G x) => F = G
>> are known as being evil, because they can a priori be applied anywhere to rewrite any term into any term; they're extremely "prolific", yielding lots of new clauses. Superposition provers neutralize this by down-prioritizing such inferences, so that such evil axioms are not actually evil in the end. (The price you pay is that when you really need them, they don't kick in; but see also the work by the Vampire guys on extensionality in superposition.)
>> Extensionality is not evil thanks to the down-prioritization (or Vampire's extensionality extension), but there might be other patterns that are evil for prover X or calculus Y. I have some concrete questions below, but I'd be curious about your thoughts or further questions.
>> Q1. Is this concept of "evilness" discussed anywhere in the literature?

Leino had a paper on CAV last year that kinda goes into this:

"Trigger Selection Strategies to Stabilize Program Verifiers"

They mostly talk about local matching loops (a formula generating instances that
trigger new instantiations of the same formula) and how to select triggers in a
way to avoid them.

>> Q2. Do provers have some general robustness mechanisms to deal with evil axioms? Should this be investigated?
>> Q3. Would it make sense to empirically evaluate the evilness of thousands of axioms and to try to identify some patterns? Has anybody ever done anything like that?
> I do not have answers to the above questions.  It is interesting.
> Probably very empirical at first.  A project for a bachelor student?
> In veriT (Haniel, correct me), we do not do anything special.  I think
> Andy is actually very aware of axioms that can go wrong, and has
> mitigating measures for them.
>> For the record, axiom a28 in the attached SMT-LIB problem ruins everything on that problem for the repository version of veriT. I don't know if it's really evil, though, without trying to add it to other easily solvable problems.
> It is probably due to the fact that this is a constant-generating axiom,
> because of the forall v2, which is Skolemized...

That's what I think also.

In veriT we have a mode which avoids some matching loops, but it does not seem
to take effect in this case. I believe the instances of this axiom are
interacting with other parts of the formula and scaping our (anyway kinda naive)
loop detection.

Haniel Barbosa

More information about the matryoshka-devel mailing list