[Matryoshka-devel] Clarifying Lambda-Free Higher-Order Logic
j.c.blanchette at vu.nl
Thu Mar 7 16:37:07 CET 2019
Alexander Steen and Christoph Benzmüller (henceforth S&B) have recently
submitted a manuscript titled "Nitpicking Lambda-Free Higher-Order Logic" to a
leading conference in the field of automated reasoning. Since they have made
their write-up semi-public, by making it available to the 42 members of the
conference's program committee, and I don't want to be accused of scheming
behind the scenes, I want to answer their claims publicly.
In an unfortunate departure with their previous publishing habits (e.g., for
IJCAR 2018), S&B refuse to post their draft online before acceptance, so I
will have to cite the relevant passages.
1. Lambda-free HOL
In the past three years, my colleagues and I have been developing calculi and
provers for a logic we call "lambda-free higher-order logic". The main
theoretical milestone has been a superposition-like calculus that does not
require full monotonicity of the term order, with a prototype implementation
in the Zipperposition prover (Bentkamp et al., IJCAR 2018). On the practical
side, we have extended the E prover's data structures and algorithms for this
logic (Vukmirović et al., TACAS 2019). Two further papers (presented at
FoSSaCS 2017 and CADE 2017) presented corresponding term orders.
Lambda-free (and Boolean-free) HOL is a stepping stone towards full HOL, our
actual aim. The next step is Boolean-free HOL with lambdas. It is the basis of
our draft "Superposition with Lambdas" . Our incremental approach is
motivated by the traditional encodings of HOL into FOL, which rely on (1) an
applicative encoding, transforming f x into @(f, x); (2) a complete set of
combinators, e.g. SKBCI, to represent lambdas; and (3) function symbols, or
"proxies", representing the logical connectives and quantifiers. Lambda-free
HOL eliminates the need for translation step 1; Boolean-free HOL eliminates
the need for step 2 as well.
2. Is lambda-free HOL a "logic"?
S&B note the absence of a Boolean type. Then they observe that without such a
type, there is no "obvious" way to express, say, "rains \/ ~rains".
Yet we are in excellent company here. Standard first-order superposition does
without an interpreted $o or \omicron type. It is easy enough to introduce
single constant "true", of an uninterpreted sort in a typed setting (e.g., TF0
or TF1). We can then write "rains = true \/ rains ~= true". There is no need
for a "false" constant or any axioms, let alone an interpreted $o. This is in
fact how E and other state-of-the-art provers implement predicate symbols, but
also how they are dealt with in Daniel Wand's PhD thesis, in polymorphic FOL.
Lambda-free HOL is a strict extension of the clausal first-order logic
considered in Bachmair and Ganzinger's original superposition papers, or of
Waldmeister's equational logic, neither of which have an $o sort, and to my
knowledge nobody ever questioned that these formalisms deserve to be called
3. Is lambda-free HOL a "higher-order logic"?
S&B also question whether lambda-free HOL deserves to be called a higher-order
logic. In our IJCAR 2018 paper, we wrote
The existence of [a translation like (1) above] explains why lambda-free
higher-order terms are also called "applicative first-order terms."
And in our TACAS 2019 paper:
Our logic corresponds to the intensional lambda-free higher-order logic
(lambdafHOL) described by Bentkamp, Blanchette, Cruanes, and Waldmann [7,
Sect. 2]. Another possible name for this logic would be "applicative
S&B do not quote any of the above passages.
As explained in our CADE 2017 paper, we prefer the HOL terminology because of
the syntax: X a or f X Y does not look first-order. There is ample precedent
in the literature for the lambda-free nomenclature, such as Lifantsev and
Bachmair's "LPO-Based Termination Ordering for Higher-Order Terms without
lambda-Abstraction" (TPHOLs 1998), but also for the "applicative first-order"
view, especially in the term rewriting community.
Herbert Enderton remarked that HOL with Henkin semantics is "nothing but
many-sorted first-order logic with comprehension axioms". One could argue that
the true split between FOL and HOL occurs once we restrict the models to
standard models; everything else is just convenience. Indeed, if one keeps
reading Stewart Shapiro beyond the sentences quoted by S&B, one sees that this
is also where he draws the line, not with the comprenhension axioms. Following
this line of thinking, it is debatable whether Leo-III and Satallax are
I must admit I cannot get excited about this debate, especially that lambda-free
HOL is merely a stepping stone for us (and a very useful one at that).
4. Are Booleans included?
The core of S&B's manuscript, Section 3, revolves around the question as to
whether lambda-free HOL has Booleans. Our IJCAR paper is clear about this, and
their Remark 1 acknowledges this. But then they experiment with our prototype
prover Zipperposition and find that it does not reject problems referring to
$o. They also observe that some of the IJCAR 2018 benchmarks contained $o.
Since these language features are used for evaluating lFHOL, we must
conjecture that a Boolean type o does in fact exist and we therefore
assume this in the remainder.
"Must conjecture"? Really?
My coauthors and I certainly deserve blame on two fronts:
1. Zipperposition, even as a prototype, should have had syntactic checks to
reject problems outside of its fragment.
2. Our IJCAR 2018 paper included some benchmarks we should have excluded we
did not filter out the TPTP problems properly. Our text claimed that we
exclude "problems that mix Booleans and terms", but we missed some. Any $o
that survives preprocessing is then treated as uninterpreted (which puts our
prover at a *disadvantage* in terms of establishing theoremhood compared with
higher-order provers such as Leo-III and Satallax).
And, for sure, our description of the logic could have been (even) clearer. We
will improve all of this in the technical report (in the coming days) and in a
forthcoming journal submission.
Nevertheless, none of this warrants suggesting the presence of an interpreted
Boolean sort when we explicitly stated that it is not present in our logic and
when we do not include any rules to cope with Booleans in our refutationally
complete calculus. The rest of S&B's manuscript, which is to say its bulk,
builds on this false premise. Surprise: Running Zipperposition on benchmarks
that are beyond its advertised fragment yields counterintuitive results.
4. Status of Zipperposition
In our papers, we always state that Zipperposition is a prototype, a word that
strangely enough does not appear in S&B's manuscript. When they refer to
Zipperposition as "the de-facto reference implementation of lFHOL", they are
misrepresenting our work. Our more serious implementation is in E 2.3,
configured with "--enable-ho". It was released in October 2018. E and the
benchmarks we used in the TACAS 2019 paper do not suffer from the flaws noted
above: E has syntactic checks, and the TPTP benchmarks were carefully
The TACAS work was advertised last year on the "matryoshka-devel" mailing
list, to which Steen subscribes, and as users of E S&B likely also receive
emails from Stephan Schulz when new versions of E are released. E's extension
was also described at the Matryoshka 2018 workshop, which Steen attended.
Thus, there is no justification for using Zipperposition as the unique
reference for the logic. The work on E is mischaracterized as "ongoing work",
when it was released over four months ago.
S&B dutifully warn the readers: "Any project interested in integrating
Zipperposition as a trusted black box in other applications or tools, should
rather be careful at the current state of development." How insightful is this
comment, given that Zipperposition's "README.md" file states the following?
Disclaimer: Note that the prover is a research project. Please don't use
it to drive your personal nuclear power plant, nor as a trusted tool for
S&B's comment is gratuitous also because no automatic prover is immune to
bugs. In the past ten years, I or my tools have discovered soundness bugs in
ten other proving systems (Alt-Ergo, Dafny, Leo-II, SPASS, TPS, tptp4X,
Vampire, veriT, Yices, and Z3). I certainly wouldn't use any automatic provers
as a black box, without proof reconstruction, to run my personal nuclear power
plant. I don't see why S&B have to single out Zipperposition in this respect.
Some of the oddities S&B discover when applying Zipperposition to problems
beyond the fragment for which it was designed are related to preprocessing.
It is an intriguing question whether there are more pre-processing
techniques where such a behaviour occurs and which, as a consequence, can
not be employed within lFHOL provers.
But there is no need to speculate about this; reading is enough. The technical
report associated with our TACAS 2019 paper has a section (Section 9) called
"Preprocessing", which in particular describes one optimization, definition
unfolding, that compromises refutational completeness (and hence soundness of
saturations w.r.t. the semantics of lambda-free HOL).
5. Metascientific Approach
S&B conclude with
Our objective has been to rationally reconstruct and assess the language
and logic of lambda-free higher-order logic as introduced in [the IJCAR
2018 paper]. We have been unsuccessful in the sense that we could not
reach a satisfying point of clarification.
And yet it would have been easy for us to clarify any misunderstandings if
they had only picked up the phone or sent us an email. The fact that
Benzmüller was on Skype with me for 101 minutes on 24 January 2019 (where he
helpfully guided me through parts of the literature) and did not use this
opportunity to seek clarifications regrettably speaks volumes about their
intentions. It would have taken me less than five minutes to explain that we
absolutely do not have Booleans in our logic (as Alex Bentkamp already told
him at IJCAR 2018, when Benzmüller asked about Boolean extensionality) and
that the closest thing we have to a reference implementation is E 2.3, not
Zipperposition. They could have spared themselves writing a manuscript and
focus instead on our shared objective, namely, to develop stronger automation
for higher-order logic.
Far from clarifying any issues, S&B's submission raises pseudo-questions and
make unwarranted conjectures, generating more noise than signal. Given the way
in which they selectively quoted and misrepresented our research, it is hard
to believe that they proceeded with the purest intentions. Despite the initial
appearances, their text reads more like a pamphlet than like a
curiosity-driven scientific investigation. The right course of action for them
would be to retract their flawed manuscript. Failing that, I hope they will
have the courage of their convictions, namely: put the PDF of their submission
online and revise their text to include a link to this answer .
Finally, I want to state that this is likely last time I comment on this publicly. I think
I have addressed all the concerned raised by S&B's manuscript and consider this
More information about the matryoshka-devel