[Cryptominisat-devel] CNF permutations and running times

Mate Soos soos.mate at gmail.com
Dim 2 Jan 13:08:41 CET 2011


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Vegard,

On 12/31/2010 07:18 PM, Vegard Nossum wrote:
> I was experimenting a bit with running times of the solver on
> permutations of the same CNF formula. In this case, a permutation
> means that literals within each clause are reordered and that the
> clauses themselves are reordered. I would have expected that the
> solver would use approximately the same amount of time for any
> permutation of the formula, but clearly I was wrong. Here are my
> observations:
> 
> Running the solver 100 times on the very same CNF file gave the
> following results (CPU time):
> 
>  Min.   :10.03
>  1st Qu.:10.15
>  Median :10.33
>  Mean   :10.39
>  3rd Qu.:10.54
>  Max.   :11.22

The solver doesn't randomise anything, thus the running times should be
the same, as the executed number of instructions, logical memory layout,
etc. is the same. Other processes running and similar factors (branch
prediction, physical memory layout, etc.) influences things a bit, though.

> There is little variation so it looks pretty good. However, running
> the solver 100 times, each time on a new random permutation of the
> original CNF file

You could simply have done "--randomise=X". That should take care of
this. BTW, permutating the literals doesn't help, as we sort them, see
addClause() and addXorClause() functions. Permutating the variables
would be meaningful, though. (But maybe --randomise=X is good enough).

> gave these results instead:
> 
>  Min.   : 3.160
>  1st Qu.: 6.202
>  Median : 9.875
>  Mean   :10.218
>  3rd Qu.:12.697
>  Max.   :29.540
> 
> So the mean stayed about the same,

It's by chance that the plain CNF file also gave 10 sec as solving
time. It could have easily been 3.1 sec.

> but minimum and maximum times are
> way out, an order of magnitudes apart. Is this behaviour expected? The
> reason I ask is that I don't like the fact that literal and clause
> order have so much to say for the running time, simply because it
> _shouldn't_ matter for the difficulty of the problem. The problem is
> in all essential ways the same.
> 
> One possible explanation is that in the case of the random
> permutations, the solver simply made different choices, sometimes
> lucky and sometimes unlucky. But why didn't this happen for the first
> experiment (same file 100 times)? In that case, it means that the
> solver's decisions depend on the order of literals and clauses. Is
> that a good thing?

Well, it seems you have just re-discovered something that has been
bothering people for a very-very long time ;) Actually, a difference of
1 order of magnitude is considered not that bad ;)

A _very_ interesting thing that may be true to CryptoMiniSat, though, is
that UNSAT problems may not have such large variations. If this were
true, that would be a breakthrough.

Your problem was a satisfiable (SAT) problem, and it seems logical why
the solving times vary at least a certain amount. Let's suppose that we
have been solving the problem for some time, and we have connected the
dots (variables) well enough. All that the SAT solver is waiting for is
a good guess to set 3-4 variables right, and then it will set all
variables to the right setting to satisfy all clauses. Well, if this is
the case, then some variation is _expected_, right? The thing is, we
have to get to the point of connecting the dots. We _may_ sacrifice some
dot-connecting (i.e. resolution) with some more guessing, and this can
in fact be coded down (though I never did it). However, we can NOT forgo
guessing (see note (***) at end of email). We can forgo dot-connecting:
we just "brute-force" the instance at the SAT level, with BCP.

All right, so much for SAT instances. What happens with UNSAT instances?
Well, we MUST either:

1) go through the whole tree and realise there is no solution

2) do resolution until we reach UNSAT, building a resolution tree with
an empty clause at its root

3) do (2) and then (1)

As you can see, both of these are complete, which should make them
more robust (finding SAT is not complete, because you don't find ALL
satisfying solutions -- if you did, you would actually have the same
properties as solving for UNSAT -- as you would have to prove there are
no more solutions, essentially searching for UNSAT(!)). There is chance
in both of (1) and (2), of course:

(1*) Number (1): the tree can be smaller or larger depending on
  the ordering of the variables

(2*) Number (2): the resolution tree can be smaller or larger depending
  on the order of resolutions

The minimal resolution tree is itself NP-hard I think so it's not
exactly simple to get to, and we cannot expect the SAT solver to solve
an NP-hard problem at the same time as an NP-complete one, just so that
we could enjoy stable running times :D However. Maybe the possible
resolution trees have small variance, and since the SAT solver is
mainly building a random resolution tree instead of going through a
search graph (though it is doing the latter as well), this might mean
the solving time of an UNSAT instance may not vary so much... IF it was
doing things RIGHT, e.g. not dropping important resolved clauses,
concentrating on the part of the instance that could be proved UNSAT,
etc. So, if CryptoMiniSat had a relatively small variance, that would be
a breakthrough as it would mean either:

(A) We are doing the right thing, all the time (what an exaggeration :D

(B) We are doing things so randomly, that even though we are making lots
of mistakes, they average out very smoothly.

I would guess on (B).

> In any case, it seems that in order to get reliable timing experiments
> with several runs on the same or similar formulas, they should be
> randomly permuted before each solver run in order to get "better"
> (closer to the truth) values for the running times.

Yes, this was known before, there were even papers published, I think.
However, I think you could do the same with --randomise=X in CryptoMiniSat.

A more interesting thing than permutation, though is the following. If
your problem is generated from a type of problem, e.g. the AES cipher,
or it is e.g. a type of Sodoku problem with certain fixed parameters
(size, etc.), then it is _far_ better to generate many problems with
parts that are randomly created, e.g. the key is random, or the sodoku
hints are randomly placed, etc. In this case, you would not only get an
average time to solve a specific instance, but a whole range of
instances which you were interested in. After all, the problem in one
CNF file is _very_ specific -- it pertains to one instantiation of a
whole range of problems that you would like to solve. For instance, if
by chance you took the key "all-zero" for an AES problem, encoded it in
CNF, and then gave it to the MiniSat solver, it would typically find the
solution _extremely_ fast, as it by default guesses on 0-s. However, you
were not in fact interested how fast MiniSat solves for an all-zero key:
you were interested in how fast MiniSat solves for a random key, right?
Well, in that case, you have to generate hundreds of randomly generated
instances, and solve them all, then average the runtimes. This is what
Grain-of-Salt did, by the way, and the reasoning I have given in this
paragraph is probably the cause of the very smooth curve that is present
in the SAT paper that to my knowledge no one was able to achieve before.

> The original CNF file, a "shuffle" program, raw data, and histograms
> can be found at http://folk.uio.no/vegardno/shuffle.tar.bz2

Thanks for this, seems like you put quite some work into it. Maybe
trying to solve an UNSAT instance many times would be interesting, too.

Sorry for the overly long email. Maybe it helped clear up things both
for you and me ;)

Bests,

Mate

Note (***): We cannot forgo guessing, as it might happen that there are
multiple solutions. In this case, it might be impossible to resolve
clauses to the point of unitary clauses, for a very good reason. E,g.
there are two solutions:

011101000...
100010111...

Since if varX is 0 in the first solution then it is 1 in the second and
vica-versa, it is impossible to prove that _any_ variable takes _any_
value. In this specific case, the best the SAT solver can do, is to
prove binary clauses (e.g. var1=0 implies var2=1, var3=1, var4=1,
var5=0...). If there are more than 2 solutions, you can probably see
that it can be possible that even binary clauses cannot be proved
between certain variables. In this case, if _all_ possible clauses have
been derived, then the guessing will essentially be selecting a solution
from the solution space -- no backtracking will be needed, but some
guessing will be needed, just to select the solution to be displayed.

PS: Sorry for the overly long email. Maybe I should make a blog post out
of this. It's just a set of random thoughts, reasonings, ideas, and
accumulated practical knowledge. I might be wrong all the way, as
always. Someone proving me wrong is what I enjoy the most, actually.

- -- 
Mate Soos
Security Research Labs
http://www.srlabs.de
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk0gasYACgkQsTOOstKb0jmTeQCeImxXqMOa+QcA8TU2OJ6P9gxf
eggAoIjAdMU5htosESquO1M5a97pDpZs
=ADCB
-----END PGP SIGNATURE-----



More information about the Cryptominisat-devel mailing list