A properly typeset version of this document is available in postscript and in pdf.
If some fonts do not look right on your screen, this might be
fixed by configuring your browser (see the documentation here).
The satisfaction of constrained formulae is a key issue in complexity
theory. Many computational problems are shown to be NPcomplete
through a polynomial mapping onto the KSatisfiability (SAT)
problem. Recently, there has been much interest in a random version of
the KSAT problem defined as follows. Consider N Boolean variables
x_{i}, i=1,...,N. Call clause C the logical OR of K randomly
chosen variables, each of them being negated or left unchanged with
equal probabilities. Then repeat this process by drawing
independently M random clauses C_{l}, l =1 ,... ,M. The
logical AND of all clauses, F, is said to be satisfiable if
there exists a logical assignment to the x's evaluating F to
true, unsatisfiable otherwise.
Numerical experiments have concentrated upon the study of the
probability P_{N} (a , K) that a given F including
M=a N clauses be satisfiable. For large sizes of N, there appears a
remarkable behaviour: P seems to reach unity for a < a
_{c} (K) and vanishes for a > a _{c} (K) [6]. Such
an abrupt threshold behaviour, separating a SAT phase from an UNSAT
one, has indeed been rigourously confirmed for 2SAT, which is in P,
with a _{c} (2) =1 [2, 5]. For larger K ³ 3, KSAT is
in NP and much less is known. The existence of a sharp transition has
not been proven yet but precise estimates of the thresholds have been
found: a _{c} (3) ~ 4.25. Moreover, some lower and upper
bounds to a _{c} (3) (if it exists), a _{l.b.}=3.003 and
a _{u.b.}=4.64 respectively have been established
[4, 3].
The classical approaches to study the SAT phenomenon threshold are both
combinatorial and probabilistic. A statistical physics approach was used in
[8, 9]. Such an approach allows properties to be
predicted. It has
been applied already to random graphs and it has led to
large deviation results for the threshold
phenomenon of random graphs in addition to previously known
results. This approach
seems therefore to be
powerful. However
it proves much harder to apply to the SAT threshold
phenomenon. It yields in particular a surprising
change concerning the proportion of variables fixed in the neighbourhood of the
threshold between 2SAT and 3SAT. This could partly account for the complexity
gap between these two problems. In order to apply the statistical physics
approach, the following
steps were carried out.
First, the energy function corresponding
to the KSAT problem is identified. The logical values of the x's can be
represented by N binary variables S_{i}'s, called spins, through the
onetoone mapping S_{i}=1 (respectively +1) if x_{i} is false
(resp. true). We then encode the random clauses into a M× N
matrix C_{l i} in the following way: C_{l i}=1
(respectively +1) if the clause C_{l} includes x_{i}
(resp. x_{i}), C_{l i}=0 otherwise. Consider now the
costfunction E[C , S] defined as the number of
clauses that are not satisfied by the logical assignment corresponding
to configuration S. The minimum E[C] of E[C ,
S], that is, the lowest number of violated clauses that can be
achieved by the best possible logical assignment [8, 9], is a
random variable which becomes totally concentrated around its mean
value « E[C] » in the large size limit [1]. The
latter is accessible through the knowledge of the averaged logarithm
of the generating function
Z[C
]=
å
S
exp
(
 E[C , S] / T
)
(1)
since
« E[C ] » =  T « log Z[C ] » + O(T^{2})
when the auxiliary parameter T is eventually sent to zero. Being the minimal
number of violated clauses, « E[C ] » equals zero in the SAT
region and is positive in the UNSAT phase, allowing the location of
a _{c}(K).
The calculation of the average value of the logarithm of Z
in (1) is an awkward one. To circumvent this difficulty, we
compute the nth moment of Z for integervalued n and perform
an analytical continuation to real n in order to exploit the identity
« Z[C ] ^{n} » = 1 + n « log Z[C ] » +
O(n^{2}).
The nth moment of Z is obtained by replicating n
times the sum over the spin configuration S and averaging over
the clause distribution [8]
« Z[C
] ^{n} » =
å
S^{1} , S^{2} , ... ,
S^{n}
« exp
æ ç ç è

n
å
a=1
E[C ,
S^{a}] / T
ö ÷ ÷ ø
».
(2)
It is crucial to notice that the averaged term in (2)
depends on the n× N spin replicas only through the 2^{n} occupation
fractions c(s) labelled by the vectors s
with n binary components; c(s) equals the number (divided
by N) of labels i such that S_{i}^{a}=s ^{a}, " a=1,...,n. Taking into account the combinatorial entropy of the labels
i at fixed occupation fractions,
« Z[C ] ^{n} » ~ exp
(NF_{max})
where F_{max} is the maximum over all possible xs
of the functional [8]
F[ { c } ] = 
å
s
c(s) log c(s) + a log
é ê ê ë
å
s_{1} , ... ,
s_{K}
c(s_{1}) ···
c(s_{K})exp
æ ç ç è

1
T
n
å
a=1
K
å
l =1
d [s
a
l
+ 1 ]
ö ÷ ÷ ø
ù ú ú û
.
(3)
The optimisation conditions over F[ { c} ] provide 2^{n} coupled
equations for the cs. Notice that F is a symmetric functional,
that is, invariant under any permutation of the replicas a. A maximum
may thus be sought in the socalled replica symmetric (RS) subspace of
dimension n+1 where c(s) is left unchanged under the
action of the symmetric group. Within the RS subspace, the occupation
fractions may be conveniently expressed as the moments of a
probability distribution P(m) over the range 1£ m£ 1
[8]. Once the number of replicas n is sent to zero, we obtain a
selfconsistent functional equation for the order parameter
P(m) that can be solved numerically.
What is the meaning of the distribution P(m)? Consider a formula
F and all the spin configurations S^{j}, j=1,..., N attaining the minimum E[C ] of the costfunction
E[C,S]. Define then the average Boolean magnetisations
of the spins
m_{i} =
1
N
N
å
j=1
S_{i}^{j},
(4)
over the set of optimal configurations. Call H(C,m) the
histogram of the m_{i}s and H(m) the average of H(C,m) over
the choices of the formulae F. H(m) is a probability
distribution over the interval 1£ m£ 1 giving information about
the resulting constraints on the variables induced by the clauses. It
has been shown that, if the RS solution is the global maximum of
(3) (and not only a local one), H(m) equals the above
mentioned P(m) in the limit of large sizes N® ¥. Therefore,
the order parameter arising in the replica calculation reflects the
``microscopic'' structure of the solutions of the KSAT problem.
Of particular interest are the fully fixed variables, that is the
x_{i}'s such that m_{i} = ± 1. In the following, the fraction of fully
constrained variables will be denoted by g (a , K).
Clearly, g (a , K) vanishes in the SAT region otherwise
the addition of a new clause to F would lead to a
contradiction with a finite probability. Two kinds
of scenarii arise when entering the UNSAT phase. For 2SAT, g
(a , 2) smoothly increases above the threshold
a_{c}(2)=1. For 3SAT (and more generally K³ 3), g
(a , 3) exhibits a discontinuous jump to a finite value g
_{c}~ 0.9 slightly above the threshold. While a_{c}(2)=1
is correctly found, the RS prediction for a_{c}(3)=4.6
exceeds the experimental estimates by 10%. Work is currently under
progress to refine the above calculation and enlarge the subspace
where the global maximum is sought in.
Qualitatively speaking, however, we expect the main
conclusion of this work to be correct: the SAT/UNSAT transition is
accompanied by a smooth (respectively abrupt) change in the structure
of the solutions of the 2SAT (resp. 3SAT) problem. Furthermore, we
conjecture that this discrepancy is responsible for the difference of
typical complexities of both models recently observed in numerical
studies [10]. The typical solving time of search algorithms
displays an easyhardeasy pattern as a function of a with a peak
of complexity close to the threshold. The peak time seems to scale
polynomially with N for the 2SAT problem and exponentially with N
in the 3SAT case. From an intuitive point of view, the search for
solutions ought to be more timeconsuming in the presence of a finite
fraction of fixed variables since the exact determination of
the latter necessarily requires an exhaustive enumeration of the
variables. To test this conjecture, a mixed
2+pmodel has been introduced; it includes a fraction p
(resp. 1p) of clauses of
length two (resp. three) and thus interpolates between the 2SAT
(p=0) and 3SAT (p=1) problems. The RS theory predicts that
the SAT/UNSAT transition becomes abrupt when p > p_{0} =0.41. Precise
numerical simulations support the conjecture that the
polynomial/exponential crossover occurs at the same critical p_{0}. An
additional argument in favour of this conclusion is provided by the
analysis of the finitesize effects on P_{N} (a , K) and the
emergence of some universality for p<p_{0}. A detailed account of
these findings may be found in [7].
Broder (Andrei Z.), Frieze (Alan M.), and Upfal (Eli). 
On the satisfiability and maximum satisfiability of random
3CNF formulas. In Proceedings of the Fourth Annual ACMSIAM
Symposium on Discrete Algorithms (Austin, TX, 1993). pp. 322330. 
New York, 1993.
Chvátal (V.) and Reed (B.). 
Miks gets some (the odds are on his side). In Proceedings of the
33rd IEEE Symposium on Foundations of Computer Science, pp. 620627. 
1992.
Dubois (O.) and Boufkhad (Y.). 
A general upper bound for the satisfiability threshold of random
rSAT formulae. Journal of Algorithms, vol. 24,
1997, pp. 395420.
Frieze (Alan) and Suen (Stephen). 
Analysis of two simple heuristics on a random instance of
kSAT. Journal of Algorithms, vol. 20, n°2,
1996, pp. 312355.
Goerdt (Andreas). 
A threshold for unsatisfiability. Journal of Computer and System
Sciences, vol. 53, n°3, 1996, pp. 469486. 
1994 ACM Symposium on Parallel Algorithms and Architectures (Cape
May, NJ, 1994).
Mitchell (D.), Selman (B.), and Levesque (H. J.). 
Hard and easy distributions of SAT problems. In Proceedings of the American Association for Artificial Intelligence AAAI92,
pp. 459465. 
1992.
Monasson (R.), Zecchina (R.), Kirkpatrick (S.), Selman (B.), and Troyansky
(L.). 
Typicalcase complexity results from a new type of phase
transition. 
In preparation.
Monasson (Rémi) and Zecchina (Riccardo). 
Statistical mechanics of the random Ksatisfiability model. Physical Review E. Statistical Physics, Plasmas, Fluids, and Related
Interdisciplinary Topics. Third Series, vol. 56, n°2,
1997, pp. 13571370.
Selman (Bart) and Kirkpatrick (Scott). 
Critical behavior in the computational cost of satisfiability
testing. Artificial Intelligence, vol. 81, n°12,
1996, pp. 273295. 
Frontiers in problem solving: phase transitions and complexity.
This document was translated from L^{A}T_{E}X by
H^{E}V^{E}A.