Smallest Components in Combinatorial Structures

Daniel Panario

University of Toronto

Algorithms Seminar

February 16, 1998

[summary by Philippe Flajolet]

A properly typeset version of this document is available in postscript and in pdf.

If some fonts do not look right on your screen, this might be fixed by configuring your browser (see the documentation here).

Abstract
The smallest size of components in random decomposable combinatorial structures is studied in a general framework. The results apply to several combinatorial structures in both the labelled and the unlabelled case. Typical examples are the cycle decomposition of permutations and the factorization of polynomials over finite fields into irreducible factors.



1   Introduction

Many types of combinatorial objects decompose as sets of simpler basic objects diversely known as ``prime'', ``irreducible'', or ``connected'' components. For instance, a permutation decomposes as a set of cyclic permutations, a polynomial as a (multi)set of irreducible factors, and a graph as a set of connected components. Such situations are combinatorial analogues of the fact that natural numbers uniquely decompose as products of primes.

Let I be a class of basic objects, F the class of all sets of objects from I, that is
F=Set ( I).
As usual, this schema covers both the labelled case (L) where sets are built upon labelled products, and the unlabelled case (U) where multisets are intended. Enumeration is treated by generating functions [5]. The generating functions (gf's) F(z),I(z) corresponding to F, I, are taken to be either the exponential generating function (egf) in the labelled case or the ordinary generating function in the unlabelled case,
(L):   F(z)
=
 
å
n
Fn
zn
n!
  
I(z)=
 
å
n
In
zn
n!
(U):   F(z)
=
 
å
n
Fnzn  
I(z)=
 
å
n
Inzn,
with Fn,In the number of objects of size n in F, I. Then, the fundamental relations between generating functions are given by the exponential formulæ:
(L):   F(z) =eI(z)
(U):   F(z)
=
¥
Õ
k=1
(1-zk)
-Ik
 
= exp æ
ç
ç
è
I(z)+
1
2
I(z2)+
1
3
I(z3)+··· ö
÷
÷
ø
.
    (1)

The construction covers a number of classical combinatorial structures like permutations (cyclic, general), monic polynomials over a finite field of cardinality q (irreducible, general), functional graphs (connected, general) in either the labelled or the unlabelled case. In fact, the examples just cited all belong to an interesting class called the ``exp-log'' class that was introduced in [4].

Definition 1   A pair ( I, F) is said to have the exp-log property if I(z) has a unique dominant singularity r of the logarithmic type,
I(z)~
 
z®r
a log
1
1-z/r
+c0+O((1-z/r)
e
 
),     (2)
for some e>0, where a is called the multiplier. Accordingly, one has
F(z)~ eI(z)~ c1(1-z)-a,     c1=e
c0
 
.     (3)
It is understood that these expansions should hold in an indented disk of the type required by singularity analysis.

Based on the known facts for integers [12] and on specific combinatorial examples, the following properties are expected to hold true:
1. Prime Number Theorem
The asymptotic density of irreducible objects satisfies
In
Fn
~ (ae
-c0
 
G(a))
1
na
.
2. Gaussian law
The number of irreducible components in a random F-object of size n is asymptotically Gaussian with mean and variance each asymptotic to alog n.
3. Dickman's law
The density of F-object of size n whose largest I-component is of size m=n/u involves a function of which a prototype is the Dickman function r(u) classically defined by the difference-differential equation
r(u)=1  (0£ u£1),     ur'(u)+r(u-1)=0  (u>1).
4. Buchstab's law
The density of F-object of size n whose smallest I-component is of size m=n/u involves a function of which a prototype is the Buchstab function w(u) classically defined by the difference-differential equation
uw(u)=1  (1£ u £ 2),     ( uw(u) ) '=w(u-1)  (u>2).
The Prime Number Theorem for exp-log classes derives immediately from basic singularity analysis theorems. The Gaussian law was established in [4] by means of characteristic functions, thanks to the uniformity afforded by singularity analysis; it is an analogue of the classical Erdös-Kac theorem for the number of prime divisors of integers. The Dickman law is known originally from number theory [12] and it holds as well for the cycle decomposition of permutations [10], its extension to the general framework of exp-log classes being due to Gourdon [7]. The purpose of the talk is precisely to establish for exp-log structures the Buchstab law of smallest components by building upon Gourdon's analysis of largest components.

2   Cycles in Permutations

In its simplest terms the problems are well exemplified by the analysis of smallest and largest cycles in permutations. In an important paper, Shepp and Lloyd [10] established the Dickman law and the Buchstab law for permutations. Their approach is however based on an asymptotic-probabilistic model of permutations as sums of Poisson random variables of rates 1,1/2,1/3,... relayed by nonconstructive Tauberian arguments. Gourdon [7] was instead able to push the analytic approach to its ultimate limits, thereby solving the long-standing Golomb-Knuth conjecture; see [6].

From standard methods of enumerative combinatorics the egf's of permutations with all their cycles of size at most m (P[£ m](z)) or at least m+1 (P[>m](z)) are given by
P
[£ m]
 
(z)
=exp æ
ç
ç
è
z
1
+
z2
2
+...+
zm
m
ö
÷
÷
ø
 
=
1
1-z
exp æ
ç
ç
è
-
zm+1
m+1
-
zm+2
m+2
-··· ö
÷
÷
ø
P[> m](z)
=exp æ
ç
ç
è
zm+1
m+1
+
zm+2
m+2
+··· ö
÷
÷
ø
 
=
1
1-z
exp æ
ç
ç
è
-
z
1
-
z2
2
-...-
zm
m
ö
÷
÷
ø
.
Let Ln and Sn be the random variables that represent the largest cycle and the smallest cycle in a random permutation of size n. Equations (4) and (5) give access to probabilities, as
Pr {Ln£ m}=[zn]P
[£ m]
 
(z),     Pr{Sn> m}=[zn]P[> m](z).

In the analytic perspective, an important rôle is thus played by the decomposition of the logarithm into its partial sum and remainder,
log
1
1-z
=sm(z)+rm(z),     sm(z):=
m
å
k=1
zk
k
,     rm(z):=
 
å
k>m
zk
k
.

Consider now smallest cycles. For any fixed m, singularity analysis at z=1 immediately implies a formula for generalized derangements,
Pn[>m]º [zn] P[>m](z)=e
-Hm
 
+o(1),     (4)
where Hm=1+1/2+...+1/m is the harmonic number and the error term is exponentially small. There is no claim to uniformity, but this argument suggests for m tending to ¥ (at least sufficiently slowly) the approximate formula
Pn[>m] »
e
-g
 
m
.     (5)
Let Sn be length of the smallest cycle in a random permutation of size n. The estimate above suggests that the expectation of Sn satisfies
E[Sn]º
 
å
m³1
Pn[>m] = e
-g
 
log n(1+o(1)),
where the asymptotic estimate matches what is otherwise known about the distribution of Sn. However, an approximation of the form (7) cannot hold all the way up to m=n-1 since
Pn[>(n-1)]=
1
n
,     (6)
corresponding to cyclic permutations. A natural way to reconcile (7) and (8) is to look for a version that is of the form
Pn[>m]»
w(n/m)
m
,     (7)
where one should have w(1)=1 and w(+¥)=e-g. It turns out that an amended form of (9) does hold true with w(u) in (9) being precisely the Buchstab function.

3   The exp-log Class

The main theorem of the talk deals with the general exp-log case. We state it here in the case of a multiplier a=1 where the standard Buchstab function appears. Also, we develop the main ideas in the representative case of the cycle structure of permutations.

Theorem 1   For a random element of size n in an exp-log class F of multiplier a=1, the probability that the smallest component Sn is of size greater than m satisfies
Pr{Sn>m}=
1
m
w æ
ç
ç
è
n
m
ö
÷
÷
ø
+O æ
ç
ç
è
1
m2
+
log n
nm
ö
÷
÷
ø
,
uniformly over the range {0,...,,n-1}.
The proof starts from Cauchy's coefficient formula
Pn[>m]=
1
2ip
ó
õ
 


C
P[>m](z)
dz
zn+1
.     (8)
With the purpose of ``capturing the singularity'', the integration contour is taken to be a circle of radius close to 1, namely e-1/n. Set
z=e-t/n ,
where t ranges from 1-nip to 1+nip. Then z-n normalizes to an exponential et. The form (5) of the gf P[>m](z) involves rm(z) that is none other than a Riemann sum relative to the exponential integral,
E(v):= ó
õ
+¥


v
e-w
dw
w
.
Thus, everything rests on a uniform approximation of the Riemann sum rm(z) by the exponential integral. This is provided by the following key lemma of [6].
Lemma 1  [Gourdon]   One has uniformly for Â(h)>0 and |Á(h)|£p,
rm(e-h)=E(mh)+O æ
ç
ç
è
e-mh
m
ö
÷
÷
ø
.

(The proof of the lemma is based on the integral formula
rm(e-h)=
1
m
ó
õ
+¥


mh
e-s
1
1-e-s/m
  ds,
and the decomposition
1
1-e-z
= æ
ç
ç
è
1
1-e-z
-
1
z
ö
÷
÷
ø
+
1
z
,
where the first term is analytic near z=0.)

Using Lemma 1, one can justify replacing the remainder logarithm in the expression of
[zn](P[>m](z)-1)
by an exponential integral. In this way, one establishes rigourously the chain of approximations
Pn[>m]
=
1
2ip n
ó
õ
1+inp


1-inp
(e
rm(e-t/n)
 
-1) et dt
 
~
1
2ip n
ó
õ
1+i¥


1-i¥
(e
Et))
 
-1) et dt
 
~
1
2ip m
ó
õ
1+i¥


1-i¥
(eE(t)-1) e
t
 
dt,
    (9)
where µ=m/n. (This is easier said than done!)

Now, the form (11) is an inverse Laplace integral evaluated at 1/µ. It can be matched against the Laplace transform of w(u), itself directly derived from the defining difference-differential equation. Thus eventually, the Buchstab law arises from Cauchy's coefficient integral upon using a contour close to the singularity z=1 with a ``renormalization'' that leads to the appearance of a Laplace transform---the transform of Buchstab's function.

The technique adapts gracefully to all exp-log structures with multiplier a=1 since these behave analytically very nearly like permutations. For other multipliers a¹1, a function wa(u) closely related to the Buchstab function must be introduced (work in progress). Finally, like in Gourdon's treatment of largest components, other problems can be dealt with including: (i) local and central limit laws; (ii) distribution estimates for the rth largest component for small fixed r.

4   Applications

The analysis sketched here follows closely a preprint by Panario and Richmond [9] and the related works on largest components [6, 7]. It applies to all exp-log classes. In particular, it specializes to polynomials over finite fields and hence has consequences on the analysis of corresponding algorithms. We may cite here:
  1. The comparative analysis of several halting rules for the Distinct Degree Factorization phase of univariate polynomial factorization in [3], which requires knowledge of the degrees of the two largest irreducible factors.
  2. The analysis of the trial-and-error construction of irreducible polynomials by Ben-Or's algorithms [9], where only partial factorisations are attempted and a candidate polynomial is discarded as soon as its factor of smallest degree has been found.
More generally, the analogy between the prime decomposition of integers and exp-log structures is a striking fact that constitutes a valuable addition to the abstract theory of combinatorial schemas initiated by Soria [11]. (Other general approaches have been recently developed by a variety of authors in a stochastic perspective; see [1, 2, 8].)

References

[1]
Arratia (Richard), Barbour (A. D.), and Tavaré (Simon). -- Random combinatorial structures and prime factorizations. Notices of the American Mathematical Society, vol. 44, n°8, 1997, pp. 903--910.

[2]
Cameron (Peter J.). -- On the probability of connectedness. Discrete Mathematics, vol. 167/168, 1997, pp. 175--187.

[3]
Flajolet (Philippe), Gourdon (Xavier), and Panario (Daniel). -- Random polynomials and polynomial factorization. In Meyer auf der Heide (F.) and Monien (B.) (editors), Automata, Languages, and Programming, Lecture Notes in Computer Science, pp. 232--243. -- 1996. Proceedings of the 23rd ICALP Conference, Paderborn, July 1996. Journal version submitted toSIAM J. Computing and available as INRIA Res. Rep. 3370, 1998, 28 pages.

[4]
Flajolet (Philippe) and Soria (Michèle). -- Gaussian limiting distributions for the number of components in combinatorial structures. Journal of Combinatorial Theory, Series A, vol. 53, 1990, pp. 165--182.

[5]
Goulden (Ian P.) and Jackson (David M.). -- Combinatorial Enumeration. -- John Wiley, New York, 1983.

[6]
Gourdon (Xavier). -- Combinatoire, Algorithmique et Géométrie des Polynômes. -- PhD thesis, École polytechnique, June 1996.

[7]
Gourdon (Xavier). -- Largest component in random combinatorial structures. Discrete Mathematics, vol. 180, n°1-3, 1998, pp. 185--209.

[8]
Hansen (Jennie C.). -- A functional central limit theorem for random mappings. Annals of Probability, vol. 17, n°1, 1989.

[9]
Panario (Daniel) and Richmond (Bruce). -- Analysis of Ben-Or's polynomial irreducibility test. -- Preprint, 1997. 16 pages. Submitted to Random Structures and Algorithms.

[10]
Shepp (L. A.) and Lloyd (S. P.). -- Ordered cycle lengths in a random permutation. Transactions of the American Mathematical Society, vol. 121, 1966, pp. 340--357.

[11]
Soria-Cousineau (Michèle). -- Méthodes d'analyse pour les constructions combinatoires et les algorithmes. -- Doctorat ès Sciences, Université de Paris-Sud, Orsay, July 1990.

[12]
Tenenbaum (Gérald). -- Introduction à la théorie analytique des nombres. -- Institut Élie Cartan, Nancy, France, 1990, vol. 13.

This document was translated from LATEX by HEVEA.