Random Group Automata
Cyril Nicaud
LIAFA, Université Paris 7
Algorithms Seminar
February 21, 2000
[summary by Marianne Durand]
A properly typeset version of this document is available in
postscript and in
pdf.
If some fonts do not look right on your screen, this might be
fixed by configuring your browser (see the documentation here).
Abstract
A group automaton is a complete deterministic automaton such that each
letter of the alphabet acts on the set of states as a
permutation [1, 5]. The aim is to describe an
algorithm for the random generation of a minimal group automaton with
n states. The treatment is largely based on properties of random
permutations and random automata.
1 Properties
A group automaton is a complete deterministic automaton such that each
letter of the alphabet acts on the set of states as a
permutation [1, 5]. We consider a group automaton
A, with states 1, 2, ..., n. The state 1 is the
initial state; the set of final states is denoted by F, the alphabet
by a, b, ..., and the transitions by q_{2}=d(q_{1},a) or
equivalently (q_{1},a,q_{2}).
Figure 1: A group automaton.
Let us recall that two states q_{1} and q_{2} of an automaton are
equivalent, notationally q_{1} ~ q_{2}, if for every word u, the
state d(q_{1},u) belongs to F if and only if d(q_{2},u)
belongs to F. The automaton A is minimal if A
has no distinct equivalent states. The structure properties of group
automata are: the minimal automaton of a group automaton is a group
automaton; the set of group automata is closed under union,
intersection and complementation but it is not closed under star and
product. As each letter acts like a permutation on the set of states,
there cannot exist two transitions (q_{1},a,q) and (q_{2},a,q) with
q_{1} and q_{2} distinct. This means that there is a ``reversibility''
property because when the automaton is in a state q after reading a
word u, it is possible to retrace the path followed.
We are now interested in the connexity of an automaton. An automaton
is connected if for any state q, there is a path joining the initial
state to q. Because of the reversibility property, if a group
automaton is connected then it is strongly connected, which means that
for any states q and q', there is a path from q to q'. A
group automaton is defined by the k permutations coding the
transitions and by the set F, where k is the cardinality of the
alphabet, so there are (2^{n}1)n!^{k} group automata. We show that, if
the alphabet has at least two letters, almost all group automata on
n states are connected. In order to do this we first state the fact
that given two permutations s and a the generated group
ás,añ is almost surely transitive. This can be
shown by a simple combinatorial argument. Take two letters a
and b and consider s_{a} the permutation related to a and
s_{b} the one related to b; then as
ás_{a},s_{b}ñ is almost always transitive the
automaton is almost always connected. We even have an asymptotic
estimate if the alphabet has exactly two letters:
Card(not connected group automata) 

Card(group automata) 

~ 

. 
2 Minimality
We now have to study the minimality of the automaton. An important
theorem is that almost all connected group automata are minimal. The
proof is partially based on the study of the oneletter case: if the
automaton is connected, then as there is only one letter a, the
permutation induced by a is a circular permutation. It is minimal
if it is not stable under a rotation which is equivalent to saying
that the word u=1···d^{k}(1,a)···d^{n1}(1,a) is
not a nontrivial factor of uu. Then in this case by counting the
words corresponding to minimal circular permutations we show that
almost all connected automata are minimal on a oneletter alphabet.
If the alphabet has more than one letter, we observe that for almost
all group automata, there is a letter a such that the permutation
induced by a on the set of states has only one cycle of maximum
length [3]. More precisely, we have the following lemma:
Lemma 1
The probability that a permutation s of size n has more than
two cycles of maximum length is o(1).
Proof.
Let c_{n,m} be the probability that a permutation of size n has
exactly two maximal cycles of size m+1. We note the generating
function C_{m}(z)=å_{n=0}^{¥}c_{n,m}z^{n} and c_{n}=å_{m£
n/2}c_{n,m}. The following equality holds:
C_{m}(z)= 

e^{z}··· e 

=



exp 
( 
r_{m}(z) 
) 
where r_{m}(z)=å_{n>m}z^{n}/n is the remainder of the
generating function of the logarithm. In order to get the coefficient
c_{n,m} we apply Cauchy's formula:
c_{n,m}= 

ó õ 



exp 
( 
r_{m}(z) 
) 

where C is a path around the origin. We choose for this
path a circle around the origin defined by: z=e^{1/n} and
we set z=e^{p/n} for a change of variable. So we have
c_{n,m}= 

ó õ 




e^{p(2m+2)/n} 

2(m+1)^{2} 


dp 
We now need to approximate some of the quantities in the integral, for
this we use a technique and a few lemmas provided in [2].
We first have the relations
r_{m}(e^{p})=E(mp)+O 
æ ç ç è 

ö ÷ ÷ ø 
and


= 

+ 

y 
æ ç ç è 

ö ÷ ÷ ø 
(1) 
with E(x)=ò_{x}^{¥}e^{v}/v dv and
y(z)=1/1e^{z}1/z, and where the error
term O(exp(mp)/M) is moreover uniform over Â(p)>0 and
Á(p)£p.
Property 1
For all a>0, the function e^{aE(u)} is bounded on Â(u)>0.
The relations 1 allow us to write, after we
set µ=m/n:
c_{n,m} 
= 

ó õ 

exp 
æ ç ç è 
E(µ
p)+O 
æ ç ç è 

ö ÷ ÷ ø 
ö ÷ ÷ ø 
æ ç ç è 

+ 

y 
æ ç ç è 

ö ÷ ÷ ø 
ö ÷ ÷ ø 
e^{p} e^{p(2m+2)/n} 

2(m+1)^{2} 

dp 


= 

ó õ 

exp 
( 
E(µ p) 
) 

æ ç ç è 

+ 

y 
æ ç ç è 

ö ÷ ÷ ø 
+O 
æ ç ç è 

ö ÷ ÷ ø 
ö ÷ ÷ ø 
e^{p} e^{p(2m+2)/n} 

2(m+1)^{2} 

dp. 

This rewrites as c_{n,m}=I_{1}+I_{2}+I_{3} where
I_{1} 
= 

ó õ 

exp 
( 
E(µ p) 
) 


e^{p} e^{p(2m+2)/n} 

2(m+1)^{2} 

dp, 

I_{2} 
= 

ó õ 

exp 
( 
E(µ p) 
) 

y 
æ ç ç è 

ö ÷ ÷ ø 
e^{p} e^{p(2m+2)/n} 

2(m+1)^{2} 

dp, 

I_{3} 
= 


ó õ 

exp 
( 
E(µ p) 
) 
O 
æ ç ç è 

ö ÷ ÷ ø 
e^{p} e^{p(2m+2)/n} 

2(m+1)^{2} 

dp. 

To study these three expressions, we use the fact that the quantities
exp(E(µ p)) (Property 1) and e^{p}
e^{p(2m+2)/n} are bounded uniformly on m . This helps us
to give an upper bound for these three expressions: first,
and this approximation is uniform on m. Second
I_{2}= 
ó õ 

O(1) 

y 
æ ç ç è 

ö ÷ ÷ ø 

dp 
as y is also bounded uniformly on m we have
I_{2}= 


ó õ 

O(1) dp=O 
æ ç ç è 

ö ÷ ÷ ø 
. 
Third, as in the case of I_{1}, we obtain
I_{3}= 

ó õ 

O 
æ ç ç è 

ö ÷ ÷ ø 

dp=O 
æ ç ç è 

ö ÷ ÷ ø 
. 
Combining these estimates we obtain
c_{n,m}=O(log n/m^{2}) uniformly on m. The
approximation is going to be useful when m is greater than (n)^{1/2}; otherwise we use the following lemma: Lemma 2 The
probability that a permutation s of size n has a maximal
cycle of length smaller than (n)^{1/2} is o(1).
Proof.
Let p_{n,m} be the probability that a permutation of size n has all its cycles
of size smaller than m. The saddlepoint method gives us an upper bound for
the quantity p_{n,m}. Then we have
p_{n,m}=[z^{n}]e 

£ 

where
l_{m}(z)=z+...+ 

. 
The saddlepoint method drives
us to apply this inequality to the value r=n^{1/3m} chosen
to fit the minimum, which gives
p_{n,m}£


,
so 
p 

£ e 

=o(1). 
The probability that a permutation has two maximal cycles of size m
is bounded by the probability that a permutation has one maximal cycle
of size m. Therefore the probability that a permutation of size n
has two maximal cycles of size smaller than (n)^{1/2} is o(1). So
c_{n}=o(1)+å_{m=(n)}^{1/2}^{m=n/2}c_{n,m}=o(1) by the approximation
c_{n,m}=O(log n/m^{2}). Lemma 1 directly follows
by showing that almost all permutations of size n having at least
two maximal cycles have exactly two maximal cycles.
We define E_{n} as the set of group automata A of
size n that are connected and with the property that there exists
one letter a such that the permutation induced by a has only one
maximal cycle. By Lemma 1, we show that almost all connected group
automata belong to E_{n}. Furthermore, if A
belongs to E_{n} then we can show that the maximal cycle
of s_{a} does not interfere with other cycles, because of their
different cardinalities and so we can use the oneletter case, and say
that this maximal cycle is almost always minimal. As the automaton
considered is connected, this implies that the automaton is minimal.
So we have the following result:
Theorem 1
Almost all group automata are minimal.
Proof.
E_{n}ÌMinimal_{n}ÌConnected_{n}ÌGroup
Automaton_{n}, and we have proved that almost every group automaton is
in E_{n}.
3 Algorithm
This work naturally leads to an algorithm for generating uniformly at
random a minimal connected group automata. Here the cardinality of the
alphabet is bounded. The size of an automaton is the number n of
states of its minimal automaton. The algorithm is:

generate a random group automaton A using a function
returning a random permutation for each letter of the alphabet. The
cost is O(n);

test if AÎ E_{n}, if not use Hopcroft's algorithm
to check if it is minimal. Since Hopcroft is used rarely, the cost
is O(n);
this being done a constant number of time on average, because of the
theorem above.
This yields a linear complexity in the average case, which is better
than the best known algorithm by Hopcroft [4] which has
complexity nlog n.
References
 [1]

Eilenberg (Samuel). 
Automata, languages, and machines. Vol. A. 
Academic Press, New York, 1974, xvi+451p. Pure and
Applied Mathematics, Vol. 58.
 [2]

Gourdon (Xavier). 
Combinatoire, algorithmique et géométrie des
polynômes. 
Thèse, École polytechnique, 1996.
 [3]

Gourdon (Xavier). 
Largest component in random combinatorial structures. In Proceedings of the 7th Conference on Formal Power Series and Algebraic
Combinatorics (NoisyleGrand, 1995), vol. 180, pp. 185209. 
1998.
 [4]

Hopcroft (John). 
An n log n algorithm for minimizing states in a finite automaton.
In Theory of machines and computations (Proc. Internat. Sympos.,
Technion, Haifa, 1971). pp. 189196. 
Academic Press, New York, 1971.
 [5]

Hopcroft (John E.) and Ullman (Jeffrey D.). 
Introduction to automata theory, languages, and computation. 
AddisonWesley Publishing Co., Reading, Mass., 1979,
x+418p. AddisonWesley Series in Computer Science.
This document was translated from L^{A}T_{E}X by
H^{E}V^{E}A.