Distribution of Image Points in Random Mappings

Distribution of Image Points in Random Mappings

Michèle Soria

Université Paris VI

Algorithms Seminar

November 10, 1996

[summary by Pierre Nicodème]

A properly typeset version of this document is available in postscript and in pdf.

If some fonts do not look right on your screen, this might be fixed by configuring your browser (see the documentation here).

Abstract

This talk presents a general theorem which can be used to identify the limiting distribution for a class of combinatorial schemata. For example, many parameters in random mappings can be covered in this way.

1 Methods

We consider the general working scheme ``Symbolic Structures A or { A,w} ® Generating Functions a(z) or a(u,z) ® a_n or a_n,k''. Then by Cauchy's formula, we get for structures A

a(z) =

a Î A

|a|

|a|!

n³ 0

a_n

zⁿ

a_n

2ip

ó
õ a(z)

zⁿ⁺¹

When considering marked structures with parameters { A,w}, (w is a mapping A ® N), we have

a(u,z)=

aÎ A

w(a)

|a|

|a|!

n,k

a_n,ku^k

zⁿ

In this case, a_n,k can be obtained by double Cauchy inversion, or by Cauchy inversion and Continuity Theorem. Table 1 gives some examples of translation of marked combinatorial structures to generating functions. The mark is represented by character ``·'' and translated to parameter u.

Description Structure Generating Function

Degree at the root in Cayley trees A = Node ×Set(· A) a(u,z) = z exp(ua(z))

Random Mappings G = Set(Cycle( A)) g(z)=1/1-a(z)

--- by number of cycles G = Set(·Cycle( A)) g(u,z)= exp(ulog1/1-a(z))

--- by number of trees G = Set(Cycle(· A)) g(u,z) = 1/1-u a(z)

Table 1: Some examples of generating functions

By a classical theorem about characteristic functions (X_n) converges weakly to Y if and only if f_X_{_n}(q) converges to f_Y(q) for all q, with f_Z = E(e^{iq Z}). We also have a(u,z)=å_n,ku^k zⁿ/n! = å_n p_n(u)zⁿ/n!, which gives the probability generating function of X_n as p_n(u)/p_n(1) = å_n Pr(X_n =k)u^k. We refer to [2] for the concept of (labelled) combinatorial structures and their translation to generating functions.

2 Trees and Random Mappings

A random mapping is an arbitrary mapping f: {1,...,n}®{1,...,n} such that every mapping has probability n^-n. A mapping f can be identified to its functional graph G_f with vertices {1,...,n} and edges (i,f(i)), for 1³ i ³ n. Each component of G_f consists of a cycle and every cyclic point is the root of a tree.

The basic property for analysis is that solutions of functional equations usually have algebraic singularity of square-root type. For trees, we get a(u,z) = t(u,z) -h(u,z)(1-z/r(u))^1/2. For sequences of trees, we get an expression of the form 1/(1-a(u,z)), and for random mappings an expression of the form

s(u,z) =

1-T(u,z)+h(u,z)(1-z/r(u))^1/2

We recall that when we get an expression of the form 1/(1-uC(z)), the asymptotic distribution of the corresponding random variable depends on the value C(r_c), where r_c is the only singularity on the circle of convergence of C(z). If C(r_c)>1, the limit law is normal; if C(r_c)<1, the limit law is derivative of geometric, and if C(r_c)=1 the limit law is Rayleigh.

3 Examples

Leaves.

For Cayley trees, we have a(u,z) = z e^a(u,z) + z(u-1), for sequences of trees, s(u,z)=1/(1-a(u,z)), and for functional graphs

g(u,z) =

1-ze^a(u,z)

1-a(u,z)+z(u-1)

Nodes of arity r.

For trees,

a(u,z) = z

æ
ç
ç
è

m¹ r

a^m(u,z)

+ u

a^r(u,z)

ö
÷
÷
ø

= z e^a(u,z)+z(u-1)

a^r(u,z)

For sequences of trees, we have s(u,z)=1/(1-a(u,z)), and for functional graphs,

g(u,z) =

1-a(u,z)+z(u-1)

æ
ç
ç
è

a^r-1

(r-1)!

a^r

ö
÷
÷
ø

Nodes at distance d from a cycle.

We have the recurrence

a₀(u,z)=ua(z), a_d+1(u,z)=ze

a_d(u,z)

For functional graphs, this gives g(u,z) =1/(1-a_d(u,z)).

Nodes with r pre-images in total.

For trees, we have a(u,z) = ze^a(u,z)+(u-1)a_r+1z^r+1, where a_r+1 = (r+1)^r is the number of trees of size r+1. For functional graphs, we have G=Set(Cycle( A)), which translates to g(z) = exp(å_{p³ 0}a^p(z)/p). This gives

g(u,z) =

1-ze^a(u,z)

exp

æ
ç
ç
è

z^r

(u^p-1)

r^r-p

(r-p)!

ö
÷
÷
ø

K(u,z)

1-a(u,z)+(u-1)a_r+1z^r+1

Nodes d iterated.

(These nodes are at distance ³ d from a leaf.) For trees, we have

a_d(u,z)=xue

a_d(u,z)

-(u-1)l_d(z) with

l₀(z)=0, l_d+1(z)=ze

l_d(z)

For functional graphs, we have, for nodes at distance ³ d of a leaf of their sub-tree, s_d(u,z) = 1/(1-a_d(u,z)). For nodes at distance ³ d of a leaf, we have

g_d(u,z) =

1-uze

a_d(u,z)

1-a_d(u,z)-(u-1)l_d(z)

4 A classification for limit laws of random mappings parameters

We begin with a proposition which applies to functional equations of trees.

Proposition 1 Let F(u,z,a(u,z)) be a power series in three variables with non-negative coefficients and F(0,0,0) = 0. Suppose that the system of equations {t=F(1,r,t), 1=F'_a(1,r,t)} has positive solutions r and t such that F'_z(1,r,t)¹ 0 and F''_aa(1,r,t)¹ 0. Then, F(u,z,a)=0 has for solution

a(u,z) = t(u,z)-h(u,z)(1-z/r(u))^1/2,

with t,h,r analytic,

t(1,r(1))=t(1)º t, r(1)=r and h(1,r(1))=(

2r F'_z(1,r,t)

F''_aa(1,r,t)

)^1/2.

We arrive to a general theorem which seems to be the proper theorem to discuss random mappings. We consider a generating function g(u,z)=å_n,kg_n,ku^k zⁿ corresponding for variables X_n to a probability distribution Pr(X_n=k)=g_n,k/g_n. We consider a local expansion in the neighbourhood of u=1,z=r(u), of the form

g(u,z)=

1-T(u,z)+h(u,z)(1-z/r(u))^1/2

T, h and r are analytic and T(1,r)=1.

Theorem 1 With these hypotheses (T, h, r analytic and T(1,r)=1),

If r'(u) = 0 and T'_u(1,r)>0, then X_n/(n)^1/2® R(l), where l=1/2(h(r,1)/T'_u(r,1))² and R(l) is the Rayleigh distribution of density l x exp(-l/2x²). Moreover E(X_n)»(p n/2l)^1/2 and Var(X_n) » (2-p/2)n/l.
If r'(1)¹ 0 and T'_u(1,r)=0, then X_n-µ n/(s² n)^1/2 ® N(0,1), where µ = -r'(1)/r(1) and s²=µ²+µ-r''(1)/r(1). Moreover E(X_n)» µ n and Var(X_n)» s² n.
If r'(1) ¹ 0 and T'_u(1,r)¹ 0, then X_n - µ n/(s² n)^1/2® N(0,1)* R(s²l), where µ and s are defined as in (2), l is defined as in (1) and the star operator represents the convolution operation.

Remark 1 If T(1,r) ¹ 1, then X_n-µ_n/(s² n)^1/2® N(0,1), (except if r'(u) = 0 and T(1,r)<1, in which case X_n ® d G, derivative of a geometric law).

The density and characteristic functions in these different cases are as follows.

R (Rayleigh) f_R_(l)(x)=l xe^{-l x}^²^/2, and f_R(q)= 1+iq(p/2)^1/2e^-q^²^/2(1-ierf(q/(2)^1/2)).
N (Normal) f_N(x)=1/(2p)^1/2e^-x^²^/2 and f_N(q)=e^-q^²^/2.
N* R (Normal conv. Rayleigh) f_N_*_R(x)=(e^-x^²^/4-e^-x^²^/2)/(2p)^1/2+xe^-x^²^/4/2(2)^1/2erf(x/2) and f_N_*_R(q)= f_N(q)× f_R(q).

Proof.(Sketch) Let g(u,z) = å_{n³ 0}p_n(u)zⁿ/n! with p_n(1)=g_n. The proof rests on the convergence of the corresponding characteristic functions to (1) f_R(q), (2) e^-q^²^/2, (3) e^-q^²^/2× f_R(q). For instance, in case (1), the characteristic function p_n(e^iq/(n)^{^1/2})/g_n converges to f_R(q). The proofs in the different cases make use of Cauchy inversions along suitable contours of the complex plane [1].

5 Applications

We note X_n the law of X_n-µ n/(s² n)^1/2.

Leaves.

We have a(z)=t(u,z)-h(u,z)(1-z/r(u))^1/2. This gives {t=r e^t+(u-1)r, 1=r e^t}, which gives {t(1,r)º t(1)=1, r(1)=r}, and also by differentiation wrt u {t'=(r e^t)' + r+(u-1)r', 0=(r e^t)'}, these two last equations give {t'(1)=r, r'(1)=-r²¹ 0}. This gives for sequences of trees t(1,r)=1, r'(1)¹ 0, t'_u(1,r)¹ 0, and therefore X_n ® N* R. This also gives for functional graphs, with T(u,z)=t(u,z)-(u-1)z, T(1,r)=1,r'(1)¹ 0, T'_u(1,r)=t'(1)-r=0, and therefore X_n ® N.

Nodes with in-degree r.

As before, a(z)=t(u,z)-h(u,z)(1-z/r(u))^1/2. We have {t = r e^t+r(u-1)t^r/r!, 1=r e^t+r(u-1)t^r-1/(r-1)!}. This gives t(1)=1 and r(1) = r. By differentiation wrt u, we obtain t'(1) = r(1/r!-1/(r-1)!) and r'(1) = -r²/r!¹ 0. For sequences of trees, we get t(1,r)=1, r'(1)¹ 0 and, if r³ 2, t'_u(1,r)¹ 0, which implies X_n ® N* R. If r=1, the limit law is normal. For functional graphs, we have T(u,z)=t(u,z)-z(u-1)(a^r-1/(r-1)!-a^r/r!). We get T(1,r)=1,r'(1) ¹ 0, and T'_u(1,r)=0, which implies that X_n ® N.

Nodes at distance d from a cycle.

We have a_d(u,z) =t_d(u,z)-c_d(u,z)(1-ez)^1/2, with t₀(z)=ug(z), t_d(u,z)=ze^t_^d-1^(u,z), c₀(z)=uk(z), c_d(u,z)=t_d(u,z)c_d-1(u,z). This gives r'=0, t_d(1,r)=1, t'_d(1,r)=1. Applying this results to g(u,z)=1/(1-a_d(u,z)), we get T(1,r)=1, T'_u(1,r)¹ 0,r=Cst, which implies that X_n® R.

Nodes with in-degree r.

(Same method.) We have for sequences of Cayley trees x_n® N* R, and for functional graphs X_n® N.

Nodes at distance ³ d from a leaf.

(Same method.) From a leaf of their own subtree (sequences of Cayley trees), X_n® N* R. In the general case, X_n® N.

Nodes at distance d from a leaf.

(Same method.) If the path contains no cyclic edge, X_n® R* N (except if d=1, in which case X_n® N). If cyclic edges are allowed, for d£2, we have X_n® N. (Conjecture: this last result is true for all d.)

References

[1]: Drmota (Michael) and Soria (Michèle). -- Images and preimages in random mappings. SIAM Journal on Discrete Mathematics, vol. 10, n°2, May 1997, pp. 246--269.
[2]: Vitter (Jeffrey Scott) and Flajolet (Philippe). -- Analysis of algorithms and data structures. In van Leeuwen (J.) (editor), Handbook of Theoretical Computer Science, Chapter 9, pp. 431--524. -- North Holland, 1990.

This document was translated from L^AT_EX by H^EV^EA.