Some Properties of the Cantor Distribution

Helmut Prodinger

Technische Universität Wien, Austria

Algorithms Seminar

December 9, 1996

[summary by Julien Clément]

A properly typeset version of this document is available in postscript and in pdf.

If some fonts do not look right on your screen, this might be fixed by configuring your browser (see the documentation here).

Abstract
The Cantor distribution is defined as a random series
1-J
J
 
å
i³1
XiJi,
where J is a parameter and the Xi are random variables that take the values 0 and 1 with probability 1/2. The moments and order statistics are discussed, as well as a ``Fibonacci'' variation. Connections to certain trees and splitting processes are also mentioned.

1   Cantor distribution

1.1   Random series

The Cantor distribution with parameter J (0<J£1/2) was introduced in [5] by the random series
X=
J
J
 
å
i ³ 1
Xi Ji,
where the Xi are independent with the distribution Pr[Xi=0]=Pr[Xi=1]=1/2, and J=1-J. The name stems from the special case J=1/3, since then this process gives exactly those numbers from the interval [0,1] that have a ternary expansion solely consisting of the digits 0 and 2. We might alternatively consider an infinite (random) word w1w2··· over the alphabet {0,1} and a map value, defined by
value(w1 w2 ···)=
J
J
 
å
i ³ 1
wi Ji.

1.2   Moments of the distribution

We abbreviate an=E[Xn]. The aim is to solve the recursion formula (from [5])
an=
1
2(1-Jn)
n-1
å
k=0
æ
è
n
k
ö
ø
J
n-k Jk ak,    a0=1.
Let us introduce the exponential generating function A(z)=ån ³ 0 an zn/n!. The functional equation involving A(z), once solved by iteration, gives
A(z)=
 
Õ
k ³ 0
1+e
J
Jk z
 
2
.
In order to derive an asymptotic equivalent of an, the Poisson generating function B(z)=e-z A(z) has to be considered. Using ``Mellin'' techniques to derive an asymptotic expansion of log B(z) when z tends to infinity and a ``de-poissonization'' argument which suggests the approximation an ~ B(n), one gets
E[Xn]=an= F(log
 
1/J
n)n
-log
 
1/J
2
 
æ
ç
ç
è
1+ O æ
ç
ç
è
1
n
ö
÷
÷
ø
ö
÷
÷
ø
.
The function F(x) is periodic of period 1 and has known Fourier coefficients. The mean of F(x) is for instance
-
1
2log J
ó
õ
¥


0
 
Õ
k ³ 1
1+e
-
J
Jk z
 
2
e
-
J
x
 
x
log
 
1/J
2 -1
 
dx.

1.3   Order statistics

Let us consider n random independent variables Y1,...,Yn from a Cantor distribution. The average value E[min(Y1,...,Yn)] of the smallest value among them is denoted by an. The coefficients an obey the following recursion
(2n-2J)an=
J
+ J
n-1
å
k=1
æ
è
n
k
ö
ø
J ak.
Considering now not exactly the Poisson generating function A(z)=åk³ 0an zn/n! but rather
^
A
 
(z)=
1
ez -1
A(z)=
 
å
n ³ 0
^
a
 
n
zn
n!
,
a simpler equation can be obtained. Indeed, one has
^
A
 
(2z)= J
^
A
 
(z)+
J
ez+1
.
The coefficients a^n can be extracted directly from this equation (equating coefficients of zn/n! on both sides). Going back to the original coefficients an, we have the explicit solution
an=-
J
n-1
å
k=0
æ
è
n
k
ö
ø
Bk+1
k+1
2k+1-1
2k-J
,
where Bn denotes a Bernoulli number. An approach based on Rice's method finally gives an asymptotic equivalent of an
an ~ n
log2 J
 
2 J -1
J log 2
( G(-log2 J) z(-log2 J) + d(log2 n) ) ,
where z(s), G(s) and d(s) denote respectively the Riemann's zeta function, the gamma function and a periodic function with period 1 and a very small amplitude (provided J is not too close to 0).

2   Cantor-Fibonacci distribution

2.1   Fibonacci restriction

The Cantor distribution might be viewed as a mapping value over a set of random words over a binary alphabet. We might also think about restricted words, according to the Fibonacci restriction, that two adjacent letters `1' are not allowed. The set of (finite) Fibonacci words F is given by
F={0,01}* {e+1}.
In the original setting (Cantor distribution) probabilities are simply introduced by saying that each letter of {0,1} can appear with probability 1/2. Here the situation is more complicated. We say that each word of Fibonacci of length m is equally likely. There are Fm+2 such words, with Fm+2 denoting the (m+2)th Fibonacci number. As an example, consider the classical Cantor case with J=1/3 and m=3. Then the values
value(000)=0,    value(001)=
2
27
,    value(010)=
2
9
,    value(100)=
2
3
,    value(101)=
20
27
appear, each with probability 1/5. The generating function F(z) of Fibonacci words, according to their lengths is easily derived from the definition of F above,
F(z)=
1+z
1-z-z2
=
 
å
m³ 0
Fm+2zm.
Note that
Fn=
1
(5)1/2
( an-bn )     with     a=
1+(5)1/2
2
  and   b=
1-(5)1/2
2
.

2.2   Moments of the Cantor-Fibonacci distribution

Let us consider the generating functions
Gn(z):=
 
å
w Î F
( value (w) )
n
 
 
z|w|,
where |w| denotes the length of the Fibonacci word w. The quantity
[zm]Gn(z)
[zm]F(z)
is the nth moment, when considering words of length m. Then we let m tend to infinity to get a limit called Mn (note that taking limits wasn't necessary for the independent original case). The recursion for value, when restricted to Fibonacci words, is
value(0w) =J · value(w)
value(10w)
=
J
+J2· value(w).
These formulae translate almost directly to generating functions according to the recursive definition F=e + 1 +{0,10} F. Thus it gives an explicit recursion formula for the functions Gn(z)
Gn(z)=
1
1-
J
n z-J2n z2
é
ê
ê
ë
J
n z+ z2
n-1
å
i=0
æ
è
n
i
ö
ø
J
n-i J2iGi(z) ù
ú
ú
û
.
Since we only consider the limit for m ® ¥, we can get the asymptotic behaviour noting that both F(z) and Gn(z) have the same dominant singularity at z=1/a and also that it is a simple pole. Consequently, we have (due to a ``pole cancellation'')
Mn=
 
lim
m ® ¥
[zm] Gn(z)
[zm]F(z)
=
 
lim
z ® 1/a
Gn(z)
F(z)
.
Therefore we have the following theorem
Theorem 1   The moments of the Cantor-Fibonacci distribution fulfill the following recursion: M0=0 and for n ³ 1
Mn=
1
a2-a Jn-J2n
n
å
i=1
æ
è
n
i
ö
ø
J
n-i J2iMi.

2.3   The asymptotic behaviour of the moments

A rough estimate shows that Mn » ln. We might infer that l=J+lJ2, so that l=1/1+J. It is not rigourous but we can set
mn:=Mn · (1+J)n
anyway and show that this sequence has nicer properties. As before the recurrence on the coefficients mn and then the exponential generating function m(z)=ån mn zn/n! need to be considered. Finally the Poisson transformed function m^ (z)=e-z m(z) obeys the functional equation
^
m
 
(z)=
e
-
J
z
 
a
^
m
 
(J z)+
1
a2
^
m
 
(J2z).
Because mn ~ m^(n), the next step considers the behaviour of m^(z) for z ® ¥. Using the Mellin transform (and the Mellin inversion formula), we have the following theorem
Theorem 2   The nth moment Mn of the Cantor-Fibonacci distribution has for n ® ¥ the following asymptotic behaviour
Mn= æ
è
1+
J
ö
ø
-n

 
F(-log
 
J
n) n
log
 
J
a
 
æ
ç
ç
è
1+O æ
ç
ç
è
1
n
ö
÷
÷
ø
ö
÷
÷
ø
,
where F(x) is a periodic function with period 1 and known Fourier coefficients. The mean (zeroth Fourier coefficient) is given by
-
1
log J
ó
õ
¥


0
e
-
J
z
 
a
^
m
 
(J z) z
-log
 
J
a -1
 
dz.
Note that here, e-Jz/am^(J z) is merely considered as an auxiliary function. This integral can be computed numerically by replacing m^(J z) by the first few values of its Taylor expansion, which can be obtained through the recursion formula on the coefficient mn. As an example, the classical case J=1/3 gives (apart from small fluctuations),
Mn ~ .6160498 n-.4380178  0.75n.
The fact that in an asymptotic formula the generating function itself, evaluated at a certain point, appears is not at all uncommon in combinatorial analysis.

References

[1]
Flajolet (Philippe), Gourdon (Xavier), and Dumas (Philippe). -- Mellin transforms and asymptotics: harmonic sums. Theoretical Computer Science, vol. 144, n°1-2, 1995, pp. 3--58. -- Special volume on mathematical analysis of algorithms.

[2]
Flajolet (Philippe) and Sedgewick (Robert). -- Mellin transforms and asymptotics: finite differences and Rice's integrals. Theoretical Computer Science, vol. 144, n°1-2, 1995, pp. 101--124. -- Special volume on mathematical analysis of algorithms.

[3]
Grabner (P. J.) and Prodinger (H.). -- Asymptotic analysis of the moments of the Cantor distribution. Statistics & Probability Letters, vol. 26, n°3, 1996, pp. 243--248.

[4]
Knopfmacher (Arnold) and Prodinger (Helmut). -- Explicit and asymptotic formulae for the expected values of the order statistics of the Cantor distribution. Statistics & Probability Letters, vol. 27, n°2, 1996, pp. 189--194.

[5]
Lad (F. R.) and Taylor (W. F. C.). -- The moments of the Cantor distribution. Statistics & Probability Letters, vol. 13, n°4, 1992, pp. 307--310.

[6]
Prodinger (H.). -- The Cantor-Fibonacci distribution. Applications of Fibonacci numbers, 1998. -- To appear.

This document was translated from LATEX by HEVEA.