Dynamical Systems and Average-Case Analysis of General Tries

Dynamical Systems and Average-Case Analysis of General Tries

Brigitte Vallée

GREYC, Université de Caen, France

Algorithms Seminar

June 9, 1997

[summary by Julien Clément]

A properly typeset version of this document is available in postscript and in pdf.

If some fonts do not look right on your screen, this might be fixed by configuring your browser (see the documentation here).

Abstract

The three major parameters of a trie (sometimes called digital tree), number of nodes, path length, and height, are analyzed precisely in a general context where words are emitted by a source associated to a dynamical system. The results can all be stated in terms of two intrinsic characteristics of the source: the entropy and the probability of letter coincidence. These characteristics themselves are linked in a natural way to spectral properties of a Ruelle operator associated to the dynamical system.

1 Probabilistic dynamical sources

1.1 Definitions

A dynamical source, in the context of information theory, is a mechanism which produces infinite words over an alphabet M. Such a system is defined by four elements: (i) an alphabet M included in N, (ii) a quasi partition of I=]0,1[ with intervals I_m, m Î M, (iii) a mapping s: I ® M which is constant over each I_m and equal to m and finally (iv) a mapping T: I ® I which satisfies two properties: the restriction of T to I_m is a real analytic bijection from I_m to I; the mapping T is expansive, i.e. |T'(x)|>1 on I. The words emitted by the source are produced by iterating T and coded thanks to s. The word M(x) of M^¥ (an infinite sequence of symbols), where xÎ I, is formed with the symbols

M(x):=(s(x),s(T(x)),s(T²(x)),...,s(T^k(x)),...).

Each letter of the alphabet is associated to a distinct branch of T or equivalently to a distinct inverse branch of T denoted by h_m. The bijection h_m: I ® I_m coincides with the inverse of the restriction of T to I_m.

1.2 Examples

Figure 1: Graphical representation of the mappings T and s for a source based on the continued fraction expansion (left) and for a memoryless source based on the 5-ary expansion of numbers (right).

A lot of probabilistic dynamical sources can be described in such a framework, including all memoryless sources where letters of the alphabet can be emitted with probabilities {p_i} independently of previous letters. This gives a mapping T where branches are affine. In particular this encompasses the b-ary expansion model of numbers. Markov sources take into account a finite past for producing words and thus generalize memoryless sources. Finally, in a model where the source is based upon the continued fraction expansion of numbers, the alphabet is N and the probability for a character to be emitted depends on all the previous history.

1.3 Fundamental intervals, entropy, coincidence probability

Numbers which share the same prefix expansion m₁,...,m_k form an interval called fundamental interval of depth k which is exactly, with the preceding notations,

m₁,...,m_k

m₁

° ... ° h

m_k

( I).

In a general context, the interval I is endowed with a continuous density f (so that F denotes the associated distribution). Then the word M(x) is produced according to the expansion process (using T and s). In this context, the measure u_h of a fundamental interval I_h associated to h=h_m_₁ ° ... ° h_m_{_k} is

u_h:= |F(h(0))-F(h(1))|,

and plays a crucial role since it is the probability that a word begins with a certain prefix. During the analysis, two quantities relative to the source appear naturally. The entropy h( S,F) and the coincidence probability c( S,F) are defined as the limits

h( S

,F):=

lim

k ® ¥

-1

|h|=k

u_h log u_h, c( S

,F):=

lim

k ® ¥

é
ê
ê
ë

|h|=k

u_h log u_h

ù
ú
ú
û

^1/k.

It is interesting to note that these limits exist and are independent of the distribution F.

2 Tries associated to a general source

Let M Î N be a set of elements called digits, and M^¥ the set of all infinite sequences built over M. For any word produced by the dynamical source M(x)=(s(x),s(Tx),s(T²x),...) (with x Î I), the head and tail functions are defined by

head(M(x))=s(x), tail(M(x))=M(Tx).

Any finite set of infinite words produced by the same source can be written as M(X)={M(x) |x Î X}, and one associates to X a trie, Trie(X), defined by the following recursive rules

(R1) If X={x} has cardinality equal to 1, then Trie(X) consists of a single leaf node that contains M(x).
(R2) If X has cardinality at least 2, then Trie(X) is an internal node represented generically by `o' to which are attached l subtrees, where l=card{s(x) |x Î X} is the number of different head symbols in M(X). Let b₁<...< b_l be these head symbols; Trie(X) is defined by

Trie(X)=á o,Trie(X₁),...,Trie (X

l
)ñ, where X_j={Tx| s(x)=b_j, x Î X}.
The trie Trie(X_j) collects all the suffixes of words that begin with b_j.

2.1 Parameters

The main parameters of a trie are size (number of internal nodes), height and external length path, which is the sum of all links from the root to each leaf.

The model considers here infinite strings emitted independently by the same dynamical source. Rather than considering a fixed number n of strings, a Poisson model of rate n is used, where the number of strings N is also a random variable which strongly concentrates around n. The strong property of independence of this particular model makes the analysis easier and gives access to the expectations of parameters. The expectations of height, size and external path length under a Poisson model of rate n and distribution F over I are respectively

D(n)=

[1-

|h|=k

(1+nu_h)e

-n u_h

], S(n)=

[1-(1+n u_h) e

-n u_h

], P(n)=

n u_h [1-e

-n u_h

2.2 Asymptotics

These quantities are easily recognized as harmonic sums of the form F(x)=å_{k Î K} l_k f(µ_k x) (excepted for the height which needs a small calculation step before). The best tool to analyze the asymptotics of such harmonic sums is the Mellin transform, which leads to locating poles of the associated Dirichlet series L(s)=å_{k Î K} l_k µ_k^s. Here, the key quantity for the analysis of size and path length is the series of fundamental intervals,

L(F,s)=

u_h^s=

|F(h(0))-F(h(1))|^s,

considered for complex values of s. For a general source, it is not always easy (or possible) to locate precisely the singularities. In this case a Tauberian theorem, under some constraints, can be used to extract the asymptotic expansion.

3 Generalized Ruelle operators

3.1 Presentation

The generalized Ruelle operators are derived from the original Ruelle operators, and involve secants of the inverse branches H(u,v):=|(h(u)-h(v))/(u-v)|. The generalized Ruelle operators are defined by

_s[F](u,v)=

|h|=1

(u,v)^s F(h(u),h(s)),

where H^~ is the analytic extension of H and s is a complex parameter. Here the sum is taken over branches of depth 1. If we define the secant L of the distribution F,

L(x,y)=

½
½
½
½

F(x)-F(y)

x-y

½
½
½
½

then the Dirichlet series can be expressed as

L(F,s)=

u_h^s=

|F(h(0))-F(h(1))|^s= (I-G_s)^-1[L^s](0,1).

3.2 Singularities of the quasi inverse (I-G_s)^-1

Singularities of (I-G_s)^-1 are of special interest because these are also singularities of L(F,s). These singularities arise for values of s where G_s has an eigenvalue equal to 1. In particular, there is always a pole at s=1 (easy to prove from the previous form of L(F,s)). One can derive from strong properties of the operator G_s that there are three different cases, called periodic, quasi-periodic and aperiodic, depending on the precise nature of the eigenvalues of G_s on the line Â(s)=1. The operator G_s has also special properties at s=1 and s=2 since the entropy of the source is h( S)=-l'(1) (the derivative of the dominant eigenvalue at s=1), while the coincidence probability is c( S)=l(2).

4 Average-case analysis of general tries

4.1 Analysis of height

The analysis of height is based on estimates of the individual probabilities p_k(n)=Õ_|h|=k (1+n u_h) e^{-n u}_^h followed by a Mellin analysis. This leads to the asymptotic expansion

D(n)=

| log c( S)|

log n +P_F(log n)+ g+A_F+o(1)

4.2 Analysis of size and path length

The operator (I-G_s)^-1 has a simple pole at s=1, and thus gives the main term of the asymptotic expansion. For a general source, a Tauberian theorem can be applied to estimate the contribution of others poles. Finally one has

P(n)=

h( S)

n log n + o(n log n), S(n)=

h( S)

n + o(n).

5 Some important particular cases

5.1 Bernoulli sources

The Bernoulli source considers a finite alphabet M={1,..., r} with probability of emission {p₁,...,p_r} (with p₁+...+p_r=1). In this case the entropy and the coincidence probability have classical expressions

H=-

i=1

p_i log

p_i, c=

i=1

p_i².

5.2 Continued fraction source

The continued fraction expansion of numbers can be considered as a dynamical source over the infinite alphabet M=N. The operator of Ruelle is then called the Ruelle-Mayer operator and is defined by

_s[f](z)=

m ³ 1

(m+z)^s

m+z

The entropy of the source is linked to the so-called Lévy's constant which plays a central rôle in the the analysis of the Euclidean algorithm. The coincidence probability is a constant which intervenes in two-dimensional generalizations of the Euclidean algorithm. These constants are

l'(1)=

p²

6 log 2

, l(2) ~ 0.1994.

References

[1]: Brigitte Vallée. -- Dynamical systems and average-case analysis of general tries. -- Les cahiers du GREYC, 1997.
[2]: Flajolet (Philippe) and Vallée (Brigitte). -- Continued Fraction Algorithms, Functional Operators, and Structure Constants. -- Research Report n°2931, Institut National de Recherche en Informatique et en Automatique, July 1996. 33 pages. Invited lecture at the 7th Fibonacci Conference, Graz, July 1996; to appear in Theoretical Computer Science.
[3]: Hervé Daudé (Philippe Flajolet) and Vallée (Brigitte). -- An Average-case Analysis of the Gaussian Algorithm for Lattice Reduction. -- Research Report n°2798, Institut National de Recherche en Informatique et en Automatique, February 1996. To appear in Combinatorics, Probability and Computing, 1997.
[4]: Jacquet (Philippe) and Szpankowski (Wojciech). -- Analysis of digital tries whith Markovian dependency. IEEE Transactions on Information Theory, vol. 37, n°5, September 1991, pp. 1470--1475.
[5]: Vallée (Brigitte). -- Opérateurs de Ruelle-Mayer généralisés et analyse en moyenne des algorithmes d'Euclide et de Gauss. Acta Arithmetica, vol. LXXXI, n°2, 1997, pp. 101--144.

This document was translated from L^AT_EX by H^EV^EA.

1 Probabilistic dynamical sources

1.1 Definitions

1.2 Examples

1.3 Fundamental intervals, entropy, coincidence probability

2 Tries associated to a general source

2.1 Parameters

2.2 Asymptotics

3 Generalized Ruelle operators

3.1 Presentation

3.2 Singularities of the quasi inverse (I-Gs)-1

4 Average-case analysis of general tries

4.1 Analysis of height

4.2 Analysis of size and path length

5 Some important particular cases

5.1 Bernoulli sources

5.2 Continued fraction source

References

3.2 Singularities of the quasi inverse (I-G_s)^-1