Previous News Next News

These pages are edited by Philippe.Flajolet@inria.fr.

** Issue Dated August 2001
(updated July 2003).
**

Tatihou poster: original design by Mariola Szpankowska. Brigitte Vallée and others at Tatihou.

The *Seventh Seminar on Analysis of Algorithms* was held
in Tatihou, a truly amazing place. Tatihou is a small island about half a
mile off the coast of Normandy in the (English?) Channel.
The isle is small but it contains a fort controlling the
bay built in the seventeenth century by Vauban
(a famous French military architect). There is also a bird
reserve, a maritime museum, and, of course, a sensational
conference center operated by the local county council.
Brigitte with her team, Ali Akhavi, Jérémie Bourdon,
Julien Clément, was responsible for making this venture possible.

- The main page of the meeting (archive).
- The programme (archive).
- List
of abstracts of the 31 talks. Here is the
**Booklet of Abstracts in PDF**[alt. PS format]. - List of participants.
- Photo Album prepared by Virginie Collette [still under construction].

Centre's portal; Abelard [left] with Julien Clément, Brigitte Vallée, Ali Akhavi, Julie Vallé, and Benoit Daireaux.

One of the buildings; the Vauban Tower.

There were eight long lectures, of an hour and a quarter each, which provided ample time for speakers to develop their ideas. These were given at the start of each half-day by Hwang, Chauvin, Viola, Szpankowski, Devroye, Pittel, Flajolet, and Schaeffer. Here's a brief summary.

- Hsien Kuei Hwang investigated the method of moments in relation
to the analysis of recursive algorithms of the divide-and-conquer
type. The method of moments is familiar from probability theory:
*for a distribution characterized by its moments (this is the case for most usual distributions, including the Gaussian law), convergence of the moments of random variables to those of implies convergence in distribution of to , namely, .*In analysis of algorithms, the are a normalized form (usually centred around the mean, possibly rescaled by a standard deviation) of the cost of an algorithm over inputs of size . Hsien Kuei examines various examples related to Quicksort, Quickselect, Mergesort, Digital Trees, Leader Election (in a distributed environment) and so on. In all these cases, there is a recurrence satisfied by the probability generating function (or equivalently, the moment generating function of the . One gets the first two moments by the usual devices. Then, it is often possible to guess--then prove--a form of the limit distribution as it should satisfy approximately the basic recurrence on probability (or moment) generating functions. Hsien Kuei gives many examples of this approach. The speed of convergence then derives from the Berry-Esseen inequalities (reminder: Berry-Esseen translates distances between characteristic functions into distances between distribution functions). There is here a method of considerable generality which in a sense parallels the recursive method developed by Rösler, Rüschendorf, and Neininger in similar contexts.Many papers by Hsien Kuei nicely illustrating this method are to be found at his web site: see esp. his publications in years .

As related reading, I also recommend the study by Boris Pittel, ``Normal convergence problem? Two moments and a recurrence may be the clues'',*Ann. Appl. Probab.*9 (1999), 1260-1302, corresponding to his talk at the AofA'98 in Princeton. - Brigitte Chauvin helped many of us discover the power of
martingale arguments in the analysis of algorithms and data
structures. Brigitte
started her by a comprehensive review
tuned to the needs of analysis of algorithms.
She then proceeded to develop a
martingale argument due to herself and her student Jean Jabbour
which gives truly surprising results regarding the
probable shape of binary search trees. Take a random binary search
tree and assocaite to it its level polynomial defined by teh
fact that
is the number of external nodens at
level . This is a random polynomial whose expectation is well known
to equal
. The big news is that
the normalized sequence of (random) polynomials
form a martingale. The consequences are cornucopious: this gives
quantitative properties that hold almost surely when a binary search tree
evolves under successive insertions from the empty tree till the end of
time--a point of view that differs somewhat from what is done usually
in AofA. It also shows that the profile of the tree is in probability
almost Gaussian. This property was known in the expected value sense since
the 1980's (Louchard). Here, the strong form of this property says the
following: simulate a random BST of size (say) and build the
histogram of depths of extrenal nodes; you will only find bell-shaped
histograms. This property had been conjectured by Sedgewick and myself
when we did some empirical simulations, and much to our astronishment
invariably found bell-shaped images like this one.
Another surprising consequence is that the width of a random BST is in probability . It's hard to imagine such a result to be accessible by analytic methods alone!

See the tutorial by Brigitte Chauvin ``Discrete martingales and applications to analysis of algorithms'' [for those who read French].

See the joint paper with Drmota and Jabbour for martingales associated to BSTs. - Alfredo Viola examined Rabin's irreducibility test for polynomials.
We are talking here of algorithms that operate on polynomials, which are
basic objects of symbolic manipulation systems, but also of
cryptography and of the theory of error-correcting codes.
Following a tradition started by Berlekamp, Knuth (in
volume 2 of
*TAOCP*) showed that we can subject these algorithms to the usual questions: What is the expected complexity? What is the distribution of costs? The subject is currently pursued by Daniel Panario (Carleton University, Ottawa) and Alfredo Viola (Montevideo) and we start knowing quite a bit on these algorithms. For instance, the usual polynomial factorization chain (based on elimination of repeated factors, distinct degree factorization, and Cantor-Zassenhaus equal-degree factorization) has been completely analysed in*Journal of Algorithms*40(1), 2001, pp. 37-81. The interest of these studies is enhanced by the fact that the algorithms are often probabilistic in nature, so that the randomness assumptions apply to perfection. In his talk, Alfredo reviewed Rabin's irreducibility test which answers the question: When is a polynomial irreducible (i.e., not factorable, a sort of ``prime'')? He presented results from the paper by Daniel Panario, Boris Pittel, Bruce Richmond, Alfredo Viola, ``Analysis of Rabin's irreducibility test for polynomials over finite fields'',*Random Structures and Algorithms*19(3-4): 525-551 (2001) which improves on the earlier*LATIN-1998*conference version.Alfredo, where's your home page with your publications?

- Wojtek Szpankowski
presented a survey talk on
*New and Old Problems in Pattern Occurrences*. Let and be sequences generated ovr er a finite alphabet. We call the pattern and the text. The basic questions are How many times does occurs in ? how long one has to wait until occurs in ? In the string matching problem the pattern is considered to be a string of consecutive symbols (a ``factor''), while in the subsequence matching problem the pattern is a subsequence. We may ask that the pattern occurs either exactly or approximately. In all cases, we assume that the pattern is given and the text is random and generated among all possible strings by some source model of sorts, the simplest cases being the uniform model, the Bernoulli, and the Markov sources. Wojtek showed us how to compute moments, limiting distributions and large deviations for the number of pattern occurrences and the waiting time. Throughout he uses combinatorial (e.g., symbolic) and analytic methods, based on generating functions (GFs) and singularity analysis. For this dense survey, we refer to Wojtek's recent book and to his web page. I note that the limiting distributions are usually normal, a fact that may arise from moment methods (based on GFs) or from Quasi-Powers approximations associated to multivariate generating functions that are rational functions. This is an area started by Guibas and Odlyzko (who developed the correlation polynomial) in the early 1980's, then pursued by Mireille Régnier, Wojetk, Hosam Mahmoud, and several others in the ensuing decade. A recent unification of seemingly very different source models is due to Brigitte Vallée, our gracious hostess at Tatihou; see her model of ``dynamical sources''.A zillion publications are available from Wojtek's web pages.

Go get Wojtek's book*Average Case Analysis of Algorithms on Sequences*, Wiley, 2001.

Brigitte's Research Pages including papers on dynmaical sources start from here. - Luc Devroye surveyed the ubiquitous
*digital tree*also known as*trie*. This wonderful structure can be approached by exact analytic models (ordinary, exponential, or Poisson generating functions), Mellin transforms, and saddle point methods, including Jacquet and Szpankowski's analytic depoissonization, whenever the source model is simple enough (typically, Bernoulli or Markov). It can also be approached by probabilistic methods and Luc has been a pioneer in this area. Luc started by recalling a few basic facts for a binary alphabet and a trie built on i.i.d. equally probable letter, as regards depth of a random node, height, and fill-up level:*Level Compressed*trie [LC trie] invented a few years ago by Nilsson and Andersson: these two authors created a surprise by showing the efficiency of this compressed structure for managing tables of IP addresses in routers. The idea is to extract the maximum perfect tree present at the root of a regular trie and encode it by an array; then proceed recursively on subtries. When you do this, the resulting structure is such that the depth (= the length of the access path) is with high probability close to , where is the number of times the (binary) logarithm must be applied to in order to obtain a number , e.g., . The height is for a Patricia variety [one-way branching is avoided] of LC tries against for a standard LC-trie, due to a few long filaments at the bottom of regular tries. For a source of i.i.d. symbols with leter probabilities , the depth goes up to . On the other hand, and this is a famous result of Luc in the late 1980's, the depth remains close to for tries built on binary representations of random variables over (0,1) as soon as these admit a probability density.Luc then went on to discuss various other models of data on which tries may be built: this includes continued fraction digits, a subject researched extensively by Brigitte Vallée and Julien Clément in recent years. I have unfortunately scanty notes on the rest: Azuma's and Talagrand's inequalities were flying around; Luc also mentioned Werner Schachinger's recent results on Gaussian-ness of trie parameters, and discussed briefly problems arising from partial match queries.

Download a version of "An analysis of random LC tries", which appeared in

*Random Structures and Algorithms*, vol. 15, pp. 359-375, 2001. - Boris Pittel thoroughly discussed the integer partitioning problem.
This is of great value since the problem is well known to be
-complete. In Boris' formulation: you are given
randomly chosen numbers in the interval between 1 and and want
to partition them into two subsets, so that the ``discrepancy'' (the
absolute value of the difference of their sums) is minimized. So far,
very few such hard combinatorial problems have succumbed to analytic
approaches, most of the existing results being based on the probabilistic
method.
Say that a partition is perfect if the discrepancy is zero. Boris shows that everything depends on the ratio and a sharp ``phase transition'' is detected: if , there are almost surely many perfect partitions while if almost surely none exists. Boris also shows much finer results describing the transition which takes place in a -region region of width about near 1. It is interesting to note that this study started from partly heuristic work by Stephan Mertens (based on the replica method of statistical physics and on Derrida's random energy model) well supported by extensive simulations. I cannot do justice to the richness of the methods here. Let me just say that it starts from a representation by oscillating integrals based on orthogonality on the unit circle; for instance, the number of perfect partitions given a sequence is

A column in the

*American Scientist*serves as a good introduction to the problem.

The STOC-2001 paper by Chayes, Borgs, and Pittel. Related publication in*Random Structures and Algorithms*19(3-4): 247-288 (2001).

The Mathematics ArXiv has a preliminary version of the related paper ``Phase Diagram for the Constrained Integer Partitioning Problem'', by C. Borgs, J. T. Chayes, S. Mertens, B. Pittel. 62 pages. - Philippe Flajolet: I talked about ``algebraic analytic asymptotics''.
Gee! Many
problems lead to generating functions that are
*algebraic functions*, that is roots of polynomial equations . First and foremost you have objects described by context-free grammars, which lead to systems of polynomial equations, hence algebraic generating functions (by elimination). This covers all sorts of combinatorial families of trees, e.g., the ones defined by restricting the node degrees to a fixed finite set of allowed values, but also lattice path that nicely decompose, certain geometric diagrams, parenthesis systems and related formal languages. Next, the ``kernel method'' invented by Fayolle, Malyshev, and Iasnogorodski (in the 1970s) and the related ``quadratic method'' of Tutte (in the 1960s) show that certain functional equations may have (highly nontrivial) algebraic solutions. Then, the following question arises: Given a generating function that is algebraic, can we determine automatically an asymptotic expansion of its coefficients? No suspense: the answer is*Yes!*.Simpler systems have a vectorial form where is a vector of GFs and is a vector of polynomials with nonnegative coefficients. In this case, we have at our disposal a classical theorem, which I like to call the Drmota-Lalley-Woods Theorem (in fact the three authors worked independently in the 1990s): it asserts that, assuming total dependence [strong connectivity of the dependency graph], any component generating function must have a dominant singularity of the square-root type. Thus coefficients are composed of asymptotic elements of the form , with and algebraic numbers. In general however, a system may well not satisfy these positivity/dependency conditions. Here I show (based on joint work with Bruno Salvy and Cyril Chabaud) that there exist

*decision procedures*for finding the dominant singularities and determine the corresponding singular expansions. (The latter are known to be of the Newton-Puiseux type, that is, they involve only fractional exponents.) The difficulty is one of choosing amongst the various branches that an algebraic function may have--this is sometimes called a ``connection problem''. The net result is an explicit algorithm by which one can find the right expansions for a branch at the origin of an algebraic function specified by enough initial conditions: coefficients of any algebraic functions are asymptotically linear combinations of elements of the form , for*computable*numbers (rational) and (algebraic).See Functional Equations, Rational and Algebraic Functions, by Flajolet & Sedgewick, Chapter 8 of

*Analytic Combinatorics*, 2001, 98p. (INRIA Res. Rep. #4103).

Slides for this lecture. - Gilles Schaeffer talked on the enumeration of planar maps.
These are connected planar graphs with an embedding in the plane (or on the
sphere) ``frozen''. Tutte in the 1960s, in an attempt to disprove
what was at the time the 4-colour conjecture launched a
far-reaching programme destined to enumerate all sorts of planar maps.
(Tutte was hoping to enumerate 4-colourable maps and show that they
are in smaller number than all maps.)
Surgery on maps leads to functional equations for bivariate GFs
enumerating maps according to size (i.e., number of edges, vertices, or
faces depending the on situation) as
well as an auxilairy parameter. For instance, take a rooted map
(rooted along an oriented edge) and delete the root: you get either
two maps not connected at all to one another or a map with a new face that
combines two other faces. With recording the number of edges and
the degree of the face containing the root, a moment's
reflection shows that the bivariate GF of maps satisfies the
functional equation
*must*be added to the collection of combinatorial objects resorting to analytic combinatorics. I'd expect in the near future some analysis of combinatorial algorithms coming out of such beautiful enumerative results.Gilles Schaeffer's old Home Page at LORIA (he's now at the Polytechnique near Paris).

Mireille Bousquet-Mélou's Publication pages.

*To be continued!*