Previous News Next News

These pages are edited by Philippe.Flajolet@inria.fr.

** Issue of October
1997.**

[28/10/97] **Large deviations and Quicksort**.
Given a parameter like the cost function of an algorithm,
mean and variance estimates provide valuable informations on
the distribution. For instance, this is the basis of the second moment
method: if a parameter has a standard deviation much smaller than the
mean, then the distribution is concentrated around the mean.
The book by Motwani and Raghavan,
Randomized Algorithms
provides several illsutrations of this basic fact.

Often tail results stronger than what moment methods provide are available. The probability of being far away from the mean may decay exponentially or even superexponentially fast (in units given by the standard deviation).

In a paper entitled Large Deviations for Quicksort
that appeared in Journal of Algorithms, **21** (1996),
pp. 476-507, McDiarmid and Hayward provide such bounds for the
Quicksort algorithm. Their main theorem provides very precise
quantitative information and they extend it to the classical median of
3, 5, etc, improvements of Quicksort.

There's a recent preprint entitled
"Quicksort algorithm again revisited" by Knessl and
Szpankowski
that deals with a very detailed study of these bounds
and
is available on the web.
What Knessl and Szpankowski do is to start from the basic recurrence
over probability generating functions and proceed via nonlinear integral
equations that approximate them. The method uses matched asymptotic
expansions and the WKB method from applied mathematics.
It is found that the limit distribution (known to
exist by works of Regnier, Hennequin, and Roesler, 1989-1992)
has a left tail that is much
"thinner", *i.e.* doubly exponential than the right tail that is only
simply exponential.

Remember that it is an outstanding open problem to determine whether the limit distribution (see the icon below for the exact distributions at n=20,25,30) can be characterized in terms of classical functions of analysis. At least, we know that the moments of the limit distribution are expressible in terms of values of the zeta function at the integers.

[28/10/97]** Random generation of combinatorial structures.**
Frank Ruskey has joined these pages. Frank has done extensive research
on random generation and
ranking-unranking. You want to
test a conjecture, need to simulate an algorithms on certain inputs of
some sort. How to generate a random instance or list all instances of
a fixed size?
Examples may be permutations of a multiset,
graphs, rooted trees, balanced trees, etc.
The
**Combinatorial Object Server** provides access to a
rich collection of algorithms
interactively on the web.

For combinatorial objects that are nicely decomposable,
the **Combstruct**package is available under
Maple starting with version 5.4. For instance, you can draw at random
an unlabelled binary tree
of size 50 by the command

An excellent introduction to these questions is still the classic book Combinatorial Algorithms by Nijenhuis and Wilf (second edition, Academic Press, 1978). By the way, a related book called Algorithms and Complexity by Herb Wilf is now freely available on the web.

[29/10/97] **Rooted trees.**
Frank Ruskey's pages also contain a lot of information on classical
combinatorial problems. Above is a display of all rooted trees of
size 5 found on his pages.

The counting sequence under *combstruct* can also be obtained
automatically by the simple
sequence of commands (Maple version 5.4):

`
with(combstruct);
RootedTrees:=[T,{T=Prod(Z,Set(T))},unlabelled];
seq(count(RootedTrees,size=j),j=1..25);
`

`
1, 1, 2, 4, 9, 20, 48, 115, 286, 719, 1842, 4766, 12486,
32973, 87811, 235381, 634847, 1721159, 4688676,
12826228, 35221832, 97055181, 268282855, 743724984, 2067174645
`

This just says that we have combinatorial objects (T) defined recursively
as an "atomic" node (Z, by convention) attached to a *multiset*
of similar objects (of type T).

Given an integer sequence like this, it is
always a good idea to check the Encyclopedia of
Integer Sequences [EIS] by Sloane and
Plouffe (Academic Press, 1995. ISBN: 012-558630-2). Sure enough, the
sequence appears there as **M1180**.
One can even access the list (and many more sequences) **
on line**! There, we'll find automatically
for the sequence 1, 1, 2, 4, 9, etc:

`
%N Rooted trees with n nodes.
%F G.f. satisfies: Sum a(n) x^n = Product(1-x^{n+1})^{-a(n)}.
`

and much more!

See also the ** Resouces
pages of AofA** for many other interesting sites.

[29/10/97] **New books**. The new edition of Knuth's the Art of
Computer Proramming is available, as publicized in the July
issue of Research News.

I have also received the latest edition (the third) of Sedgewick's Algorithms in C (Addison-Wesley, 1997; ISBN 0-201-31452-5). What was the earlier book is now split into two volumes (!). The material of the current volume concerns fondamentals, data structures, sorting, and searching, and it has been considerably expanded. I especially enjoy the visual traces corresponding to various algorithms and various data sets. Very clever! New data structures have also found their ways into the book, like skip lists, ternary search tries, randomized binary search trees, and so on. All, as usual, carefully and simply implemented. The back cover suggests that the distribution of the previous editions was around 250,000. I guess that if the new edition is equally successful, Bob will have no trouble buying a beer to all members of this list. (Subscribe!) :-) :-)