Previous News
Next News
Hi! We're opening with a bunch of interesting items.
Hashing with linear probing. One of the events of the 3rd Dagstuhl seminar was the solution of an old research problem by Patricio Poblete and Alfredo Viola. This deals with a complete analysis of hashing with linear probing whose average case analysis was done by Knuth as early as 1962. As Knuth explains himself, this problem has played a very important role in his subsequent works. Poblete and Viola have succeeded in determining moments of the total construction time of an LP table (a sort of "path length"), while even the variance was previously unknown. A research announcement in postscript is available here [ps:37k | ps.gz:11k]. E-mail exchanges (!!) with Don Knuth while many of us were in Dagstuhl in July 1997 also indicate that Don has found an independent solution that is related to Wright's enumeration of connected graphs and inversion in trees. I hope to report later on these developments.
A recurrence in universal coding theory by Wojtek Szpankowski.
Ramanujan's Q-function and the so called ``tree function'' T(z)
have found
applications in hashing, the birthday paradox problem, random mappings,
caching, memory conflicts, and so forth. Recently, several novel applications
of these functions to information theory problems such as linear coding
and universal portfolios were brought to light. In his paper
[ps:136k |
ps.gz:41k],
Szpankowski studies them in the context of universal coding.
His methodology falls under
analytical information theory that was recently applied
successfully to a variety of information theory problems.
Analytic Combinatorics.
This is the title of a future book by Flajolet and Sedgewick.
The overall objective is to provide a unified treatment based on algebra and
analysis of generating functions, whenever there exist "exactly solved
models" in combinatorics. The subject is naturally closely
related to Analysis of Algorithms. Eventually, the book should
comprise ten chapters of which eight exist in preview version. The
latest chapter (123pp)
[
ps: 1451k |
ps.gz:416k ], dated May 1997, is out.
It deals with Multivariate Asymptotics and Limit
Distributions.
Just look up
Flajolet's pages for the whole collection and more.
Hofri's book. The title is
Analysis of Algorithms --
Mathematical Methods, Computational Tools and it's published by
Oxford University Press, 1995 (ISBN: 0-19-509954-0).
You'll find
a page
with preface, table of contents, and errata (there's a 3$ reward!).
The book has chapters on Generating functions, Combinatorial calculus,
Integral transforms, Asymptotic methods, Probabilistic methods,
Permutations, Sorting and Searching, Communication protocols, and Bin
Packing.
[28/7/97] Linear probing strikes again.
Don Knuth's
brand new work on the
analysis of linear probing hashing (see his P158) is available.
We also archive copies here
[tex |
tex.gz |
ps |
ps.gz].
This is strongly related to the July'97 editorial
and the results by Poblete and Viola that are discussed
there, as well as our pet icon.
Look up also Knuth's conclusion!
[29/7/97] Gcd algorithms.
One of the challenging problems in the area of
Seminumerical Algorithms is the theoretical understanding of the gcd
algorithm. This is one of the oldest known
algorithms and its analysis had to wait for more than two
millenia! What seems to be the "ancient" version (subtract/exchange) was,
if my memory serves me right, only analysed by Knuth and Yao in the
1970's or early 1980's.
The classical version had also to wait till
the 1970's (Wirsing, Heilbronn, Dixon). Now, programmers of binary
computers know that the
subtract/shift "binary" gcd performs quite well in practice. Brent in 1976 (a
great paper in a
famous book) did a first "semi-rigorous" analysis of the binary gcd
algorithm but several open problems remain in this orbit: look up
for
Knuth's beta test version of new material (Section 4.5.2) on analysis
of the binary gcd algorithm.
Big bucks and/or glory await you:-)
Roughly, computing the gcd of pairs of random integers in the interval
[1..N]
[by the classical or binary method] leads to expected costs ~c*
log(N)
and everything lies in the determination of
the right constant "c"... People interested in limit distributions may
also find it rewarding to know that the distribution of costs of the
classical algorithm is asymptotically Gaussian, a recent result of
Hensley that is based
on hard functional analysis.
[31/7/97] Quickselect. Quicksort can be adapted
in order to select any element of an unsorted file given by its rank
(using only one-sided recursion).
This is "Quickselect", an algorithm that goes back to Hoare's classic
paper. It has been known for a long time that selection can be achieved
in this way in expected linear time. The distribution of costs has
been investigated by Devroye, as far as tail estimates go, and by
Mahmoud, Modarres, and Smythe who characterized the limiting distribution of
costs when both the input permutation and the rank of the element to
be selected are random. In
this paper, Prodinger and Panholzer analyse "Multiple Quickselect"
where one searches for several elements by rank simultaneously. The
problem is of interest in statistics (cf, Mahmoud at Dagstuhl'97), for
instance in situations where several quantiles (median, quarter
quantiles, etc) need to be determined. From a combinatorial viewpoint,
the problem is equivalent to analysing the location of p random
pebbles on a random binary search tree of size n. Panholzer
and Prodinger determine exactly the mean and variance of the number of
passes and of the number of comparisons. Computer algebra is essential,
as some of their exact expressions have sizes of several
hundred symbols! Our pet dilogarithm even pokes its nose there.
This paper is especially
interesting from a methodological point of view as it is a testimony
of the subtle [inter]play between a computer algebra system and
the solvers of an intricate
combinatorial problem.