Issue of November 1997.
[02/11/97] Knuth's first analysis of an algorithm. Thanks to the web, everybody can now have easy access to an original of Don Knuth's NOTES ON "OPEN" ADDRESSING . The note has 6 pages. It provided for the first time a precise analysis of hashing with linear probing. As Knuth repeatedly indicates, this problem has had a strong influence on his carreer.
There, you'll find for the first time the connection between the analysis of full tables and what was to become later known as the Knuth-Ramanujan function Q(n). Of course, the analysis of sparse tables is also there.
[02/11/97] The Ramanujan-Knuth Q-function. Knuth's note of 1963 concludes with a conjecture about the asymptotics of the Knuth-Ramanujan function (which Knuth proved in 1965). We now know that this function together with a statement of its asymptotic behaviour was in fact already appearing in Ramanujan's first letter to Hardy dated Januray 16, 1913. Here is a version of a paper that provides a recent asymptotic analysis of this function (P. Flajolet, P. Grabner, P. Kirschenhofer, H. Prodinger. In J. Computational and Applied Mathematics, vol. 58 (1), 1995, pp. 103-116). Bruce Berndt has devoted a considerable part of his life to editing Ramanujan's works. See his home page for more about this fascinating adventure.
As we see from Knuth's note, Ramanujan's analysis entails that the total displacement of elements, i.e., the construction cost, for a linear probing hashing table that is full is about N^(3/2).
[02/11/97] Cost distribution in linear probing hashing. In the July issue of Research Notes, we saw that Poblete-Viola and Knuth had independently solved the problem of the determining the variance of the construction cost of linear probing hashing. Knuth's analysis from July 1997 is available on the web here. In July 1997, Poblete and Viola merged forces with Flajolet. As a result, the variance analysis can be in fact extended to a full distributional analysis of linear probing hashing. The report is available on the web.
Here's what goes on. For sparse tables, that is to say tables whose filling ratio a is less than 1, then there are a large number of clusters that tend to be small. Accordingly, the distribution of costs is Gaussian in the asymptotic limit. For full tables, the distribution has a standard deviation that is of the same order as the mean and thus it is rather spread out. In the limit, what we have for this case of full tables is a so-called Airy distribution that is related to our pet icon and involves Airy functions.
For reasons explained in the above papers, the Airy distribution appears in a diversity of combinatorial problems like: (i) path length in trees, (ii) area below random walks and excursions, (iii) inversions in trees, (iv) connectivity in random graphs.
Here's the way clusters are formed in a linear probing table as it fills up (from top to bottom). See Sedgewick's new book for more of such displays (and also better ones!!)
[10/11/97] Knuth's original note in TeX. Here is the TeX
version of Knuth's "Notes on `Open' Addressing". Helmut
Prodinger has very kindly retyped them into TeX for everyone to
see and enjoy.
Versions:
[TeX |
dvi |
postscript |
ps.gz]
[30/11/97] Odlyzko's monograph on "Asymptotic enumeration methods". In 1995, Elsevier published the Handbook of Combinatorics edited by R. L. Graham, M. Groetschel, and L. Lovasz. Andrew Odlyzko has there a survey of asymptotic enumeration methods (Volume 2 Chapter 22, pp. 1063-1229) that is really a thorough monograph of some healthy 166 pages! A preprint version of this inspiring text is on the web, as well as many other goodies that Andrew is offering on his Home Pages. I cannot resist listing the table of contents from this document that is a must for anyone interested in combinatorial asymptotics.
It is amusing to contrast a quotation of Odlyzko (1995)
Under Odlyzko's page, you'll find a lot of relevant papers from the guru of analytic methods. :-) Here are some that are relative to general methods and are offered electronically: