Travelling Waves and the Heigth of Binary Search Trees

Several more or less rigorous approaches suggest that the distribution of the height $H_n$ of binary search trees with $n$ internal nodes is given by ${\bf P}[H_n\le k] = F(\log n - c_1 k- c_2 \log k) + o(1)$, where $c_1 = 0.231\ldots = 1/4.3107\ldots$ and $c_2 = \frac {3c_1}{2(1-c_1)} = 0.407\ldots$\@ $F$ is a (continuous) ``wave'' and $c_1$ the ``velocity''. Interestingly this property is related to the sequence $(y_k(1))_{k\ge 0}$, where $y_k(x) = \sum_n {\bf P}[H_n\le k]\, x^n$ are generating functions inductively given by $y_0(x)\equiv 1$ and \[y_{k+1}(x) = 1 + \int_0^x y_k(t)^2dt.\] For example, if the limit $\lim_{k\to\infty} y_{k+1}(1)/y_k(1)$ exists then ${\bf P}[H_n\le k] = F(\log n - \log y_k(1)) + o(1)$. By applying a result of B.~Reed one gets $ \left| \log y_k(1) - c_1 k - c_2 \log k\right| = O(1)$, \ $(k\to\infty)$. Hence the above relation for the distribution of $H_n$ is really plausible.

Virginie Collette Last modified: Tue Sep 4 18:59:46 CEST 2001