From spa@cs.purdue.edu Fri Jul 25 03:43:47 1997 From: Wojciech Szpankowski Subject: Two Problems on Digital Search Trees Consider a digital search tree (DST) built over $m$ independently generated binary strings with ''$0$'' occurring with` probability $p$ and ''$1$'' with probability $q=1-p$. Limiting Distribution of the Profile. Let us fixed $k$ (that may depend on $m$), and define $B_m^k$ as the number of nodes in the DST at level $k$. This is called the profile of DST. We are interested in obtaining the limiting distribution of $B_m^k$ for any $k$. Aldous and Shields [AS] established it for the symmetric case. In the asymmetric case the limiting distribution is unknown. It is easy to derive a recurrence for $B_m^k$. Define $B_m^k(u)=Eu^{B_m^k}$. Then $$ B_{m+1}^k(u)=\sum_{l=0}^m {m \choose l} p^lq^{m-l} B_l^{k-1}(u) B_{m-l}^{k-1}(u)~, $$ with $B_0^0(u)=1$. Let now $B^k(z,u)=\sum_{m=0}^\infty B_m^k(u) {z^m \over m!}$. Then, the above becomes $$ {\partial B^k(z,u) \over \partial z} = B^{k-1}(pz,u) B^{k-1}(qz,u) $$ with $B^0(z,u)=u(e^z-1)+1$. We conjecture that for $k=O(\log m)$ the limiting distribution of $B_m^k$ is normal. The average profile was established in [LS] Limiting Distribution of the Height. Let $H_n^k$ denote the probability that the {\it height} in a DST with $m$ nodes is smaller equal to $k$. Define $H^k(z)=\sum_{m\geq 0} H_m^k {z^m \over m!}$. Then, one easily see that for $k\geq 1$ $$ {d H^k(z) \over d z} = H^{k-1}(zp) H^{k-1}(zq) $$ with $H^0(z)=1+z$. What is the limiting distribution of the height? From [Pittel] we only know that $H_m /\log m \to 1/\log (\min\{1/p, 1/q\})$ almost surely. [AS] D. Aldous and P. Shields, A Diffusion Limit for a Class of Random-Growing Binary Trees, {\it Probab. Th. Rel. Fields}, 79, 509-542 (1988). {LS] G. Louchard and W. Szpankowski, Average profile and Limiting distribution for a phrase size in the Lempel-Ziv parsing algorithm, {\it IEEE Trans. Information Theory}, 41, 478-488 (1995). [Pittel] B. Pittel, Asymptotic Growth of a Class of random Trees, {\it Annals of Probability}, 13, 414 - 427 (1985). \end{thebibliography}