Philippe Jacquet, {\sc Inria}--Rocquencourt

Analysis of Suffix Trees by String-Ruler Approach

We study in a probabilistic framework the lengths of strings which occur at least twice in a given random word on a finite alphabet. The lengths of repeated substrings correspond to the depths of suffixes stored in the associated suffix tree. Our main finding shows that the depths in a suffix tree are asymptotically distributed in the same manner as the depths in a digital tree that stores independent keys. We prove these results using a novel technique called {\it string-ruler} approach. Our results provide new insights into several algorithms on words and data compression schemes. (Joint work with Wojciech Szpankowski).