Article contents
Concentration of Size and Path Length of Tries
Published online by Cambridge University Press: 24 September 2004
Abstract
We show that, for a certain class of probabilistic models, the number of internal nodes $S_n$ of a trie built from $n$ independent and identically distributed keys is concentrated around its mean, in the sense that $\Var S_n=\Oh(\EE S_n)$. Keys are sequences of symbols which may be taken from varying alphabets, and the choice of the alphabet from which the $k$th symbol is taken, as well as its distribution, may depend on all the preceding symbols. In the construction of the trie we also allow for bucket sizes greater than 1. The property that characterizes our models is the following: there is a constant $C$ such that for any word $v$ that may occur as a prefix of a key the size $S^v_n$ of a trie built from the suffixes of $n$ independent keys conditioned to have the common prefix $v$ has the property $\EE S^v_n\leq Cn$. This class of models contains memoryless and Markovian source models as well as the probabilistic dynamical source models that were recently introduced and thoroughly developed by Vallée [Algorithmica29 (2001) 262–306], in particular the continued fraction source. Furthermore we study the external path length $L_n$, which obeys $\EE L_n=\Oh(n\ln n)$ and $\Var L_n=\Oh(n\ln^2 n)$.
- Type
- Paper
- Information
- Copyright
- © 2004 Cambridge University Press
- 2
- Cited by