Hostname: page-component-78c5997874-8bhkd Total loading time: 0 Render date: 2024-11-16T19:25:49.220Z Has data issue: false hasContentIssue false

Stochastic scrabble: large deviations for sequences with scores

Published online by Cambridge University Press:  14 July 2016

Richard Arratia
Affiliation:
University of Southern California
Pricilla Morris
Affiliation:
University of Southern California
Michael S. Waterman
Affiliation:
University of Southern California

Abstract

A derivation of a law of large numbers for the highest-scoring matching subsequence is given. Let Xk, Yk be i.i.d. q=(q(i))iS letters from a finite alphabet S and v=(v(i))iS be a sequence of non-negative real numbers assigned to the letters of S. Using a scoring system similar to that of the game Scrabble, the score of a word w=i1 · ·· im is defined to be V(w)=v(i1) + · ·· + v(im). Let Vn denote the value of the highest-scoring matching contiguous subsequence between X1X2 · ·· Xn and Y1Y2· ·· Yn. In this paper, we show that Vn/K log(n) → 1 a.s. where KK(q,v). The method employed here involves ‘stuttering’ the letters to construct a Markov chain and applying previous results for the length of the longest matching subsequence. An explicit form for β ∊Pr(S), where β (i) denotes the proportion of letter i found in the highest-scoring word, is given. A similar treatment for Markov chains is also included.

Implicit in these results is a large-deviation result for the additive functional, H ≡ Σn < τv(Xn), for a Markov chain stopped at the hitting time τ of some state. We give this large deviation result explicitly, for Markov chains in discrete time and in continuous time.

Type
Research Papers
Copyright
Copyright © Applied Probability Trust 1988 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Research supported by the System Development Foundation,

the National Science Foundation,

§

the Institute of Mathematics and its Applications and

††

the National Institutes of Health.

References

[1] Arratia, R. and Waterman, M. S. (1985) Critical phenomena in sequence matching. Ann. Prob. 13, 12361249.Google Scholar
[2] Arratia, R. and Waterman, M. S. (1985) An Erdös–Rényi law with shifts. Adv. Math. 55, 1323.Google Scholar
[3] Arratia, R., Gordon, L., and Waterman, M. S. (1986) An extreme value theory for sequence matching. Ann . Statist. 14, 971993.CrossRefGoogle Scholar
[4] Erdös, P. and Rényi, A. (1970) On a new law of large numbers. J. Anal. Math. 22, 103111.Google Scholar
[5] Karlin, S. and Taylor, H. M. (1970) A First Course in Stochastic Processes , 2nd edn. Academic Press, New York.Google Scholar
[6] Rényi, A. (1970) Probability Theory , Akademia Kiado, Budapest.Google Scholar
[7] Waterman, M. S. (1984) General methods of sequence comparison. Bull. Math. Biol. 46, 473500. Reference added in proof CrossRefGoogle Scholar
[8] Arratia, R., Goldstein, L. and Gordon, L. (1988) Two moments suffice for Poisson approximations: the Chen-Stein method. Ann. Prob. To appear.Google Scholar