Hostname: page-component-cd9895bd7-jkksz Total loading time: 0 Render date: 2024-12-26T00:30:56.161Z Has data issue: false hasContentIssue false

Entropy estimate by a randomness criterion

Published online by Cambridge University Press:  28 January 2016

TETURO KAMAE*
Affiliation:
Advanced Mathematical Institute, Osaka City University, 558-8585, Japan email [email protected]

Abstract

We propose a new criterion for randomness of a word $x_{1}x_{2}\cdots x_{n}\in \mathbb{A}^{n}$ over a finite alphabet $\mathbb{A}$ defined by

$$\begin{eqnarray}\unicode[STIX]{x1D6EF}^{n}(x_{1}x_{2}\cdots x_{n})=\mathop{\sum }_{\unicode[STIX]{x1D709}\in \mathbb{A}^{+}}\unicode[STIX]{x1D713}(|x_{1}x_{2}\cdots x_{n}|_{\unicode[STIX]{x1D709}}),\end{eqnarray}$$
where $\mathbb{A}^{+}=\bigcup _{k=1}^{\infty }\mathbb{A}^{k}$ is the set of non-empty finite words over $\mathbb{A}$, for $\unicode[STIX]{x1D709}\in \mathbb{A}^{k}$,
$$\begin{eqnarray}|x_{1}x_{2}\cdots x_{n}|_{\unicode[STIX]{x1D709}}=\#\{i;~1\leq i\leq n-k+1,~x_{i}x_{i+1}\cdots x_{i+k-1}=\unicode[STIX]{x1D709}\},\end{eqnarray}$$
and for $t\geq 0$, $\unicode[STIX]{x1D713}(0)=0$ and $\unicode[STIX]{x1D713}(t)=t\log t~(t>0)$. This value represents how random the word $x_{1}x_{2}\cdots x_{n}$ is from the viewpoint of the block frequency. In fact, we define a randomness criterion as
$$\begin{eqnarray}Q(x_{1}x_{2}\cdots x_{n})=(1/2)(n\log n)^{2}/\unicode[STIX]{x1D6EF}^{n}(x_{1}x_{2}\cdots x_{n}).\end{eqnarray}$$
Then,
$$\begin{eqnarray}\lim _{n\rightarrow \infty }(1/n)Q(X_{1}X_{2}\cdots X_{n})=h(X)\end{eqnarray}$$
holds with probability 1 if $X_{1}X_{2}\cdots \,$ is an ergodic, stationary process over $\mathbb{A}$ either with a finite energy or $h(X)=0$, where $h(X)$ is the entropy of the process. Another criterion for randomness using $t^{2}$ instead of $t\log t$ has already been proposed in Kamae and Xue [An easy criterion for randomness. Sankhya A77(1) (2015), 126–152]. In comparison, our new criterion provides a better fit with the entropy. We also claim that our criterion not only represents the entropy asymptotically but also gives a good representation of the randomness of fixed finite words.

Type
Research Article
Copyright
© Cambridge University Press, 2016 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Kamae, T. and Xue, Y.-M.. An easy criterion for randomness. Sankhya A 77(1) (2015), 126152.Google Scholar
Kamae, T. and Kim, D. H.. A characterization of eventually periodicity. Theoret. Comput. Sci. 581 (2015), 18.Google Scholar
Li, M. and Vitányi, P.. An Introduction to Kolmogorov Complexity and Its Applications, 3rd edn. Springer, New York, 2008, ch. 2.Google Scholar
Ornstein, D. and Weiss, B.. How samples reveals a process. Ann. Probab. 18(3) (1990), 905930.CrossRefGoogle Scholar
Shields, P. C.. The Ergodic Theory of Discrete Sample Paths (Graduate Studies in Mathematics, 13) . American Mathematical Society, Providence, RI, 1996.Google Scholar
Ziv, J. and Lempel, A.. A universal algorithm for sequential data compression. IEEE Trans. Inform. Theory 23 (1977), 337343.Google Scholar