Recurrence and transience of a Markov chain on + and evaluation of prior distributions for a Poisson mean

James P. Hobert; Kshitij Khare

doi:10.1017/jpr.2024.13

Recurrence and transience of a Markov chain on $\mathbb Z$+ and evaluation of prior distributions for a Poisson mean

Part of: Markov processes Decision theory

Published online by Cambridge University Press: 25 April 2024

James P. Hobert and

Kshitij Khare

Show author details

James P. Hobert*: Affiliation:
University of Florida
Kshitij Khare*: Affiliation:
University of Florida
*: *Postal address: Department of Statistics, 103 Griffin Floyd Hall, University of Florida, Gainesville, FL 32611, USA.
*Postal address: Department of Statistics, 103 Griffin Floyd Hall, University of Florida, Gainesville, FL 32611, USA.

Article contents

Abstract
Introduction
Random walks on networks
A condition for transience of W
A condition for recurrence of W
Examples
Discussion
Funding information
Competing interests
References

Rights & Permissions

Abstract

Eaton (1992) considered a general parametric statistical model paired with an improper prior distribution for the parameter and proved that if a certain Markov chain, constructed using the model and the prior, is recurrent, then the improper prior is strongly admissible, which (roughly speaking) means that the generalized Bayes estimators derived from the corresponding posterior distribution are admissible. Hobert and Robert (1999) proved that Eaton’s Markov chain is recurrent if and only if its so-called conjugate Markov chain is recurrent. The focus of this paper is a family of Markov chains that contains all of the conjugate chains that arise in the context of a Poisson model paired with an arbitrary improper prior for the mean parameter. Sufficient conditions for recurrence and transience are developed and these are used to establish new results concerning the strong admissibility of non-conjugate improper priors for the Poisson mean.

Keywords

Admissibility electrical network null recurrence Poisson distribution reversibility weighted random walk

MSC classification

Primary: 60J10: Markov chains (discrete-time Markov processes on discrete state spaces)

Secondary: 62C15: Admissibility

Type: Original Article
Information: Journal of Applied Probability , Volume 61 , Issue 4 , December 2024 , pp. 1361 - 1379

DOI: https://doi.org/10.1017/jpr.2024.13 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright: © The Author(s), 2024. Published by Cambridge University Press on behalf of Applied Probability Trust

1. Introduction

There is a well-known connection between the admissibility of statistical estimators and the recurrence of associated stochastic processes (see, e.g., [Reference Brown2, Reference Eaton4, Reference Johnstone9]). Eaton [Reference Eaton4] considered a general parametric statistical model paired with an improper prior distribution for the parameter that leads to a proper posterior distribution. Let $\theta$ and $\nu({\textrm{d}}\theta)$ denote the parameter and the improper prior distribution, respectively. Eaton proved that if a certain Markov chain (constructed using the model and the prior) is recurrent, then the improper prior is strongly admissible, which means that the generalized Bayes estimator of every bounded function of $\theta$ is almost- $\nu$ -admissible under squared error loss. That is, if $g(\theta)$ is any bounded function of $\theta$ and $\delta$ is any estimator of $g(\theta)$ whose mean squared error (MSE) is less than or equal to that of the generalized Bayes estimator of $g(\theta)$ for all $\theta$ , then the set of $\theta$ s for which the MSE of $\delta$ is strictly less than that of the generalized Bayes estimator has $\nu$ -measure 0. (See [Reference Eaton5] for an excellent introduction to this theory.) Strong admissibility is a useful property. Indeed, if the prior $\nu$ is strongly admissible, this means that the statistical model and $\nu$ combine to yield a formal posterior distribution that generates (almost) admissible estimators for a large class of functions of $\theta$ , which means that we might be willing to endorse $\nu$ as a good ‘all purpose’ prior to use in conjunction with this particular statistical model. It is important to keep in mind throughout that Eaton’s condition is merely sufficient. In particular, it remains unknown whether or not transience of Eaton’s Markov chain implies that the prior is not strongly admissible. (See [Reference Eaton4, Section 7] for more on this issue.)

Hobert and Robert [Reference Hobert and Robert7] showed that Eaton’s Markov chain is recurrent if and only if its so-called conjugate Markov chain is recurrent (see also [Reference Eaton, Hobert and Jones6]). This is a useful result from a practical standpoint because the conjugate chain is often much easier to analyze than Eaton’s chain. Here we study a set of Markov chains that contains all the conjugate chains that arise in the context of a Poisson model paired with an arbitrary improper prior. We now describe this set of chains.

Let $\{a_m\}_{m=0}^\infty$ be a sequence of strictly positive real numbers such that, for each $i \in {\mathbb Z}^+ \;:\!=\; \{0,1,2,\ldots\}$ , we have $\sum_{j=0}^\infty {a_{i+j}}/{j!} < \infty$ . Define $b_i = ({1}/{i!}) \sum_{j=0}^\infty {a_{i+j}}/{j!}$ . Now let $W =\{W_n\}_{n=0}^\infty$ be a time-homogeneous Markov chain with state space ${\mathbb Z}^+$ and transition probabilities given by

\[ p_{ij} = \mathbb{P}(W_{n+1} = j\mid W_n = i) = \frac{a_{i+j}}{i! j! b_i} \]

for $i,j \in {\mathbb Z}^+$ . The fact that the transition probabilities are all strictly positive implies that the chain is irreducible and aperiodic. Moreover, since $p_{ij} b_i ={a_{i+j}}/({i! j!}) = p_{ji} b_j$ for all $i,j \in {\mathbb Z}^+$ , the chain is reversible with respect to the sequence $\{b_i\}_{i=0}^\infty$ . Thus, $\{b_i\}_{i=0}^\infty$ is an invariant sequence for W, i.e. for each $j \in {\mathbb Z}^+$ we have $\sum_{i=0}^\infty p_{ij} b_i = b_j$ . Because W is irreducible and aperiodic, it follows that W is positive recurrent if and only if $\sum_{i=0}^\infty b_i < \infty$ (see, e.g., [Reference Billingsley1, Section 8]). When this sum diverges, the chain is either null recurrent or transient, and differentiating between these two possibilities in specific examples can be quite challenging. This is our focus. We now provide a simple example.

If we take $a_m = m!/2^{m+1}$ , then, for fixed i,

\[\sum_{j=0}^\infty \frac{a_{i+j}}{j!} = \sum_{j=0}^\infty\frac{(i+j)!}{2^{i+j+1} j!} ,\]

which converges (ratio test). Now,

\[\sum_{i=0}^\infty b_i = \sum_{i=0}^\infty \frac{1}{i!}\sum_{j=0}^\infty \frac{a_{i+j}}{j!} = \sum_{i=0}^\infty \frac{1}{i!}\sum_{j=0}^\infty \frac{(i+j)!}{2^{i+j+1} j!} = \sum_{n=0}^\infty\Big( \frac{1}{2} \Big)^{n+1} \sum_{i=0}^n \binom{n}{i} =\sum_{n=0}^\infty \frac{1}{2} = \infty .\]

We conclude that the Markov chain W corresponding to $a_m =m!/2^{m+1}$ is either null recurrent or transient. We will return to this example several times throughout the paper.

We now describe the connection between the Markov chain W and the decision-theoretic study of improper priors for a Poisson mean. Suppose that X is a $\mbox{Poisson}(\lambda)$ random variable; that is, $\lambda>0$ and $\mathbb{P}(X=x \mid \lambda) =({{\textrm{e}}^{-\lambda} \lambda^x}/{x!}) \textbf{1}_{{\mathbb Z}^+}(x)$ , where $\textbf{1}_A(\!\cdot\!)$ is the indicator function of the set A. Set $\mathbb{R}^+ = (0, \infty)$ and let $\nu\colon \mathbb{R}^+ \rightarrow\mathbb{R}^+$ be such that $\int_{\mathbb{R}^+} \nu(\lambda) \,{\textrm{d}}\lambda = \infty$ and $\int_{\mathbb{R}^+} \lambda^x {\textrm{e}}^{-\lambda}\nu(\lambda) \, {\textrm{d}}\lambda < \infty$ for all $x \in {\mathbb Z}^+$ . Under these conditions, $\nu(\lambda)$ can be viewed as an improper prior density for the parameter $\lambda$ that yields a proper posterior density given by

\[ \pi(\lambda \mid x) = \frac{{\textrm{e}}^{-\lambda} \lambda^x \nu(\lambda)}{x! m_\nu(x)} \textbf{1}_{{\mathbb R}^+}(\lambda) ,\]

where, of course, $m_\nu(x) \;:\!=\; ({1}/{x!}) \int_{\mathbb{R}^+}\lambda^x {\textrm{e}}^{-\lambda} \nu(\lambda) \, {\textrm{d}}\lambda$ . We associate with each such improper prior $\nu(\!\cdot\!)$ a Markov chain $\Phi^\nu = \{\Phi^\nu_n \}_{n=0}^\infty$ with state space ${\mathbb Z}^+$ and transition probabilities given by

(1.1)

\begin{align} \mathbb{P}(\Phi^\nu_{n+1} = j \mid \Phi^\nu_n = i) & = \int_{\mathbb{R}^+} \mathbb{P}(X=j \mid \lambda) \pi(\lambda \mid i) \, {\textrm{d}}\lambda \nonumber \\[5pt] & = \frac{1}{i! j! m_\nu(i)} \int_{\mathbb{R}^+} \lambda^{i+j}{\textrm{e}}^{-2\lambda}\nu(\lambda)\ {\textrm{d}}\lambda \end{align}

for $i,j \in {\mathbb Z}^+$ . This is the conjugate chain mentioned above. As is typical, it is less complex than Eaton’s chain, which has a continuous state space, $\mathbb{R}^+$ , and Markov transition density given by

\[ k(v\mid u) = \sum_{x=0}^\infty \pi(v \mid x) \mathbb{P}(X=x \mid u) = {\textrm{e}}^{-u-v} \nu(v) \sum_{x=0}^\infty \frac{(u v)^x}{(x!)^2 m_\nu(x)} .\]

Clearly, the transition probabilities in (1.1) are strictly positive, which implies that the chain is irreducible and aperiodic. Moreover,

\[ \mathbb{P}(\Phi^\nu_{n+1} = j\mid \Phi^\nu_n = i) m_\nu(i) = \mathbb{P}(\Phi^\nu_{n+1} = i\mid \Phi^\nu_n = j) m_\nu(j)\]

for $i,j \in {\mathbb Z}^+$ . Hence, $\Phi^\nu$ is reversible with respect to the sequence $\{m_\nu(i)\}_{i=0}^\infty$ . The impropriety of $\nu(\!\cdot\!)$ implies that $\sum_{i=0}^\infty m_\nu(i) = \infty$ , so $\Phi^\nu$ is either null recurrent or transient. It follows from results of [Reference Eaton4, Reference Hobert and Robert7] that if $\Phi^\nu$ is null recurrent, then the prior $\nu$ is strongly admissible under squared error loss. Here is the connection: the chain $\Phi^\nu$ is a member of the general class of chains described above with

\[ a_m = \int_{\mathbb{R}^+} \lambda^m {\textrm{e}}^{-2\lambda} \nu(\lambda) \,{\textrm{d}}\lambda .\]

This connection provides motivation for the development of techniques for differentiating between null recurrence and transience of W when W is not positive recurrent, i.e., when $\sum_{i=0}^\infty b_i$ diverges.

Let’s now look at a particular family of improper priors for $\lambda$ that lead to proper posteriors. Take $\nu(\lambda) =\lambda^{\alpha-1} {\textrm{e}}^{-\beta \lambda}$ for $\alpha>0$ and $\beta \in(-1,0]$ . This is basically an improper gamma density, i.e. for $\alpha>0$ and $\beta \in (-1,0]$ , we have $\int_{\mathbb{R}^+} \lambda^{\alpha-1} {\textrm{e}}^{-\beta \lambda} \, {\textrm{d}}\lambda = \infty$ . The resulting posterior density is proper since, for any $x \in\mathbb{Z}^+$ ,

\[ \int_{\mathbb{R}^+} \lambda^x {\textrm{e}}^{-\lambda} \nu(\lambda) \, {\textrm{d}}\lambda = \int_{\mathbb{R}^+} \lambda^{x+\alpha-1} {\textrm{e}}^{-\lambda (\beta+1)} \,{\textrm{d}}\lambda < \infty .\]

Under this improper gamma prior, the posterior density is a (proper) gamma density, which is why the priors in this family are called conjugate priors. (Warning: The word ‘conjugate’ is used in two different ways in this paper; one applies to priors and the other to Markov chains.) Again, the associated Markov chain $\Phi^\nu$ is a special case of the Markov chain W, and the corresponding sequence $\{a_m\}_{m=0}^\infty$ is given by

\[ a_m = \int_{\mathbb{R}^+} \lambda^m {\textrm{e}}^{-2\lambda} \nu(\lambda) \, {\textrm{d}}\lambda = \int_{\mathbb{R}^+} \lambda^{m+\alpha-1} {\textrm{e}}^{-(2+\beta) \lambda} \, {\textrm{d}}\lambda = \frac{\Gamma(m+\alpha)}{(2+\beta)^{m+\alpha}} .\]

When $\alpha=1$ and $\beta=0$ , we have $a_m = m!/2^{m+1}$ , which is precisely the example discussed earlier in this section. It is known that the Markov chain $\Phi^\nu$ is null recurrent when $\beta = 0$ and $\alpha \in (0,1]$ , and is transient otherwise. So improper conjugate priors taking the form $\nu(\lambda) =\lambda^{\alpha-1}$ , $\alpha \in (0,1]$ , are strongly admissible. These priors are improper due to their heavy right tails. When $\alpha = 1$ we recover the so-called flat prior, which is constant on $\mathbb{R}^+$ , and when $\alpha \in (0,1)$ the right tail decreases to 0 at a rate dictated by $\alpha$ , with smaller values of $\alpha$ leading to a faster decrease. The result concerning the stability of $\Phi^\nu$ was established in [Reference Hobert and Robert7], which showed that when $\nu$ is conjugate, $\Phi^\nu$ can be represented as a branching process with immigration. The null recurrence/transience results then follow easily from classical theorems in branching process theory. Unfortunately, when $\nu$ is non-conjugate, the branching process representation of $\Phi^\nu$ breaks down, and differentiating between null recurrence and transience is much more difficult. In fact, not much is known about the strong admissibility of improper non-conjugate priors for $\lambda$ .

Remark 1.1. Now suppose that, instead of a single observation from the $\mbox{Poisson}(\lambda)$ distribution, we have an independent and identically distributed (i.i.d.) sample of size n, and we want to know if $\nu(\lambda)$ is strongly admissible in this new situation. Since the sum of these Poisson random variables is a sufficient statistic, we can base our inference on the sum, which also has a Poisson distribution. In fact, a straightforward calculation shows that the conjugate Markov chain for this problem is exactly the same as that corresponding to the case of a single observation with prior $\nu(\lambda/n)/n$ . Hence, if the latter chain is recurrent, then $\nu(\lambda)$ is strongly admissible in the i.i.d. sample case. For example, if $\nu(\lambda) = \lambda^{\alpha-1} {\textrm{e}}^{-\beta \lambda}$ , then $\nu(\lambda/n)/n = n^{-1} (\lambda/n)^{\alpha-1} {\textrm{e}}^{-\beta \lambda/n}$ , which is just a slightly different conjugate prior (the factor of $n^{-\alpha}$ plays no role). Since $\beta = 0$ if and only if $\beta/n = 0$ , the conjugate priors that we identified as strongly admissible for the single-observation case remain so for the i.i.d. sample case, which is not surprising.

Our main contribution is the development of general conditions that can be used to ascertain whether W (characterized by the sequence $\{a_m\}_{m=0}^\infty$ ) is recurrent or transient. In particular, we prove that a sufficient condition for transience of W is

\[ \sum_{n=1}^\infty \frac{(n!)^2}{n^{3/2} a_{2n}} + \sum_{n=1}^\infty \frac{n! (n+1)!}{n^{3/2} a_{2n+1}} < \infty ,\]

and that a sufficient condition for recurrence of W is

\[ \sum_{n=1}^\infty \Bigg[ \sum_{i=0}^{n-1} \sum_{j=n}^\infty \frac{a_{i+j}(j-i)}{i!j!} \Bigg]^{-1} = \infty .\]

These conditions are applicable to all W, but they are most useful in situations where $\sum_{i=0}^\infty b_i = \infty$ . In the context of our statistical problem, we show that our sufficient conditions are sharp enough to correctly characterize all of the $\Phi^\nu$ s associated with improper conjugate priors, which suggests that they ought to be useful in differentiating between null recurrence and transience when improper non-conjugate priors are used, and we demonstrate that this is indeed the case. Of course, as mentioned above, transience of $\Phi^\nu$ does not tell us anything about $\nu$ (beyond the fact that Eaton’s theory cannot be used to establish the strong admissibility of $\nu$ ). Therefore, in the context of our statistical problem, the sufficient condition for transience is clearly much less useful than the sufficient condition for recurrence.

We develop these sufficient conditions for recurrence and transience by leveraging a branch of classical Markov chain theory that is based on connections between reversible Markov chains (on countable state spaces) and electrical networks (see, e.g., [Reference Doyle and Snell3, Reference Peres and Bernard13]). Our work is analogous to that of [Reference Hobert and Schweinsberg8], which developed a sufficient condition for strong admissibility of improper priors for a geometric success probability.

The remainder of this paper is organized as follows. In Section 2, we introduce random walks on networks and explain how they are related to our Markov chain W. Section 3 contains our development of the sufficient condition for transience, which is based on a result from [Reference Lyons11]. In Section 4, a result from [Reference McGuinness12] is employed to develop the sufficient condition for recurrence. We apply our results in Section 5. There it is shown that certain members of the family of improper (non-conjugate) inverse gamma priors are strongly admissible, and that the logarithmic prior $\nu(\lambda) =\log\!(1+\lambda)$ is strongly admissible. Finally, Section 6 contains some closing remarks about our results.

2. Random walks on networks

In this section, we define a weighted random walk on a network, and show that a slightly altered version of the Markov chain W can be represented as such. This representation facilitates our analysis of W because W and the altered version have the same recurrence/transience properties.

A network is a pair $N = [G, c]$ , where G is a simple connected graph with countable vertex set V(G) and edge set E(G), and c is a function with domain E(G) and range $\mathbb{R}^+$ . For $e \in E(G)$ , c(e) is called the conductance of the edge e. If v and w are vertices of G that are connected by an edge, then we write $v \sim w$ and denote the edge connecting v and w by $e_{vw}$ . For $v \in V(G)$ , let $c(v) = \sum_{w: v \sim w}c(e_{vw})$ . A weighted random walk on N is a Markov chain $S = \{S_n\}_{n=0}^\infty$ with state space V(G) whose transition probabilities are given by

\[ \mathbb{P}(S_{n+1} = w\mid S_n = v) = \begin{cases} c(e_{vw})/c(v) & \mbox{if } v \sim w , \\[5pt] 0 & \mbox{otherwise}.\end{cases}\]

In words, if the chain is currently at the vertex v, then its next move is to one of the vertices that share an edge with v according to probabilities that are proportional to the conductances of those edges. Since

\[ \mathbb{P}(S_{n+1} = w\mid S_n = v) c(v) = c(e_{vw}) = \mathbb{P}(S_{n+1} = v\mid S_n = w) c(w)\]

for all $(v,w) \in V(G) \times V(G)$ , the chain S is reversible with respect to the sequence $\{c(v)\}_{v \in V(G)}$ .

The graph G is simple so it has no self loops. Hence, the Markov chain S cannot make transitions from a vertex in G back to the same vertex. The Markov chain W, however, can make transitions from any point in $\mathbb{Z}^+$ back to the same point. It follows that W cannot be represented exactly as a weighted random walk on a network. This is why we must consider a slightly altered version of W that we now describe. Let H be the graph with vertex set $\mathbb{Z}^+$ and an edge joining any two distinct vertices. (We are using H instead of G here because we wish to preserve the generality of the network $N = [G, c]$ .) Let i and j be any two distinct points in $\mathbb{Z}^+$ and define the conductance as $d(e_{ij}) = p_{ij} b_i = {a_{i+j}}/({i! j!})$ . Now let $T = \{T_n\}_{n=0}^\infty$ be the weighted random walk on the network $M=[H,d]$ , which has transition probabilities given by

\[ \mathbb{P}(T_{n+1} = j\mid T_n = i) = \frac{d(e_{ij})}{d(i)} = \frac{p_{ij} b_i}{\sum_{j \ne i} p_{ij} b_i} = \frac{p_{ij}}{1-p_{ii}}\]

for all $i \ne j$ . These are also the transition probabilities of the Markov chain $\tilde{W} = \{\tilde{W}_n\}_{n=0}^\infty$ obtained from the chain W by removing repeated values, and, moreover, W is recurrent if and only if $\tilde{W}$ is recurrent (see [Reference Hobert and Schweinsberg8, Section 2] for details). Therefore, T is recurrent if and only if W is recurrent, so we can study the stability of W indirectly by studying T.

3. A condition for transience of W

In this section, we develop a sufficient condition for the transience of W by employing a result from [Reference Lyons11]. Consider again our generic network $N = [G,c]$ from the previous section. If $a \in V(G)$ , a flow from a to $\infty$ is a real-valued function $\theta$ defined on $V(G) \times V(G)$ such that $\theta(v,w)= 0$ unless $v \sim w$ , $\theta(v,w) = -\theta(w,v)$ for all $v, w \in V(G)$ , and $\sum_{w \in V(G)} \theta(v,w) = 0$ if $v \neq a$ . The flow is called a unit flow if $\sum_{w \in V(G)} \theta(a,w) =1$ . The energy of the flow is defined by

\[ \mathcal{E}(\theta) = \frac{1}{2} \sum_{(v,w): v \sim w}\frac{\theta^2(v,w)}{c(e_{vw})}.\]

Theorem 3.1. (Lyons [Reference Lyons11].) The weighted random walk on the network $N = [G, c]$ is transient if and only if, for some $a \in V(G)$ , there exists a unit flow from a to $\infty$ having finite energy.

In our application of this result, we will be concerned with the particular network $M = [H,d]$ defined in the previous section. We now describe a novel technique for converting certain partitions of $\mathbb{Z}^+$ into flows from 0 to $\infty$ . Let $\{B_k\}_{k=0}^\infty$ denote a partition of $\mathbb{Z}^+$ where $B_0= \{0\}$ . We assume without loss of generality that all sets in the partition are non-empty. We assume further that the partition is ‘monotone’ in the sense that if $i \in B_k$ and $j \in B_\ell$ with $k < \ell$ , then $i < j$ . Now define a function $\theta\colon \mathbb{Z}^+\times \mathbb{Z}^+ \rightarrow \mathbb{R}$ as follows:

\[ \theta(i,j) = \begin{cases} \dfrac{|B_1|}{|B_k| |B_{k+1}|} & \mbox{if } i \in B_k, \, j \in B_{k+1} , \\[9pt] \dfrac{-|B_1|}{|B_{k-1}| |B_{k}|} & \mbox{if } i \in B_k,\, j \in B_{k-1} , \cr 0 & \mbox{otherwise}. \end{cases}\]

We claim that $\theta$ is a flow. The anti-symmetry of $\theta$ , i.e. $\theta(i,j) = -\theta(j,i)$ , follows immediately by construction. Now suppose that $k \ge 1$ and that $i \in B_k$ . Then we have

\[ \sum_{j=0}^\infty \theta(i,j) = \sum_{j \in B_{k-1}} \theta(i,j) + \sum_{j \in B_{k+1}} \theta(i,j) = \frac{-|B_1|}{|B_{k-1}| |B_{k}|} |B_{k-1}| + \frac{|B_1|}{|B_{k}| |B_{k+1}|} |B_{k+1}| = 0 .\]

Hence, $\theta$ is a flow. Further,

\[ \sum_{j=0}^\infty \theta(0,j) = \sum_{j \in B_1} \theta(0,j) = \frac{|B_1|}{|B_0| |B_1|} |B_1| = |B_1| > 0 .\]

Therefore, we can make the flow a unit flow from 0 to $\infty$ by choosing $B_1 = \{1\}$ . We call any flow constructed using the above technique a partition flow. Here is our main result regarding the transience of the Markov chain W.

Proposition 3.1. The Markov chain W (characterized by the sequence $\{a_m\}_{m=0}^\infty$ ) is transient if

(3.1)

\begin{equation} \sum_{n=1}^\infty \frac{(n!)^2}{n^{3/2} a_{2n}} + \sum_{n=1}^\infty \frac{n! (n+1)!}{n^{3/2} a_{2n+1}} < \infty . \end{equation}

Proof. Let $\theta$ denote the unit partition flow from 0 to $\infty$ based on the following partition: $B_0 = \{0\}$ , $B_1 = \{1\}$ , and $B_k = \{(k-1)^2+1,\ldots,k^2\}$ for $k \ge 2$ . Note that $|B_k| = 2k-1$ for $k \ge 1$ . We will show that (3.1) implies that the energy of this flow is finite, which in turn implies transience by Theorem 3.1. We can express the energy of our flow as follows:

\begin{align*} \mathcal{E}(\theta) = \frac{1}{2}\sum_{i=0}^{\infty}\sum_{j = 0}^{\infty}\frac{i! j! \theta(i,j)^2}{a_{i+j}} & = \frac{1}{2}\sum_{n=1}^{\infty}\frac{1}{a_n}\sum_{i=0}^n i! (n-i)! \theta(i,n-i)^2 \\[5pt] & = \sum_{n=1}^{\infty}\frac{1}{a_n}\sum_{i=0}^{\lfloor n/2 \rfloor} i! (n-i)! \theta(i,n-i)^2, \end{align*}

where the last step uses the anti-symmetry of $\theta$ , and $\lfloor \cdot \rfloor$ denotes the floor of the argument. Now fix $n \ge 50$ and fix a non-negative integer $i < n/2$ . Suppose that $i \in B_k$ . Then $\theta(i,n-i)^2 \ne 0$ if and only if $n-i \in B_{k+1}$ . It follows from the definition of $B_k$ and $B_{k+1}$ that, in such a case,

\[ n = i + (n-i) \le k^2 + (k+1)^2 = 2k^2 + 2k + 1 . \]

Continuing, since $i \in B_k$ , we have $i \ge k^2 - 2k + 2$ and $\sqrt{i} > k-1$ . Hence,

\[ n \le 2k^2 -4k + 4 + 6(k-1) + 3 < 2i + 6\sqrt{i} + 3 < 2i + 6\sqrt{n/2} + 3 < 2i + 6\sqrt{n} . \]

It follows that $\theta(i,n-i) \ne 0 \Rightarrow i > {n}/{2} - 3 \sqrt{n}$ . We see that $n-i \in B_{k+1}$ implies that $n < 2(k+1)^2$ . Thus, $50 \le n < 2(k+1)^2$ , which implies that $k>4$ and

\[ \theta(i,n-i) = \frac{1}{|B_k| |B_{k+1}|} = \frac{1}{(2k-1)(2k+1)} \le \frac{1}{2(k+1)^2} < \frac{1}{n} . \]

Using the facts just established, we have, for $n \ge 50$ ,

(3.2)

\begin{align} \sum_{i=0}^{\lfloor n/2 \rfloor} i! (n-i)! \theta(i,n-i)^2 & \le \frac{1}{n^2} \sum_{i = \lfloor n/2 - 3\sqrt{n}\rfloor }^{\lfloor n/2 \rfloor} i! (n-i)! \nonumber \\[5pt] & \le \frac{1}{n^2} \sum_{i = \lfloor n/2 - 3\sqrt{n}\rfloor}^{\lfloor n/2 \rfloor} \lfloor n/2 - 3\sqrt{n}\rfloor! \big( n - \lfloor n/2 - 3\sqrt{n}\rfloor \big)! \nonumber \\[5pt] & \le \frac{4 \sqrt{n}}{n^2} \lfloor n/2 - 3\sqrt{n}\rfloor! \big( n - \lfloor n/2 - 3\sqrt{n}\rfloor \big)! . \end{align}

The penultimate inequality follows from the fact that $i! (n-i)!$ is a decreasing function of i for $0 \le i \le \lfloor n/2 \rfloor$ . We now look to bound the product of factorials in the last line of (3.2).

Let $h_n = \lfloor n/2 - 3\sqrt{n} \rfloor/n \in (0,1)$ . Then

(3.3)

\begin{equation} \frac{n/2 - 3\sqrt{n} - 1}{n} \leq h_n \leq \frac{n/2 - 3\sqrt{n}}{n}. \end{equation}

It follows that

(3.4)

\begin{equation} 4 h_n (1-h_n) \leq \bigg( 1 - \frac{6}{\sqrt{n}\,} \bigg) \bigg( 1 + \frac{6}{\sqrt{n}\,} + \frac{2}{n} \bigg) \leq 1 - \frac{36}{n} + \frac{2}{n} \leq 1 - \frac{34}{n} . \end{equation}

Recall Stirling’s approximation:

\[ \sqrt{2\pi n}\bigg(\frac{n}{{\textrm{e}}}\bigg)^n < n! < \sqrt{2\pi n{\textrm{e}}^2}\bigg(\frac{n}{{\textrm{e}}}\bigg)^n . \]

Now fix n even. Using Stirling’s approximation in conjunction with (3.3) and (3.4), we have

(3.5)

\begin{align} \frac{\lfloor n/2 - 3\sqrt{n}\rfloor! \big( n - \lfloor n/2 - 3\sqrt{n}\rfloor \big)!}{(n/2)! \, (n/2)!} & \leq 2^{n+1}{\textrm{e}}^2 \sqrt{h_n (1-h_n)} h_n^{n h_n} (1-h_n)^{n(1-h_n)} \nonumber \\[5pt] & \leq 2^{n+1}{\textrm{e}}^2 \sqrt{h_n (1-h_n)} h_n^{n/2 - 3\sqrt{n} - 1} (1-h_n)^{n/2 + 3\sqrt{n}} \nonumber \\[5pt] & = 2{\textrm{e}}^2\sqrt{\frac{1-h_n}{h_n}}\big(4h_n(1-h_n)\big)^{n/2}\bigg(\frac{1-h_n}{h_n}\bigg)^{3\sqrt{n}} \nonumber \\[5pt] & \leq 2{\textrm{e}}^2 \sqrt{\frac{1-h_n}{h_n}} \bigg( 1 - \frac{34}{n} \bigg)^{n/2} \bigg(\frac{1 + {6}/{\sqrt{n}} + {2}/{n}}{1 - {6}/{\sqrt{n}} - {2}/{n}} \bigg)^{3 \sqrt{n}} . \end{align}

Now recall that if $\{u_n\}$ and $\{v_n\}$ are sequences of real numbers such that $u_n \rightarrow \infty$ and $v_n \rightarrow v$ for some $v \in \mathbb{R}$ , then

\[ \bigg( 1 + \frac{v_n}{u_n} \bigg)^{u_n} \rightarrow {\textrm{e}}^v \]

as $n \rightarrow \infty$ . Hence, $(1-34/n)^{n/2} \rightarrow {\textrm{e}}^{-17}$ and

\[ \bigg(\frac{1 + {6}/{\sqrt{n}} + {2}/{n}}{1 - {6}/{\sqrt{n}} - {2}/{n}}\bigg)^{3\sqrt{n}} \rightarrow {\textrm{e}}^{36}. \]

Combining this with the fact that $h_n \rightarrow \frac12$ , we have, for large enough even n,

\[ \frac{\lfloor n/2 - 3\sqrt{n}\rfloor! \big( n - \lfloor n/2 - 3\sqrt{n}\rfloor \big)!}{(n/2)! (n/2)!} \le 4{\textrm{e}}^{21}. \]

It then follows from (3.2) that, for large enough even n,

(3.6)

\begin{align} \sum_{i=0}^{\lfloor n/2 \rfloor} i! (n-i)!\theta(i,n-i)^2 \le \frac{16{\textrm{e}}^{21}}{n^{3/2}} (n/2)! (n/2)! . \end{align}

Using Stirling’s approximation again, we have, for large enough odd n,

(3.7)

\begin{align} \frac{1}{((n-1)/2)! ((n+1)/2)!} & \leq \frac{2^n{\textrm{e}}^n}{\pi\sqrt{n^2-1}\,}(n-1)^{-(n-1)/2}(n+1)^{-(n+1)/2} \nonumber \\[5pt] & = \frac{2^n{\textrm{e}}^n n^{-n}}{\pi\sqrt{n^2-1}\,} \sqrt{\frac{n-1}{n+1}} \bigg( 1 - \frac{1}{n^2} \bigg)^{-n/2} \nonumber \\[5pt] & \leq 2\frac{2^n{\textrm{e}}^n n^{-n}}{\pi n}. \end{align}

The last expression in (3.7) is precisely twice the Stirling-based upper bound for

\[ \frac{1}{(n/2)! (n/2)!} \]

for n even that we used to get (3.5). Therefore, proceeding exactly as in the n even case, we find that, for large enough odd n,

(3.8)

\begin{equation} \sum_{i=0}^{\lfloor n/2 \rfloor} i! (n-i)! \theta(i,n-i)^2 \le \frac{32{\textrm{e}}^{21}}{n^{3/2}} ((n-1)/2)! ((n+1)/2)!. \end{equation}

Combining (3.6) and (3.8), it’s clear that $\mathcal{E}(\theta) < \infty$ if

\[ \sum_{n=1}^\infty \frac{(n!)^2}{n^{3/2} a_{2n}} + \sum_{n=1}^\infty \frac{n! (n+1)!}{n^{3/2} a_{2n+1}} < \infty , \]

which completes the proof.

Here is an easy extension of Proposition 3.1.

Corollary 3.1. Let W and W^′ be Markov chains defined by the sequences $\{a_m\}_{m=0}^\infty$ and $\{a'_{\!\!m}\}_{m=0}^\infty$ , respectively. Suppose that $\{a_m\}_{m=0}^\infty$ satisfies (3.1), which implies that W is transient. If there exists a $C>0$ such that $a'_{\!\!m} \ge C a_m$ for all $m \in \mathbb{Z}^+$ , then W^′ is also transient.

Recall that the improper conjugate prior for the Poisson parameter $\lambda$ takes the form $\nu(\lambda) = \lambda^{\alpha-1} {\textrm{e}}^{-\beta \lambda}$ for $\alpha>0$ and $\beta \in (-1,0]$ . Again, [Reference Hobert and Robert7] used a highly specialized branching process argument to show that the corresponding Markov chain, $\Phi^\nu$ , is null recurrent when $\beta = 0$ and $\alpha \in(0,1]$ , and is transient otherwise. We now demonstrate that Proposition 3.1 can be used to reproduce the transience part of this result. We consider two different cases that lead to transience:

1. $\alpha \in (0,1]$ and $\beta \in (-1,0)$ ;
2. $\alpha>1$ and $\beta \in (-1,0]$ .

As shown in Section 1, $\Phi^\nu$ is a special case of the chain W generated by the sequence $\{a_m\}_{m=0}^\infty$ given by

\[ a_m = \frac{\Gamma(m+\alpha)}{(2+\beta)^{m+\alpha}} .\]

We begin with case I. Results in [Reference Wendel15] imply that, for every $s > 0$ ,

(3.9)

\begin{equation} \lim_{x \rightarrow \infty} \frac{\Gamma(x+s)}{x^s \Gamma(x)} \rightarrow 1 .\end{equation}

It follows that there exists $N = N(\alpha)$ such that

\[ \frac{1}{\Gamma(n + \alpha)} \leq \frac{2}{n^{\alpha} (n-1)!}\]

for all $n > N$ . Hence, for $n > N$ ,

\[ \frac{(n!)^2}{n^{3/2} a_{2n}} = \frac{(n!)^2(2+\beta)^{2n+\alpha}}{n^{3/2} \Gamma(2n+\alpha)} \le \frac{2 (2+\beta)^{2n+\alpha}(n!)^2}{n^{3/2} (2n)^{\alpha} (2n-1)!} = \frac{2^{2-\alpha} (2+\beta)^{2n+\alpha}(n!)^2}{n^{1/2} n^{\alpha} (2n)!} .\]

From the inequality

(3.10)

\begin{equation} {\binom{2n}{n}}^{-1} \leq \frac{{\textrm{e}} \sqrt{\pi n}}{2^{2n}} ,\end{equation}

it follows that, for $n>N$ ,

\[ \frac{(n!)^2}{n^{3/2} a_{2n}} \le \frac{{\textrm{e}} \sqrt{\pi} 2^{2-\alpha} (2+\beta)^{2n+\alpha}}{2^{2n} n^{\alpha}} .\]

Note that $r \;:\!=\; [ (2+\beta)/2 ]^2 \in (0,1)$ , and hence $\sum_{n=1}^\infty{r^n}/{n^{\alpha}} < \infty $ , which implies that

\[ \sum_{n=1}^\infty \frac{(n!)^2}{n^{3/2} a_{2n}} < \infty .\]

A very similar argument shows that the second summation in (3.1) is also finite. This takes care of case I. We now consider case II in which $\alpha>1$ and $\beta \in (-1,0]$ . According to (3.9), for any $\alpha>1$ there exists $N= N(\alpha)$ such that, for all $n > N$ ,

\[ \frac{1}{\Gamma(n + \alpha)} \leq \frac{2}{n^{\alpha - \lfloor \alpha \rfloor} \Gamma(n + \lfloor \alpha \rfloor)} = \frac{2}{n^{\alpha - \lfloor \alpha \rfloor} (n + \lfloor \alpha \rfloor - 1)!} \leq \frac{2}{n^{\alpha-1} n!} .\]

Hence, for $n > N$ ,

\[ \frac{(n!)^2}{n^{3/2} a_{2n}} \le \frac{2 (2+\beta)^{2n+\alpha}(n!)^2}{n^{3/2} (2n)^{\alpha-1} (2n)!} \le \frac{2^{2n + \alpha + 1}(n!)^2}{n^{3/2} (2n)^{\alpha-1} (2n)!} .\]

Applying (3.10), it follows that, for $n>N$ ,

\[ \frac{(n!)^2}{n^{3/2} a_{2n}} \le \frac{4 {\textrm{e}} \sqrt{\pi}}{n^{\alpha}},\]

and since $\alpha>1$ , it follows immediately that

\[ \sum_{n=1}^\infty \frac{(n!)^2}{n^{3/2} a_{2n}} < \infty .\]

A very similar argument shows that the second summation in (3.1) is also finite. This takes care of case II. Therefore, as claimed, Proposition 3.1 is sharp enough to identify all of the transient versions of $\Phi^\nu$ when $\nu$ is an improper conjugate prior.

Remark 3.1. The unit flow from 0 to $\infty$ that [Reference Hobert and Schweinsberg8] used to prove their transience result is actually a partition flow based on the partition in which $B_0 = \{0\}$ , $B_1 = \{1\}$ , and $B_k =\{2^{k-1},\ldots,2^k-1\}$ for $k \ge 2$ . We have a proof (not presented herein) that this flow cannot work in our case. In particular, it can be shown that it is impossible to use the flow from [Reference Hobert and Schweinsberg8] in conjunction with Theorem 3.1 to produce a condition for transience of W that reproduces the results of [Reference Hobert and Robert7] for conjugate priors.

4. A condition for recurrence of W

In this section, we develop a sufficient condition for the recurrence of W by applying a result from [Reference McGuinness12] to the Markov chain T. In order to state the theorem from [Reference McGuinness12], we must introduce a few new concepts. Recall our generic graph G and consider forming a new graph by subdividing an edge of G. That is, we add vertices $u_1, \ldots, u_{n-1}$ to G and then replace an edge e in G between the vertices v and w with edges $e_1, \ldots, e_n$ , where $e_1$ connects v to $u_1$ , $e_k$ connects $u_{k-1}$ to $u_k$ for $2 \leq k \leq n-1$ , and $e_n$ connects $u_{n-1}$ to w. A network $\tilde{N} = [\tilde{G}, \tilde{c}]$ is said to be a refinement of the network $N = [G, c]$ if the graph $\tilde{G}$ can be obtained by subdividing some of the edges of G and if, whenever $e \in E(G)$ is replaced by edges $e_1, \ldots,e_n \in E(\tilde{G})$ ,

(4.1)

\begin{equation} \sum_{i=1}^n \tilde{c}(e_i)^{-1} = c(e)^{-1} .\end{equation}

Let $\mathcal{U} = \{U_n\}_{n=0}^{\infty}$ be a partition of V(G) such that, whenever $|m-n| \geq 2$ , there is no edge connecting a vertex in $U_m$ and a vertex in $U_n$ . We call such a partition an N-constriction. Let $\tau_a^N(U_n)$ denote the probability that the weighted random walk on N starting at a eventually reaches a vertex in the set $U_n$ . Let $E_n$ be the set of edges connecting a vertex in $U_{n-1}$ to a vertex in $U_{n}$ .

Theorem 4.1. (McGuinness [Reference McGuinness12].) Let $N = [G, c]$ be a network and let $a \in V(G)$ . Then the weighted random walk on N is recurrent if and only if there exists a refinement $\tilde{N} = [\tilde{G}, \tilde{c}]$ of N having an $\tilde{N}$ -constriction $\mathcal{U} = \{U_n\}_{n=0}^{\infty}$ such that $a \in U_0$ , $\tau_a^{\tilde{N}}(U_n) = 1$ for all $n \in \{1,2,\dots\}$ , and $\sum_{n=1}^{\infty} \big( \sum_{e \in E_n} \tilde{c}(e) \big)^{-1} = \infty$ .

Here is our result concerning the recurrence of W.

Proposition 4.1. The Markov chain W (characterized by the sequence $\{a_m\}_{m=0}^\infty$ ) is recurrent if

(4.2)

\begin{equation} \sum_{n=1}^\infty \Bigg[ \sum_{i=0}^{n-1} \sum_{j=n}^\infty \frac{a_{i+j}(j-i)}{i!j!} \Bigg]^{-1} = \infty . \end{equation}

Proof. We begin by describing a refinement of $M = [H,d]$ , call it $\tilde{M}= [\tilde{H},\tilde{d}]$ . For all $i,j \in \mathbb{Z}^+$ such that $i+1<j$ , we add vertices $v_{ij}^n$ for $n=i+1,\dots,j-1$ . The edge $e_{ij}$ is replaced by $e^n_{ij}$ for $n=i+1,\dots,j$ , where $e_{ij}^{i+1}$ connects i to $v_{ij}^{i+1}$ , $e_{ij}^j$ connects $v_{ij}^{j-1}$ to j, and $e_{ij}^n$ connects $v_{ij}^{n-1}$ to $v_{ij}^n$ for $n=i+2,\dots,j-1$ . For all $i,j \in \mathbb{Z}^+$ such that $i+1=j$ , we add no new vertices to H, but $e_{ij}$ is renamed $e^j_{ij}$ . The new conductance is defined as

\[\tilde{d}(e^n_{ij}) = \frac{a_{i+j}(j-i)}{i!j!}\]

for every $i,n,j \in \mathbb{Z}_+$ with $i < n \leq j$ . It follows that, for every $i,j \in \mathbb{Z}_+$ with $i < j$ ,

\[ \sum_{n=i+1}^j \tilde{d}(e^n_{ij})^{-1} = \frac{i!j!}{a_{i+j}(j-i)} \sum_{n=i+1}^j 1 = \frac{i!j!}{a_{i+j}} = d(e_{ij})^{-1} .\]

Thus, (4.1) is satisfied. (We note that this refinement is similar to that used in [Reference Hobert and Schweinsberg8, Section 3], but our conductances are not the same as those used in [Reference Hobert and Schweinsberg8].) Now let $U_0 = \{0\}$ and, for $n \in \{1,2,\dots\}$ , let $U_n = \{n\}\cup \{v_{ij}^n \colon i < n < j \}$ . It follows from the definition of $\tilde{H}$ that every edge in $E(\tilde{H})$ with one end in $U_n$ has its other end in $U_{n-1}$ or $U_{n+1}$ . Therefore, $\mathcal{U} =\{U_n\}_{n=0}^{\infty}$ is an $\tilde{M}$ -constriction. Moreover, $0\in U_0$ and a straightforward argument in [Reference Hobert and Schweinsberg8, p. 1220] applies directly to our situation and shows that $\tau_0^{\tilde{M}}(U_n) = 1$ for all $n \in \{1,2,\dots\}$ . Now, for every $n \geq 1$ we have

\begin{equation*} \sum_{e \in E_n} \tilde{d}(e) = \sum_{i=0}^{n-1} \sum_{j=n}^\infty \tilde{d}(e^n_{ij}) = \sum_{i=0}^{n-1} \sum_{j=n}^\infty \frac{a_{i+j}(j-i)}{i!j!} .\end{equation*}

The result now follows immediately from Theorem 4.1.

Remark 4.1. Recall that W is positive recurrent if $\sum_{i=1}^\infty b_i < \infty$ , and is null recurrent or transient otherwise. Proposition 4.1 is applicable regardless of the value of $\sum_{i=1}^\infty b_i$ , but its real value resides in cases where this sum diverges.

Here is the analogue of Corollary 3.1.

Corollary 4.1. Let W and W^′ be Markov chains defined by the sequences $\{a_m\}_{m=0}^\infty$ and $\{a'_{\!\!m}\}_{m=0}^\infty$ , respectively. Suppose that $\{a_m\}_{m=0}^\infty$ satisfies (4.2), which implies that W is recurrent. If there exists a $C>0$ such that $a'_{\!\!m} \le C a_m$ for all $m \in \mathbb{Z}^+$ , then W^′ is also recurrent.

Remark 4.2. Corollary 4.1 can be viewed as a generalization of a result in [Reference Eaton, Hobert and Jones6] that holds in the context of the Poisson problem. Indeed, let $\nu(\lambda)$ be a prior such that the corresponding $\Phi^\nu$ is (null) recurrent. Suppose that $\nu'(\lambda)$ is another prior for which $\nu'(\lambda) = g(\lambda) \nu(\lambda)$ , where $g\colon \mathbb{R}^+\rightarrow \mathbb{R}^+$ is bounded. Then, together, [Reference Eaton, Hobert and Jones6, Theorems 4 and 8] imply that $\Phi^{\nu'}$ is also (null) recurrent. Here is the connection with Corollary 4.1. If $\nu'(\lambda) =g(\lambda) \nu(\lambda)$ where $g \le C$ , then $\nu'(\lambda) \le C\nu(\lambda)$ and

\[ a'_{\!\!m} = \int_{\mathbb{R}^+} \lambda^m {\textrm{e}}^{-2\lambda} \nu'(\lambda) \, {\textrm{d}}\lambda \le C \int_{\mathbb{R}^+} \lambda^m {\textrm{e}}^{-2\lambda} \nu(\lambda) \, {\textrm{d}}\lambda = C a_m ,\]

which is precisely the condition in Corollary 4.1. Of course, Corollary 4.1 is more general. Firstly, even if $\nu'/\nu$ is unbounded, it is still possible that $a'_{\!\!m} \le Ca_m$ for all m. Secondly, Corollary 4.1 holds for general sequences, $a'_{\!\!m}$ and $a_m$ , not only those associated with the Poisson problem.

We now demonstrate that the results in this section can be can be used to show that the Markov chain $\Phi^\nu$ corresponding to the improper conjugate prior with $\beta=0$ and $\alpha \in (0,1]$ is null recurrent. Again, the sequence $\{a_m\}_{m=0}^\infty$ associated with this conjugate prior is given by $a_m = {\Gamma(m+\alpha)}/{2^{m+\alpha}}$ . Because $\Gamma(x)$ is an increasing function for $x \ge 2$ , it follows that, for any $m \ge 2$ and any $\alpha \in (0,1)$ ,

\[ \frac{\Gamma(m+\alpha)}{2^{m+\alpha}} < 2^{1-\alpha} \frac{\Gamma(m+1)}{2^{m+1}} .\]

Consequently, if we could use Proposition 4.1 to prove recurrence when $\beta=0$ and $\alpha = 1$ , then it would follow immediately by Corollary 4.1 that we also have recurrence when $\beta=0$ and $\alpha \in (0,1)$ . This is our plan. Assume now that $\beta=0$ and $\alpha = 1$ so that $a_m = m!/2^{m+1}$ . When we write $Z \sim \mbox{NB}(r,p)$ , we mean that the random variable Z has a negative binomial distribution with parameters $r\in \{1,2,\dots\}$ and $p \in (0,1)$ , and

\[ \mathbb{P}(Z=z) = \binom{r+z-1}{z} p^r (1-p)^z \textbf{1}_{\mathbb{Z}^+}(z) .\]

Recall that $\mathbb{E}(Z) = r(1-p)/p$ . We have

\begin{align*} \sum_{j=n}^\infty \frac{a_{i+j}(j-i)}{i!j!} & = \sum_{j=n}^\infty\frac{a_{i+j}j}{i!j!} - \sum_{j=n}^\infty \frac{a_{i+j}i}{i!j!} \\[8pt] & = \sum_{j=n}^\infty \frac{j (i+j)!}{i!j!2^{i+j+1}} - i \sum_{j=n}^\infty \frac{(i+j)!}{i!j!2^{i+j+1}} \\[8pt] & = (i+1) \sum_{k=n-1}^\infty \frac{(i+k+1)!}{(i+1)!k!2^{i+k+2}} - i \sum_{j=n}^\infty \frac{(i+j)!}{i!j!2^{i+j+1}} \\[8pt] & = (i+1)\sum_{k=n-1}^\infty \binom{i+k+1}{k} 2^{-i-k-2} - i \sum_{j=n}^\infty \binom{i+j}{j} 2^{-i-j-1} \\[8pt] & = (i+1)\mathbb{P}(Z_{i+2} \ge n-1) - i \mathbb{P}(Z_{i+1} \ge n) ,\end{align*}

where $Z_i \sim \mbox{NB}\big(i,\frac12\big)$ for $i \in \{1,2,\dots\}$ . Now let $U \sim \mbox{NB}\big(i+1,\frac12\big)$ , $V \sim \mbox{NB}\big(1,\frac12\big)$ , and assume U and V are independent. Then $U + V \sim \mbox{NB}\big(i+2,\frac12\big)$ . For every $n \ge 2$ and $0 \le i \le n-1$ , we have, by Markov’s inequality,

(4.3)

\begin{equation} \sum_{j=n}^\infty \frac{a_{i+j}(j-i)}{i!j!} \le \frac{i+2}{n-1} + i ( \mathbb{P}(U + V \geq n-1) - \mathbb{P}(U \geq n) ) .\end{equation}

Since $U \geq n$ implies that $U + V \geq n-1$ , it follows by the independence of U and V that

\begin{align*} \mathbb{P}(U+V \geq n-1) - \mathbb{P}(U \geq n) & = \mathbb{P}(U+V \geq n-1, U < n) \\[8pt] & = \mathbb{P} \big( \uplus_{k=0}^{n-1} \{V \geq n-1-k, U = k\} \big) \\[8pt] & = \sum_{k=0}^{n-1} \mathbb{P}(U = k) \mathbb{P}(V \geq n-1-k) \\[8pt] & = \sum_{k=0}^{n-1} {\binom{i+k}{k}} 2^{-i-k-1} 2^{-(n-k-1)} \\[8pt] & = \sum_{k=0}^{n-1} {\binom{i+k}{k}} 2^{-i-n} \\[8pt] & = {\binom{i+n}{n-1}} 2^{-i-n} .\end{align*}

The last step follows from a repeated application of the fact that ${\binom{n}{r}} + {\binom{n}{r-1}} = {\binom{n+1}{r}}$ (Pascal’s identity), and noting that ${\binom{i}{0}} = {\binom{i+1}{0}}$ . Combining this with (4.3) we have

\begin{align*} \sum_{j=n}^\infty \frac{a_{i+j}(j-i)}{i!j!} & \le \frac{i+2}{n-1} + i {\binom{i+n}{n-1}} 2^{-i-n} \\[5pt] & = \frac{i+2}{n-1} + n \frac{i}{i+1} {\binom{i+n}{n}} 2^{-i-n} \\[5pt] & \le \frac{i+2}{n-1} + 2n {\binom{i+n}{n}} 2^{-i-n-1} \\[5pt] & = \frac{i+2}{n-1} + 2n \mathbb{P}(U' = i) ,\end{align*}

where $U' \sim \mbox{NB}\big(n+1,\frac12\big)$ . Hence, for any $n \ge 2$ ,

\begin{align*} \sum_{i=0}^{n-1} \sum_{j=n}^\infty \frac{a_{i+j}(j-i)}{i!j!} & \le \sum_{i=0}^{n-1} \bigg( \frac{i+2}{n-1} + 2n \mathbb{P}(U' = i) \bigg) \\[5pt] & \le \frac{(n+1)(n+2)-2}{2(n-1)} + 2n \\[5pt] & \le 2(n+2) + 2n .\end{align*}

Finally, we have

\[\sum_{n=2}^\infty \Bigg[ \sum_{i=0}^{n-1} \sum_{j=n}^\infty \frac{a_{i+j}(j-i)}{i!j!} \Bigg]^{-1} \ge \sum_{n=2}^\infty\frac{1}{4n+4} = \infty ,\]

which implies that the Markov chain is (null) recurrent by Proposition 4.1. Therefore, as claimed, our results are sharp enough to identify all of the null recurrent versions of $\Phi^\nu$ when $\nu$ is an improper conjugate prior.

5. Examples

5.1. Improper inverse gamma priors

Consider another family of improper priors for $\lambda$ that lead to proper posteriors. Take $\nu(\lambda) = \lambda^{\gamma-1}{\textrm{e}}^{-\theta/ \lambda}$ for $\gamma \ge 0$ and $\theta > 0$ . This is an improper inverse gamma density, i.e. for $\gamma \ge 0$ and $\theta > 0$ we have $\int_{\mathbb{R}^+} \lambda^{\gamma-1} {\textrm{e}}^{-\theta/\lambda} \,{\textrm{d}}\lambda = \infty$ . The resulting posterior density is proper since, for any $x \in\mathbb{Z}^+$ ,

\[ \int_{\mathbb{R}^+} \lambda^x {\textrm{e}}^{-\lambda} \nu(\lambda) \, {\textrm{d}}\lambda = \int_{\mathbb{R}^+} \lambda^{x+\gamma-1} {\textrm{e}}^{-\lambda - \theta/\lambda} \, {\textrm{d}}\lambda < \infty .\]

In fact, the posterior density is generalized inverse Gaussian. When we write $V \sim \mbox{GIG}(\phi,a,b)$ , we mean that $\phi \in\mathbb{R}$ , $a,b>0$ , and the random variable V has density

\[ f(v) = \frac{a^{\phi/2}}{2 b^{\phi/2}K_\phi (\sqrt{ab})} v^{\phi-1} {\textrm{e}}^{-{a v}/{2} - {b}/{2v}} \textbf{1}_{\mathbb{R}^+}(v) ,\]

where $K_\phi$ is the modified Bessel function of the second kind. So the posterior is $\mbox{GIG}(x+\gamma,2,2\theta)$ . We now investigate the stability of the Markov chains $\Phi^\nu$ associated with this new family of priors. Fix $\gamma \in [0,1]$ and $\theta>0$ . The ratio of this improper inverse gamma prior to the improper conjugate prior with $\alpha = 1$ and $\beta=0$ is

\[ \frac{\lambda^{\gamma-1} {\textrm{e}}^{-\theta/\lambda}}{\lambda^{1-1}} = \lambda^{\gamma-1} {\textrm{e}}^{-\theta/\lambda} ,\]

which is bounded. Hence, it follows from Corollary 4.1 (or the results in [Reference Eaton, Hobert and Jones6]) that the associated $\Phi^\nu$ is null recurrent. So improper inverse gamma priors taking the form $\nu(\lambda) = \lambda^{\gamma-1} {\textrm{e}}^{-\theta/ \lambda}$ , $\gamma \in[0,1]$ and $\theta > 0$ , are strongly admissible. Just like the conjugate priors, these priors are improper due to their heavy right tails. When $\gamma = 1$ , the prior increases from 0 to 1 as $\lambda$ increases. When $\gamma \in [0,1)$ , the prior is unimodal, converging to 0 at the origin and at $\infty$ , and achieving its maximum when $\lambda = \theta/(1-\gamma)$ . Note that when $\gamma= 0$ , the right tail decreases like $1/\lambda$ , a rate not attained by any of the improper conjugate priors.

Now assume that $\gamma>1$ and $\theta>0$ . We have

\begin{equation*} a_m = \int_{\mathbb{R}_+} {\textrm{e}}^{-2\lambda} \lambda^m \nu(\lambda) \, {\textrm{d}}\lambda = \frac{2 \theta^{({m+\gamma})/{2}} K_{m+\gamma}(\sqrt{8\theta})}{2^{({m+\gamma})/{2}}} .\end{equation*}

A standard bound for the ratio of Bessel functions (see, e.g., [Reference Segura14, Theorem 1]) gives us

(5.1)

\begin{equation} \frac{K_\nu (2\sqrt{2\theta})}{K_{\nu-1} (2\sqrt{2\theta})} > \frac{(\nu-1)}{\sqrt{2\theta}} .\end{equation}

Assume $m \ge 2$ . Repeated application of (5.1) leads to the inequality

\[K_{m+\gamma}(2 \sqrt{2\theta}) > \frac{(m+\gamma-1)(m+\gamma-2) \cdots (2+\gamma- \lfloor \gamma \rfloor)K_{2 + \gamma - \lfloor \gamma \rfloor}(2\sqrt{2\theta})}{(2\theta)^{({m + \lfloor \gamma \rfloor - 2})/{2}}} .\]

Define

\[g(\gamma,\theta) = \frac{2^{({4+\gamma - \lfloor \gamma \rfloor})/{2}} \theta^{({\gamma - \lfloor \gamma \rfloor + 2})/{2}} K_{2+\gamma -\lfloor \gamma \rfloor}(2\sqrt{2\theta})}{\Gamma(2+\gamma-\lfloor \gamma \rfloor)}.\]

So, for $m \ge 2$ , we have

\begin{align*} a_m & > 2^{({4+\gamma - \lfloor \gamma \rfloor})/{2}} \theta^{({\gamma - \lfloor \gamma \rfloor + 2})/{2}} \frac{(m+\gamma-1)(m+\gamma-2) \cdots (2+\gamma-\lfloor \gamma \rfloor) K_{2+\gamma -\lfloor \gamma \rfloor}(2\sqrt{2\theta})}{2^{m+\gamma}} \\[5pt] & = g(\gamma,\theta)\frac{(m+\gamma-1)(m+\gamma-2) \cdots (2+\gamma-\lfloor \gamma \rfloor) \Gamma(2+\gamma-\lfloor \gamma \rfloor)}{2^{m+\gamma}} \\[5pt] & = g(\gamma,\theta) \frac{\Gamma(m+\gamma)}{2^{m+\gamma}} .\end{align*}

Hence, for $m \ge 2$ , $a_m > C a'_{\!\!m}$ where $\{a'_{\!\!m}\}_{m=0}^\infty$ is the sequence associated with the conjugate prior with $\alpha = \gamma> 1$ and $\beta=0$ . It follows from Corollary 3.1 that $\Phi^\nu$ is transient whenever $\gamma>1$ and $\theta > 0$ . We conclude that it is not possible to use the results of [Reference Eaton4] to establish the strong admissibility of the improper inverse gamma prior when $\gamma>1$ and $\theta>0$ .

5.2. A logarithmic improper prior

Recall that the flat prior, i.e. the conjugate prior with $\alpha =1$ and $\beta=0$ , is strongly admissible, but any conjugate prior with $\alpha>1$ leads to a transient $\Phi^\nu$ . This suggests that it might be interesting to consider the improper prior $\nu(\lambda) =\log\!(1+\lambda)$ since it is increasing, but it increases slower than any conjugate prior with $\alpha>1$ and $\beta=0$ . We now use Proposition 4.1 to show that $\Phi^\nu$ corresponding to $\nu(\lambda) = \log\!(1+\lambda)$ is recurrent, which implies that this prior is strongly admissible. We start by noting that

\[ a_m = \int_{\mathbb{R}^+} \lambda^m {\textrm{e}}^{-2\lambda} \log\!(1+\lambda) \, {\textrm{d}}\lambda = \frac{m!}{2^{m+1}} \mathbb{E}(\log\!(1+X)) ,\]

where $X \sim \mbox{Gamma}(m+1,2)$ . Thus, by Jensen’s inequality,

\[ a_m < \frac{m!}{2^{m+1}} \log\big( (m+3)/2 \big) < \frac{m!}{2^{m+1}} \log\!(m+3) .\]

Thus, by Proposition 4.1, the Markov chain $\Phi^\nu$ corresponding to the logarithmic prior is recurrent if

\begin{equation*} \sum_{n=1}^\infty \Bigg( \sum_{i=0}^{n-1} \sum_{j=n}^\infty \frac{(i+j)!(j-i)\log\!(i+j+3)}{2^{i+j+1}i!j!} \Bigg)^{-1} = \infty.\end{equation*}

As we did previously, let $U \sim \mbox{NB}\big(i+1,\frac12\big)$ , $V \sim\mbox{NB}\big(1,\frac12\big)$ , and assume U and V are independent. Recall that $U+V \sim \mbox{NB}\big(i+2,\frac12\big)$ . Fix $n \ge 2$ . We have

\begin{align*} \sum_{j=n}^\infty & \frac{(i+j)!(j-i)\log\!(i+j+3)}{2^{i+j+1}i!j!} \\[5pt] & \qquad = \sum_{j=n}^\infty \frac{j (i+j)!\log\!(i+j+3)}{2^{i+j+1}i!j!} - i \sum_{j=n}^\infty \frac{(i+j)!\log\!(i+j+3)}{2^{i+j+1}i!j!} \\[5pt] & \qquad = (i+1) \sum_{k=n-1}^\infty \mathbb{P}(U+V=k) \log\!(i+k+4) - i \sum_{j=n}^\infty \mathbb{P}(U=j) \log\!(i+j+3) \\[5pt] & \qquad = (i+1) \mathbb{E} \big[ \log\!(i + U + V + 4) \textbf{1}_{\{ U + V \geq n-1\}} \big] - i \mathbb{E} \big[ \log\!(i + U + 3) \textbf{1}_{\{ U \geq n\}} \big] .\end{align*}

Using Jensen’s inequality, we have

\[ \mathbb{E} \big[ \log\!(i + U + V + 4) \textbf{1}_{\{ U + V \geq n-1\}} \big] \le \mathbb{E} \big[ \log\!(i + U + V + 4) \big] \le \log\!(2i + 6).\]

Hence,

(5.2)

\begin{multline} \sum_{j=n}^\infty \frac{(i+j)!(j-i)\log\!(i+j+3)}{2^{i+j+1}i!j!} \\[5pt] \le \log\!(2i + 6) + i \big( \mathbb{E} \big[ \log\!(i + U + V + 4) \textbf{1}_{\{ U + V \geq n-1\}} \big] - \mathbb{E} \big[ \log\!(i + U + 3) \textbf{1}_{\{ U \geq n\}} \big] \big).\end{multline}

Now, since $\{U+V \geq n-1\} = \{U \geq n\} \uplus \big( \uplus_{k=0}^{n-1} \{V\geq n-1-k, U = k\} \big)$ , it follows that

(5.3)

\begin{multline} \mathbb{E} \big[ \log\!(i + U + V + 4) \textbf{1}_{\{ U + V \geq n-1\}} \big] - \mathbb{E} \big[ \log\!(i + U + 3) \textbf{1}_{\{ U \geq n\}} \big] \\[5pt] = \mathbb{E}\big[\big(\log\!(i + U + V + 4) - \log\!(i + U + 3)\big)\textbf{1}_{\{U\geq n\}}\big] \\[5pt] + \sum_{k=0}^{n-1} \mathbb{E} \big[ \log\!(i + U + V + 4) \textbf{1}_{\{ V \geq n-1-k,U=k\}} \big] .\end{multline}

Now, by the independence of U and V and Jensen’s inequality, we have

(5.4)

\begin{align} \mathbb{E} \big[ \big( \log\!(i + U + V + 4) & - \log\!(i + U + 3) \big) \textbf{1}_{\{ U \geq n\}} \big] \nonumber \\[5pt] & = \mathbb{E} \big[ \mathbb{E} \big[ \big( \log\!(i + U + V + 4) - \log\!(i + U + 3) \big) \textbf{1}_{\{ U \geq n\}} \mid U \big] \big] \nonumber \\[5pt] & \le \mathbb{E} \big[ \big( \log\!(i + U + \mathbb{E}(V) + 4) - \log\!(i + U + 3) \big) \textbf{1}_{\{ U \geq n\}} \big] \nonumber \\[5pt] & = \mathbb{E} \big[ \big( \log\!(i + U + 5) - \log\!(i + U + 3) \big) \textbf{1}_{\{ U \geq n\}} \big] \nonumber \\[5pt] & = \mathbb{E} \bigg[ \log \bigg( 1 + \frac{2}{i+U+3} \bigg) \textbf{1}_{\{ U \geq n\}} \bigg] \nonumber \\[5pt] & \le \log \bigg( 1 + \frac{2}{i+n+3} \bigg) \le \frac{2}{i+n+3} , \end{align}

where the last inequality follows from the fact that $\log\!(1+x) \le x$ for $x>0$ . Again using the independence of U and V, we have

\begin{align*} \sum_{k=0}^{n-1} \mathbb{E} \big[ \log\!(i + U + V + 4) & \textbf{1}_{\{ V \geq n-1-k,U=k\}} \big] \nonumber \\[5pt] & = \sum_{k=0}^{n-1} \mathbb{E} \big[ \log\!(i + k + V + 4) \textbf{1}_{\{ V \geq n-1-k,U=k\}} \big] \nonumber \\[5pt] & = \sum_{k=0}^{n-1}\mathbb{E}\big[\log\!(i + k + V + 4)\textbf{1}_{\{V \geq n-1-k\}}\big]\mathbb{P}(U=k) \nonumber \\[5pt] & = \sum_{k=0}^{n-1} \Bigg( \sum_{s=n-1-k}^\infty \log\!(i + k + s + 4) 2^{-s-1} \Bigg) \mathbb{P}(U=k) \nonumber \\[5pt] & = \sum_{k=0}^{n-1}\Bigg(\sum_{t=0}^\infty\log\!(i + n + t + 3)2^{-t-1}\Bigg)2^{-(n-k-1)}\mathbb{P}(U=k) \\[5pt] & = \mathbb{E} [ \log\!(i+n+V+3) ] \sum_{k=0}^{n-1} \binom{i+k}{k} 2^{-i-n} \end{align*}

(5.5)

\begin{align} & \le \log\!(i+n+4) \sum_{k=0}^{n-1} \binom{i+k}{k} 2^{-i-n} \nonumber \\[5pt] & = \log\!(i+n+4) \binom{i+n}{n-1} 2^{-i-n} , \end{align}

where the penultimate inequality is Jensen’s, and the last line follows from the argument based on Pascal’s identity that was used in the previous section. Let $U' \sim \mbox{NB}\big(n+1,\frac12\big)$ . By combining (5.2), (5.3), (5.4), and (5.5), we obtain

\begin{align*} \sum_{i=0}^{n-1} \sum_{j=n}^\infty & \frac{(i+j)!(j-i)\log\!(i+j+3)}{2^{i+j+1}i!j!} \\[5pt] & \le \sum_{i=0}^{n-1}\bigg( \log\!(2i + 6) + \frac{2i}{i+n+3} + i \log\!(i+n+4)\binom{i+n}{n-1} 2^{-i-n} \bigg) \\[5pt] & \le n\log\!(2n + 6) + 2n + 2n\log\!(2n+4) \sum_{i=0}^{n-1} \binom{i+n}{i} 2^{-i-n-1} \\[5pt] & \le n\log\!(2n + 6) + 2n + 2n\log\!(2n+4) \mathbb{P}(U' \le n-1) \\[5pt] & \le 5n\log\!(2n + 6) .\end{align*}

Thus,

\begin{equation*}\sum_{n=4}^\infty \Bigg( \sum_{i=0}^{n-1} \sum_{j=n}^\infty\frac{(i+j)!(j-i) \log\!(i+j+3)}{2^{i+j+1}i!j!} \Bigg)^{-1} \ge\sum_{n=4}^\infty \frac{1}{5n\log\!(2n + 6)} \ge \frac{1}{10}\sum_{n=4}^\infty \frac{1}{n\log\!(n)} = \infty .\end{equation*}

Therefore, as claimed, $\Phi^\nu$ corresponding to $\nu(\lambda) =\log\!(1+\lambda)$ is (null) recurrent, and this prior is strongly admissible.

6. Discussion

An obvious question concerning our two sufficient conditions is as follows: Does there exist a gap between them, i.e. are there chains that satisfy neither of the conditions? While we strongly suspect that there do exist examples of W that don’t satisfy either of the sufficient conditions, we have yet to come across one. In particular, note that every version of W analyzed in this paper does satisfy one of the two conditions. We leave the existence/non-existence of a gap as an open problem.

It might be possible to extend our work on the Poisson problem to the multivariate case. Specifically, suppose that instead of observing a single observation from the Poisson distribution, we observe multiple independent Poisson random variables with different means. We could then consider prior distributions for the corresponding vector of unknown means and attempt to use Eaton’s [Reference Eaton4] theory to develop conditions for strong admissibility. There has been some work on this problem [Reference Lai10], but, as far as we know, the associated conjugate chain has not been analyzed.

Acknowledgement

The authors are grateful to two anonymous reviewers for helpful comments and suggestions that led to an improved version of the paper.

Funding information

There are no funding bodies to thank relating to the creation of this article.

Competing interests

There were no competing interests to declare which arose during the preparation or publication process of this article.

References

Billingsley, P. (1995). Probability and Measure, 3rd edn. John Wiley, New York.Google Scholar

Brown, L. D. (1971). Admissible estimators, recurrent diffusions, and insoluble boundary value problems. Ann. Math. Statist. 42, 855–904.CrossRef Google Scholar

Doyle, P. J. and Snell, J. L. (1984). Random Walks and Electric Networks. Mathematical Association of America, Washington, DC.CrossRef Google Scholar

Eaton, M. L. (1992). A statistical diptych: Admissible inferences-recurrence of symmetric Markov chains. Ann. Statist. 20, 1147–1179.CrossRef Google Scholar

Eaton, M. L. (2004). Evaluating improper priors and the recurrence of symmetric Markov chains: An overview. In A Festschrift to Honor Herman Rubin, ed. A. Dasgupta (IMS Lect. Notes Ser. 45). Institute of Mathematical Statistics, Beachwood, OH.Google Scholar

Eaton, M. L., Hobert, J. P. and Jones, G. L. (2007). On perturbations of strongly admissible prior distributions. Ann. Inst. H. Poincaré Prob. Statist. 43, 633–653.CrossRef Google Scholar

Hobert, J. P. and Robert, C. P. (1999). Eaton’s Markov chain, its conjugate partner and

$\mathcal{P}$ -admissibility. Ann. Statist. 27, 361–373.CrossRef Google Scholar

Hobert, J. P. and Schweinsberg, J. (2002). Conditions for recurrence and transience of a Markov chain on

${\mathbb Z}^+$ and estimation of a geometric success probability. Ann. Statist. 30, 1214–1223.CrossRef Google Scholar

Johnstone, I. (1984). Admissibility, difference equations, and recurrence in estimating a Poisson mean. Ann. Statist. 12, 1173–1198.CrossRef Google Scholar

Lai, W.-L. (1996). Admissibility and the recurrence of Markov chains with applications. Tech. Rep. No. 612, School of Statistics, University of Minnesota.Google Scholar

Lyons, T. (1983). A simple criterion for transience of a reversible Markov chain. Ann. Prob. 11, 393–402.CrossRef Google Scholar

McGuinness, S. (1991). Recurrent networks and a theorem of Nash–Williams. J. Theoret. Prob. 4, 87–100.CrossRef Google Scholar

Peres, Y. (1999). Probability on trees: An introductory climb. In Lectures on Probability Theory and Statistics, ed. Bernard, P. (Lect. Notes Math. 1717). Springer, New York, pp. 193-280.Google Scholar

Segura, J. (2023). Simple bounds with best possible accuracy for ratios of modified Bessel functions. J. Math. Anal. Appl. 526, 127211.CrossRef Google Scholar

Wendel, J. G. (1948). Note on the gamma function. Amer. Math. Monthly 55, 563–564.CrossRef Google Scholar