Hostname: page-component-745bb68f8f-v2bm5 Total loading time: 0 Render date: 2025-01-10T20:45:18.128Z Has data issue: false hasContentIssue false

Max-linear graphical models with heavy-tailed factors on trees of transitive tournaments

Published online by Cambridge University Press:  15 December 2023

Stefka Asenova*
Affiliation:
UCLouvain
Johan Segers*
Affiliation:
UCLouvain
*
*Postal address: LIDAM/ISBA, Voie du Roman Pays 20, 1348 Louvain-la-Neuve, Belgium.
*Postal address: LIDAM/ISBA, Voie du Roman Pays 20, 1348 Louvain-la-Neuve, Belgium.
Rights & Permissions [Opens in a new window]

Abstract

Graphical models with heavy-tailed factors can be used to model extremal dependence or causality between extreme events. In a Bayesian network, variables are recursively defined in terms of their parents according to a directed acyclic graph (DAG). We focus on max-linear graphical models with respect to a special type of graph, which we call a tree of transitive tournaments. The latter is a block graph combining in a tree-like structure a finite number of transitive tournaments, each of which is a DAG in which every two nodes are connected. We study the limit of the joint tails of the max-linear model conditionally on the event that a given variable exceeds a high threshold. Under a suitable condition, the limiting distribution involves the factorization into independent increments along the shortest trail between two variables, thereby imitating the behaviour of a Markov random field.

We are also interested in the identifiability of the model parameters in the case when some variables are latent and only a subvector is observed. It turns out that the parameters are identifiable under a criterion on the nodes carrying the latent variables which is easy and quick to check.

Type
Original Article
Copyright
© The Author(s), 2023. Published by Cambridge University Press on behalf of Applied Probability Trust

1. Introduction

Dependence in multivariate linear factor models is determined by a collection of independent random variables, called factors, which are shared by the modelled variables. In extreme value analysis there are the max-linear and the additive factor models with heavy-tailed factors. In [Reference Einmahl, Krajina and Segers10], it is shown that both have the same max-domain of attraction.

In [Reference Gissibl and Klüppelberg13], a link is made between such factor models and probabilistic graphical models via a max-linear recursively defined structural equation model on a directed acyclic graph (DAG). Each node carries a variable defined as a weighted maximum of its parent variables and an independent factor. This leads to a representation of the graphical model as a (max-)factor model as in [Reference Einmahl, Krajina and Segers10], the factors relevant for a given variable being limited to the set of its ancestors. More recent is the linear causally structured model in [Reference Gnecco, Meinshausen, Peters and Engelke16]: each variable is the weighted sum of the variables on all its parent nodes plus an independent factor. This leads to a representation where a single variable is a weighted sum of all its ancestral factors.

In this paper, we study a type of graph that, to the best of our knowledge, is not yet known, and to which we have given a name that reflects its most important properties: a tree of transitive tournaments (TTT), denoted by $\mathcal{T}$ . A tournament is a graph obtained by directing a complete graph, while a tournament is said to be transitive if it has no directed cycles. The name reflects the interpretation of such a graph as a competition where every node is a player and a directed edge points from the winner to the loser. Some examples are hierarchical relations between members of animal and bird societies, brand preferences, and votes between two alternative policies [Reference Harary and Moser18]. A TTT links up several such transitive tournaments in a tree-like structure. It is acyclic by construction. If there is a directed path from one node to another one, there is a unique shortest such path. Moreover, between any pair of nodes, there is a unique shortest undirected path.

In this paper, we study max-linear graphical models with respect to a TTT as defined in (2) below. In particular, for a max-linear random vector $X=(X_v, v\in V)$ with node set V, we study the limit in distribution

(1) \begin{equation} \left( X_v/X_u, v\in V\setminus u \mid X_u>t \right) \stackrel{d}{\longrightarrow} (A_{uv}, v\in V), \qquad t\rightarrow\infty.\end{equation}

It is not hard to show that the limit distribution in (1) is discrete [Reference Segers31, Example 1]. We show that if the TTT has a unique node without parents, a so-called source node, the joint distribution of $(A_{uv}, v\in V)$ is determined by products of independent multiplicative increments along the unique shortest undirected paths between the node u at which the high threshold is exceeded on the one hand and the rest of the nodes on the other hand. Such behaviour is analogous to that of Markov random fields on block graphs in [Reference Asenova and Segers5] and of Markov trees in [Reference Segers31, Theorem 1]. In turn, these results go back to the extensive literature on the additive or multiplicative structure of extremes for Markov chains [Reference Janssen and Segers20, Reference Resnick and Zeber28, Reference Segers29, Reference Smith32, Reference Yun36].

An underlying reason for the factorization into independent increments is the fact that a max-linear graphical model with respect to a TTT is a Markov random field with respect to the undirected graph associated to the original, directed graph when the TTT has a unique source. A TTT with unique source has no v-structures, that is, no nodes with non-adjacent parents. Both properties, the factorization of the limiting variables and the Markovianity with respect to the undirected graph, are lost if the graph contains v-structures. To show this, we rely on the recent theory of conditional independence in max-linear Bayesian networks based on the notion of $*$ -connectedness [Reference Améndola, Hollering, Sullivant, Tran, de Campos, Maathuis and Research Press1, Reference Améndola, Klüppelberg, Lauritzen and Tran2]. This theory diverges from classical results on conditional independence in Bayesian networks based on the notion of d-separation [Reference Koller and Friedman23, Reference Lauritzen25].

In our paper the graph is given. A significant line of research in the context of extremal dependence is graph discovery. Given observations on a number of variables represented as nodes in a graph, the task is to estimate the edges. For Bayesian networks we can also talk about causality discovery, because directed edges show the direction of influence. A first attempt to identify the DAG in the context of max-linear models was [Reference Gissibl, Klüppelberg and Otto15], which was followed by several papers focusing on this topic: [Reference Buck and Klüppelberg8], [Reference Gissibl, Klüppelberg and Lauritzen14], [Reference Klüppelberg and Krali21], [Reference Tran, Buck and Klüppelberg33], and [Reference Tran, Buck and Klüppelberg34]. The problems related to identifiability of the true graph and to the estimation of the edge weights are discussed in [Reference Klüppelberg and Lauritzen22]. The paper [Reference Gnecco, Meinshausen, Peters and Engelke16] studies a new metric called the causal tail coefficient, which is shown to reveal the structure of a linear causal recursive model with heavy-tailed noise. Graph discovery for non-directed graphs is studied in [Reference Engelke and Hitz11], [Reference Engelke and Volgushev12], and [Reference Hu, Peng and Segers19].

Inspired by practice, and more specifically by river network applications [Reference Asenova, Mazo and Segers4], we study a different identifiability problem. If the structure of the graph is known, it may happen that on some nodes the variables are latent, i.e., unobserved. The identifiability problem in this case is whether two different parameter vectors can still generate the same distribution of the observable part of the model. If this is possible then we cannot uniquely identify all tail dependence parameters that characterize the full distribution. Similarly to [Reference Asenova and Segers5], the identifiability criterion involves properties of the nodes with latent variables. The criterion is specific for a TTT with unique source and is easy to check. Our identifiability problem resembles the ‘method of path coefficients’ of Sewall Wright, which uses a system of equations involving correlations to solve for the edge coefficients [Reference Wright35].

The novelty of the paper lies in several directions. First, we introduce a new class of graphs, called trees of transitive tournaments (TTT), which are the directed acyclic analogue of block graphs. TTTs can be seen as a generalization of directed trees, where edges are replaced by transitive tournaments. Second, we show that a max-linear graphical model over a TTT with unique source exhibits properties known for other graphical models, namely Markov trees [Reference Segers31] and Markov block graphs [Reference Asenova and Segers5]. In particular, when the TTT has a unique source, the model is Markov with respect to the skeleton of the graph. This property underlies the factorization of the tail limit into independent increments along the unique shortest trails. Finally, we study a problem of identifiability of the edge weights from the angular measure, both when all variables are observed and when some of them are latent.

The structure of the paper is as follows. In Section 2 we introduce the TTT, the max-linear model, and its angular measure, which plays a key role in almost all proofs. In Section 3 we discuss the limiting distribution of (1) and give four equivalent characterizations of a max-linear graphical model with respect to a TTT with unique source. The identifiability problem is covered in Section 4. The discussion summarizes the main points of the paper. The appendices contain some additional lemmas and the proofs that are not presented in the main text.

2. Notions and definitions

2.1. Directed graphs

Let $\mathcal{T}=(V,E)$ be a directed acyclic graph (DAG) with finite vertex (node) set V and edge set $E\subset V\times V$ . An edge $e\;:\!=\;(u,v)\in E$ is directed, meaning $(u,v)\neq (v,u)$ ; it is outgoing with respect to the parent node u and incoming with respect to the child node v. The graph $\mathcal{T}$ excludes loops, i.e., edges of the form (u, u), and as $\mathcal{T}$ is directed, we cannot have both $(u,v)\in E$ and $(v,u)\in E$ . Two nodes u and v are adjacent if (u, v) or (v, u) is an edge. A cycle is a sequence of edges $e_1,\ldots,e_n$ with $e_k = (u_k, u_{k+1})$ and $u_1 = u_{n+1}$ for some nodes $u_1,\ldots,u_n$ . The property that $\mathcal{T}$ is acyclic means that it does not contain any cycles. The graph $\mathcal{T}$ is assumed connected; i.e., for any two distinct nodes u and v, we can find nodes $u_1=u,u_2,\ldots,u_{n+1}=v$ such that $u_k$ and $u_{k+1}$ are adjacent for every $k = 1,\ldots,n$ ; we call the associated edge sequence an undirected path or a trail between u and v. If all edges are directed in the same sense, i.e., $(u_k, u_{k+1}) \in E$ for all $k = 1,\ldots,n$ , we refer to the edge sequence as a (directed) path from the ancestor u to the descendant v. Recall that a path is directed by convention, so when we need non-directed paths this will be indicated explicitly. Between a pair of nodes there may be several paths. The set of all paths between two nodes $u,v\in V$ is denoted by $\pi(u,v)$ . An element, say p, of $\pi(u,v)$ is a collection of edges $\{(v_1, v_2), (v_2,v_3),\ldots, (v_{n-1},v_n)\}$ for a path that involves the non-repeating nodes $\{v_1=u, v_2, \ldots, v_{n-1}, v_{n}=v\}$ . Note that $\pi(u, u) = \varnothing$ in an acyclic graph.

A source is a node without parents. If a DAG has a unique source, this node is an ancestor of every other node. This property follows from the following reasoning: let $u_0$ denote the unique source node of the DAG, and let v be any other node different from $u_0$ . Then v must have a parent, say u. If $u = u_0$ , we are done. Otherwise, replace v by u and restart. Since the graph is finite and has no cycles, this chain must stop at some moment at a node without parents. But this node is necessarily equal to $u_0$ by assumption.

A graph, directed or not, is complete if there is an edge between any pair of distinct nodes. A subgraph of a graph is biconnected if the removal of any of its nodes will not disconnect the subgraph. A maximal biconnected subgraph, also known as a biconnected component, is a subgraph that cannot be extended by adding one adjacent node without violating this principle.

A directed complete graph is called a tournament. A tournament $\tau=(V_\tau, E_\tau)$ is transitive if $(u, v), (v, w) \in E_\tau$ implies $(u, w) \in E_\tau$ . A transitive tournament is necessarily acyclic. The graph-theoretic properties of transitive tournaments are studied in [Reference Harary and Moser18]. The property most used here is that the set of out-degrees of the d nodes of a transitive tournament is $\{d-1, d-2, \ldots, 0\}$ ; the in- and out-degrees of a node are the numbers of incoming and outgoing edges, respectively.

A subgraph of a graph is a maximal transitive tournament if it is not properly contained in another subgraph which is also a transitive tournament. The set of maximal transitive tournaments that are subgraphs of a DAG $\mathcal{T}$ will be denoted by $\mathbb{T}$ . For brevity we will just write ‘tournament’ when we mean a maximal transitive tournament and denote it by $\tau$ .

2.2. Tree of transitive tournaments

A block graph is an undirected graph in which every maximal biconnected subgraph is a complete graph [Reference Le and Tuy26]. Let T denote the non-directed version of $\mathcal{T}$ , also called the skeleton of $\mathcal{T}$ . It shares the same node set as $\mathcal{T}$ , and for every edge (u, v) in the original graph $\mathcal{T}$ , the reverse edge (v, u) is added to form the edge set of the skeleton graph T, after which each pair of edges $\{(u, v), (v, u)\}$ is identified with the undirected edge $\{u, v\}$ of T.

Definition 2.1. (Tree of transitive tournaments (TTT).) A tree of transitive tournaments is a connected DAG whose skeleton is a block graph.

A TTT enjoys three key properties. They all follow from the link with block graphs, whose characteristics can be found in [Reference Le and Tuy26].

Lemma 2.1. (Properties I.) For a TTT, the following properties hold:

  1. (P1) Two or more maximal transitive tournaments can have at most one common node, referred to as a separator node.

  2. (P2) There is no undirected cycle that passes through nodes in different maximal transitive tournaments.

  3. (P3) Between every pair of nodes there is a unique shortest trail (undirected path).

Proof. All properties are direct consequences of the fact that removing directions from the TTT we obtain a block graph. In a block graph the minimal separator sets are singletons [Reference Harary17, Theorem B]; there is a unique shortest path between two nodes [Reference Behtoei, Jannesari and Taeri6, Theorem 1.a)]; and the graph is acyclic up to blocks, a property that follows from the first one.

Similarly to a block graph [Reference Le and Tuy26], a TTT can be seen as a tree whose edges are replaced by transitive tournaments.

In a TTT, if there is at least one (directed) path between distinct nodes u and v, there is a unique shortest path (see Lemma 2.2-1) between them, which we denote by p(u, v), and which belongs to $\pi(u, v)$ . We also set $p(u,u)=\varnothing$ by convention.

A key object in the paper is a TTT with unique source. In Lemma 2.2-2 below it is shown that in this case there are no nodes with parents that are not adjacent or ‘married’, a configuration also known as a v-structure [Reference Koller and Friedman23].

Consider the TTT in Figure 1, which presents some of the notions introduced above. Each tournament is acyclic, and we cannot find a cycle passing through different tournaments either. This is why we call such a graph a tree of transitive tournaments. There are three v-structures: one on nodes 1, 3, 4, one on 3, 7, 8, and one on 5, 7, 8. The main results in this paper require a TTT without v-structures. According to Lemma 2.2, there are no v-structures in a TTT with unique source. This is illustrated in Figure 2.

Figure 1. A tree of four maximal transitive tournaments: $\tau_1$ , $\tau_2$ , $\tau_3$ , and $\tau_4$ . The skeleton graph is the same, but with the arrow heads removed. Node 3 is a separator node between tournaments $\tau_1$ , $\tau_2$ , and $\tau_3$ , while node 7 is a separator node between tournaments $\tau_3$ and $\tau_4$ . Nodes 1, 4, and 8 are source nodes, i.e., have no parents. The subgraph with node set $\{1, 3, 4\}$ is a v-structure: node 3 has non-adjacent parents 1 and 4. Other v-structures within the TTT are the subgraphs with node sets $\{3, 7, 8\}$ and $\{5, 7, 8\}$ . Between any pair of distinct nodes there is a unique shortest trail; for instance, nodes 4 and 8 are connected by the trail passing through nodes 3 and 7. There is no undirected cycle involving several tournaments.

Figure 2. A tree of four maximal transitive tournaments. The skeleton graph is the same as the one in Figure 1, but the graph here has a single source, on node 4. We see that there are no longer any v-structures in the graph.

Considered on its own, every tournament in a TTT has a unique source; this follows from the ordering of the out-degrees due to [Reference Harary and Moser18] mentioned earlier. When we talk about a source node, we will always state whether this is with respect to the whole graph or to a particular tournament.

In a general directed graph (V, E), let $\mathrm{pa}(v)\in V$ denote the set of parents of $v\in V$ , and put $\mathrm{Pa}(v)=\mathrm{pa}(v)\cup \{v\}$ . In a similar way, let $\mathrm{an}(v)$ , $\mathrm{desc}(v)$ , and $\mathrm{ch}(v)$ denote the sets of ancestors, descendants, and children, respectively, excluding v, while $\mathrm{An}(v)$ , $\mathrm{Desc}(v)$ , and $\mathrm{Ch}(v)$ denote the same sets but including v.

Below we present some additional properties used often in the paper.

Lemma 2.2. (Properties II.) Let $\mathcal{T}$ be a TTT as in Definition 2.1. We have the following statements:

  1. 1. If there is a path between two nodes, then there is a unique shortest path between them.

  2. 2. The TTT $\mathcal{T}$ has a unique source if and only if it possesses no v-structures.

  3. 3. If $\mathcal{T}$ has a unique source, then for any two nodes $i\neq j$ , either the sets $\mathrm{Desc}(i)$ and $\mathrm{Desc}(j)$ are disjoint, or one contains the other (that is, i is an ancestor of j or vice versa).

Lemma 2.3. (Properties III.) Consider a TTT $\mathcal{T}=(V,E)$ as in Definition 2.1 with unique source.

  1. 1. If $\{v_1, v_2, \ldots, v_n\}$ is the node sequence of a unique shortest path between nodes $v_1$ and $v_n$ , then every node except for possibly $v_1$ and $v_n$ is the source node of the tournament shared with the next node in the sequence.

  2. 2. For any two distinct nodes u,v in $\mathcal{T}$ , either the unique shortest trail between them is p(u,v) or p(v,u), or there exists a node $w \in V \setminus \{u, v\}$ such that the trail is composed of the two shortest paths p(w,u) and p(w,v).

2.3. Max-linear structural equation model on a TTT

Consider a directed graph, (V, E). To each edge $e = (i, j) \in E$ we associate a weight $c_e = c_{ij} \in [0, \infty)$ . The product of the edge parameters over a directed path $p = \{e_1,\ldots,e_m\}$ is denoted by

\[c_p=\prod_{r=1}^m c_{e_r}.\]

When the product is over the unique shortest path from u to v, we write $c_{p(u,v)}$ . The product over the empty set being one by convention, we have $c_{p(i,i)} = 1$ .

Let $(Z_i, i \in V)$ be a vector of independent unit-Fréchet random variables, i.e., $\operatorname{\mathbb{P}}(Z_i \le z) = \exp\!(\!-\!1/z)$ for $z > 0$ . In the spirit of [Reference Gissibl and Klüppelberg13], a recursive max-linear model on a TTT $\mathcal{T}$ is defined by

(2) \begin{equation} X_v=\bigvee_{i\in \mathrm{pa}(v)}c_{iv}X_i\vee c_{vv}Z_v, \qquad v\in V,\end{equation}

where the parameters $c_e$ , for $e\in E$ , and $c_{vv}$ , for $v\in V$ , are positive. We interpret this constraint as follows: if $c_{ij} = 0$ , the variable $X_j$ cannot be influenced by $X_i$ through the edge (i, j), and the edge could be removed from the graph. If $c_{vv} = 0$ , the factor variable $Z_v$ does not influence $X_v$ . We do not want to deal with such border cases, so we assume that all parameters in the model definition (2) are positive. According to [Reference Gissibl and Klüppelberg13, Theorem 2.2], the expression in (2) is equal also to

(3) \begin{equation} X_v = \bigvee_{i \in V} b_{vi} Z_i,\end{equation}

with

(4) \begin{equation} b_{vi} = \begin{cases} 0 & \text{if $i \not\in \mathrm{An}(v)$,} \\[5pt] c_{vv} & \text{if $i = v$,} \\[5pt] c_{ii} \max_{p \in \pi(i, v)} c_p & \text{if $i \in \mathrm{an}(v)$.} \end{cases}\end{equation}

The cumulative distribution function of $X_v$ is $\operatorname{\mathbb{P}}(X_v \le x) = \exp\!(\!-\!\sum_{i \in V} b_{vi} / x)$ for $x > 0$ . We assume that $(X_v, v\in V)$ are unit-Fréchet, yielding the constraint

(5) \begin{equation} \sum_{i \in V} b_{vi} = \sum_{i \in \mathrm{An}(v)} b_{vi} = 1, \qquad \forall v \in V,\end{equation}

since $b_{vi} = 0$ whenever $i \notin \mathrm{An}(v)$ . It is thus necessary and sufficient to have

(6) \begin{equation} c_{vv} = 1 - \sum_{i \in \mathrm{an}(v)} c_{ii} \max_{p \in \pi(i,v)} c_p, \end{equation}

with $c_{vv}=1$ if $\mathrm{an}(v)=\varnothing$ . By (6), the coefficients $c_{vv}$ for $v \in V$ are determined recursively by the edge weights $c_{e}$ for $e \in E$ . If $c_{iv} \ge 1$ for some $(i, v) \in E$ , then (2) implies that $X_v \ge X_i \vee c_{vv} Z_v$ , and the constraint that $X_i$ and $X_v$ are unit-Fréchet distributed implies that $c_{vv} = 0$ , a case we want to exclude, as explained above. This is why we impose $0 < c_{e} < 1$ for all $e \in E$ from the start, yielding the parameter space

\begin{equation*} \mathring{\Theta} = \left\{ \theta = (c_e, e \in E) \in (0,1)^{E} \;:\; \forall v \in V, \, c_{vv} > 0 \right\}.\end{equation*}

The notion of criticality is important for max-linear structural equation models. We refer to [Reference Améndola, Klüppelberg, Lauritzen and Tran2], [Reference Gissibl and Klüppelberg13], [Reference Gissibl, Klüppelberg and Lauritzen14], and [Reference Klüppelberg and Lauritzen22] for examples where different conditional independence relations arise depending on which path is critical, or for illustrations in the context of graph learning. According to [Reference Gissibl and Klüppelberg13, Definitionn 3.1], a path $p \in \pi(i, v)$ is max-weighted under $\theta \in \Theta$ if it realizes the maximum $\max_{p' \in \pi(i, v)} c_{p'}$ , where p is any path in $\pi(i,v)$ . In [Reference Améndola, Klüppelberg, Lauritzen and Tran2] the term critical is preferred.

If there is a (directed) path between two nodes, there is a unique shortest (directed) path between them (Lemma 2.2-1). This is crucial for our parametric model. We define the critical parameter space $\Theta_* \subset (0,1)^{E}$ as the set of parameters $\theta =(c_e, e \in E)$ such that for every $v \in V$ and every $i \in \mathrm{an}(v)$ , the unique shortest directed path from i to v is the only critical path. Therefore we have $c_{p(i,v)} > c_{p}$ , with strict inequality for any $p \in \pi(i,v)$ different from p(i, v). Formally,

\begin{equation*} \Theta_* = \left\{ \theta \in (0,1)^{E} \;:\; \forall v \in V, \, \forall i \in \mathrm{an}(v), \, \forall p \in \pi(i, v) \setminus \{p(i, v)\}, \; c_{p(i, v)} > c_p \right\}.\end{equation*}

Next, we consider the intersection of the two spaces as an appropriate parameter space for our max-linear structural equation model:

(7) \begin{equation} \mathring{\Theta}_*=\mathring{\Theta}\cap \Theta_*.\end{equation}

For $\theta \in \mathring{\Theta}_*$ , every element of the max-linear coefficient matrix $B_{\theta}=(b_{vi})_{v,i\in V}$ can be rewritten using an edge weight product over the unique shortest path p(i, v) via

(8) \begin{equation} b_{vi} = \begin{cases} 0 & \text{if $i \not\in \mathrm{An}(v)$,} \\[5pt] c_{vv} & \text{if $i = v$,} \\[5pt] c_{ii}c_{p(i,v)} & \text{if $i \in \mathrm{an}(v)$}, \end{cases} \qquad \text{and} \qquad c_{vv} = 1 - \sum_{i \in \mathrm{an}(v)} c_{ii}c_{p(i,v)}.\end{equation}

Also, note that $b_{ii}=c_{ii}$ , leading to the frequently used expression

\[b_{vi} = c_{p(i,v)} b_{ii}, \qquad i \in \mathrm{an}(v).\]

Example 2.1. (Criticality.) The following example shows what happens if the assumption that all shortest paths are critical is omitted. Consider a max-linear model on three nodes $\{1,2,3\}$ and three edges $\{(1,2), (2,3),(1,3)\}$ . The corresponding edge weights are $c_{12}, c_{23}, c_{13}$ . We have

\begin{align*} X_1=c_{11}Z_1, \qquad X_2=c_{12}X_1\vee c_{22}Z_2, \qquad X_3=c_{13}X_1\vee c_{23}X_2 \vee c_{33}Z_3. \end{align*}

The coefficient matrix $B=\{b_{iv}\}$ from (4) together with (5) and (6) is

\begin{equation*} B=\left[\begin{array}{c@{\quad}c@{\quad}c} 1&0&0\\[5pt] c_{12}&(1-c_{12})&0\\[5pt] c_{12}c_{23}\vee c_{13}&c_{23}(1-c_{12}) &1-c_{12}c_{23}\vee c_{13}-c_{23}(1-c_{12}) \end{array}\right]. \end{equation*}

If the shortest path $p=\{(1,3)\}$ from node 1 to node 3 is not critical, then we have $b_{31}=c_{12}c_{23}$ and also $b_{33}=1-c_{23}$ . In this way the coefficient $c_{13}$ has completely left the model. When considering the identifiability problem, we cannot hope to identify a coefficient from some marginal distribution if it is not even identifiable from the full one.

All of the elements are now in place for us to describe our main object of interest.

Assumption 2.1. (Max-linear structural equation model on a TTT.) The random vector $X = (X_v, v \in V)$ has the max-linear representation in (3) and (8) with respect to the TTT $\mathcal{T} = (V, E)$ (Definition 2.1) where $(Z_v, v \in V)$ is a vector of independent unit-Fréchet random variables and the edge weight vector $\theta = (c_e, e \in E)$ belongs to $\mathring{\Theta}_*$ in (7).

The following identity for nodes with a unique parent will be useful:

(9) \begin{equation} \mathrm{pa}(v) = \{i\} \implies b_{vv}=1-c_{iv}.\end{equation}

Indeed, if i is the only parent of v, then $X_v = c_{iv} X_i \vee c_{vv} Z_v$ by (2). The variables $X_v, X_i, Z_v$ are unit-Fréchet distributed and $X_i$ is independent of $Z_v$ , since $X_i$ is a function of $(Z_u, u \in \mathrm{An}(i))$ and $v \not\in \mathrm{An}(i)$ . Hence $ c_{iv} + c_{vv} = 1 $ , and because $c_{vv} = b_{vv}$ , Equation (9) follows.

A notational convention: in the case of double subscripts, we may also write $x_{i_1,i_2}$ instead of $x_{i_1i_2}$ .

2.4. The angular measure

Let X follow a max-linear model with parameter vector $\theta$ as in Assumption 2.1. The joint distribution $P_\theta$ of X on $[0, \infty)^V$ is max-stable and has unit-Fréchet margins. It is determined by

\[P_\theta([0, z]) = \operatorname{\mathbb{P}}(X \le z)= \exp\left( - l_\theta\left((1/z_v)_{v \in V}\right)\right),\qquad z \in (0, \infty]^V,\]

where the stable tail dependence function (STDF) $l_{\theta} \;:\; [0, \infty)^V \to [0, \infty)$ is

(10) \begin{equation} l_\theta(x) = \sum_{i \in V} \max_{v \in V} \left( b_{vi} x_v \right)\end{equation}

for $x = (x_v)_{v \in V} \in [0, \infty)^V$ (see [Reference Einmahl, Krajina and Segers10]).

Let $H_{\theta}$ be the angular measure on the unit simplex $\Delta_V = \{ a \in [0, 1]^V \;:\; \sum_{v \in V} a^{(v)} = 1\}$ corresponding to the STDF $l_{\theta}$ . The link between the STDF and the angular measure is detailed in [Reference De Haan and Ferreira9] for the bivariate case and in [Reference Resnick27, Chapter 5] and [Reference Beirlant, Goegebeur, Segers and Teugels7, Chapters 7–8] for higher dimensions: we have

\begin{equation*} l_\theta(x) = \int_{\Delta_V} \max_{v \in V} {( a^{(v)} x_v )} \, \mathrm{d} H_\theta(a).\end{equation*}

In view of the expression for $l_{\theta}$ in (10), the angular measure is discrete and satisfies

(11) \begin{equation} H_\theta = \sum_{i \in V} m_i \delta_{a_i},\end{equation}

with masses $m_i = \sum_{v \in V} b_{vi}$ and atoms $a_i = (b_{vi} / m_i)_{v \in V} \in \Delta_V$ for $i \in V$ [Reference Einmahl, Krajina and Segers10, p. 1779]. The notation $\delta_{x}$ refers to a unit point mass at x.

If X follows a max-linear model, the angular measure of X is identifiable from its distribution $P_\theta$ via the limit relation

\begin{equation*} t \operatorname{\mathbb{P}} \bigg( \frac{1}{\|{X}\|_1} X \in \,\cdot\,, \, \|{X}\|_1 > t \bigg) \xrightarrow{w} H_{\theta}(\!\cdot\!), \qquad t \to \infty,\end{equation*}

where $\|{x}\|_1 = \sum_i |x_i|$ for a vector x in Euclidean space, while the arrow $\xrightarrow{w}$ denotes weak convergence of finite Borel measures, in this case on $\Delta_V$ .

When we discuss latent variables and identifiability in Section 4, we will have to deal with the angular measure of a subvector of X, say $X_U = (X_v)_{v \in U}$ , for non-empty $U \subset V$ . Its STDF $l_{\theta, U}$ is obtained from $l_\theta$ by setting $x_v = 0$ for all $v \not\in U$ : for $x \in [0, \infty)^U$ we have

\begin{equation*} l_{\theta,U}(x) = \sum_{i \in V} \max_{v \in U} {(b_{vi} x_v)} = \int_{\Delta_U} \max_{v \in U} {(a^{(v)} x_v)} \, \mathrm{d} H_{\theta,U}(a).\end{equation*}

The distribution of $X_U$ is max-linear too, so that its angular measure $H_{\theta,U}$ on $\Delta_U$ has a similar form to that of X:

(12) \begin{equation} H_{\theta,U} = \sum_{i \in V} m_{i,U} \delta_{a_{i,U}},\end{equation}

with masses $m_{i,U} = \sum_{v \in U} b_{vi}$ and atoms $a_{i,U} = (b_{vi} / m_{i,U})_{v \in U} \in \Delta_U$ for $i \in V$ .

3. Conditional tail limit and the TTT with unique source

Here we study the limit distribution of

(13) \begin{equation} \bigg(\frac{X_v}{X_u}, v\in V \mathrel{\Big|} X_u>t \bigg), \qquad t\rightarrow\infty,\end{equation}

when X is a max-linear model with respect to a TTT $\mathcal{T} = (V, E)$ as in Assumption 2.1. In particular, we are interested to know whether the elements of the limiting vector of (13) can be factorized into products of independent increments, similarly to other models with this property as in [Reference Asenova and Segers5, Reference Segers30]. In Proposition 3.1 below, we show that the limit variables factorize according to the unique shortest trails under the condition that the TTT has a unique source (node without parents). Moreover, by Proposition 3.2, the latter criterion is necessary and sufficient for X to satisfy the global Markov property with respect to the skeleton graph associated to $\mathcal{T}$ , i.e., the undirected counterpart of $\mathcal{T}$ .

Even though Proposition 3.1 below looks similar to Theorem 3.5 in [Reference Asenova and Segers5], it does not follow from it. The reason is that we have not been able to verify Assumptions 3.1 and 3.4 in that article for the recursive max-linear model. In these assumptions, the conditioning event involves equality, i.e., $\{X_u=t\}$ , and calculating the conditional distributions and their limits is not easy. This is why we have opted here for a different route: in (13), the conditioning event is $\{X_u>t\}$ and the limit conditional distribution as $t \to \infty$ is found from [Reference Segers31, Example 1].

According to Property (P3), any pair of distinct nodes in a TTT is connected by a unique shortest trail. Let t(u, v) denote the set of edges along the unique shortest trail between two distinct nodes u and v. Consider for instance the shortest trail between nodes 2 and 8 in Figure 2: $t(2,8)=\{(3,7), (7,8), (3,2)\}$ . In contrast, let $t_{u}(u,v)$ be the set of edges incident to the same node set but directed from u to v, irrespective of their original directions, e.g., $t_2(2,8)=\{(2,3), (3,7), (7, 8)\}$ .

For a given node $u \in V$ , let $E_u$ be the set of all edges in such unique shortest paths directed away from u, that is,

(14) \begin{equation} E_u=\bigcup_{v\in V\setminus u} t_u(u,v).\end{equation}

Recall from Section 2 that $\mathbb{T}$ denotes the set of tournaments within the TTT $\mathcal{T}$ . For fixed $u \in V$ there is for every tournament $\tau = (V_\tau, E_\tau) \in \mathbb{T}$ a node, say $w_{u,\tau}$ , which is the unique node in $V_\tau$ such that the trail $t(u, w_{u, \tau})$ is the shortest one among all trails between u and a node v in $V_\tau$ . As an example, consider Figure 3: starting from node $u = 8$ , the closest node from the node set $V_{\tau_1}=\{1,2,3\}$ is 3, so $w_{8,\tau_1}=3$ .

Figure 3. A TTT on four tournaments: $\tau_1$ on node set $\{1,2,3\}$ , $\tau_2$ on $\{3,4\}$ , $\tau_3$ on $\{3,5,6,7\}$ , and $\tau_4$ on $\{8,7\}$ . The variable exceeding a high threshold is at node 8. The set $E_u$ is composed of the thicker edges, which do not necessarily have the same directions in the original graph. On the nodes we have $A^{(8)}=(A_{8i}, i=1, \ldots, 7)$ and on the edges we have the multiplicative increments $(M_{e}, e\in E_u)$ . Increments in different tournaments are mutually independent, while those in the same tournament are dependent according to Lemma 3.1.

With these definitions we are ready to state the condition under which the limiting variables factorize into independent increments.

Proposition 3.1. (Factorization in max-linear model.) Let $(X_v, v\in V)$ follow a max-linear model as in Assumption 2.1. Fix $u\in V$ . Let $E_u$ be as in (14) and let $(M_e, e\in E_u)$ be a random vector composed of mutually independent subvectors $M^{(u,\tau)}=\bigl( M_{w_{u,\tau},j}\;:\; j \in V_\tau, (w_{u,\tau},j)\in E_u \bigr)$ , one for every transitive tournament $\tau\in \mathbb{T}$ , and with marginal distribution as in Lemma 3.1.

The following statements are equivalent:

  1. (i) $\mathcal{T}$ has a unique source.

  2. (ii) For every $u \in V$ , we have, as $t \to \infty$ , the weak convergence given by

    (15) \begin{equation} \mathcal{L}(X_v/X_u, v\in V\mid X_u>t)\stackrel{d}{\longrightarrow} \mathcal{L}(A^{(u)}) =\mathcal{L}(A_{uv}, v\in V) \end{equation}
    with
    (16) \begin{equation} A_{uv}=\prod_{e\in t_u(u,v)}M_e, \qquad v\in V. \end{equation}
  3. (iii) There exists $u\in V$ such that the limit in (15) and (16) holds.

The following lemma provides the distribution of $M^{(u,\tau)}$ in Proposition 3.1.

Lemma 3.1. Let $(X_v, v\in V)$ follow a max-linear model as in Assumption 2.1. Let $\tau \in \mathbb{T}$ be a transitive tournament on nodes $V_{\tau}$ . Then for $u\in V_{\tau}$ , we have

(17) \begin{align} \mathcal{L}\bigg(\frac{X_v}{X_u}, v\in V_{\tau} \mathrel{\Big|} X_u>t\bigg) \stackrel{d}{\longrightarrow} \mathcal{L}(M^{(u,\tau)}) =\mathcal{L}(M_{uv}, v\in V_{\tau}) =\sum_{j\in \mathrm{An}(u)}b_{uj} \delta_{\left\{\frac{c_{p(j,v)}}{c_{p(j,u)}}, v\in V_{\tau}\right\}}. \end{align}

The vector $M^{(u,\tau)}=(M_{uv}, v\in V_{\tau})$ has dependent variables, and the distribution of a single element is as follows:

  1. 1. The distribution of $M_{uv}$ when $(u,v)\in E$ :

    1. (a) If u is the source node of $\tau$ , the distribution is given by $\mathcal{L}(M_{uv})=\delta_{\{c_{uv}\}}$ .

    2. (b) If u is not the source node of $\tau$ , the distribution is given by

      \[ \mathcal{L}(M_{uv})=\sum_{j\in \mathrm{An}(u)}b_{uj} \delta_{\left\{\frac{c_{p(j,v)}}{c_{p(j,u)}}\right\}}. \]

  2. 2. The distribution of $M_{uv}$ when $(v,u)\in E$ :

    1. (a) If v is the source node of $\tau$ , the distribution is given by

      \[ \mathcal{L}(M_{uv})=c_{vu}\delta_{\{1/c_{vu}\}} +(1-c_{vu})\delta_{\{0\}}. \]
    2. (b) If v is not the source node of $\tau$ , the distribution is given by

      \[ \mathcal{L}(M_{uv}) =\sum_{j\in \mathrm{An}(v)} b_{uj}\delta_{\left\{\frac{c_{p(j,v)}}{c_{p(j,u)}}\right\}} +\sum_{j\in \mathrm{An}(u)\setminus \mathrm{An}(v)} b_{uj}\delta_{\{0\}}. \]

According to Proposition 3.1, the factorization property (16) holds either for all nodes or for no nodes at all, a necessary and sufficient condition being that the TTT has a unique source. The principle of (16) is illustrated in Figure 3 for $u = 8$ . The limit $A^{(u)}=(A_{uv}, v\in V\setminus u)$ is given by

where are independent subvectors by construction.

What underlies the link between the factorization of the limiting variables from Proposition 3.1 on the one hand and the uniqueness of the source of the TTT on the other hand is the Markovianity of X with respect to the skeleton graph T. The Markov property states that for any three non-empty and disjoint sets $A,B,C\subset V$ such that, in the graph T, the nodes in A are separated from the nodes in B by the nodes in C, the vector $X_A=(X_v, v\in A)$ is conditionally independent from $X_B$ given $X_C$ [Reference Lauritzen25]. Another equivalence condition can be added to the list in Proposition 3.1.

Proposition 3.2. Let X follow a max-linear model with respect to the TTT $\mathcal{T}$ as in Assumption 2.1. Then X satisfies the global Markov property with respect to the skeleton graph T if and only if $\mathcal{T}$ has a unique source.

Even though in Proposition 3.2 we consider the undirected graph T associated to the TTT $\mathcal{T}$ , the recursive max-linear specification of X is still with respect to the directed graph $\mathcal{T}$ itself. Indeed, the latter edges’ directions are intrinsically determined by the recursive max-linear model specification of $X = (X_v)_{v \in V}$ . When we also consider the associated skeleton graph T, i.e., without directions, it is because, in view of the factorization property in Proposition 3.1, we are interested in whether the global Markov property holds, a property which is most easily described in terms of the skeleton graph T.

The proof of Proposition 3.2 is based on notions and results from [Reference Améndola, Klüppelberg, Lauritzen and Tran2], which provides an extensive study of conditional independence properties of max-linear models. In particular, [Reference Améndola, Klüppelberg, Lauritzen and Tran2] introduces the notion of a $*$ -connecting path between two nodes in a DAG, which is similar to the notion of an active path between two nodes [Reference Koller and Friedman23, Definition 3.6].

4. Latent variables and parameter identifiability

In practice, it is possible that on some of the nodes, the variables of interest are not observed (latent). Examples from the literature are water heights at certain locations in the river networks of the Danube [Reference Asadi, Davison and Engelke3] and the Seine [Reference Asenova, Mazo and Segers4]. We look at the problem of recovering all parameters of the distribution of the complete vector, based on the distribution of the observed variables only. If this is possible, we can study the parametric model as if all variables were observed: in particular, we are able to compute measures of tail dependence for sets including the unobserved variables. The latter is important as it may be the only possible way to quantify tail dependence, because non-parametric estimates are not available when dealing with unobserved variables.

Consider for instance the network in Figure 4. The max-linear model on $\mathcal{T}=(V,E)$ has eight variables and eleven parameters $\theta=(c_e, e\in E)$ . By Proposition 4.1 below, the parameter $\theta \in \mathring{\Theta}_*$ can be uniquely identified in the case when $X_1, X_3, X_7$ are not observed, on the basis of the joint distribution of the remaining five variables, $X_U=(X_{2}, X_4,X_5,X_6,X_8)$ .

Figure 4. In the following TTT, the nodes that are allowed to contain a latent variable while the edge parameters remain identifiable are 1,3,7. These are the only nodes that each satisfy both (I1) and (I2). For instance, if node 2 has unobserved variables, the parameters attached to edges (1, 2),(3, 2) are not identifiable. This is because the edge weights $c_{12},c_{32}$ take part only in products over paths ending at 2. But if $2\in\bar{U}$ these coefficients disappear from the atoms of the angular measure $a_{i,U}=(b_{vi}/m_{i,U}, v\in U)$ and accordingly from the collection of vectors $\mathcal{B}_{\theta,U}$ .

The problem of parameter identifiability will be formalized on the level of the angular measure $H_{\theta}$ and is presented in detail in the next two subsections.

4.1. Graph-induced characteristics of the angular measure

In this subsection, we argue that the condition $\theta=(c_e, e\in E)\in \mathring{\Theta}_*$ guarantees that all edge weights in $\theta$ are uniquely identifiable from the angular measure $H_{\theta}$ of $X = (X_v, v\in V)$ and thus from the distribution $P_\theta$ of X. Recall from (11) that $H_{\theta}$ is discrete with atoms $a_i = (a_{vi})_{v \in V} \in \Delta_V$ and masses $m_i > 0$ .

Thanks to the assumption $\theta\in \mathring{\Theta}_*$ , we have

(18) \begin{equation} a_{vi} > 0 \iff b_{vi} > 0 \iff i \in \mathrm{An}(v) \iff v \in \mathrm{Desc}(i).\end{equation}

For any DAG, all nodes have different sets of descendants, i.e.,

(19) \begin{equation} \forall i, j \in V \;:\; i \ne j \implies \mathrm{Desc}(i) \neq \mathrm{Desc}(j).\end{equation}

Indeed, if $i \ne j$ and $\mathrm{Desc}(i) \subseteq \mathrm{Desc}(j)$ , then $i \in \mathrm{desc}(j)$ and hence $j \not\in \mathrm{desc}(i)$ , so that $\mathrm{Desc}(j) \not\subseteq \mathrm{Desc}(i)$ .

Lemma 4.1. Let $(X_v, v \in V)$ follow a max-linear model as in Assumption 2.1, with parameter vector $\theta \in \mathring{\Theta}_*$ and induced coefficient matrix $(b_{vi})_{i,v \in V}$ . Let $H_\theta = \sum_{i \in V} m_i \delta_{a_i}$ in (11) be its angular measure. Then the following hold:

  1. (1) We have $m_i > 0$ for all $i \in V$ .

  2. (2) For any atom $a_i = (a_{vi})_{v \in V}$ , we have $a_{vi} > 0$ if and only if $v \in \mathrm{Desc}(i)$ . Specifically, all $|V|$ vectors $a_i$ are different, and every atom can be matched uniquely to a node in V.

  3. (3) For each edge $(i, v) \in E$ , we have $c_{iv} = b_{vi} / b_{ii} = a_{vi} / a_{ii}$ .

In particular, $\theta \in \mathring{\Theta}_*$ is identifiable from $H_\theta$ and thus from $P_\theta$ ; i.e., for $\theta_1 \neq \theta_2\in \mathring{\Theta}_*$ we have $H_{\theta_1} \neq H_{\theta_2}$ and thus $P_{\theta_1} \neq P_{\theta_2}$ .

In Lemma 4.1, if the edge (i, v) is not critical, then there is another path, say p , from i to v with path product $c_{p'} \ge c_{iv}$ , and then we can further lower the value of $c_{iv}$ without changing the coefficients in (4), because they involve $c_{p'}$ rather than $c_{iv}$ , thus yielding the same measure $H_\theta$ . This shows that without the criticality assumption, some edge weights may not be identifiable from $H_\theta$ .

Example 4.1 (Unique zero patterns.)

In dimension $d = 3$ , consider an angular measure given by the following atoms and masses:

\begin{equation*} \omega_1=\frac{1}{2.2}\begin{bmatrix}0.8\\[5pt] 1\\[5pt] 0.4\end{bmatrix}, \; \mu_1=2.2; \quad \omega_2=\frac{1}{0.5}\begin{bmatrix}0\\[5pt] 0\\[5pt] 0.5\end{bmatrix}, \; \mu_2= 0.5; \quad \omega_3= \frac{1}{0.3}\begin{bmatrix}0.2\\[5pt] 0\\[5pt] 0.1\end{bmatrix}, \; \mu_3= 0.3. \end{equation*}

Consider the vectors $\beta_j = \mu_j \omega_j$ for $j \in \{1,2,3\}$ . By Lemma 4.1, the unordered collection $\{ \beta_1, \beta_2, \beta_3 \} = \{(0.8, 1, 0.4)^\top, (0, 0, 0.5)^\top, (0.2, 0, 0.1)^\top\}$ permits one to recover the values of the coefficients in the max-linear model

\begin{align*} X_1=c_{11}Z\vee c_{21}c_{22}Y,\quad X_2=c_{22}Y,\quad X_3=c_{13}c_{11}Z\vee c_{13}c_{21}c_{22}Y\vee c_{33}T, \end{align*}

with (known) edge set $E = \{(2, 1), (1, 3)\}$ , and this is due to the presence of zeros in the vectors. For the current example, we argue as follows. The angular measure $H_{\theta}$ of $(X_1,X_2, X_3)$ has three atoms: atom $a_Z=b_Z/m_Z$ with $b_{Z}=(c_{11}, 0, c_{13}c_{11})^\top$ , atom $a_Y=b_Y/m_Y$ with $b_{Y}=(c_{21}c_{22},\, c_{22},\, c_{13}c_{21}c_{22})^\top$ , and atom $a_T=b_T/m_T$ with $b_{T}=(0,0,c_{33})^\top$ . As unordered sets, $\{\beta_1, \beta_2, \beta_3\}$ and $\{b_Z, b_Y, b_T\}$ are equal, but the question is which vector $\beta_j$ corresponds to which vector $b_{*}$ . From an inspection of the zero entries of the vectors, it is easily seen that the only possible way to identify the three coefficient vectors $\beta_1, \beta_2, \beta_3$ with the vectors $b_Z, b_Y, b_T$ of the angular measure $H_\theta$ is

\begin{align*} \beta_1=\begin{bmatrix}0.8\\[5pt] 1\\[5pt] 0.4\end{bmatrix} &= \begin{bmatrix}c_{21}c_{22}\\[5pt] c_{22}\\[5pt] c_{13}c_{21}c_{22}\end{bmatrix} = b_Y, & \beta_2=\begin{bmatrix}0\\[5pt] 0\\[5pt] 0.5\end{bmatrix} &= \begin{bmatrix} 0\\[5pt] 0\\[5pt] c_{33}\end{bmatrix} = b_T, & \\[5pt] \beta_3=\begin{bmatrix}0.2\\[5pt] 0\\[5pt] 0.1\end{bmatrix} &= \begin{bmatrix} c_{11}\\[5pt] 0\\[5pt] c_{13}c_{11}\end{bmatrix} = b_Z. \end{align*}

Solving the equations yields $(c_{11}, c_{21}, c_{22}, c_{13}, c_{33})=(0.2, 0.8, 1, 0.5, 0.5)$ .

4.2. Identifiability issues with the angular measure of a subvector

When we deal with latent variables, we know the distribution of the observable variables only, $X_U=(X_v, v\in U)$ for non-empty $U \subset V$ . The angular measure, say $H_{\theta,U}$ , of $X_U$ in (12) is discrete and takes the form

(20) \begin{equation} H_{\theta, U} = \sum_{r=1}^s \mu_r \delta_{\omega_r},\end{equation}

with masses $\mu_r > 0$ and s distinct atoms $\omega_r \in \Delta_U$ . Combining (12) and (20), we should have

(21) \begin{equation} \sum_{r=1}^s \mu_r \delta_{\omega_r} = \sum_{i \in V} m_{i,U} \delta_{a_{i,U}},\end{equation}

which means that, as sets, we should have $\{ \omega_1, \ldots, \omega_s \}=\{ a_{i,U} \;:\; i \in V \}$ . In contrast to the situation in Lemma 4.1, the subvectors $a_{i,U}$ for $i \in V$ are not necessarily all different. Any atom $\omega_r$ of $H_{\theta, U}$ is of the form $a_{i,U} = (b_{vi}/m_{i,U})_{v \in U}$ for one or possibly several indices $i \in V$ . For $r=1,\ldots, s$ and $i \in V$ such that $\omega_r = a_{i, U}$ , we know from (18) that

(22) \begin{equation} \{ v \in U \;:\; \omega_{r,v} > 0 \} = \mathrm{Desc}(i) \cap U.\end{equation}

The (unordered) collection of vectors $\{(b_{vi})_{v\in U} \;:\; i \in V\}$ will be denoted by $\mathcal{B}_{\theta,U}$ .

With unobservable variables, there are several issues with the angular measure and the expression for it on the right-hand side of (21):

  • Zero masses. We have $m_{i,U}=\sum_{v\in U}b_{vi}$ , so that if all components of $(b_{vi})_{v\in U}$ are zero, then $m_{i,U}=0$ . This happens when $\mathrm{Desc}(i)\cap U=\varnothing$ . In this case, we have $s<|V|$ , i.e., $H_{\theta,U}$ has fewer atoms than $H_\theta$ .

  • Equal atoms. We may have $a_{i,U}=a_{j,U}$ for some indices $i,j\in V$ and $i\neq j$ . In this case, the terms i and j in (12) are to be aggregated, and again, $H_{\theta,U}$ has fewer than $|V|$ atoms, $s<|V|$ . This happens when the vectors $(b_{vi}, v\in U)$ and $(b_{vj}, v\in U)$ are proportional for some distinct $i,j\in V$ .

  • Zeros in the same positions. A more subtle problem occurs when for two distinct vectors $b,b'\in \mathcal{B}_{\theta,U}$ , the supports $\{v\in U\;:\;b_v>0\}$ and $\{v\in U\;:\;b'_v>0\}$ are equal. Such a situation arises when two distinct nodes $i,j\in V$ satisfy $\mathrm{Desc}(i)\cap U=\mathrm{Desc}(j)\cap U$ . The latter equality is possible only in the presence of latent variables and is to be contrasted with Property (19) when all variables are observable.

4.3. Identifiability criterion

For a max-linear model with respect to a TTT $\mathcal{T}=(V,E)$ with unique source, we need conditions that ensure that the minimal representation of the angular measure of $X_U$ is the one in (12). Consider the following two conditions for the set of nodes $\bar{U} = V \setminus U$ carrying latent variables:

  1. (I1) Any $u \in \bar{U}$ has at least two children.

  2. (I2) Any $u \in \bar{U}$ is the source of some tournament in $\mathcal{T}$ .

Proposition 4.1. Let X follow a max-linear model as in Assumption 2.1 with respect to a TTT $\mathcal{T}=(V,E)$ with unique source. For a non-empty node set $U \subset V$ , the parameter $\theta\in \mathring{\Theta}_*$ is uniquely identifiable from the distribution of $(X_v, v \in U)$ if and only if the conditions (I1) and (I2) are satisfied.

Figure 4 illustrates the identifiability criterion.

5. Discussion

In this paper we have considered a Bayesian max-linear network over a special type of graph which we call a tree of transitive tournaments (TTT). This is a graph which collects in an acyclic manner transitive tournaments which are themselves complete DAGs. The max-linear model is defined on a particular parameter space which ensures that the impact from one variable to another takes place along the shortest path, a consideration that has been defined in the literature as the path’s criticality. It turns out that a TTT with unique source leads to a graph without v-structures; that is, no node has non-adjacent parents. The limit of the scaled random vector, conditional on the event that a high threshold is exceeded at a particular node, is shown to be factorizable into independent multiplicative increments if and only if the TTT has a unique source. This result is analogous to those for Markov trees in [Reference Segers31] and for Markov random fields on undirected block graphs in [Reference Asenova and Segers5]. The feature that the Bayesian max-linear model on a TTT with unique source shares with these other two models is that it satisfies the global Markov property with respect to the undirected counterpart or skeleton graph of the TTT.

In addition, we have provided a simple necessary and sufficient criterion guaranteeing the identifiability of the edge coefficients in the case when some variables are latent. As suggested by a reviewer, it may be possible to extend the criterion to partial identifiability of some edge weights in the case when the criterion is fulfilled only locally.

With appropriate modifications, we expect the results presented in this paper to hold equally for the linear additive causal model introduced in [Reference Gnecco, Meinshausen, Peters and Engelke16]. One of the reasons is that the max-domain of attraction of a linear model with heavy-tailed factors is the same as that of a max-linear one [Reference Einmahl, Krajina and Segers10]. However, the relation between the edge weights $\theta = (c_e)_{e \in E}$ and the coefficient matrix $B_\theta = (b_{ij})_{i,j \in V}$ is different between the max-linear and additive linear versions, and this may require different approaches to showing the same properties for the additive version.

Appendix A. Trees of transitive tournaments

Recall that in a DAG, a v-structure refers to a node with parents that are not adjacent; see Figure 1.

A.1 Proof of Lemma 2.2

Proof. Part 1. Let $a,b\in V$ . If a and b share the same tournament, they must be connected by an arrow, which is then the unique shortest path between them, since all other possible paths have length larger than one.

Let a, b be non-adjacent. If there is a unique directed path between a and b, then this is the unique shortest path. Suppose now that there are two shortest paths: $p_1, p_2\in \pi(a,b)$ . Let the path $p_1$ be along the vertices $\{v_1=a, v_2, \ldots, v_n=b\}$ and the path $p_2$ along the vertices $\{u_1=a, u_2, \ldots, u_n=b\}$ .

We will proceed by contradiction. Assume $v_2\neq u_2$ . If $v_2$ and $u_2$ belong to two different tournaments, then there exists a non-directed cycle through nodes in different tournaments, namely $\{a,v_2, \ldots, b,\ldots, u_2,a\}$ . But this is impossible by Property (P2) of a TTT. Hence, $v_2$ and $u_2$ must belong to the same tournament, say $\tau_a$ , because a is part of the same tournament too. Now consider $u_3$ and $v_3$ . Then either $u_3=v_3$ or they share a tournament, say $\tau_3$ , because otherwise there exists a non-directed cycle through nodes in different tournaments. Since $(v_2, v_3)\in E$ and $(u_2, u_3)\in E$ , and by the assumption that $v_2\neq u_2$ , all four nodes $\{a, v_2, u_2, v_3=u_3\}$ or all five nodes $\{a, v_2, u_2, v_3, u_3\}$ belong to $\tau_a$ . This is because by Property (P1), two tournaments can share only one node, so it is impossible to have $\tau_3\cap \tau_a=\{v_2, u_2\}$ . Because all four or five nodes belong to the same tournament, and since $(a,v_2), (v_2,v_3), (a,u_2), (u_2, u_3)\in E$ , we must have $(a,v_3)\in E$ and $(a,u_3)\in E$ to avoid inter-tournament undirected cycles. Hence the paths $\{a=v_1, v_3, \ldots, v_n=b\}$ and $\{u_1=a, u_3, \ldots, u_n=b\}$ are shorter than $p_1$ and $p_2$ , a contradiction. We must therefore have $v_2 = u_2$ .

We apply the same strategy to the nodes $v_3, u_3$ and $v_4, u_4$ to find that $v_3 = u_3$ . Proceeding recursively, we conclude that $p_1 = p_2$ .

Part 2. First we show that if the TTT has a unique source, there cannot be a v-structure. We proceed by contraposition. Assume that there is a node, v, with parents in two different tournaments $\tau_a$ and $\tau_b$ . Let a and b be the sources of $\tau_a$ and $\tau_b$ respectively [Reference Harary and Moser18, Corollary 5a]. Note that we definitely have $v\neq a$ and $v\neq b$ . From node v we go to node a. If a does not have a parent from another tournament, we have found one node with zero in-degree with respect to the whole graph. If a has parent(s) from another tournament, say $\tau'_{\!\!a}$ , then we go to the node that within $\tau'_{\!\!a}$ has in-degree zero, say node a . We keep on going until we find a node with in-degree zero within the whole graph—such a node must exist because the graph is finite. We repeat the same process for $\tau_b$ , yielding two different nodes having zero in-degree with respect to whole graph. These nodes must be different because of the definition of $\mathcal{T}$ : since we started in two different tournaments $\tau_a$ and $\tau_b$ , we cannot end up in the same node; otherwise there would be a non-directed cycle passing through v and that node. Hence we have found two nodes with zero in-degree, which means that $\mathcal{T}$ does not have a unique source node.

Next we show that if $\mathcal{T}$ has two or more source nodes, u and v, then there is a v-structure. Because u and v are sources they have in-degree zero, so that they cannot belong to the same tournament, and thus they belong to two different tournaments. Consider the unique shortest trail between u, v on a sequence of nodes $\{u=v_1, v_2, \ldots, v_n=v\}$ . Such a trail exists because, by the definition of a TTT, the skeleton of $\mathcal{T}$ is a block graph, and in a block graph there is a unique shortest path between every two nodes [Reference Behtoei, Jannesari and Taeri6, Theorem 1]. For every two consecutive nodes in the shortest path, $v_i,v_{i+1}$ , we have either $(v_i,v_{i+1})\in E$ or $(v_{i+1}, v_i)\in E$ . Because u and v are sources of $\mathcal{T}$ , we have $(u,v_2)\in E$ and $(v,v_{n-1})\in E$ . Note that $n \ge 3$ , since u and v cannot be adjacent. We need three nodes $v_i, v_{i+1}, v_{i+2}$ such that $(v_i,v_{i+1})\in E$ and $(v_{i+2}, v_{i+1})\in E$ . If $n = 3$ , then the triple $(u,v_2,v)$ already fulfils the requirement. If $n\geq 4$ , then we continue from $v_2$ as follows. Let $i = \max \{ j = 1,\ldots,n-2 \;:\;(v_{j}, v_{j+1}) \in E\}$ ; then $(v_i,v_{i+1}) \in E$ and $(v_{i+2},v_{i+1}) \in E$ , as required. Because this is the shortest trail, $v_i$ and $v_{i+2}$ cannot belong to the same tournament, since otherwise there would exist a shorter trail passing only through $v_i$ and $v_{i+2}$ .

Part 3. Suppose that $v \in \mathrm{Desc}(i) \cap \mathrm{Desc}(j)$ but also both $i \not\in \mathrm{an}(j)$ and $j \not\in \mathrm{an}(i)$ ; in particular, i and j do not belong to the same tournament. Consider the paths p(i, v) and p(j, v). Along each path, continue walking upwards considering successive parents. Since the graph is finite, this walk must end for both paths at a node without parents. By assumption, the latter must be the unique source node of the TTT, say $u_0$ . We will thus have found two different paths from $u_0$ to v, one passing via i and the other one via j. However, as i and j do not belong to the same tournament, this is in contradiction to Property (P2) of a TTT.

A.2. Proof of Lemma 2.3

Proof. Part 1. Suppose that there is a node $v_r$ , for $r \in \{2, \ldots, n-1\}$ , which is not the source node in the tournament shared with $v_{r+1}$ , say $\tau$ . Let $\bar{v}$ be a parent of $v_r$ in $\tau$ . Note that $\bar{v}$ must be a parent of $v_{r+1}$ too, because of the out-degree ordering in a tournament. Because $v_{r-1}$ is a parent of $v_r$ too, both $v_{r-1}$ and $\bar{v}$ must belong to $\tau$ , since otherwise $v_r$ would have parents from different tournaments, which is impossible according to Lemma 2.2-2. Hence $v_{r-1}, v_r, \bar{v}, v_{r+1}$ all belong to the same tournament, i.e., to $\tau$ . Necessarily $v_{r-1}$ is a parent of $v_{r+1}$ , because otherwise there would be a directed cycle $\{v_{r-1},v_r, v_{r+1}, v_{r-1}\}$ . But then $\{v_1, \ldots, v_{r-1}, v_{r+1}, \ldots, v_n\}$ is a shorter path between $v_1$ and $v_n$ , in contradiction to the hypothesis.

Part 2. Let the shortest trail between u and v be the one along the node sequence $\{v_1=u, \ldots, v_n=v\}$ . It is sufficient to show that there cannot exist a node $v_{r}$ for $r \in \{2, \ldots, n-1\}$ such that $(v_{r-1}, v_r)\in E$ and $(v_{r+1},v_r)\in E$ . Indeed, suppose to the contrary that there exists $r \in \{2, \ldots, n-1\}$ such that both $v_{r-1}$ and $v_{r+1}$ are parents of $v_r$ . Then $v_{r-1}$ and $v_{r+1}$ must be adjacent, because v-structures are excluded by Lemma 2.2-2. But then $\{v_1, \ldots, v_{r-1}, v_{r+1}, \ldots, v_n\}$ is a shorter trail between u and v, yielding a contradiction.

Appendix B. Proofs and additional results for Section 3

Proof of Lemma 3.1. From [Reference Segers31, Example 1] we have the limit

\[ \sum_{j\in V} b_{uj} \delta_{\left\{\frac{b_{vj}}{b_{uj}}, v\in V_{\tau}\right\}}. \]

Adapting this representation to a model where we have $b_{uj}=0$ for $j\notin \mathrm{An}(u)$ and $b_{ij}=c_{p(j,i)}b_{jj}$ for $j\in \mathrm{An}(i)$ , we obtain

\[ \sum_{j\in \mathrm{An}(u)} b_{uj} \delta_{\left\{\frac{c_{p(j,v)}b_{jj}}{c_{p(j,u)}b_{jj}}, v\in V_{\tau}\right\}}. \]

Recall that $c_{p(i,i)}=1$ and $c_{p(i,j)}=0$ if $i\notin \mathrm{An}(j)$ .

Next we show that $(M_{uv}, v\in V_{\tau})$ are mutually dependent. When u is the source of $\tau$ , for every $j\in \mathrm{An}(u)$ the atom

\[ \bigg(\frac{c_{p(j,v)}}{c_{p(j,u)}}, v\in V_{\tau}\bigg) =\bigg(\frac{c_{p(j,u)}c_{uv}}{c_{p(j,u)}}, v\in V_{\tau}\bigg) =(1;\; c_{uv}, v\in V_{\tau}\setminus u) \]

has probability $\sum_{j\in \mathrm{An}(u)}b_{uj}=1$ . Hence $(M_{uv}, v\in V_{\tau})$ are at the same time perfectly dependent and independent.

For a node u which is not the source node, the general idea is to take a collection of coordinates for which the joint probability is zero while the product of the marginal probabilities is positive, thus showing that the joint probability does not equal the product of the marginal probabilities for the selected possible value of the random vector.

For brevity, let $V_{\tau}=\{1,2,\ldots,m \}$ , with the nodes labelled according to the order of their out-degrees within $\tau$ : the source node of $\tau$ has out-degree $m-1$ (the largest) and is labelled as 1, the node with out-degree $m-2$ is labelled as 2, etc.

Suppose u is the node 2. Thanks to the no-cycle property within a tournament, we have $\mathrm{An}(2)=\mathrm{An}(1)\cup \{2\}$ . For all $j\in \mathrm{An}(1)$ we have

(23) \begin{equation} \bigg(\frac{c_{p(j,v)}}{c_{p(j,2)}}, v=1, \ldots, m \bigg) =\bigg( \frac{1}{c_{12}};\; 1; \frac{c_{p(j,1)}c_{1v}}{c_{p(j,1)}c_{12}}, v = 3, \ldots, m \bigg), \end{equation}

which is an atom of $(M_{2v},v=1,\ldots, m)$ with mass $\sum_{j\in \mathrm{An}(1)}b_{2j}$ . This means that for the marginal distribution of $M_{21}$ we have $\operatorname{\mathbb{P}}(M_{21}=1/c_{12})\geq\sum_{j\in \mathrm{An}(1)}b_{2j}$ . For $j=2$ we have an atom $(0, 1, c_{23}, \ldots, c_{2m})$ with mass $b_{22}$ . This means that for the marginal probabilities of $(M_{23}, \ldots, M_{2m})$ we have $\operatorname{\mathbb{P}}(M_{2v}=c_{2v})\geq b_{22}$ for all $v=3, \ldots, m$ . Take a vector of coordinates $(1/c_{12}, 1, c_{23}, \ldots, c_{2m})$ . Note that this vector cannot be the same as the one in (23). For any $v=3, \ldots, m$ we cannot have $c_{1v}/c_{12}=c_{2v}$ because of the criticality assumption, according to which $c_{1v} > c_{12}c_{2v}$ for any $v = 3,\ldots,m$ . The joint probability of this vector of coordinates is

\[ \operatorname{\mathbb{P}}(M_{21}=1/c_{12}, M_{22}=1, M_{23}=c_{23},\ldots , M_{2m}=c_{2m}) =0. \]

However, the product of the marginal probabilities is positive:

\[ \operatorname{\mathbb{P}}(M_{21}=1/c_{12}) \operatorname{\mathbb{P}}( M_{22}=1) \prod_{v=3}^m\operatorname{\mathbb{P}}( M_{2v}=c_{2v}) \geq \sum_{j\in \mathrm{An}(1)}b_{2j}\times b_{22}^{m-1} >0. \]

Now let $u\geq 3$ . Take the vector of coordinates in (17) corresponding to $j=1$ , which is equal to $(1/c_{1u},c_{12}/c_{1u}, \ldots, c_{1m}/c_{1u} )$ and has probability at least $b_{u1}$ . Consider also the vector of coordinates for $j=u$ , which is $(0, \ldots, 0, 1;\; c_{uv}, v=u+1, \ldots, m)$ with mass at least $b_{uu}$ . Replace the first coordinate by $1/c_{1u}$ . The vector obtained in this way has joint probability zero. For every $j\in \mathrm{pa}(u)$ we have $b_{vj}/b_{uj}=0$ when v is not a child of j, or equivalently, given the order in the node labelling, when $v<j$ . So for fixed $u\geq3$ , for $j=1$ the vector $(b_{vj}/b_{uj}, v=1, \ldots, m)$ has no zeros. For $j=2$ the vector $(b_{vj}/b_{uj}, v=1, \ldots, m)$ has one zero, namely $(0;b_{vj}/b_{uj}, v=2, \ldots, m)$ ; for $j=3$ the vector $(b_{vj}/b_{uj}, v=1, \ldots, m)$ has two zeros, namely $(0,0;b_{vj}/b_{uj}, v=3, \ldots, m)$ ; and so on until $j=u$ with the corresponding vector $(b_{vj}/b_{uj}, v=1, \ldots, m)=(0,\ldots, 0;b_{vj}/b_{uj}, v=u, \ldots, m)$ . If we replace the first coordinate by a non-zero value in this vector, we get an impossible value for the random vector $(M_{uv}, v=1, \ldots, m)$ or a value with probability zero. Considering the univariate marginal distributions of $(M_{uv}, v=1, \ldots, m)$ , we obtain for the product of the marginal probabilities a positive value:

\[ \operatorname{\mathbb{P}}(M_{u1}=1/c_{1u}) \left[\prod_{v=2}^{u-1}\operatorname{\mathbb{P}}( M_{uv}=0)\right] \operatorname{\mathbb{P}}(M_{uu}=1) \prod_{v=u+1}^m\operatorname{\mathbb{P}}( M_{uv}=c_{uv}) \geq b_{u1}\times b_{uu}^{m-1} >0. \]

This shows that for any $u\in V_{\tau}$ the vector $(M_{u1}, \ldots, M_{um})$ has jointly dependent elements.

Next we show the distribution of a single element $M_{uv}, v\in V_{\tau}\setminus u$ :

  1. 1. Consider first the case when u is the source node in $\tau$ . Since $(u, v) \in E$ , we have $\mathrm{An}(u) \subset \mathrm{An}(v)$ and thus $\mathrm{An}(v) \cap \mathrm{An}(u) = \mathrm{An}(u)$ . We have $b_{vj}>0$ , $j\in \mathrm{An}(u)$ ; hence zero is not a possible value of $M_{uv}$ . For $j\in \mathrm{An}(u)$ ,

    \[ \frac{b_{vj}}{b_{uj}}=\frac{c_{p(j,u)}c_{uv}b_{jj}}{c_{p(j,u)}b_{jj}}=c_{uv}, \]
    and since $\sum_{j\in \mathrm{An}(u)}b_{uj}=1$ we obtain the desired result under 1.(a).

    When u is not the source node in $\tau$ , not all shortest paths to v pass through u; hence for $j\in\mathrm{An}(u)$ we have

    \[ \frac{b_{vj}}{b_{uj}}=\frac{c_{p(j,v)}b_{jj}}{c_{p(j,u)}b_{jj}} =\frac{c_{p(j,v)}}{c_{p(j,u)}} >0 \]
    with mass $b_{uj}$ . This yields the result in 1.(b). Note that zero is not a possible value, as we still have $\mathrm{An}(u)\subset \mathrm{An}(v)$ . Also, $c_{p(u,u)}=1$ by convention.
  2. 2. Let us now consider $(v,u)\in E$ . In this case $\mathrm{An}(u)\setminus\mathrm{An}(v)$ is not empty, because it contains at least the node u, so zero is a possible value of $M_{uv}$ . We need to distinguish only the zero atoms from the non-zero ones. When v is a source node in $\tau$ , for $j\in \mathrm{An}(v)$ we have

    \[ \frac{b_{vj}}{b_{uj}} =\frac{c_{p(j,v)}b_{jj}}{c_{p(j,u)}b_{jj}} =\frac{c_{p(j,v)}}{c_{p(j,v)}c_{vu}} =\frac{1}{c_{vu}} >0, \]
    which is an atom with probability
    \[ \sum_{j\in \mathrm{An}(v)}b_{uj} =\sum_{j\in \mathrm{An}(v)}c_{p(j,v)}c_{vu}b_{jj} =c_{vu}\sum_{j \in \mathrm{An}(v)} c_{p(j,v)} b_{jj} =c_{vu}\sum_{j \in \mathrm{An}(v)} b_{vj} = c_{vu}. \]
    The probability of the zero atom is $\sum_{j\in \mathrm{An}(u)\setminus \mathrm{An}(v)}b_{uj}=\sum_{j\in \mathrm{An}(u)}b_{uj}-\sum_{j\in \mathrm{An}(v)}b_{uj}=1-c_{vu}$ . This shows 2.(a).

    When v is not a source node of $\tau$ , we have for $j\in \mathrm{An}(v)$

    \[ \frac{b_{vj}}{b_{uj}} =\frac{c_{p(j,v)}b_{jj}}{c_{p(j,u)}b_{jj}} =\frac{c_{p(j,v)}}{c_{p(j,u)}}>0, \]
    an atom with mass $b_{uj}$ and zero atom with probability $\sum_{j\in \mathrm{An}(u)\setminus \mathrm{An}(v)}b_{uj}$ . This shows 2.(b).

Remark B.1. From the results in Lemma 3.1 we see that a multiplicative increment does not have a degenerate distribution at zero, so that a product of several such multiplicative increments cannot be degenerate at zero either. This is an important observation that we will use in further proofs.

Lemma B.1. Let $(X_v, v\in V)$ follow a max-linear model as in Assumption 2.1. Let $\mathcal{T}$ have a unique source. For any $u\in V$ we have

(24) \begin{align} \mathcal{L}\bigg(\frac{X_v}{X_u}, v\in V \mathrel{\Big|} X_u>t\bigg) \stackrel{d}{\longrightarrow} \mathcal{L}(A_{uv}, v\in V) =\sum_{j\in \mathrm{An}(u)}b_{uj} \delta_{\left\{\frac{c_{p(j,v)}}{c_{p(j,u)}}, v\in V\right\}}. \end{align}

The distribution of $A_{uv}$ depends on the three types of possible trails according to Lemma 2.3-2. In what follows we assume $(u,v)\notin E$ . For the case $(u,v)\in E$ see Lemma 3.1.

  1. 1. Distribution of $A_{uv}$ on a path $\{u=v_1, r=v_2, \ldots, v=v_n\}$ with $u,r\in \tau$ , one of the tournaments of $\mathcal{T}$ :

    1. (a) If u is a source node in $\tau$ , then $\mathcal{L}(A_{uv})=\delta_{\{c_{p(u,v)}\}}$ .

    2. (b) If u is not a source node in $\tau$ , we have

      \[ \mathcal{L}(A_{uv}) =\sum_{j\in \mathrm{An}(u)} b_{uj}\delta_{\left\{\frac{c_{p(j,r)}}{c_{p(j,u)}}c_{p(r,v)}\right\}}. \]

  2. 2. Distribution of $A_{uv}$ on a path $\{v=v_1, r=v_2, \ldots, u=v_n\}$ with $v,r\in \tau$ :

    1. (a) If v is a source node in $\tau$ , then

      \[ \mathcal{L}(A_{uv}) =c_{p(v,u)}\delta_{\left\{\frac{1}{c_{p(v,u)}}\right\}} +(1-c_{p(v,u)})\delta_{\{0\}}. \]
    2. (b) If v is not a source node in $\tau$ , then

      \[ \mathcal{L}(A_{uv}) =\sum_{j\in \mathrm{An}(v)} c_{p(r,u)}b_{rj}\delta_{\left\{ \frac{c_{p(j,v)}}{c_{p(j,r)}c_{p(r,u)}}\right\}} +\sum_{j\in \mathrm{An}(u)\setminus \mathrm{An}(v)}b_{uj}\delta_{\{0\}}. \]

  3. 3. Distribution of $A_{uv}$ on a trail composed of two paths p(r,u) and p(r,v): Let the trail be on nodes $\{u, \ldots, m,r, n, \ldots, v\}$ . Also, let $\tau_m, \tau_n$ be two tournaments with $r,m\in \tau_m$ and $r,n\in \tau_n$ .

    1. (a) If r is a source in both $\tau_m$ and $\tau_n$ , then

      \[ \mathcal{L}(A_{uv})=c_{p(r,u)} \delta_{\left\{\frac{c_{p(r,v)}}{c_{p(r,u)}}\right\}} +(1-c_{p(r,u)})\delta_{\{0\}}. \]
    2. (b) If r is a source in $\tau_m$ , but not in $\tau_n$ , then

      \[ \mathcal{L}(A_{uv})=\sum_{j\in\mathrm{An}(r)}c_{p(r,u)}b_{rj} \delta_{\left\{\frac{c_{p(j,n)}c_{p(n,v)}}{c_{p(j,r)}c_{p(r,u)}}\right\}} +\sum_{j\in \mathrm{An}(u)\setminus \mathrm{An}(r)}b_{uj}\delta_{\{0\}}. \]
    3. (c) If r is a source in $\tau_n$ , but not in $\tau_m$ , then

      \[ \mathcal{L}(A_{uv})=\sum_{j\in\mathrm{An}(r)}c_{p(m,u)}b_{mj} \delta_{\left\{\frac{c_{p(j,r)}c_{p(r,v)}}{c_{p(j,m)}c_{p(m,u)}}\right\}} +\sum_{j\in \mathrm{An}(u)\setminus \mathrm{An}(r)}b_{uj}\delta_{\{0\}}. \]

Proof. We have already seen that from [Reference Segers31, Example 1] we have the limit

\[ \mathcal{L}\bigg(\frac{X_v}{X_u}, v\in V\mid X_u>t\bigg) \stackrel{d}{\longrightarrow} \sum_{j\in V} b_{uj} \delta_{\left\{\frac{b_{vj}}{b_{uj}}, v\in V\right\}}. \]

Adapting this representation to a model where we have $b_{uj}=0$ for $j\notin \mathrm{An}(u)$ and $b_{ij}=c_{p(j,i)}b_{jj}$ for $j\in \mathrm{An}(i)$ , we obtain

\[ \sum_{j\in \mathrm{An}(u)} b_{uj} \delta_{\left\{\frac{c_{p(j,v)}}{c_{p(j,u)}}, v\in V\right\}}. \]

Recall that $c_{p(i,i)}=1$ and $c_{p(i,j)}=0$ if $i\notin \mathrm{An}(j)$ . For a single $v\in V\setminus u$ we have the marginal distribution

(25) \begin{equation} \mathcal{L}(A_{uv})=\sum_{j\in\mathrm{An}(u)}b_{uj} \delta_{\left\{\frac{b_{vj}}{b_{uj}}\right\}}. \end{equation}

The distribution of $A_{uv}$ depends deterministically on properties of the TTT. When $\mathcal{T}$ has a unique source, according to Lemma 2.3-2 there are three possible shortest trails between two nodes. In addition we have also the property under Lemma 2.3-1. We look at the different distributions of $A_{uv}$ that arise from these two properties of the TTT.

First we deal with 1.(a). Since $\mathrm{An}(u)\subset \mathrm{An}(v)$ , all atoms in (25) are positive and zero is not a possible value of $A_{uv}$ . All paths from $\mathrm{An}(u)$ to v pass through u, because u is a source in $\tau$ and because, by Property (P2) of a TTT, no cycle involving several tournaments is allowed. The case is illustrated by the graph below.

Hence for all $j\in \mathrm{An}(u)$ we have

\[ \frac{b_{vj}}{b_{uj}} =\frac{c_{p(j,v)}b_{jj}}{c_{p(j,u)}b_{jj}} =\frac{c_{p(j,u)}c_{p(u,v)}}{c_{p(j,u)}} =c_{p(u,v)}>0, \]

with mass $\sum_{j\in \mathrm{An}(u)}b_{uj}=1$ .

Next we show 1.(b). Because $\mathrm{An}(u)\subset \mathrm{An}(v)$ , zero is not a possible value of $A_{uv}$ . Not all shortest paths from $\mathrm{An}(u)$ to v pass through u, because u is not a source in $\tau$ . However, all paths from $\mathrm{An}(u)$ to v pass through r, as shown in the picture. Paths from $\mathrm{An}(u)$ to v other than these passing through u or r are impossible because of Property (P2) of a TTT.

For $j\in \mathrm{An}(u)$ we have

\[ \frac{b_{vj}}{b_{uj}} =\frac{c_{p(j,v)}b_{jj}}{c_{p(j,u)}b_{jj}} =\frac{c_{p(j,r)}}{c_{p(j,u)}}c_{p(r,v)}>0, \]

with mass $b_{uj}$ , hence the expression in 1.(b).

Next we show 2.(a). When the directed path is from v to u, the set $\mathrm{An}(u)\setminus \mathrm{An}(v)$ contains at least u; hence we have $b_{vj}=0$ for all $j\in \mathrm{An}(u)\setminus \mathrm{An}(v)$ . This means that zero is a possible value of $A_{uv}$ . All shortest paths from $j\in \mathrm{An}(v)$ to u pass through v, as v is a source in $\tau$ . Otherwise, there would be a cycle involving multiple tournaments, which is not allowed under Property (P2) of a TTT.

For $j\in \mathrm{An}(v)$ , the non-zero atom is given by

\[ \frac{b_{vj}}{b_{uj}} =\frac{c_{p(j,v)}b_{jj}}{c_{p(j,u)}b_{jj}} =\frac{c_{p(j,v)}}{c_{p(j,v)}c_{p(v,u)}}=\frac{1}{c_{p(v,u)}}>0, \qquad j\in\mathrm{An}(v), \]

with mass

\begin{align*} \sum_{j\in \mathrm{An}(v)}b_{uj} =\sum_{j\in \mathrm{An}(v)}c_{p(j,v)}c_{p(v,u)}b_{jj} &=c_{p(v,u)}\sum_{j\in \mathrm{An}(v)}c_{p(j,v)}b_{jj} \\[5pt] &=c_{p(v,u)}\sum_{j\in \mathrm{An}(v)}b_{vj} =c_{p(v,u)}. \end{align*}

For the zero atom we have probability

\[ \sum_{j\in\mathrm{An}(u)\setminus \mathrm{An}(v)}b_{uj}=\sum_{j\in \mathrm{An}(u)}b_{uj}-\sum_{j\in \mathrm{An}(v)}b_{uj}=1-c_{p(v,u)}. \]

This shows 2.(a).

To show 2.(b) we note that when v is not a source node of $\tau$ , not all shortest paths from $j\in \mathrm{An}(v)$ to u pass through v. However, all paths from $j\in \mathrm{An}(v)$ to u pass through r, as can be seen from the figure below.

Hence for $j\in \mathrm{An}(v)$ we have

\[ \frac{b_{vj}}{b_{uj}} =\frac{c_{p(j,v)}b_{jj}}{c_{p(j,u)}b_{jj}} =\frac{c_{p(j,v)}}{c_{p(j,r)}c_{p(r,u)}}>0, \]

which is an atom with mass $b_{uj}=c_{p(j,r)}c_{p(r,u)}b_{jj}=c_{p(r,u)}b_{rj}$ . The zero atom comes from the fact that $b_{vj}=0$ for all $j\in \mathrm{An}(u)\setminus \mathrm{An}(v)$ , and it has probability

\[ \sum_{j\in\mathrm{An}(u)\setminus \mathrm{An}(v)}b_{uj}. \]

This shows the distribution under 2.(b).

By Lemma 2.3-1, the node r, as part of the path p(r, u), is allowed not to be a source node in $\tau_m$ . A similar statement can be made in relation to the path p(r, v). However, when we combine p(r, u) and p(r, v) in one trail t(u, v), the node r should be a source in at least one of $\tau_m$ and $\tau_n$ . Indeed, if r were not a source either in $\tau_m$ or in $\tau_n$ , then there would be a v-structure. However, Lemma 2.2-2 excludes v-structures when $\mathcal{T}$ has a unique source; hence node r should be a source in at least one of the tournaments $\tau_m$ and $\tau_n$ .

To show 3.(a), we note that all paths from $j\in \mathrm{An}(r)$ to u and to v pass through r, as r is a source in both $\tau_n$ and $\tau_m$ . The case is depicted in the following picture.

Also, we have $b_{vj}=0$ for all $j\in \mathrm{An}(u)\setminus \mathrm{An}(r)$ . For $j\in \mathrm{An}(r)$ we have

\[ \frac{b_{vj}}{b_{uj}} =\frac{c_{p(j,r)}c_{p(r,v)}b_{jj}}{c_{p(j,r)}c_{p(r,u)}b_{jj}} =\frac{c_{p(r,v)}}{c_{p(r,u)}}>0, \]

with probability

\begin{align*} \sum_{j\in \mathrm{An}(r)}b_{uj} =\sum_{j\in \mathrm{An}(r)}c_{p(j,r)}c_{p(r,u)}b_{jj} &=c_{p(r,u)}\sum_{j\in \mathrm{An}(r)}c_{p(j,r)}b_{jj} \\[5pt] &=c_{p(r,u)}\sum_{j\in \mathrm{An}(r)}b_{rj} =c_{p(r,u)}. \end{align*}

The probability of the zero atom is

\begin{align*} \sum_{j\in \mathrm{An}(u)\setminus \mathrm{An}(r)}b_{uj}=\sum_{j\in \mathrm{An}(u)}b_{uj}-\sum_{j\in\mathrm{An}(r)}b_{uj}=1-c_{p(r,u)}. \end{align*}

Next we show 3.(b). Because r is not a source in $\tau_n$ , not all paths from $\mathrm{An}(r)$ to v pass through r, but they do all pass through n. Also, all paths from $\mathrm{An}(r)$ to u pass through r, because r is a source in $\tau_m$ .

Hence, for $j\in \mathrm{An}(r)$ ,

\[ \frac{b_{vj}}{b_{uj}} =\frac{c_{p(j,n)}c_{p(n,v)}b_{jj}}{c_{p(j,r)}c_{p(r,u)}b_{jj}} =\frac{c_{p(j,n)}c_{p(n,v)}}{c_{p(j,r)}c_{p(r,u)}}>0, \]

which is an atom with mass $b_{uj}=c_{p(j,r)}c_{p(r,u)}b_{jj}=b_{rj}c_{p(r,u)}$ . The zero atom has probability equal to $\sum_{j\in \mathrm{An}(u)\setminus \mathrm{An}(r)}b_{uj}$ .

Next we show 3.(c). When r is a source in $\tau_n$ , this means that all paths from $\mathrm{An}(r)$ to v pass through r. Because r is not a source in $\tau_m$ , not all paths from $\mathrm{An}(r)$ to u pass through r, but they do all pass through m.

For $j\in \mathrm{An}(r)$ we have

\[ \frac{b_{vj}}{b_{uj}} =\frac{c_{p(j,r)}c_{p(r,v)}b_{jj}}{c_{p(j,m)}c_{p(m,u)}b_{jj}} =\frac{c_{p(j,r)}c_{p(r,v)}}{c_{p(j,m)}c_{p(m,u)}}>0, \]

which is an atom with mass $b_{uj}=c_{p(j,m)}c_{p(m,u)}b_{jj}=b_{mj}c_{p(m,u)}$ . The zero atom comes from $b_{uj}=0$ for all $j\in\mathrm{An}(u)\setminus \mathrm{An}(r)$ . It has probability $\sum_{j\in \mathrm{An}(u)\setminus \mathrm{An}(r)}b_{uj}$ .

Proof of Proposition 3.1. First we prove that (i) implies (ii). Assume $\mathcal{T}$ has a unique source. We have to prove that for any $u\in V$ , an element from the limiting vector in (15) is given by (16).

In Lemma B.1 we have seen a number of cases for the distribution of $A_{uv}$ , depending on deterministic properties of the trail between u and v. Below we consider each of these cases again.

Case 1. Let the unique shortest trail between u and v be a path on the node sequence $\{u=v_1, r=v_2, \ldots, v_n=v \}$ . Let $\tau$ be the tournament containing u, r.

Case 1. (a). Let u be the source in $\tau$ . From Lemma B.1-1.(a) we have $P(A_{uv}=c_{p(u,v)})=1$ . Consider the variables $(M_e, e\in p(u,v))$ , which are by construction independent of each other, because they belong to different tournaments. Note that in this case every node $v_1,\ldots, v_{n-1}$ is the source node in the tournament containing that node and the next one in the sequence. This follows from Lemma 2.3-1. Then, by Lemma 3.1 1.(a), for every $M_e, e\in p(u,v)$ we have $\operatorname{\mathbb{P}}(M_e=c_e)=1$ , and hence

\[ \operatorname{\mathbb{P}}\left(\prod_{e\in p(u,v)}M_e=c_{p(u,v)}\right) = \prod_{e\in p(u,v)}\operatorname{\mathbb{P}}(M_{e}=c_e)=1, \]

which shows $A_{uv}=\prod_{e\in p(u,v)}M_e$ .

Case 1. (b). If u is not the source in $\tau$ , the distribution of $M_{ur}$ is as in Lemma 3.1-1.(b). As in Case 1.(a), every node $r=v_2,v_3,\ldots, v_{n-1}$ is the source node in the tournament containing that node and the next one in the sequence. The variables $M_e, e\in p(r,v)$ are degenerate at $c_e$ . As in Case 1.(a), the variables $(M_e, e\in p(u,v))$ are by construction independent of each other, because they are indexed by edges which belong to different tournaments. Then we have

(26) \begin{equation} \begin{split} \mathcal{L}\left(\prod_{e\in p(u,v)}M_e\right) =\mathcal{L}\left(M_{ur}\prod_{e\in p(r,v)}M_e\right) &=\left(\sum_{j\in \mathrm{An}(u)} b_{uj}\delta_{\left\{\frac{c_{p(j,r)}}{c_{p(j,u)}}\right\}}\right) \otimes \delta_{\{c_{p(r,v)}\}} \\[5pt] &=\sum_{j\in \mathrm{An}(u)} b_{uj}\delta_{\left\{\frac{c_{p(j,r)}}{c_{p(j,u)}}c_{p(r,v)}\right\}}. \end{split} \end{equation}

The symbol $\otimes$ denotes multiplication between two discrete probability measures, say $\mu$ and $\nu$ , of two independent variables, say $\xi_1$ and $\xi_2$ , respectively. For two possible values $a_1,a_2$ of $\xi_1, \xi_2$ respectively, we have $\mu(\{a_1\})\nu(\{a_1\})$ as a measure of the event $\{\xi_1\xi_2=a_1a_2\}=\{\xi_1=a_1, \xi_2=a_2\}$ . The last expression in (26) is the distribution of $A_{uv}$ in Lemma B.1-1.(b).

Case 2. Let the unique shortest trail between u and v be a path from v to u on the node sequence $\{v=v_1, r=v_2, \ldots, v_n=u \}$ . Let $\tau$ be the tournament containing v, r.

Case 2. (a). Let v be the source in $\tau$ . Consider the random variables $M_{v_{i+1}v_{i}}$ , $i=1,\ldots, n-1,$ whose distributions are as in Lemma 3.1-2.(a). Since this is the unique shortest trail from v to u, all edges on it belong to different tournaments, and the vector $(M_{v_{i+1}v_{i}}, i=1,\ldots, n-1)$ contains independent variables by definition. Then

(27) \begin{equation} \operatorname{\mathbb{P}}\Bigg(\prod_{i=1}^{n-1}M_{v_{i+1}v_{i}}=\frac{1}{c_{p(v,u)}}\Bigg)= \prod_{i=1}^{n-1}\operatorname{\mathbb{P}}\bigg(M_{v_{i+1}v_{i}}=\frac{1}{c_{v_{i}v_{i+1}}}\bigg) =\prod_{i=1}^{n-1}c_{v_{i}v_{i+1}}=c_{p(v,u)}. \end{equation}

For the zero atom we have

(28) \begin{align} \operatorname{\mathbb{P}}\Bigg(\prod_{i=1}^{n-1}M_{v_{i+1}v_{i}}=0\Bigg) =1-\prod_{i=1}^{n-1}\operatorname{\mathbb{P}}(M_{v_{i+1}v_{i}}>0)\nonumber &=1-\prod_{i=1}^{n-1}\operatorname{\mathbb{P}}\bigg(M_{v_{i+1}v_{i}} =\frac{1}{c_{v_{i}v_{i+1}}}\bigg) \\[5pt] &=1-c_{p(v,u)}. \end{align}

The expressions in (27) and (28) represent the distribution of $A_{uv}$ in Lemma B.1-2.(a).

Case 2. (b). If v is not the source in $\tau$ , consider a random variable $M_{rv}$ with distribution as in Lemma 3.1-2.(b) and a random variable $A_{ur}$ constructed as in Case 2.(a) above, i.e., as the product $\prod_{i=2}^{n-1}M_{v_{i+1}v_i}$ . By construction $M_{rv}$ is independent from $A_{ur}$ by the same argument as above. We have

\begin{align*} \mathcal{L}(A_{ur}M_{rv}) &=\left(c_{p(r,u)}\delta_{\left\{\frac{1}{c_{p(r,u)}}\right\}} +(1-c_{p(r,u)})\delta_{\{0\}}\right) \\[5pt] &\otimes \left(\sum_{j\in \mathrm{An}(v)} b_{rj} \delta_{\left\{\frac{c_{p(j,v)}}{c_{p(j,r)}}\right\}} +\sum_{j\in \mathrm{An}(r)\setminus \mathrm{An}(v)}b_{rj}\delta_{\{0\}}\right), \end{align*}

which gives non-zero atoms $c_{p(j,v)}/(c_{p(j,r)}c_{p(r,u)})$ , $j\in \mathrm{An}(v)$ , with masses $b_{rj}c_{p(r,u)}$ , $j\in \mathrm{An}(v)$ . To show the probability of the zero atom, consider

\begin{align*} &\operatorname{\mathbb{P}}(A_{ur}M_{rv}=0) =1-\operatorname{\mathbb{P}}(A_{ur}>0) \cdot \operatorname{\mathbb{P}}(M_{rv}>0) =1-c_{p(r,u)}\sum_{j\in \mathrm{An}(v)}b_{rj} \\[5pt] &=\sum_{j\in \mathrm{An}(u)}b_{uj}-\sum_{j\in \mathrm{An}(v)}c_{p(j,r)}b_{jj}c_{p(r,u)} =\sum_{j\in \mathrm{An}(u)}b_{uj}-\sum_{j\in \mathrm{An}(v)}b_{uj} =\sum_{j\in \mathrm{An}(u)\setminus \mathrm{An}(v)}b_{uj}, \end{align*}

which is what we need to confirm $A_{uv}=A_{ur}M_{rv}$ , where $A_{uv}$ is as in Lemma B.1-2.(b).

Case 3. In the three cases that follow, let the unique shortest trail from u to v be given by two paths p(r, u) and p(r, v). Let the trail be on nodes $\{u, \ldots, m,r, n, \ldots, v\}$ . Also, let $\tau_m, \tau_n$ be two tournaments with $r,m\in \tau_m$ and $r,n\in \tau_n$ .

Case 3. (a). Let r be the source in both $\tau_m$ and $\tau_n$ . Consider random variables $A_{rv}$ as in Lemma B.1-1.(a) and $A_{ur}$ as in Lemma B.1-2.(a). Above we have shown in Cases 1.(a) and 2.(a) that $A_{rv}$ and $A_{ur}$ are factorizable into independent multiplicative increments. By construction $A_{rv}$ and $A_{ur}$ are independent from each other, because the multiplicative increments are independent. We have

\begin{align*} \operatorname{\mathbb{P}}\bigg(A_{ur}A_{rv}=\frac{c_{p(r,v)}}{c_{p(r,u)}}\bigg) =\operatorname{\mathbb{P}}\bigg(A_{ur}=\frac{1}{c_{p(r,u)}}\bigg)\operatorname{\mathbb{P}}(A_{rv}=c_{p(r,v)}) =c_{p(r,u)}. \end{align*}

For the probability of the zero atom we have

\[ \operatorname{\mathbb{P}}(A_{ur}A_{rv}=0)=P(A_{ur}=0)=(1-c_{p(r,u)}). \]

The two displays above represent the distribution of $A_{uv}$ in Lemma B.1-3.(a).

Case 3. (b). Let r be the source in $\tau_m$ , but not in $\tau_n$ . Consider three random variables $A_{ur}, M_{rn}, A_{nv}$ with distributions as in Lemma B.1-2.(a), Lemma 3.1-1.(b), and Lemma B.1-1.(a), respectively. For $A_{ur}$ and $A_{nv}$ , we have shown in Cases 2.(a) and 1.(a) above that they are factorizable into independent multiplicative increments. By construction $M_{rn}$ is independent from the increments in $A_{ur}$ and $A_{nv}$ . Then

\begin{align*} &\mathcal{L}(A_{ur}M_{rn}A_{nv}) \\[5pt] &=\left(c_{p(r,u)}\delta_{\left\{\frac{1}{c_{p(r,u)}}\right\}} +(1-c_{p(r,u)})\delta_{\{0\}}\right) \otimes \left(\sum_{j\in \mathrm{An}(r)}b_{rj} \delta_{\left\{\frac{c_{p(j,n)}}{c_{p(j,r)}}\right\}}\right) \otimes \delta_{\{c_{p(n,v)}\}} \\[5pt] &=\sum_{j\in \mathrm{An}(r)}b_{rj}c_{p(r,u)} \delta_{\left\{\frac{c_{p(j,n)}c_{p(n,v)}}{c_{p(j,r)}c_{p(r,u)}}\right\}} +(1-c_{p(r,u)})\delta_{\{0\}}. \end{align*}

Note that

\begin{align*} \sum_{j\in \mathrm{An}(u)\setminus \mathrm{An}(r)}b_{uj} =\sum_{j\in\mathrm{An}(u)}b_{uj}-\sum_{j\in \mathrm{An}(r)}b_{uj} &=1-\sum_{j\in \mathrm{An}(r)} c_{p(j,r)}c_{p(r,u)}b_{jj} \\[5pt] &=1-c_{p(r,u)}\sum_{j\in \mathrm{An}(r)} b_{rj} =1-c_{p(r,u)}. \end{align*}

This shows that the distribution of $A_{ur}M_{rn}A_{nv}$ is that of $A_{uv}$ in Lemma B.1-3.(b).

Case 3. (c). Let r be the source in $\tau_n$ , but not in $\tau_m$ . Consider variables $A_{um}, M_{mr}, A_{rv}$ with distributions as in Lemma B.1-2.(a), Lemma 3.1-2.(b), and Lemma B.1-1.(a), respectively. The variables $A_{um}$ and $A_{rv}$ have been shown to factorize into independent increments in Cases 2.(a) and 1.(a), respectively, of this proof; hence they are also independent from each other. By construction $M_{mr}$ is independent from $A_{um}$ and $A_{rv}$ . Then we have

\begin{align*} \mathcal{L}(A_{um}M_{mr}A_{rv}) &=\left(c_{p(m,u)}\delta_{\left\{\frac{1}{c_{p(m,u)}}\right\}} +(1-c_{p(m,u)})\delta_{\{0\}}\right) \\[5pt] & \otimes \left(\sum_{j\in \mathrm{An}(r)}b_{mj} \delta_{\left\{\frac{c_{p(j,r)}}{c_{p(j,m)}}\right\}} +\sum_{j\in \mathrm{An}(m)\setminus \mathrm{An}(r)}b_{mj}\delta_{\{0\}}\right) \otimes \delta_{\{c_{p(r,v)}\}}. \end{align*}

The non-zero atoms are $c_{p(j,r)}c_{p(r,v)}/(c_{p(j,m)}c_{p(m,u)})$ for $j\in \mathrm{An}(r)$ with masses

\[ c_{p(m,u)}b_{mj}=c_{p(j,m)}c_{p(m,u)}b_{jj}=b_{uj}, \qquad j\in \mathrm{An}(r). \]

The probability of the zero atom is given by

\begin{align*} &\operatorname{\mathbb{P}}(A_{um}M_{mr}A_{rv}=0) =1-\operatorname{\mathbb{P}}(A_{um}>0)\operatorname{\mathbb{P}}(M_{mr}>0) =1-c_{p(m,u)}\sum_{j\in \mathrm{An}(r)}b_{mj} \\[5pt] &=1-\sum_{j\in \mathrm{An}(r)} c_{p(j,m)}c_{p(m,u)}b_{jj} =\sum_{j\in\mathrm{An}(u)}b_{uj}-\sum_{j\in\mathrm{An}(r)}b_{uj} =\sum_{j\in \mathrm{An}(u)\setminus \mathrm{An}(r)}b_{uj}. \end{align*}

Hence the distribution of $A_{um}M_{mr}A_{rv}$ is that of $A_{uv}$ in Lemma B.1-3.(c). This completes the proof that the statement in (i) implies (ii).

The statement in (iii) holds trivially from (ii).

Next we prove that (iii) implies (i) by contraposition: we assume that $\mathcal{T}$ has at least two sources and show that it is not possible to obtain the factorization in (16). If $\mathcal{T}$ has at least two sources, then by Lemma 2.2-2 there is at least one v-structure, say on nodes 1,2,3 and involving edges $(1,3), (2,3)\in E$ . Consider the nodes 1,2. For every $u\in V$ we have two possibilities:

  1. (a) The v-structure belongs to only one of the trails t(u, 1) and t(u, 2): without loss of generality, $(1,3), (2,3)\in t(u,2)$ and $(1,3), (2,3)\notin t(u,1)$ .

  2. (b) Each trail t(u, 1) and t(u, 2) contains one edge of the v-structure: without loss of generality, $(1,3)\in t(u,1)$ and $(2,3)\in t(u,2)$ .

If $u \in \{1, 2\}$ , then we are in the case (a), while if $u = 3$ we are in the case (b). If $u \not\in \{1,2,3\}$ , then node 3 must belong to at least one of the two trails t(u, 1) or t(u, 2), because otherwise the skeleton graph would have a cycle connecting nodes u, 1, 2, 3 and passing through more than one block. The latter is impossible according to Property (P2). The two possibilities are illustrated in Figure 5.

Figure 5. The two possible configurations of the trails t(u, 1) and t(u, 2) when nodes 1, 2, 3 form a v-structure.

Case 3. (c-i). Consider first the case when, without loss of generality, the v-structure belongs to t(u, 2) but not to t(u, 1); see Figure 5(a). Let the trail from 1 to u be on nodes $\{v_1=1, v_2, \ldots, v_n=u\}$ . We can have any direction on the edges of t(1, u). Recall the distribution of $A_{u2}$ :

\[ \mathcal{L}(A_{u2})=\sum_{j\in\mathrm{An}(u)}b_{uj}\delta_{\{b_{2j}/b_{uj}\}}. \]

We have $b_{2j}=0$ for all $j\notin \mathrm{An}(2)$ . We claim that $\mathrm{An}(u)\cap \mathrm{An}(2)=\varnothing$ . According to Property (P2) of a TTT, $\mathcal{T}$ does not contain an undirected cycle involving several tournaments. This means that it is impossible to find a node from which there are paths going to u and to 2. Also, it is impossible to find a path passing through 3 and going to 2, because otherwise there would be either an undirected cycle involving several tournaments, or a cycle within a tournament. Both are impossible for a TTT. This leads to the conclusion that $A_{u2}$ is degenerate at zero. Now we look at the variables $(M_{v_{i+1}v_i}, i=n-1, \ldots, 1;\; M_{13}, M_{32})$ , which by construction we take to be independent, as they belong to different tournaments. Each of them is one of the variables in Lemma 3.1, and none of these is degenerate at zero. Hence their product cannot be degenerate at zero either.

Case 3. (c-ii). Next we consider the second case, when without loss of generality $(1,3)\in t(u,1)$ and $(2,3)\in t(u,2)$ ; see Figure 5(b). Let the trail from node 3 to u be on nodes $\{v_1=3, v_2, \ldots, v_n=u\}$ . First we consider the case when we have at least one $i=1, \ldots, n-1$ for which $(v_{i+1},v_i)\in E$ , i.e., we have at least one edge with direction from u to 3. Because t(u, 3) is a shortest trail, the edges incident to the nodes on the trail belong to different tournaments. The distribution of $(A_{u1}, A_{u2})$ is given by

\[ \mathcal{L}(A_{u1}, A_{u2})= \sum_{j\in\mathrm{An}(u)}b_{uj} \delta_{\left\{\frac{b_{1j}}{b_{uj}},\frac{b_{2j}}{b_{uj}}\right\}}, \]

where $b_{1j}=0$ and $b_{2j}=0$ if $j\notin \mathrm{An}(1)$ and $j\notin \mathrm{An}(2)$ , respectively. When for some $i=1, \ldots, n-1$ we have $(v_{i+1}, v_i)\in E$ , it follows necessarily that $\mathrm{An}(1) \cap \mathrm{An}(u)=\varnothing$ and $\mathrm{An}(2) \cap \mathrm{An}(u)=\varnothing$ . There cannot be a path from $\mathrm{An}(1)$ or $\mathrm{An}(2)$ to any of the nodes $\{v_2,\ldots, v_n=u\}$ , because otherwise there would be a cycle involving several tournaments, in contradiction to the definition of a TTT. Because of the edge $(v_{i+1}, v_i)\in E$ , all nodes in $\mathrm{An}(1)\cup\mathrm{An}(2)$ are not ancestors of u. Also, because of the edges $(1,3), (2,3)\in E$ , all nodes in $\mathrm{An}(u)$ cannot be ancestors of node 1 or 2. Thus, when for some $i=1, \ldots, n-1$ there is a directed edge $(v_{i+1}, v_i)\in E$ , we have $\mathcal{L}(A_{u1},A_{u2})=\delta_{\{0,0\}}$ . We have found a node $v \in V$ such that $A_{uv} = 0$ almost surely, but then the factorization (15)–(16) cannot hold, because these are never degenerate at zero.

Now let the trail from node 3 to u actually be a path. Also, let nodes 1 and 2 be sources with respect to the tournaments shared with node 3, say $1,3\in V_{\tau_1}$ and $2,3\in V_{\tau_2}$ . It is always possible to choose 1 and 2 in such a way that they are sources of $\tau_1$ and $\tau_2$ . This is because node 3 obviously is not a source in $\tau_1$ and $\tau_2$ , so the sources of these must point to 3. We can decompose $\mathrm{An}(u)$ into three disjoint sets, namely $\mathrm{An}(1)$ , $\mathrm{An}(2)$ , and the rest, $\mathrm{An}(u)\setminus \{\mathrm{An}(1)\cup \mathrm{An}(2)\}$ . For the distribution of $(A_{u1}, A_{u2})$ we have

\begin{align*} &\mathcal{L}(A_{u1}, A_{u2}) =\sum_{j\in \mathrm{An}(u)}b_{uj} \delta_{\left\{\frac{b_{1j}}{b_{uj}}, \frac{b_{2j}}{b_{uj}}\right\}} \\[5pt] &=\sum_{j\in \mathrm{An}(1)}b_{uj} \delta_{\left\{\frac{b_{1j}}{b_{uj}}, \frac{b_{2j}}{b_{uj}}\right\}} +\sum_{j\in \mathrm{An}(2)}b_{uj} \delta_{\left\{\frac{b_{1j}}{b_{uj}}, \frac{b_{2j}}{b_{uj}}\right\}} +\sum_{j\in\mathrm{An}(u)\setminus \{\mathrm{An}(1)\cup \mathrm{An}(2)\}}b_{uj} \delta_{\left\{\frac{b_{1j}}{b_{uj}}, \frac{b_{2j}}{b_{uj}}\right\}}. \end{align*}

For the atoms in the first summation we have

\[ \frac{b_{1j}}{b_{uj}}=\frac{c_{p(j,1)}b_{jj}}{c_{p(j,u)}b_{jj}} =\frac{c_{p(j,1)}}{c_{p(j,1)}c_{13}c_{p(3,u)}} =\frac{1}{c_{13}c_{p(3,u)}} \]

and $b_{2j}/b_{uj}=0$ , as $b_{2j}=0$ for all $j\in \mathrm{An}(1)$ . Hence we have an atom that does not depend on $j\in \mathrm{An}(1)$ , i.e., $\big(1/(c_{13}c_{p(3,u)}),0\big)$ , and its mass is

\[ \sum_{j\in \mathrm{An}(1)}b_{uj}=\sum_{j\in \mathrm{An}(1)}c_{p(j,1)}c_{13}c_{p(3,u)}b_{jj}=c_{13}c_{p(3,u)}=c_{p(1,u)}. \]

In a similar way, from the second summation in the last display we have an atom $\big(0,1/(c_{23}c_{p(3,u)})\big)$ with mass $c_{23}c_{p(3,u)}=c_{p(2,u)}$ . In the third summation term the atom is (0,0), as $b_{1j}=b_{2j}=0$ for all $j\in\mathrm{An}(u)\setminus \{\mathrm{An}(1)\cup \mathrm{An}(2)\}$ , and its mass is $1-c_{13}c_{p(3,u)}-c_{23}c_{p(3,u)}=1-c_{p(3,u)}(c_{13}+c_{23})$ . Consider now the multiplicative increments $(M_{31};\; M_{32}, M_{v_{i+1}v_i} i=1, \ldots,n-1)$ , which are mutually independent since they belong to different tournaments. Because node 1 is a source node in the tournament $\tau_1$ , the distribution of $M_{31}$ is $c_{13}\delta_{\{1/c_{13}\}}+(1-c_{13})\delta_{\{0\}}$ by Lemma 3.1-2.(a); and similarly for $M_{32}$ . We have

(29) \begin{equation} \begin{split} & \operatorname{\mathbb{P}}\left(M_{31}\prod_{i=1}^{n-1}M_{v_{i+1}v_i}=0, M_{32}\prod_{i=1}^{n-1}M_{v_{i+1}v_i}=0\right) =1-\prod_{i=1}^{n-1}\operatorname{\mathbb{P}}(M_{v_{i+1}v_i}>0) \\[5pt] & +\operatorname{\mathbb{P}}(M_{31}=0)\operatorname{\mathbb{P}}(M_{32}=0) -\left(1-\prod_{i=1}^{n-1}\operatorname{\mathbb{P}}(M_{v_{i+1}v_i}>0)\right) \operatorname{\mathbb{P}}(M_{31}=0)\operatorname{\mathbb{P}}(M_{32}=0). \end{split} \end{equation}

After some rearranging of the expression above we obtain

(30) \begin{equation} 1-\prod_{i=1}^{n-1}\operatorname{\mathbb{P}}(M_{v_{i+1}v_i}>0)(c_{13}+c_{23}-c_{13}c_{23}). \end{equation}

There are two further subcases: either every node $v_1,\ldots,v_{n-1}$ is the source node with respect to the tournament involving the next node in the sequence, or not. In the first subcase, namely when every node in $\{v_1=3, v_2, \ldots, v_{n-1}\}$ is the source node with respect to the tournament involving the next node in the sequence, we have $\operatorname{\mathbb{P}}(M_{v_{i+1}v_i}>0)=c_{v_{i}v_{i+1}}$ for $i=1, \ldots, n-1$ . This means that the probability in (30), and consequently in (29), equals $1-c_{p(3,u)}(c_{13}+c_{23}-c_{13}c_{23})$ , which is different from $\operatorname{\mathbb{P}}(A_{u1}=0, A_{u2}=0)=1-c_{p(3,u)}(c_{13}+c_{23})$ . In the second subcase, i.e., if at least one node from $\{v_1=3, v_2, \ldots, v_{n-1}\}$ is not the source with respect to the tournament involving the next node in the sequence, we have that the possible values for $M_{31}\prod_{i=1}^{n-1}M_{v_{i+1}v_i}$ are not only $\{0,1/c_{p(1,u)}\}$ , which are the only possible values of $A_{u1}$ as we showed in the previous paragraph. Let $i \in \{1,\ldots,n-1\}$ be such that node $v_i$ is not the source node in the tournament shared with $v_{i+1}$ , say $\tau_i$ . This is depicted in the following graph:

Recall the distribution of $M_{v_{i+1}v_i}$ from Lemma 3.1-2.(b):

\[ \mathcal{L}(M_{v_{i+1}v_i}) =\sum_{j\in \mathrm{An}(v_{i})}b_{v_{i+1}j}\delta_{\{b_{v_ij}/b_{v_{i+1}j}\}} +\sum_{j\in \mathrm{An}(v_{i+1})\setminus \mathrm{An}(v_i)}\delta_{\{0\}}. \]

Take for instance a node, say s, which is a parent of $v_i$ and accordingly is in $\mathrm{An}(v_i)$ . Then

\[ \frac{b_{v_is}}{b_{v_{i+1}s}} =\frac{c_{sv_i}}{c_{sv_{i+1}}} \]

is a possible value of $M_{v_{i+1}v_i}$ with positive probability, namely at least $b_{v_{i+1}s}$ . Another possible positive value is for $j=v_i\in \mathrm{An}(v_i)$ , namely

\[ \frac{b_{v_iv_i}}{b_{v_{i+1}v_i}} =\frac{1}{c_{v_iv_{i+1}}} \]

with probability at least $b_{v_{i+1}v_i}$ . The criticality assumption on edge weights guarantees that $\frac{c_{sv_i}}{c_{sv_{i+1}}}\neq 1/c_{v_iv_{v+1}}$ . This means that the product $M_{31}\prod_{i=1}^{n-1}M_{v_{i+1}v_i}$ has at least two different positive values: one involving $\frac{c_{sv_i}}{c_{sv_{i+1}}}$ and another $1/c_{v_iv_{i+1}}$ . However, $A_{u1}$ has only one possible positive value.

Proof of Proposition 3.2. Sufficiency. Assume $\mathcal{T}$ has a unique source. We need to show that, for any disjoint and nonempty sets A, B, S, we have $X_A\perp\!\!\!\perp X_B\mid X_S$ whenever S is a separator of A and B in the skeleton T of $\mathcal{T}$ . We would like to use [Reference Améndola, Klüppelberg, Lauritzen and Tran2, Theorem 5.15], by which we need to show that $A\perp_*B\mid S$ in $\mathcal{D}_S^*$ —that is, that there are no $*$ -connecting paths between any pair of nodes in A and B in the conditional reachability DAG $\mathcal{D}_S^*$ . We will explain these notions further.

Let $A, B,S\subset V$ be nonempty disjoint node sets such that S is a separator of A and B in the skeleton T. Consider Figure 6. According to [Reference Améndola, Klüppelberg, Lauritzen and Tran2, Definition 5.4], a $*$ -connecting path between $a \in A$ and $b \in B$ is one of the five configurations shown in the figure. Our goal is to show that for $a\in A$ and $b\in B$ it is impossible to find a $*$ -connecting path in a certain graph $\mathcal{D}_S^*$ , which is neither $\mathcal{T}$ nor T, but is constructed under particular rules given in [Reference Améndola, Klüppelberg, Lauritzen and Tran2, Definition 5.1].

Figure 6. According to [Reference Améndola, Klüppelberg, Lauritzen and Tran2, Definition 5.4], a $*$ -connected path between a and b relative to S is one of the five configurations above. In the last three graphs we have $s\in S$ .

According to this definition, the conditional reachability graph $\mathcal{D}_S^*$ is on the same vertex set, V. Between two nodes i and j in V, there is an edge (i, j) in $\mathcal{D}_S^*$ if and only if there is a directed path from i to j in $\mathcal{T}$ such that no node on that path belongs to S, except possibly for i and j themselves.

Consider Figure 6. We need to show that in the conditional reachability graph $\mathcal{D}_S^*$ , there is no $*$ -connecting path between a node $a \in A$ and a node $b \in B$ .

To obtain the first configuration in $\mathcal{D}_S^*$ , there must be, in the skeleton T, nodes $a\in A$ and $b\in B$ such that no node on the path from a to b passes through S. But this is impossible, because we assumed that S is a separator of A and B in T. A similar argument holds for the second configuration.

For the other three configurations in Figure 6, consider Figure 7.

Figure 7. The left and right trails in the original graph $\mathcal{T}$ are the only possible trails that give rise to the middle path in the graph $\mathcal{D}_S^*$ .

In Figure 7, the left-hand and right-hand trails in the original graph $\mathcal{T}$ are the only possible ones that give rise to the middle path in Figure 6 with respect to the graph $\mathcal{D}_S^*$ : both the left-hand and right-hand graphs in Figure 7 show existing trails between a and b in $\mathcal{T}$ , trails composed of a directed path from a to s, and a directed path from b to s. The only node on these trails which belongs to S is s. Hence in $\mathcal{D}_S^*$ we put a directed edge from a to s and from b to s. This gives the third $*$ -connecting path in Figure 6. But this configuration cannot occur, for the following reason. On the left-hand trail in Figure 7, the separator node s has parents $u_r$ and $v_q$ in different tournaments. This leads to a v-structure between the nodes $u_r,s,v_q$ , in contradiction to Lemma 2.2-2 and the hypothesis that $\mathcal{T}$ has a unique source.

On the right-hand trail in Figure 7, the node s shares a tournament with its parents $u_r$ and $v_q$ , but only s belongs to S; on the trail $\{a=u_1, u_2, \ldots, u_r, v_q, \ldots, v_2,v_1=b\}$ , none of the nodes are in S. In T, this means that there is a path between A and B that does not pass through S. This is in contradiction to the assumption that S separates A and B in T.

To show that the fourth type of $*$ -connecting path in Figure 6 cannot occur, we can use the reasoning used for the third one by setting $a=a'$ in Figure 7. Then either s has parents from two different tournaments or there is a non-directed path from a to b which does not pass through S. The first case is excluded by Lemma 2.2-2 and the assumption that $\mathcal{T}$ has a unique source, and the second one by the assumption that S is a separator of A and B in T. The impossibility of the fifth $*$ -connected configuration follows analogously.

Necessity. We will show that if $\mathcal{T}$ has multiple source nodes, there is a triple of disjoint, non-empty sets $A,B,S\subset V$ such that S is a separator of A and B in T, but $X_A$ and $X_B$ are conditionally dependent given $X_S$ . In the case when $\mathcal{T}$ has at least two sources, we have at least one v-structure in $\mathcal{T}$ by Lemma 2.2-2. Take a triple of nodes in a v-structure, say u, v, w, with u and w being parents of v. Then node v separates nodes u and w in T, i.e., $S = \{v\}$ is a separator of $A = \{u\}$ and $B = \{w\}$ in T. All references below are from [Reference Améndola, Klüppelberg, Lauritzen and Tran2].

To show $X_u\not\!\perp\!\!\!\perp X_w\mid X_v$ , we will use [Reference Améndola, Klüppelberg, Lauritzen and Tran2, Theorem 6.18] (context-free completeness). We need to show that there is an effective $*$ -connecting path in the critical DAG $\mathcal{D}^*_S(\theta)$ between nodes u and w, as in Definitions 5.2 and 6.5 of [Reference Améndola, Klüppelberg, Lauritzen and Tran2].

The subgraph on nodes u, v, w of $\mathcal{D}^*_S(\theta)$ is a v-structure, , according to the definition of $\mathcal{D}^*_S(\theta)$ . According to [Reference Améndola, Klüppelberg, Lauritzen and Tran2, Definition 6.4], the $|S| \times |S|$ substitution matrix of $(u,v)\in E$ relative to $S=\{v\}$ is zero, because S is a singleton and by definition all diagonal entries of the substitution matrix are zero, i.e., $\Xi^{vu}_S=0$ . Similarly, $\Xi^{vw}_S=0$ . Because the edges (u, v), (w, v) form a $*$ -connecting path between u, w in $\mathcal{D}^*_S(\theta)$ , the substitution matrix of this path relative to S, say $\Xi_S$ , is zero too:

\[ \Xi_S=\max(\Xi^{vu}_S, \Xi^{vw}_S)=0. \]

To find out whether the edges (u, v), (w, v) form an effective $*$ -connecting path between u, w, we need to compute the tropical eigenvalue of $\max(\Gamma_{SS},\Xi_S)$ , where $\Gamma$ is as in [Reference Améndola, Klüppelberg, Lauritzen and Tran2, Equation (2.3)] and $\Gamma_{SS}$ is the vv-element of $\Gamma$ , i.e., $\{\Gamma\}_{vv}$ . Because $\{\Gamma\}_{ij}>0$ if and only if there is a directed path from j to i, we have $\Gamma_{SS}=\{\Gamma\}_{vv}=0$ and so

\[ \max(\Gamma_{SS}, \Xi_S)=0. \]

The tropical eigenvalue [Reference Améndola, Klüppelberg, Lauritzen and Tran2, Equation (2.7)] of the above matrix is trivially equal to zero and thus smaller than one. By [Reference Améndola, Klüppelberg, Lauritzen and Tran2, Definition 6.5], there is indeed an effective $*$ -connecting path between u, w in $\mathcal{D}^*_{S}(\theta)$ . In view of [Reference Améndola, Klüppelberg, Lauritzen and Tran2, Theorem 6.18], we conclude that $X_u \not\!\perp\!\!\!\perp X_w \mid X_v$ .

Appendix C. Proofs and additional results for Section 4

C.1 Auxiliary results

Proof of Lemma 4.1. The point masses satisfy $m_i > 0$ for all $i\in V$ , because we have $m_i=0$ if and only if $c_{ii} = 0$ . However, $c_{ii}=0$ is impossible in view of the definition in (6). Therefore we cannot have undefined atoms, which would happen when $m_i=0$ . This shows (i).

Next we show (ii). To see why $a_i\neq a_j$ for $i\neq j$ , let $i, v \in V$ and recall $b_{vi}$ in (8). From the line below (11), recall that we also have $b_{vi} = m_i a_{vi}$ for $i, v \in V$ . Thanks to the assumption $\theta\in \mathring{\Theta}_*$ , we have (18). We also have (19) for any DAG.

The combination of the last two equations implies that in (11), all vectors $a_i$ for $i \in V$ are different, and thus $H_\theta$ has $|V|$ distinct atoms. Also, for every node $i \in V$ , we can find out which of the $|V|$ atoms of $H_\theta$ is $a_i$ , because it is the unique one that satisfies $\mathrm{Desc}(i) = \{ v \in V \;:\; a_{vi} > 0 \}$ . Note that similarly, among the $|V|$ vectors in the set $\mathcal{B}_\theta = \{ (b_{vj})_{v \in V} \;:\; j \in V \}$ , the vector $b_i$ is the unique one such that $\mathrm{Desc}(i) = \{ v \in V \;:\; b_{vi} > 0 \}$ .

Finally consider (iii). By the criticality assumption, every edge is critical, because it is the shortest path between any pair of adjacent nodes. Since $(i, v)\in E$ is critical, we have $b_{vi} = b_{ii} c_{iv}$ and thus $c_{iv} = b_{vi} / b_{ii} = a_{vi} / a_{ii}$ .

In summary, the angular measure $H_\theta$ possesses $|V|$ distinct atoms that can be uniquely matched to the nodes. As a consequence, we can reconstruct the matrix $(b_{vi})_{i,v \in V}$ . Thanks to (iii), this matrix allows us to recover all edge weights $c_{vi}$ .

Lemma C.1.1. Let $\mathcal{T}= (V, E)$ be a TTT as in Definition 2.1, and let $\mathcal{T}$ have a unique source, $u_0$ . Let $U \subset V$ be non-empty and suppose that $\bar{U} = V \setminus U$ satisfies the conditions (I1) and (I2). For every $\bar{u} \in \bar{U}$ there exists $s \in \mathrm{desc}(\bar{u}) \cap U$ such that $\pi(\bar{u}, s)$ is a singleton and the unique path p from $\bar{u}$ to s satisfies the following two properties:

  1. 1. All nodes on p except for s are in $\bar{U}$ .

  2. 2. Each node on p except possibly for $\bar{u}$ has only one parent.

As a consequence, any path with destination s must either start in one of the nodes of p or contain p as a subpath.

Proof. Let $\bar{u}\in \bar{U}$ and suppose $\bar{u}$ has no parents, so $\bar{u}=u_0$ . Take a node whose unique parent is $\bar{u}$ , say $v_2$ . By [Reference Harary and Moser18, Corollary 5.a], such a node exists in every tournament in which $\bar{u}$ takes part. If $v_2\in U$ , then $s=v_2$ and we are done. If $v_2\in \bar{U}$ , then by (I2) $v_2$ must be a source of at least one another tournament. In each of these, there is a node whose only parent is $v_2$ . Take such a node, say $v_3$ . If $v_3\in U$ then $v_3=s$ and we are done; otherwise continue in the same way until we find a node which is in U. Because the graph is finite and because of the condition (I2), such a node must exist. It is clear that the path constructed in this way has the stated properties.

Next suppose that $\bar{u}$ belongs to $\bar{U}$ and that $\bar{u}$ has at least one parent. By (I2) it must be a source of at least one tournament. In each of these tournaments there is a node with single parent $\bar{u}$ . Take one of them, say $v_2$ ; if $v_2\in U$ then we are done, and otherwise we repeat the same procedure as above until we find a node which is in U. Because the graph is finite and because of the condition (I2), such a node must exist. It is clear that the path constructed in this way has the stated properties too.

Lemma C.1.2. Let $\mathcal{T}= (V, E)$ be a TTT as in Definition 2.1, and let $\mathcal{T}$ have a unique source, $u_0$ . Let $U \subset V$ be non-empty and suppose that $\bar{U} = V \setminus U$ satisfies the conditions (I1) and (I2). Let $i, j \in V$ be two distinct nodes. With the roles of i and j switched if needed, the equality $\mathrm{Desc}(i) \cap U = \mathrm{Desc}(j) \cap U$ implies the following:

  1. 1. We have $i \in \bar{U}$ .

  2. 2. We have $\mathrm{desc}(i) = \mathrm{Desc}(j)$ .

  3. 3. We have $\{i\} = \mathrm{pa}(j)$ .

  4. 4. There exists $u \in V$ such that $i, j \in \mathrm{pa}(u)$ .

  5. 5. For $k \in V \setminus \{i, j\}$ , the set $\mathrm{Desc}(k) \cap U$ is different from $\mathrm{Desc}(i) \cap U = \mathrm{Desc}(j) \cap V$ .

  6. 6. We have $|\mathrm{Desc}(i)\cap U|=|\mathrm{Desc}(j)\cap U|\geq 2$ .

Proof. Part 1. Note first that $\mathrm{Desc}(i) \cap U$ cannot be empty: otherwise, we would have $\mathrm{Desc}(i) \subseteq \bar{U}$ , but this is impossible, since $\mathrm{Desc}(i)$ contains at least one leaf node (a node without children), in contradiction to (I1).

Figure 8. A unique path on nodes $\{\bar{u}=v_1, v_2, v_3, \ldots, s = v_n\}$ under Lemma C.1.1. Each of the nodes $\bar{u}, v_2, \ldots, v_{n-1}$ belongs to $\bar{U}$ . Each of the nodes $v_2, \ldots, v_{n-1}, s$ has a unique parent. The node $\bar{u}\in \bar{U}$ may have parents as illustrated here, but then there is at least one tournament with respect to which it is a source node, e.g., $\tau_1$ . Let $v_2$ be the node with unique parent $\bar{u}$ in $\tau_1$ . When $v_2$ belongs to $\bar{U}$ , it must participate in at least one another tournament, say $\tau_2$ . In $\tau_2$ the node with unique parent $v_2$ is $v_3$ . Following this principle, the path continues until we find a node in U, which is $v_n = s$ in this case.

Since $\mathrm{Desc}(i) \cap U = \mathrm{Desc}(j) \cap U$ and since this set is non-empty, the intersection $\mathrm{Desc}(i) \cap \mathrm{Desc}(j)$ is non-empty as well. In relation to Lemma 2.2-3, this means either $\mathrm{Desc}(i) \subseteq \mathrm{Desc}(j)$ or $\mathrm{Desc}(j) \subseteq \mathrm{Desc}(i)$ . In the remainder of the proof, we suppose $\mathrm{Desc}(j) \subseteq \mathrm{Desc}(i)$ . Then we must have $i \not\in \mathrm{Desc}(j)$ , since otherwise also $\mathrm{Desc}(i) \subseteq \mathrm{Desc}(j)$ and thus $\mathrm{Desc}(i) = \mathrm{Desc}(j)$ , which is impossible since i and j are distinct; see (19). From $\mathrm{Desc}(j)\subseteq\mathrm{Desc}(i)$ and $\mathrm{Desc}(i) \cap U = \mathrm{Desc}(j) \cap U$ it follows that

\[ \mathrm{Desc}(i) \setminus \mathrm{Desc}(j) \subseteq \bar{U}. \]

Because $i \not\in \mathrm{Desc}(j)$ we get $i\in \bar{U}$ .

Parts 2–4. First we show that all elements in $\mathrm{Desc}(i) \setminus \mathrm{Desc}(j)$ are ancestors of j. Let $v \in \mathrm{Desc}(i) \setminus \mathrm{Desc}(j)$ . Because j and v are two different nodes, Lemma 2.2-3 implies that one of three cases must occur: $\mathrm{Desc}(v) \subset \mathrm{Desc}(j)$ , $\mathrm{Desc}(j) \subset \mathrm{Desc}(v)$ , or $\mathrm{Desc}(j) \cap \mathrm{Desc}(v) = \varnothing$ . The first case, $\mathrm{Desc}(v) \subset \mathrm{Desc}(j)$ , is impossible, since $v \not\in \mathrm{Desc}(j)$ . The third case, $\mathrm{Desc}(j) \cap \mathrm{Desc}(v) = \varnothing$ , is impossible too, since it would imply that $\mathrm{Desc}(v) \subseteq \mathrm{Desc}(i) \setminus \mathrm{Desc}(j) \subseteq \bar{U}$ , but this cannot happen since $\mathrm{Desc}(v)$ contains at least one leaf node while $\bar{U}$ does not contain any. Only the second case, $\mathrm{Desc}(j) \subset \mathrm{Desc}(v)$ , remains. As a consequence, v is an ancestor of j, and so all nodes of $\mathrm{Desc}(i)\setminus \mathrm{Desc}(j)$ are ancestors of j. By the proof of Part 1, we get

(31) \begin{equation} \mathrm{Desc}(i) \setminus \mathrm{Desc}(j) \subseteq \mathrm{an}(j) \cap \bar{U}. \end{equation}

Again let $v \in \mathrm{Desc}(i) \setminus \mathrm{Desc}(j)$ . We show that there exists a unique path from v to j and that j has only a single parent. Since $v \in \bar{U}$ , there exists, by Lemma C.1.1, a node $s(v) \in U$ such that there is only one directed path p(v, s(v)) from v to s(v); moreover, this path satisfies the properties 1 and 2 in the lemma. Necessarily,

\[ s(v) \in \mathrm{Desc}(v) \cap U \subseteq \mathrm{Desc}(i) \cap U = \mathrm{Desc}(j) \cap U. \]

As s(v) is a descendant of j while j is a descendant of v, the path p(v, s(v)) passes through j. As a consequence, there is a unique path p(v, j) from v to j; otherwise, there would be more than one path from v to s(v). Moreover, by Lemma C.1.1-2, each node on the path p(v, s(v)), except possibly for v, has only one parent. In particular, j has only one parent.

Again take any $v \in \mathrm{Desc}(i) \setminus \mathrm{Desc}(j)$ . By (I1), v has at least two children. They cannot both be ancestors of j, since then there would be two paths from v to j, in contradiction to the previous paragraph. Let u be a child of v that is not an ancestor of j; then $u \in \mathrm{Desc}(j)$ because of (31). This means there are two paths from v to u: the edge (v, u) and a path passing through j. These paths must belong to the same tournament, as the skeleton of $\mathcal{T}$ is a block graph. But then v and j are adjacent, and thus v, which we already knew to be an ancestor of j, is actually a parent of j. But j has only one parent, and so the set $\mathrm{Desc}(i) \setminus \mathrm{Desc}(j)$ must be a singleton. As this set obviously contains node i, we get $v = i$ and thus $\mathrm{Desc}(i) \setminus \mathrm{Desc}(j) = \mathrm{pa}(j) = \{i\}$ . Since $\mathrm{Desc}(i) = \{i\} \cup \mathrm{desc}(i)$ and $\mathrm{Desc}(j) \subset \mathrm{Desc}(i)$ , it follows that $\mathrm{desc}(i) = \mathrm{Desc}(j) = \mathrm{Desc}(i) \setminus \{i\}$ .

Part 5. Let $k \in V \setminus \{i,j\}$ be such that $\mathrm{Desc}(k) \cap U = \mathrm{Desc}(i) \cap U = \mathrm{Desc}(j) \cap U$ . By Part 3 we have $\{i\}=\mathrm{pa}(j)$ . From $\mathrm{Desc}(k) \cap U = \mathrm{Desc}(i) \cap U$ we can have either $\{k\}=\mathrm{pa}(i)$ or $\{i\}=\mathrm{pa}(k)$ , while from $\mathrm{Desc}(k) \cap U = \mathrm{Desc}(j) \cap U$ we have either $\{k\}=\mathrm{pa}(j)$ or $\{j\}=\mathrm{pa}(k)$ . Because already $\{i\}=\mathrm{pa}(j)$ , we cannot also have $\{k\}=\mathrm{pa}(j)$ , whence we must have $\{j\}=\mathrm{pa}(k)$ . But then $\{i\}=\mathrm{pa}(k)$ is impossible, so that necessarily $\{k\}=\mathrm{pa}(i)$ . From $\{i\}=\mathrm{pa}(j)$ , $\{j\}=\mathrm{pa}(k)$ , and $\{k\}=\mathrm{pa}(i)$ we get a cycle between the three nodes i, j, k, which is in contradiction to the definition of a DAG.

Part 6. We already know that $\mathrm{Desc}(i) \cap U = \mathrm{Desc}(j) \cap U$ . We need to show that this set contains at least two elements. Consider the triple $\{i,j,u\}$ from Part 4 that forms a triangle with directed edges (i, j), (i, u), and (j, u). By Part 1 we have also $i\in \bar{U}$ . There are four cases, according to whether j and u belong to U or not:

  1. 1. If $j,u\in U$ , they are two distinct elements of $\mathrm{Desc}(j) \cap U$ .

  2. 2. If $j \in U$ but $u \in \bar{U}$ , then take $r \in \mathrm{Desc}(u) \cap U$ (which is non-empty by (I1): all leaf nodes in $\mathrm{Desc}(u)$ are in U), and note that j and r are two distinct elements in $\mathrm{Desc}(j) \cap U$ .

  3. 3. If $j \in \bar{U}$ and $u \in U$ , then, as in Lemma C.1.1, let $s \in U$ be such that there is a unique path p(j, s) from j to s, this path satisfying the properties 1–2 in the same lemma. Then u does not belong to that path (since $u \in U$ and u has at least two parents, i and j), so that s is different from u, and both are members of $\mathrm{Desc}(j) \cap U$ .

  4. 4. If $j, u \in \bar{U}$ , then by Lemma C.1.1 we can find nodes $s\in \mathrm{desc}(j)\cap U$ and $r\in \mathrm{desc}(u)\cap U$ with paths p(j, s) and p(u, r) which satisfy the properties in the lemma. Nodes s, r clearly belong to $\mathrm{Desc}(j)\cap U$ . Moreover, they are distinct: node u, having at least two parents, cannot belong to the unique path p(j, s) between j and s, while by construction, there is a directed path from j to r that passes through u.

Lemma C.1.3. Let X be a max-linear model with respect to a TTT with unique source. The coefficient $b_{vv}$ depends only on the edge weights of the tournament shared by node $v\in V$ and its parents, and it is given by

(32) \begin{equation} b_{vv}=1+\sum_{u\in \mathrm{pa}(v)} \sum_{p\in \pi(u,v) } (\!-\!1)^{|p|}c_p. \end{equation}

Proof. Consider the node v. If v has no parents we have $\mathrm{an}(v)=\varnothing$ and by (6) we have $b_{vv}=1$ . If v has at least one parent, then there is a tournament which contains the parents, say, $\tau=(V_{\tau}, E_{\tau})$ . Let the nodes in $\tau$ be labelled according to their in-/out-degree ordering in $\tau$ : the node with $|V_{\tau}|-1$ children in $\tau$ (the source of $\tau$ ) has index 1, the node with $|V_{\tau}|-2$ children in $\tau$ has index 2, and so on. We can partition the set $\mathrm{an}(v)$ into $\mathrm{An}(1)$ and $\mathrm{pa}(v)\setminus \{1\}$ . For $i\in \mathrm{An}(1)$ the shortest path from i to v necessarily passes through 1, so $b_{vi}=c_{p(i,1)}c_{1v}b_{ii}$ . Then, by (4) and (6), we have

(33) \begin{align} b_{vv}=1-\sum_{i\in \mathrm{an}(v)}b_{vi}= 1-\sum_{i\in \mathrm{An}(1)}c_{p(i,1)}c_{1v}b_{ii} -\sum_{i\in \mathrm{pa}(v)\setminus 1}b_{vi} = 1-c_{1v}-\sum_{i\in \mathrm{pa}(v)\setminus 1}c_{iv}b_{ii}. \end{align}

Let $C=\{c_{ij}\}_{i,j\in V_\tau, i<j}$ be the matrix of edge weights within $\tau$ : it is lower-triangular and has zero diagonal. Let $I_{m}$ denote the $m \times m$ identity matrix, write

$$\boldsymbol{b}=(1,b_{22},\ldots,b_{|V_\tau|,|V_\tau|})^\top$$

(a column vector), and let $1_{|V_\tau|}$ be a column vector of ones of length $|V_\tau|$ . Consider the system of linear equations

(34) \begin{equation} (I_{|V_{\tau}|}+C) \, \boldsymbol{b}=1_{|V_\tau|}. \end{equation}

For $v\geq 2$ the expression in (33) is equivalent to the vth equation in (34). A solution for $\boldsymbol{b}$ is

\[ \boldsymbol{b} = (I_{|V_{\tau}|}+C)^{-1}1_{|V_\tau|} = \left( I_{|V_{\tau}|}-(\!-\!C) \right)^{-1} \, 1_{|V_\tau|}. \]

From the equality

\[ \big(I_{|V_{\tau}|}+(\!-\!C)+(\!-\!C)^2+\cdots+(\!-\!C)^k\big) \big(I_{|V_{\tau}|}-(\!-\!C)\big) = I_{|V_{\tau}|}-(\!-\!C)^{k+1} \]

and the fact that, for a $|V_{\tau}|$ -square lower-triangular matrix with zero diagonal, powers of $k\geq |V_{\tau}|$ are zero matrices, we obtain

\[ \big(I_{|V_{\tau}|}+(\!-\!C)+(\!-\!C)^2+\cdots+(\!-\!C)^{|V_{\tau}|-1}\big) = \big(I_{|V_{\tau}|}-(\!-\!C)\big)^{-1}. \]

If the matrix on the left is denoted by K, we have as solution $\boldsymbol{b}=K1_{|V_{\tau}|}$ . For all $b_{vv}, v\geq 2,$ it can be shown that (32) equals the vvth element of this solution for $\boldsymbol{b}$ . For $b_{11}$ consider the corresponding solution when the tournament $\tau$ is the one which node 1 shares with its parents. If node 1 has no parents in the TTT, then we have the solution $b_{11}=1$ , which is indeed the case.

Lemma C.1.4. Let X follow a max-linear model as in Assumption 2.1 with respect to a TTT $\mathcal{T}$ consisting of a single tournament $\tau=(V,E)$ . If the node $v\in V$ has at least one parent, then the parameter vector $\theta=(c_e)_{e\in E} \in \mathring{\Theta}_*$ is not identifiable from the distribution of $X_{V \setminus v}$ . Specifically, there exists $\theta' = (c'_{\!\!e})_{e \in E} \in \mathring{\Theta}_*$ such that $\theta' \ne \theta$ and the distribution of $X_{V \setminus v}$ is the same under $\theta'$ as under $\theta$ .

Proof. Let $n = |V|$ denote the number of nodes. For convenience, rename the nodes to $V = \{1, \ldots, n\}$ in the ordering induced by the DAG; i.e., node i has $i-1$ parents, for $i \in V$ . The number of edges is $|E| = n(n-1)/2 \;=\!:\; m$ , and the parameter set $\mathring{\Theta}_*$ is an open subset of $\mathbb{R}^E$ . The distribution of X is max-linear and is given by

(35) \begin{equation} X_{j} = \bigvee_{i=1}^j b_{ji} Z_i, \qquad j \in V, \end{equation}

where $b_{11} = 1$ , $b_{jj} = 1 - \sum_{i=1}^{j-1} b_{ji}$ for $j \in V \setminus 1$ , and where the m coefficients $b = (b_{ji} \;:\; 1 \le i < j \le n)$ are determined by the edge parameters $\theta = (c_{ij} \;:\; 1 \le i < j \le n)$ .

Discarding the variable $X_v$ for some $v \in V \setminus 1$ yields the vector $X_{V \setminus v}$ , the distribution of which is determined by the $m - (v-1)$ coefficients $(b_{ji} \;:\; 1 \le i < j \le n, j \ne v)$ . For convenience, identify $\mathbb{R}^E$ with $\mathbb{R}^m$ . Let $\pi \;:\; \mathbb{R}^m \to \mathbb{R}^{m-v+1}$ be the projection that sends $x = (x_{ij} \;:\; 1 \le i < j \le m)$ to $\pi(x) = (x_{ij} \;:\; 1 \le i < j \le m, j \ne v)$ ; i.e., the effect of $\pi$ is to leave out the coordinates (i, v) with $i = 1,\ldots,v-1$ . By (35) with $j = v$ removed, the distribution of $X_{V \setminus v}$ is determined by $\pi(b)$ .

The max-linear coefficients b are a function of the edge parameters $\theta$ . Formally, there exists a map $f \;:\; \mathring{\Theta}_* \to \mathbb{R}^m$ such that

\[ b = f(\theta). \]

The function f can be reconstructed from (8) with $p(i,j) = (i, j)$ for $1 \le i < j \le n$ . Clearly, f is continuous. Since the parameter $\theta$ is identifiable from the distribution of X (Lemma 4.1), the function f is also injective, i.e., $\theta \ne \theta'$ implies $f(\theta) \ne f(\theta')$ . By the invariance-of-domain theorem [Reference Kulpa24], the image $f(\mathring{\Theta}_*)$ is therefore an open subset of $\mathbb{R}^m$ . But then, for any coefficient vector $b \in f(\mathring{\Theta}_*)$ , there exists another coefficient vector $b' \in f(\mathring{\Theta}_*)$ such that $b' \ne b$ but still $b_{ji} = b'_{\!\!ji}$ for all $1 \le i < j \le n$ and $j \ne v$ —in other words, such that $\pi(b) = \pi(b')$ . Since f is injective, the vectors b and b originate from different edge parameter vectors $\theta = f^{-1}(b)$ and $\theta' = f^{-1}(b')$ in $\mathring{\Theta}_*$ . But

\[ \pi(f(\theta)) = \pi(b) = \pi(b') = \pi(f(\theta')), \]

so that the edge weight vectors $\theta$ and $\theta'$ induce the same distribution of $X_{V \setminus v}$ . We conclude that the parameter $\theta$ is not identifiable from the distribution of $X_{V \setminus v}$ .

C.2. Proof of Proposition 4.1

In reading the proof, the following perspective may help. Recall the notation in Equations (12) and (20). Knowledge of the (simple max-stable) distribution of $X_U$ implies knowledge of its angular measure $H_U$ and thus of the unordered collection of pairs of atoms and masses $(\omega_r, \mu_r)$ for $r = 1, \ldots, s$ . The vector $X_U$ can itself be represented as a max-linear model with s independent factors and coefficient vectors $\beta_r = \mu_r \omega_r$ for $r = 1,\ldots,s$ . We first need to ensure that we can match those vectors $\beta_r$ in a unique way to the max-linear coefficient vectors $(b_{vi})_{v \in U}$ for $i \in V$ ; note that the coordinates v of those vectors are restricted to U. Next, from the latter vectors, we need to recover the edge coefficients $\theta = (c_e)_{e \in E}$ .

Proof of sufficiency (if) part of Proposition 4.1. We assume (I1) and (I2). In the first step of the proof we show that the angular measure of $X_U$ in (20) is composed of $|V|$ distinct atoms; furthermore, we show that we can associate every atom in $\{\omega_r \;:\; r=1, \ldots, |V|\}$ to some node $v \in V$ , and accordingly to one of the atoms $a_{i,U}=(b_{vi}/m_{i,U})_{v\in U}$ for $i\in V$ . For this, we focus on the nature of the atoms $\{a_{i,U}\}$ , given the conditions (I1) and (I2). As a consequence, the max-linear coefficient matrix $b_{U \times V} = (b_{vi})_{v \in U, i \in V}$ can be recovered from the distribution of $X_U$ . In Step 2, we show how to recover from this matrix the edge parameters $\theta = (c_e)_{e \in E}$ .

Step 1. Recall the representation $H_U = \sum_{i \in V} m_{i,U} \delta_{a_{i,U}}$ in (12) of the angular measure of $X_U$ . We shall show that all $|V|$ masses $m_{i,U}$ are positive and that all $|V|$ atoms $a_{i,U}$ are distinct. Moreover, we will show how to match the atoms to the nodes—that is, given an atom $\omega \in \{ \omega_r \;:\; r = 1, \ldots, |V|\}$ , how to identify the node $i \in V$ such that $\omega = a_{i,U}$ .

All $|V|$ vectors $\{a_{i,U}\}$ have positive masses $\{m_{i,U}\}$ . Recall $m_{i,U} = \sum_{v \in U} b_{vi}$ , and recall from (18) that $b_{vi} > 0$ if and only if $v \in \mathrm{Desc}(i)$ . It follows that $m_{i,U} = 0$ if and only if $\mathrm{Desc}(i) \cap U = \varnothing$ or, in other words, $\mathrm{Desc}(i) \subseteq \bar{U}$ . But this is impossible since $\mathrm{Desc}(i)$ contains at least one leaf node, that is, a node without children, and such a node belongs to U by (I1). We conclude that $m_{i,U} > 0$ for all $i \in V$ .

All $|V|$ vectors $\{a_{i,U}\}$ are distinct. By (22) it follows that whenever for two different nodes $i,j\in V$ we have $\mathrm{Desc}(i)\cap U\neq \mathrm{Desc}(j)\cap U$ , we can find two atoms, say $\omega'$ and $\omega''$ , within the set $\{\omega_r\}$ such that $\omega'=a_{i,U}$ and $\omega''=a_{j,U}$ . Because $\mathrm{Desc}(i)\cap U\neq \mathrm{Desc}(j)\cap U$ , we necessarily have $a_{i,U}\neq a_{j,U}$ . Suppose, however, that for two different nodes $i,j\in V$ we have $a_{i,U}=a_{j,U}$ . This means that for the uth and jth elements of these vectors we have

\[ a_{i,u;U}=a_{j,u;U} \Longleftrightarrow \frac{b_{ui}}{m_i}=\frac{b_{uj}}{m_j} \qquad \text{and} \qquad a_{i,j;U}=a_{j,j;U} \Longleftrightarrow \frac{b_{ji}}{m_i}=\frac{b_{jj}}{m_j} \]

Considering the ratios above, we should also have

(36) \begin{equation} \frac{a_{i,u;U}}{a_{i,j;U}}=\frac{a_{j,u;U}}{a_{j,j;U}} \qquad \Longleftrightarrow \qquad \frac{b_{ui}}{b_{ji}}=\frac{b_{uj}}{b_{jj}}. \end{equation}

Because $a_{i,U}=a_{j,U}$ , necessarily $ \mathrm{Desc}(i) \cap U = \mathrm{Desc}(j) \cap U. $ By Lemma C.1.2 there exists a node u such that one of the edge sets $\{(i,j), (i,u), (j,u)\}$ and $\{(j,i), (i,u), (j,u)\}$ is contained in E. Without loss of generality, suppose this holds for the first triple. Also, by Lemma 1.2 there cannot be another node k with $\mathrm{Desc}(k)\cap U=\mathrm{Desc}(i)\cap U=\mathrm{Desc}(j)\cap U$ .

Suppose first $j, u \in U$ . From the identities

(37) \begin{align} b_{ji} &= c_{ij} b_{ii}, & b_{ui} &= c_{iu} b_{ii}, & b_{uj} &= c_{ju} b_{jj}, \end{align}

and the criticality requirement

\[ c_{iu} > c_{ij} c_{ju}, \]

we have

(38) \begin{equation} \frac{b_{ui}}{b_{ji}}>\frac{b_{uj}}{b_{jj}}. \end{equation}

This inequality shows that (36) cannot happen; hence we cannot have $a_{i,U}=a_{j,U}$ . This means that all $|V|$ atoms are distinct.

Next suppose $j, u \in \bar{U}$ . By Lemma C.1.1, there exist nodes $j^{\prime}, u^{\prime}\in U$ such that $j^{\prime} \in \mathrm{desc}(j)$ and $u^{\prime} \in \mathrm{desc}(u)$ and the paths p(j, j ) and p(u, u ) satisfy the properties in the lemma. Because each node on the path p(j, j ), except possibly for j, has a unique parent, the path p(j, j ) cannot pass through u (which has parents i and j), and thus $j^{\prime} \neq u^{\prime}$ . As i is a parent of j, the shortest (and in fact the only) path from i to j is the one that concatenates the edge (i, j) with p(j, j ) (Lemma C.1.1). It follows that

(39) \begin{equation} b_{j^{\prime}i} = c_{p(j,j^{\prime})} c_{ij} b_{ii} \qquad \text{and} \qquad b_{j^{\prime}j} = c_{p(j,j^{\prime})} b_{jj}. \end{equation}

By a similar argument, the path that concatenates the edge (i, u) with the path p(u, u ) is the unique shortest path from i to u , while the path that concatenates (j, u) with p(u, u ) is the unique shortest path from j to u . It follows that

(40) \begin{equation} b_{u^{\prime}i} = c_{p(u,u^{\prime})} c_{iu} b_{ii} \qquad \text{and} \qquad b_{u^{\prime}j} = c_{p(u,u^{\prime})} c_{ju} b_{jj}. \end{equation}

Combining these equalities, $c_{iu}>c_{ij}c_{ju}$ implies that we should have

(41) \begin{equation} \frac{b_{u^{\prime}i}}{b_{u^{\prime}j}}>\frac{b_{j^{\prime}i}}{b_{j^{\prime}j}}. \end{equation}

However, from $a_{i,U}=a_{j,U}$ , we have for the u th and j th elements of these vectors

\[ a_{i,u^{\prime};U}=a_{j,u^{\prime};U} \Longleftrightarrow \frac{b_{u^{\prime}i}}{m_i}=\frac{b_{u^{\prime}j}}{m_j} \qquad \text{and} \qquad a_{i,j^{\prime};U}=a_{j,j^{\prime};U} \Longleftrightarrow \frac{b_{j^{\prime}i}}{m_i}=\frac{b_{j^{\prime}j}}{m_j} \]

Considering the ratios above, we should also have

(42) \begin{equation} \frac{a_{i,u^{\prime};U}}{a_{i,j^{\prime};U}}=\frac{a_{j,u^{\prime};U}}{a_{j,j^{\prime};U}} \qquad \Longleftrightarrow \qquad \frac{b_{u^{\prime}i}}{b_{u^{\prime}j}}=\frac{b_{j^{\prime}i}}{b_{j^{\prime}j}}. \end{equation}

Because of (41), the equalities in (42) cannot happen, so we cannot have $a_{i,U}=a_{j,U}$ .

The analysis of the cases $(j, s) \in U \times \bar{U}$ and $(j, s) \in \bar{U} \times U$ is similar. This shows that all $|V|$ vectors $a_{i,U}$ for $i \in V$ are different, and because of (21), all vectors $\omega_r$ for $r \in \{1,\ldots,|V|\}$ are different too.

Distinguishing all atoms $a_{i,U}$ with zeros in the same positions. For two different nodes $i, j \in V$ , the atoms $a_{i,U}$ and $a_{j,U}$ have the same supports $\{v\in U\;:\; a_{vi}>0\}=\{v\in U\;:\; a_{vj}>0\}$ when $\mathrm{Desc}(i) \cap U = \mathrm{Desc}(j)\cap U$ . By Lemma C.1.2-5 there cannot be any other node $k \in V \setminus \{i, j\}$ with the same descendants in U. In the representation

\[ H_{\theta,U} = \sum_{t \in V} m_{t,U} \delta_{a_{t,U}} = \sum_{r = 1}^{|V|} \mu_r \delta_{\omega_r} \]

there are thus exactly two atoms, $\omega$ and $\omega'$ , say, with the same indices of non-zero coordinates as $a_{i,U} = (b_{vi} / m_{i,U})_{v \in V}$ and $a_{j,U} = (b_{vj} / m_{j,U})_{v \in V}$ . The question is then how to know whether $\omega = a_{i,U}$ and $\omega' = a_{j,U}$ or vice versa, $\omega = a_{j,U}$ and $\omega' = a_{i,U}$ . Let $\mu$ and $\mu^{\prime}$ be the masses of $\omega$ and $\omega'$ , respectively, and consider the vectors $\beta = \mu \omega$ and $\beta' = \mu^{\prime} \omega'$ . An equivalent question is then how to identify $\beta$ and $\beta'$ with the two max-linear coefficient vectors $(b_{vi})_{v \in U}$ and $(b_{vj})_{v \in U}$ .

By Parts 2–4 of Lemma C.1.2, we can suppose that i is the unique parent of j and that i and j have a common child u. The analysis is now to be split up into different cases, according to whether j and u belong to U or not. Recall that $b_{vi} = c_{p(i,v)} b_{ii}$ and $b_{vj} = c_{p(j,v)} b_{jj}$ for $v \in \mathrm{Desc}(j) \subset \mathrm{Desc}(i)$ .

Suppose first that $j, u \in U$ . From (37) we deduce (38) thanks to the criticality assumption. In order to make the correct assignment of the two vectors $\beta=(\beta_v)_{v\in U}$ and $\beta'=(\beta'_v)_{v\in U}$ to the nodes i and j, we need to check the inequality (38). If $\beta_u / \beta_j > \beta'_u / \beta'_j$ , then we assign the vector $\beta$ to the node i and the vector $\beta'$ to the node j. If the inequality is reversed, we do the assignment the other way around.

Next suppose that $j, u \in \bar{U}$ . According to Lemma C.1.1, there exist nodes $j^{\prime}, u^{\prime}\in U$ such that there is a unique path from j to j and from u to u . By Lemma C.1.1, the paths from i to u and j , and from j to u , are

\begin{align*} p(i,u^{\prime}) &= \{(i,u)\} \cup p(u,u^{\prime}), & p(i,j^{\prime}) &= \{(i,j)\} \cup p(j,j^{\prime}), & p(j,u^{\prime}) &= \{(j,u)\} \cup p(u,u^{\prime}). \end{align*}

We have the same identities in (39) and (40), which, together with the criticality assumption, lead to the inequality (41). In order to make the correct assignment of the two vectors $\beta=(\beta_v)_{v\in U}$ and $\beta'=(\beta'_v)_{v\in U}$ , we proceed as above for the case $j,u\in U$ .

For the cases $(j,u) \in U \times \bar{U}$ and $(j, u) \in \bar{U} \times U$ , we combine methods from the cases $(j,u) \in U \times U$ and $(j,u) \in \bar{U} \times \bar{U}$ .

With this we finish the proof that we can learn the structure of every atom $\{\omega_r \;:\; r = 1, \ldots, |V|\}$ , i.e., that for every $r=1, \ldots, |V|$ we can identify the unique node $i\in V$ such that $\omega_r=a_{i,U}=(b_{vi}/m_{i,U})_{v\in U}$ . This means that we can also match every element $\beta$ in the collection of vectors $\{\beta_r \;:\; r = 1, \ldots, |V|\}$ to the correct node $i \in V$ such that $\beta = (b_{vi})_{v \in U}$ .

Step 2. In the previous step, we showed that the distribution of $X_U$ (together with knowledge of the graph structure) determines the max-linear coefficient matrix $b_{U \times V} = (b_{vi})_{v \in U, i \in V}$ . Here we show that this matrix suffices to reconstruct the vector of edge coefficients $\theta = (c_e)_{e \in E}$ .

If v is a child of i, then $p(i,v) = \{(i,v)\}$ and thus $b_{vi} = c_{iv} b_{ii}$ . If both i and v belong to U, then clearly we can identify $c_{iv} = b_{vi} / b_{ii}$ .

Let $i\in U$ with child $v \in \mathrm{ch}(i) \cap \bar{U}$ . By Lemma C.1.1 there exists a node $v'\in U$ such that there is a unique path from v to v . Write $v=v_1$ and $v'=v_n$ , and consider the node set $\{v_1, v_2, \ldots, v_n\}$ on that unique path. Using the fact that if a node $\ell$ has a single parent k, then $b_{\ell\ell} = 1 - c_{k \ell}$ (see (9)), we find the following identities for the max-linear coefficients $b_{v_nj}$ for $j \in \{v_n,\ldots,v_2,i\}$ :

(43) \begin{equation} \begin{split} b_{v_nv_n} &= 1 - c_{v_{n-1}v_n},\\[5pt] b_{v_nv_{n-1}} &= c_{p(v_{n-1},v_n)} b_{v_{n-1}v_{n-1}} = c_{v_{n-1} v_n} \left( 1-c_{v_{n-2}v_{n-1}} \right),\\[5pt] &\vdots\\[5pt] b_{v_nv_{2}} &= c_{p(v_{2},v_n)} b_{v_2v_2} = c_{v_2 v_3} \cdots c_{v_{n-1}v_n} \left( 1-c_{v_{1}v_{2}} \right),\\[5pt] b_{v_ni}&=c_{p(i,v_n)}b_{ii} =c_{iv_1}c_{v_1v_2}c_{v_2v_3}\cdots c_{v_{n-1}v_n}b_{ii}.\\[5pt] \end{split} \end{equation}

From the first equation we identify $c_{v_{n-1}v_n}$ , from the second $c_{v_{n-2}v_{n-1}}$ , and so on until we identify $c_{v_1v_2}$ from the penultimate equation. From the last equation we can identify $c_{iv_1}$ , because $b_{ii}$ is available from $b_{U \times V}$ in view of $i \in U$ .

The next step of the proof is to extract the edge parameters between a node with latent variable and its children.

Let $i\in \bar{U}$ . We will show that we can identify all edge weights $c_{ij}$ for $j \in \mathrm{ch}(i)$ . Because i belongs to $\bar{U}$ , it should have at least two children, say v and $\bar{v}$ . Take v to be a node whose only parent is i. Note that we can always find such a node by Lemma C.1.1. Let us first assume $v,\bar{v}\in U$ . Because v has only one parent, we have $b_{vv} = 1-c_{iv}$ and thus $c_{iv}=1-b_{vv}$ . We also know $b_{vi}=c_{iv}b_{ii}$ , and thus $b_{ii}=b_{vi}/c_{iv}=b_{vi}/(1-b_{vv})$ . From $b_{\bar{v} i}=c_{i\bar{v}}b_{ii}$ we deduce that $c_{i\bar{v}}=b_{\bar{v} i}(1-b_{vv})/b_{vi}$ . Hence we have identified all the edge parameters related to children of i which are observable, provided i has two or more children in U.

Next assume that both $v,\bar{v}\in \bar{U}$ . By Lemma C.1.1, there exists a node $v'\in \mathrm{desc}(v) \cap U$ such that there is a unique path from v to v , and it has the properties stated in the lemma. Let the sequence of nodes along which the path passes be denoted by $\{v_1=v, v_2, \ldots, v_n=v'\}$ . Using again the fact that for a node $\ell$ with single parent k, $b_{\ell\ell} = 1 - c_{k \ell}$ (see (9)), we find the same identities as in (43) for the max-linear coefficients $b_{v_nj}$ for $j \in \{v_n,\ldots,v_2\}$ . From the first equation we obtain $c_{v_{n-1} v_n} = 1 - b_{v_nv_n}$ , from the second equation $c_{v_{n-2}v_{n-1}} = 1 - b_{v_nv_{n-1}}/(1-b_{v_nv_n})$ , and so on until we obtain $c_{v_1 v_2}$ from the penultimate equation in (43). Because we assumed $\mathrm{pa}(v_1)=\{i\}$ , we have $b_{v_1v_1}=1-c_{iv_1}$ and thus

\begin{equation*} b_{v_nv_{1}} = c_{p(v_{1},v_n)} b_{v_1v_1} = c_{v_1 v_2} c_{v_2 v_3} \cdots c_{v_{n-1} v_n } \left( 1-c_{iv_1} \right), \end{equation*}

from which we identify $c_{iv_1}$ . Since $v_n \in U$ , all coefficients $b_{v_nj}$ for $j \in V$ are contained in the max-linear coefficient matrix $b_{U \times V}$ . The procedure just described thus allows us to compute all edge coefficients $c_{iv_1}, c_{v_1v_2}, \ldots, c_{v_{n-1}v_n}$ .

Because the path $p(v_1, v_n)$ satisfies the properties in Lemma C.1.1, and in view of the same lemma plus the fact that the only parent of $v_1=v$ is i, it follows that the path $\{(i,v_1)\} \cup p(v_1,v_n)$ is the unique path between i and $v_n$ . Hence we have also

(44) \begin{equation} b_{v_ni} = c_{p(i,{v_n})}b_{ii} =c_{iv_1}c_{v_1 v_2}c_{v_2 v_3}\cdots c_{v_{n-1} v_n}b_{ii}. \end{equation}

As $v_n \in U$ , the value of $b_{v_n i}$ is known from $b_{U \times V}$ . The edge coefficients on the right-hand side were expressed in terms of $b_{U \times V}$ in the previous paragraph. From there we obtain the value of $b_{ii}$ , which will be used next.

Now consider the node $\bar{v}$ , renamed to $\bar{v}_1$ , which was an arbitrary child of i. For $\bar{v}=\bar{v}_1$ too, we can find a node $\bar{v}_m \in U$ and a sequence of nodes $\{\bar{v}_2, \ldots, \bar{v}_m\}$ according to Lemma C.1.1 satisfying the properties stated there. For all $r \in \{2,\ldots,m\}$ , the node $\bar{v}_r$ has a unique parent, $\bar{v}_{r-1}$ . We have the following equalities, again using $b_{\ell\ell} = 1 - c_{k\ell}$ for a node $\ell$ with unique parent k:

\begin{equation*} \begin{split} b_{\bar{v}_m\bar{v}_m} &= 1-c_{\bar{v}_{m-1}\bar{v}_m}, \\[5pt] b_{\bar{v}_m\bar{v}_{m-1}} &= c_{p(\bar{v}_{m-1},\bar{v}_m)} b_{\bar{v}_{m-1}\bar{v}_{m-1}} = c_{\bar{v}_{m-1} \bar{v}_m} \, \left( 1-c_{\bar{v}_{m-2}\bar{v}_{m-1}} \right),\\[5pt] &\vdots\\[5pt] b_{\bar{v}_m\bar{v}_{2}} &= c_{p(\bar{v}_{2},\bar{v}_m)} b_{\bar{v}_2 \bar{v}_2} = c_{\bar{v}_2 \bar{v}_3} \cdots c_{\bar{v}_{m-1} \bar{v}_m} \left( 1-c_{\bar{v}_{1}\bar{v}_{2}} \right),\\[5pt] b_{\bar{v}_mi}&=c_{p(i,\bar{v}_m)}b_{ii} =c_{i\bar{v}_1}c_{\bar{v}_1 \bar{v}_2}c_{\bar{v}_2 \bar{v}_3}\cdots c_{\bar{v}_{m-1} \bar{v}_m}b_{ii} . \end{split} \end{equation*}

In the last equality we used that the path $\{(i,\bar{v}_1)\}\cup p(\bar{v}_1, \bar{v}_m)$ is the unique shortest path between i and $\bar{v}_m$ , because of Lemma C.1.1 and because $p(\bar{v}_1, \bar{v}_m)$ satisfies the properties 1–2 of the lemma. Since $\bar{v}_m \in U$ , the values of the left-hand sides in the previous equations are contained in the given matrix $b_{U \times V}$ . From the first equality we obtain $c_{\bar{v}_{m-1}\bar{v}_m}$ , from the second one $c_{\bar{v}_{m-2}\bar{v}_{m-1}}$ , and so on until we identify $c_{\bar{v}_{1}\bar{v}_{2}}$ from the penultimate equality, i.e., all edge parameters linked to $p(\bar{v}_1,\bar{v}_m)$ . In the last equation above we replace $b_{ii}$ with the expression derived from (44) and we obtain the parameter $c_{i\bar{v}_1}$ .

If some of the children of i are in U and others are in $\bar{U}$ , we apply a combination of the techniques used in the two cases described above—when two children are in U or when two children are in $\bar{U}$ .

This concludes the proof of the sufficiency (if) part.

Proof of necessity (only-if) part in Proposition 4.1. Let $\bar{U} \subset V$ be such that at least one of the two conditions (I1) and (I2) is not satisfied, and let $\theta = (c_e)_{e \in E} \in \mathring{\Theta}_*$ . We will show that there exists another parameter $\theta' = (c'_{\!\!e})_{e \in E} \in \mathring{\Theta}_*$ such that $\theta' \ne \theta$ but the distribution of $X_{V \setminus u}$ under $\theta'$ is the same as the one under $\theta$ .

We consider two cases: (1) (I2) does not hold, i.e., there exists $u \in \bar{U}$ which is not the source of any tournament in $\mathcal{T}$ ; (2) (I2) holds but not (I1), i.e., every $u \in \bar{U}$ is the source of some tournament in $\mathcal{T}$ , but there exists $u \in \bar{U}$ with fewer than two children.

Case 1: there exists $u \in \bar{U}$ which is not the source of any tournament in $\mathcal{T}$ . Then u belongs to only a single tournament, say $\tau = (V_\tau, E_\tau)$ ; indeed, a node that belongs to two different tournaments must be the source of at least one of them, because otherwise it would have parents from two different tournaments, yielding a forbidden v-structure.

Let $X_{V_{\tau}}$ be a max-linear model restricted to a single tournament, $\tau$ . The coefficients associated to $X_{V_{\tau}}$ , denoted by $b_{vi}^\tau$ for $v,i \in V_\tau$ , are determined by the weights of the edges $e \in E_\tau$ . By Lemma 1.4, we can modify the edge weights $c_e$ for $e \in E_\tau$ in such a way that the max-linear coefficients $b_{vi}^\tau$ for $v \in V_\tau \setminus u$ and $i \in V_\tau$ remain unaffected. Let $\tilde{c}_e$ for $e \in E_\tau$ denote such a modified vector of edge weights. Define $\theta' = (c'_{\!\!e})_{e \in E} \in \mathring{\Theta}_*$ by

\begin{align*} c'_{\!\!e} &= \begin{cases} \tilde{c}_e & \text{if $e \in E_\tau$,} \\[5pt] c_e & \text{if $e \in E \setminus E_\tau$.} \end{cases} \end{align*}

Then $\theta' \in \mathring{\Theta}_*$ too: indeed, $c'_{\!\!ii} > 0$ for all $i \in V$ by assumption on $\theta$ and by the fact the vector $(\tilde{c}_e \;:\; e \in E_\tau)$ satisfies $\tilde{c}_{ii} > 0$ for $i \in V_\tau$ . By construction, the distribution of $X_{V_{\tau} \setminus u}$ is the same under $\theta'$ as under $\theta$ .

We show that the distribution of $X_{V \setminus u}$ under $\theta'$ is the same as the one under $\theta$ . We proceed by induction on the number of tournaments.

If $\mathcal{T}$ consists of a single tournament, then $\mathcal{T} = \tau$ and there is nothing more to show.

So suppose $\mathcal{T}$ consists of $m \ge 2$ tournaments. The skeleton graph of $\mathcal{T}$ is a block graph and thus a decomposable graph. By the running intersection property, we can order the tournaments $\tau_1,\ldots,\tau_m$ with node sets $V_1,\ldots,V_m$ in such a way that $\tau_1 = \tau$ , the tournament containing u, and such that $V_m \cap (V_1 \cup \ldots \cup V_{m-1})$ is a singleton, say $\{s\}$ . Then $s \ne u$ since u belongs to only a single tournament.

Write $W = V_1 \cup \ldots \cup V_{m-1} \setminus u$ . The joint distribution of $X_{V \setminus u}$ can be factorized into two parts: first, the distribution of $X_{W}$ , and second, the conditional distribution of $X_{V_m \setminus s}$ given $X_{W}$ . It is sufficient to show that both parts remain the same when $\theta$ is replaced by $\theta'$ .

  • By the induction hypothesis, the distribution of $X_{W}$ is the same under $\theta'$ as under $\theta$ .

  • By the global Markov property (Proposition 3.2), the conditional distribution of $X_{V_m \setminus s}$ given $X_{W}$ is the same as the conditional distribution of $X_{V_m \setminus s}$ given $X_s$ . But the latter is determined by the joint distribution of $X_{V_m}$ , which, in turn, depends only on the weights of the edges e in $\tau_m$ . By construction, these edge weights are the same under $\theta$ as under $\theta'$ . It follows that the conditional distribution of $X_{V_m \setminus u}$ given $X_W$ is the same under $\theta'$ as under $\theta$ .

We conclude that the distribution of $X_{V \setminus u}$ is the same under $\theta'$ as under $\theta$ . Since $\theta \ne \theta'$ , the parameter is not identifiable.

Case 2: any $u \in \bar{U}$ is the source of some tournament in $\mathcal{T}$ , but there exists $u \in \bar{U}$ with fewer than two children. Any $u \in \bar{U}$ must have at least one child (a node without children cannot be the source of a tournament). But then there exists $u \in \bar{U}$ with exactly one child: $\mathrm{ch}(u) = \{w\}$ . The tournament of which u is the source can only consist of the nodes u and w and the edge (u, w). Now there are two subcases, depending on whether u has any parents or not.

Case 2.a: u has no parents. Then u is the source node of $\mathcal{T}$ with a single child w. Removing the node u yields the TTT $\mathcal{T}_{\setminus u} \;:\!=\; (V \setminus u, E \setminus \{(u, w)\})$ with single source w. The random vector $X_{V \setminus u}$ follows the recursive max-linear model (2) with respect to $\mathcal{T}_{\setminus u}$ . Its distribution is determined by the coefficients $c_e$ for $e \in E \setminus \{(u, w)\}$ . The value of $c_{uw}$ can thus be chosen arbitrarily in (0, 1) without affecting the distribution of $X_{V \setminus u}$ .

Case 2.b: u has parents. Any ancestor of w different from u must also be an ancestor of u, since otherwise there would be a v-structure at w; therefore,

\begin{equation*} \mathrm{An}(w) = \mathrm{an}(u) \cup \{u, w\}. \end{equation*}

Let $\lambda > 0$ and close enough to 1 (as specified below). Define $\theta' = (c'_{\!\!e})_{e \in E}$ by modifying the weights of edges adjacent to u: specifically,

\begin{align*} c'_{\!\!ju} &= \lambda c_{ju}, \qquad j \in \mathrm{pa}(u); \\[5pt] c'_{\!\!uw} &= \lambda^{-1} c_{uw}; \\[5pt] c'_{\!\!e} &= c_e, \qquad e \in E \setminus [\{(j,u) \;:\; j \in \mathrm{pa}(u)\} \cup \{(u, w)\}]. \end{align*}

In words, $\theta'$ coincides with $\theta$ for edges e that do not involve u, and $\theta' = \theta$ if and only if $\lambda = 1$ . Since the parameter space $\mathring{\Theta}_*$ is open, $\theta'$ belongs to $\mathring{\Theta}_*$ for $\lambda$ sufficiently close to 1. We claim that the distribution of $X_{V \setminus u}$ is invariant under $\lambda$ . Hence, for $\lambda$ different from but sufficiently close to 1, we have found a parameter $\theta' \neq \theta$ producing the same distribution of $X_{V \setminus u}$ as $\theta$ .

Under $\theta'$ , the random vector $X_{V \setminus u}$ follows the max-linear model

\[ X_v = \bigvee_{i \in \mathrm{An}(v)} b'_{\!\!vi} Z_i, \qquad v \in V \setminus u, \]

where $(Z_i)_{i \in V}$ is a vector of independent unit-Fréchet random variables and where the coefficients $b'_{\!\!vi}$ are given by Equations (4), (5), and (6) with $c_e$ replaced by $c'_{\!\!e}$ .

First, suppose $v \in V \setminus u$ is not a descendant of u. Then for any $i \in \mathrm{An}(v)$ , the coefficient $b'_{\!\!vi}$ is a function of edge weights $c'_{\!\!e}$ for edges $e \in E$ different from (u, w) and from (j, u) for $j \in \mathrm{pa}(u)$ . It follows that $c'_{\!\!e} = c_e$ for such edges, and thus $b'_{\!\!vi} = b_{vi}$ for $v \in V \setminus \mathrm{Desc}(u)$ and $i \in \mathrm{An}(v)$ .

Second, suppose $v \in \mathrm{desc}(u)$ . Then necessarily $v \in \mathrm{Desc}(w)$ too, and for any $i \in \mathrm{An}(u)$ , the path p(i, v) passes through (or arrives at) w. Furthermore, any ancestor of v not in $\mathrm{An}(w)$ is a descendant of w:

(45) \begin{equation} \mathrm{An}(v) = \mathrm{an}(u) \cup \{u, w\} \cup [\mathrm{desc}(w) \cap \mathrm{An}(v)]. \end{equation}

It follows that

\[ X_v = \bigvee_{i \in \mathrm{an}(u)} b'_{\!\!vi} Z_i \vee \left( b'_{\!\!vu} Z_u \vee b'_{\!\!vw} Z_w \right) \vee \bigvee_{i \in \mathrm{desc}(w) \cap \mathrm{An}(v)} b'_{\!\!vi} Z_i, \]

where the last term on the right-hand side is to be omitted if $v = w$ . We treat each of the three terms on the right-hand side separately.

  • For $i \in \mathrm{an}(u)$ , we have

    \[ b'_{\!\!vi} = c'_{\!\!ii} c'_{\!\!p(i,v)} = c'_{\!\!ii} c'_{\!\!p(i,u)} c'_{\!\!uw} c'_{\!\!p(w,v)}, \]
    where $c'_{\!\!p(w,v)} = 1$ if $v = w$ . The coefficients $c'_{\!\!ii}$ and $c'_{\!\!p(w, v)}$ only involve weights $c'_{\!\!e}$ for edges $e \in E$ different from (u, w) and (j, u) for $j \in \mathrm{pa}(u)$ ; it follows that $c'_{\!\!ii} = c_{ii}$ and $c'_{\!\!p(w,v)} = c_{p(w,v)}$ . Furthermore, given $i \in \mathrm{an}(u)$ , there exists $j \in \mathrm{pa}(u)$ such that p(i, u) passes through j right before reaching u (with $i = j$ if $i \in \mathrm{pa}(u)$ ), and then
    \[ c'_{\!\!p(i,u)} c'_{\!\!uw} = c'_{\!\!p(i,j)} c'_{\!\!ju} c'_{\!\!uw} = c'_{\!\!p(i,j)} (\lambda c_{ju}) (\lambda^{-1} c_{uw}). \]
    Since p(i, j) does not involve edges meeting u, we find that $c'_{\!\!p(i,j)} = c_{p(i,j)}$ , so that the above expression does not depend on $\lambda$ . We conclude that $b'_{\!\!vi} = b_{vi}$ for $i \in \mathrm{an}(u)$ .
  • If $v \in \mathrm{desc}(w)$ and $i \in \mathrm{desc}(w) \cap \mathrm{An}(v)$ , the coefficient $b'_{\!\!vi}$ is

    \[ b'_{\!\!vi} = c'_{\!\!ii} c'_{\!\!p(i,v)}. \]
    The path p(i, v) does not involve edges touching u, so $c_{p(i,v)'} = c_{p(i,v)}$ . By Lemma 1.3, the coefficient $c'_{\!\!ii} = b'_{\!\!ii}$ is a function of the edge weights $c'_{\!\!e}$ for edges e in the tournament shared by i and its parents. Since $i \in \mathrm{desc}(w)$ and since w is the only child of u, none of these edges touches u, and thus $c'_{\!\!e} = c_{e}$ for all such edges. It follows that $c'_{\!\!ii} = c_{ii}$ too. We conclude that $b'_{\!\!vi} = b_{vi}$ for $v \in \mathrm{desc}(w)$ and $i \in \mathrm{desc}(w) \cap \mathrm{An}(v)$ .
  • The random variable $b'_{\!\!vu} Z_u \vee b'_{\!\!vw} Z_w$ is independent of all other variables $Z_i$ for $i \in V \setminus \{u,w\}$ , and its distribution is equal to $(b'_{\!\!vu} + b'_{\!\!vw}) Z$ for Z a unit-Fréchet variable. We will show that $b'_{\!\!vu} + b'_{\!\!vw}$ does not depend on $\lambda$ . Since $1 = \sum_{i \in \mathrm{An}(v)} b'_{\!\!vi}$ , the partition (45) yields

    \[ b'_{\!\!vu} + b'_{\!\!vw} = 1 - \sum_{i \in \mathrm{an}(u)} b'_{\!\!vi} - \sum_{i \in \mathrm{desc}(w) \cap \mathrm{An}(v)} b'_{\!\!vi}. \]
    (The last sum on the right-hand side is zero if v is not a descendant of w.) In the two previous bullet points, we have already shown that the coefficients $b'_{\!\!vi}$ for i in $\mathrm{an}(u)$ or $\mathrm{desc}(w) \cap \mathrm{An}(v)$ do not depend on $\lambda$ . By the stated identity, it follows that the sum $b'_{\!\!vu} + b'_{\!\!vw}$ does not depend on $\lambda$ either.

We have thus shown that if $\bar{U}$ does not satisfy (I1)–(I2), then we can find $u \in \bar{U}$ such that the distribution of $X_{V \setminus u}$ is the same under $\theta'$ as under $\theta$ . As $\theta' \ne \theta$ by construction, this means that the parameter $\theta$ is not identifiable from the distribution of $X_{V \setminus u}$ . But as $U = V \setminus \bar{U} \subseteq V \setminus \{u\}$ , the parameter $\theta$ is not identifiable from the distribution of $X_U$ either. This confirms the necessity of (I1)–(I2) for the identifiability of $\theta$ from the distribution of $X_U$ .

Acknowledgements

The comments and suggestions provided by two anonymous reviewers have been greatly helpful in the preparation of the final version of the manuscript. S. Asenova is particularly grateful to Eugen Pircalabelu and Ngoc Tran for their availability and precious help.

Funding information

There are no funding bodies to thank in relation to the creation of this article.

Competing interests

There were no competing interests to declare which arose during the preparation or publication process of this article.

References

Améndola, C., Hollering, B., Sullivant, S. and Tran, N. (2021). Markov equivalence of max-linear Bayesian networks. In Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence (Proceedings of Machine Learning Research 161), eds de Campos, C. and Maathuis, M. H., Research Press, ML, pp. 1746–1755.Google Scholar
Améndola, C., Klüppelberg, C., Lauritzen, S. and Tran, N. M. (2022). Conditional independence in max-linear Bayesian networks. Ann. Appl. Prob. 32, 145.CrossRefGoogle Scholar
Asadi, P., Davison, A. C. and Engelke, S. (2015). Extremes on river networks. Ann. Appl. Statist. 9, 20232050.10.1214/15-AOAS863CrossRefGoogle Scholar
Asenova, S., Mazo, G. and Segers, J. (2021). Inference on extremal dependence in the domain of attraction of a structured Hüsler–Reiss distribution motivated by a Markov tree with latent variables. Extremes 24, 461500.CrossRefGoogle Scholar
Asenova, S. and Segers, J. (2023). Extremes of Markov random fields on block graphs: max-stable limits and structured Hüsler–Reiss distributions. Extremes 26, 433468.CrossRefGoogle Scholar
Behtoei, A., Jannesari, M. and Taeri, B. (2010). A characterization of block graphs. Discrete Appl. Math. 158, 219221.CrossRefGoogle Scholar
Beirlant, J., Goegebeur, Y., Segers, J. and Teugels, J. L. (2004). Statistics of Extremes: Theory and Applications. John Wiley, Hoboken, NJ.CrossRefGoogle Scholar
Buck, J. and Klüppelberg, C. (2021). Recursive max-linear models with propagating noise. Electron. J. Statist. 15, 47704822.10.1214/21-EJS1903CrossRefGoogle Scholar
De Haan, L. and Ferreira, A. (2007). Extreme Value Theory: An Introduction. Springer, New York.Google Scholar
Einmahl, J. H. J., Krajina, A. and Segers, J. (2012). An M-estimator for tail dependence in arbitrary dimensions. Ann. Statist. 40, 17641793.CrossRefGoogle Scholar
Engelke, S. and Hitz, A. S. (2020). Graphical models for extremes. J. R. Statist. Soc. B [Statist. Methodology] 82, 871932.CrossRefGoogle Scholar
Engelke, S. and Volgushev, S. (2022). Structure learning for extremal tree models. J. R. Statist. Soc. B [Statist. Methodology] 84, 20552087.CrossRefGoogle Scholar
Gissibl, N. and Klüppelberg, C. (2018). Max-linear models on directed acyclic graphs. Bernoulli 24, 26932720.CrossRefGoogle Scholar
Gissibl, N., Klüppelberg, C. and Lauritzen, S. (2021). Identifiability and estimation of recursive max-linear models. Scand. J. Statist. 48, 188211.CrossRefGoogle Scholar
Gissibl, N., Klüppelberg, C. and Otto, M. (2018). Tail dependence of recursive max-linear models with regularly varying noise variables. Econometrics Statist. 6, 149167.CrossRefGoogle Scholar
Gnecco, N., Meinshausen, N., Peters, J. and Engelke, S. (2021). Causal discovery in heavy-tailed models. Ann. Statist. 49, 17551778.10.1214/20-AOS2021CrossRefGoogle Scholar
Harary, F. (1963). A characterization of block-graphs. Canad. Math. Bull. 6, 16.CrossRefGoogle Scholar
Harary, F. and Moser, L. (1966). The theory of round robin tournaments. Amer. Math. Monthly 73, 231246.CrossRefGoogle Scholar
Hu, S., Peng, Z. and Segers, J. (2022). Modelling multivariate extreme value distributions via Markov trees. To appear in Scand. J. Statist.Google Scholar
Janssen, A. and Segers, J. (2014). Markov tail chains. J. Appl. Prob. 51, 11331153.CrossRefGoogle Scholar
Klüppelberg, C. and Krali, M. (2021). Estimating an extreme Bayesian network via scalings. J. Multivariate Anal. 181, 10461072.CrossRefGoogle Scholar
Klüppelberg, C. and Lauritzen, S. (2019). Bayesian networks for max-linear models. In Network Science, Springer, Cham, pp. 7997.CrossRefGoogle Scholar
Koller, D. and Friedman, N. (2009). Probabilistic Graphical Models: Principles and Techniques. MIT Press.Google Scholar
Kulpa, W. (1998). Poincaré and domain invariance theorem. Acta Univ. Carolin. Math. Phys. 39, 127136.Google Scholar
Lauritzen, S. L. (1996). Graphical Models. Oxford University Press.CrossRefGoogle Scholar
Le, V. B. and Tuy, N. N. (2010). The square of a block graph. Discrete Math. 310, 734741.CrossRefGoogle Scholar
Resnick, S. (1987). Extreme Values, Regular Variation, and Point Processes. Springer, New York.CrossRefGoogle Scholar
Resnick, S. I. and Zeber, D. (2013). Asymptotics of Markov kernels and the tail chain. Adv. Appl. Prob. 45, 186213.CrossRefGoogle Scholar
Segers, J. (2007). Multivariate regular variation of heavy-tailed Markov chains. Preprint. Available at https://arxiv.org/abs/math/0701411v1.Google Scholar
Segers, J. (2020). Discussion on the paper by Engelke and Hitz (comment). J. R. Statist. Soc. B [Statist. Methodology] 82, 926.Google Scholar
Segers, J. (2020). One- versus multi-component regular variation and extremes of Markov trees. Adv. Appl. Prob. 52, 855878.CrossRefGoogle Scholar
Smith, R. L. (1992). The extremal index for a Markov chain. J. Appl. Prob. 29, 3745.CrossRefGoogle Scholar
Tran, N. M., Buck, J. and Klüppelberg, C. (2021). Causal discovery of a river network from its extremes. Preprint. Available at https://arxiv.org/abs/2102.06197v2.Google Scholar
Tran, N. M., Buck, J. and Klüppelberg, C. (2021). Estimating a latent tree for extremes. Preprint. Available at https://arxiv.org/abs/2102.06197v2.Google Scholar
Wright, S. (1934). The method of path coefficients. Ann. Math. Statist. 5, 161215.CrossRefGoogle Scholar
Yun, S. (1998). The extremal index of a higher-order stationary Markov chain. Ann. Appl. Prob. 8, 408437.CrossRefGoogle Scholar
Figure 0

Figure 1. A tree of four maximal transitive tournaments: $\tau_1$, $\tau_2$, $\tau_3$, and $\tau_4$. The skeleton graph is the same, but with the arrow heads removed. Node 3 is a separator node between tournaments $\tau_1$, $\tau_2$, and $\tau_3$, while node 7 is a separator node between tournaments $\tau_3$ and $\tau_4$. Nodes 1, 4, and 8 are source nodes, i.e., have no parents. The subgraph with node set $\{1, 3, 4\}$ is a v-structure: node 3 has non-adjacent parents 1 and 4. Other v-structures within the TTT are the subgraphs with node sets $\{3, 7, 8\}$ and $\{5, 7, 8\}$. Between any pair of distinct nodes there is a unique shortest trail; for instance, nodes 4 and 8 are connected by the trail passing through nodes 3 and 7. There is no undirected cycle involving several tournaments.

Figure 1

Figure 2. A tree of four maximal transitive tournaments. The skeleton graph is the same as the one in Figure 1, but the graph here has a single source, on node 4. We see that there are no longer any v-structures in the graph.

Figure 2

Figure 3. A TTT on four tournaments: $\tau_1$ on node set $\{1,2,3\}$, $\tau_2$ on $\{3,4\}$, $\tau_3$ on $\{3,5,6,7\}$, and $\tau_4$ on $\{8,7\}$. The variable exceeding a high threshold is at node 8. The set $E_u$ is composed of the thicker edges, which do not necessarily have the same directions in the original graph. On the nodes we have $A^{(8)}=(A_{8i}, i=1, \ldots, 7)$ and on the edges we have the multiplicative increments $(M_{e}, e\in E_u)$. Increments in different tournaments are mutually independent, while those in the same tournament are dependent according to Lemma 3.1.

Figure 3

Figure 4. In the following TTT, the nodes that are allowed to contain a latent variable while the edge parameters remain identifiable are 1,3,7. These are the only nodes that each satisfy both (I1) and (I2). For instance, if node 2 has unobserved variables, the parameters attached to edges (1, 2),(3, 2) are not identifiable. This is because the edge weights $c_{12},c_{32}$ take part only in products over paths ending at 2. But if $2\in\bar{U}$ these coefficients disappear from the atoms of the angular measure $a_{i,U}=(b_{vi}/m_{i,U}, v\in U)$ and accordingly from the collection of vectors $\mathcal{B}_{\theta,U}$.

Figure 4

Figure 5. The two possible configurations of the trails t(u, 1) and t(u, 2) when nodes 1, 2, 3 form a v-structure.

Figure 5

Figure 6. According to [2, Definition 5.4], a $*$-connected path between a and b relative to S is one of the five configurations above. In the last three graphs we have $s\in S$.

Figure 6

Figure 7. The left and right trails in the original graph $\mathcal{T}$ are the only possible trails that give rise to the middle path in the graph $\mathcal{D}_S^*$.

Figure 7

Figure 8. A unique path on nodes $\{\bar{u}=v_1, v_2, v_3, \ldots, s = v_n\}$ under Lemma C.1.1. Each of the nodes $\bar{u}, v_2, \ldots, v_{n-1}$ belongs to $\bar{U}$. Each of the nodes $v_2, \ldots, v_{n-1}, s$ has a unique parent. The node $\bar{u}\in \bar{U}$ may have parents as illustrated here, but then there is at least one tournament with respect to which it is a source node, e.g., $\tau_1$. Let $v_2$ be the node with unique parent $\bar{u}$ in $\tau_1$. When $v_2$ belongs to $\bar{U}$, it must participate in at least one another tournament, say $\tau_2$. In $\tau_2$ the node with unique parent $v_2$ is $v_3$. Following this principle, the path continues until we find a node in U, which is $v_n = s$ in this case.