1. Introduction
In this paper, we continue the line of research from [Reference Benjamini, Cohen and Shinkar2, Reference Rao and Shinkar15, Reference Johnston and Scott9] studying geometric similarities between different subsets of the hypercube $\mathcal{H}_n = \{0,1\}^n$ . Given a set $A \subseteq \mathcal{H}_n$ of size $\left |A\right | = 2^{n-1}$ and a bijection $\phi \,:\, \mathcal{H}_{n-1} \to A$ , we define the average stretch of $\phi$ as
where ${{\sf dist}}(x,y)$ is defined as the number of coordinates $i \in [n]$ such that $x_i \neq y_i$ , and the expectation is taken over a uniformly random $x,x^{\prime} \in \mathcal{H}_{n-1}$ that differ in exactly one coordinate.Footnote 1
The origin of this notion is motivated by the study of the complexity of distributions [Reference Goldreich, Goldwasser and Nussboim5, Reference Viola17, Reference Lovett and Viola12]. In this line of research, given a distribution $\mathcal{D}$ on $\mathcal{H}_n$ the goal is to find a mapping $h \,:\, \mathcal{H}_m \to \mathcal{H}_n$ such that if $U_m$ is the uniform distribution over $\mathcal{H}_m$ , then $h(U_m)$ is (close to) the distribution $\mathcal{D}$ , and each output bit $h_i$ of the function $h$ is computable efficiently (e.g., computable in ${AC}_0$ , i.e., by polynomial size circuits of constant depth).
Motivated by the goal of proving lower bounds for sampling from the uniform distribution on some set $A \subseteq \mathcal{H}_n$ , Lovett and Viola [Reference Lovett and Viola12] suggested the restricted problem of proving that no bijection from $\mathcal{H}_{n-1}$ to $A$ can be computed in ${AC}_0$ . Toward this goal they noted that it suffices to prove that any such bijection requires large average stretch. Indeed, by the structural results of [Reference Håstad7, Reference Boppana4, Reference Linial, Mansour and Nisan11] it is known that any such mapping $\phi$ that is computable by a polynomial size circuit of depth $d$ has ${\sf avgStretch}(\phi ) \lt \log (n)^{O(d)}$ , and hence proving that any bijection requires super-polylogarithmic average stretch implies that it cannot be computed in ${AC}_0$ . Proving a lower bound for sampling using this approach remains an open problem.
Studying this problem, [Reference Benjamini, Cohen and Shinkar2] have shown that if $n$ is odd, and $A_{{\sf maj}} \subseteq \mathcal{H}_n$ is the hamming ball of density $1/2$ , that is $A_{{\sf maj}} = \{x \in \mathcal{H}_n \,:\, \sum _i x_i \gt n/2\}$ , then there is a $O(1)$ -bi-Lipschitz mapping from $\mathcal{H}_{n-1}$ to $A_{{\sf maj}}$ , thus suggesting that proving a lower bound for a bijection from $\mathcal{H}_{n-1}$ to $A_{{\sf maj}}$ requires new ideas beyond the sensitivity-based structural results of [Reference Håstad7, Reference Boppana4, Reference Linial, Mansour and Nisan11] mentioned above. In [Reference Rao and Shinkar15] it has been shown that if a subset $A_{{\sf rand}}$ of density $1/2$ is chosen uniformly at random, then with high probability there is a bijection $\phi \,:\, \mathcal{H}_{n-1} \to A_{{\sf rand}}$ with ${\sf avgStretch}(\phi ) = O(1)$ . This result has recently been improved by Johnston and Scott [Reference Johnston and Scott9], who showed that for a random set $A_{{\sf rand}} \subseteq \mathcal{H}_n$ of density $1/2$ there exists a $O(1)$ -Lipschitz bijection from $\mathcal{H}_{n-1}$ to $A_{{\sf rand}}$ with high probability.
The following problem was posed in [Reference Benjamini, Cohen and Shinkar2], and repeated in [Reference Rao and Shinkar15, Reference Johnston and Scott9].
Problem 1.1. Exhibit a subset $A \subseteq \mathcal{H}_n$ of density $1/2$ such that any bijection $\phi \,:\, \mathcal{H}_{n-1} \to A$ has ${\sf avgStretch}(\phi ) = \omega (1)$ , or prove that no such subset exists.Footnote 2
Remark. Note that it is easy to construct a set of density $1/2$ such that any bijection $\phi \,:\, \mathcal{H}_{n-1} \to A$ must have a worst case stretch at least $n/2$ . For example, for odd $n$ consider the set $A = \{y \in \mathcal{H}_n \,:\, n/2 \lt \sum _i y_i \lt n\} \cup \{0^n\}$ . Then any bijection $\phi \,:\, \mathcal{H}_{n-1} \to A$ must map some point $x \in \mathcal{H}_{n-1}$ to $0^n$ , while all neighbours $x^{\prime}$ of $x$ are mapped to some $\phi (x^{\prime})$ with weight at least $n/2$ . Hence, the worst case stretch of $\phi$ is at least $n/2$ . In contrast, Problem 1.1 does not seem to have a non-trivial solution.
To rephrase Problem 1.1, we are interested in determining a tight upper bound on the $\sf avgStretch$ that holds uniformly for all sets $A \subseteq \mathcal{H}_n$ of density $1/2$ . Note that since the diameter of $\mathcal{H}_n$ is $n$ , for any set $A \subseteq \mathcal{H}_n$ of density $1/2$ and any bijection $\phi \,:\, \mathcal{H}_{n-1} \to A$ it holds that ${\sf avgStretch}(\phi ) \leq n$ . It is natural to ask how tight this bound is, that is, whether there exists $A \subseteq \mathcal{H}_n$ of density $1/2$ such that any bijection $\phi \,:\, \mathcal{H}_{n-1} \to A$ requires linear average stretch.
It is consistent with our current knowledge (though hard to believe) that for any set $A$ of density $1/2$ there is a mapping $\phi \,:\, \mathcal{H}_{n-1} \to A$ with ${\sf avgStretch}(\phi ) \leq 2$ . The strongest lower bound we are aware of is for the set $A_{\oplus } = \left\{x \in \mathcal{H}_n \,:\, \sum _{i}x_i \equiv 0 \pmod 2\right\}$ . Note that the distance between any two points in $A_{\oplus }$ is at least 2, and hence ${\sf avgStretch}(\phi ) \geq 2$ for any mapping $\phi \,:\, \mathcal{H}_{n-1} \to A_{\oplus }$ . Proving a lower bound strictly greater than 2 for any set $A$ is an open problem, and prior to this work we are not aware of any sublinear upper bounds that apply uniformly to all sets.
Most of the research on metric embedding, we are aware of, focuses on worst case stretch. For a survey on metric embeddings of finite spaces see [Reference Linial10]. There has been a lot of research on the question of embedding into the Boolean cube. For example, see [Reference Angel and Benjamini1, Reference Håstad, Leighton and Newman8] for work on embeddings between random subsets of the Boolean cube, and [Reference Graham6] for isometric embeddings of arbitrary graphs into the Boolean cube.
1.1. A uniform upper bound on the average stretch
We prove a non-trivial uniform upper bound on the average stretch of a mapping $\phi \,:\, \mathcal{H}_{n-1} \to A$ that applies to all sets $A \subseteq \mathcal{H}_n$ of density $1/2$ .
Theorem 1.2. For any set $A \subseteq \mathcal{H}_n$ of density $\mu _n(A) = 1/2$ , there exists a bijection $\phi \,:\, \mathcal{H}_{n-1} \to A$ such that ${\sf avgStretch}(\phi ) = O\left(\sqrt{n}\right)$ .
Toward this goal we prove a stronger result bounding the average transportation distance between two arbitrary sets of density $1/2$ . Specifically, we prove the following theorem.
Theorem 1.3. For any two sets $A,B \subseteq \mathcal{H}_n$ of density $\mu _n(A) = \mu _n(B) = 1/2$ , there exists a bijection $\phi \,:\, A \to B$ such that ${\mathbb E}\left[{{\sf dist}}\left(x,\phi (x)\right)\right] \leq \sqrt{2n}$ .
Note that Theorem 1.2 follows immediately from Theorem 1.3 by the following simple argument.
Proposition 1.4. Fix a bijection $\phi \,:\, \mathcal{H}_{n-1} \to A$ . Then ${\sf avgStretch}(\phi ) \leq 2{\mathbb E}_{x \in \mathcal{H}_{n-1}}[{{\sf dist}}$ $(x,\phi (x))] + 1$ .
Proof. Using the triangle inequality we have
as required.
1.2. Bounds on the average stretch for specific sets
Next, we study two specific subsets of $\mathcal{H}_n$ defined by Boolean functions commonly studied in the field “Analysis of Boolean functions” [Reference O’Donnell13]. Specifically, we study two monotone noise-sensitive functions: the recursive majority of 3’s, and the tribes function.
It was suggested in [Reference Benjamini, Cohen and Shinkar2] that the set of ones of these functions $A_f = f^{-1}(1)$ may be such that any mapping $\phi \,:\, \mathcal{H}_{n-1} \to A_f$ requires large $\sf avgStretch$ . We show that for the recursive majority function there is a mapping $\phi _{{\textsf{rec-maj}}} \,:\, \mathcal{H}_{n-1} \to{\textsf{rec-maj}}^{-1}(1)$ with ${\sf avgStretch}(\phi _{{\textsf{rec-maj}}}) = O(1)$ . For the tribes function we show a mapping $\phi _{{\sf tribes}} \,:\, \mathcal{H}_{n-1} \to{\sf tribes}^{-1}(1)$ with ${\sf avgStretch}\left(\phi _{{\sf tribes}}\right) = O(\!\log (n))$ . Below we formally define the functions, and discuss our results.
1.2.1. Recursive majority of 3’s
The recursive majority of 3’s function is defined as follows.
Definition 1.5. Let $k \in{\mathbb N}$ be a positive integer. Define the function recursive majority of 3’s ${\textsf{rec-maj}}_{k} \,:\, \mathcal{H}_{3^k} \to \{0,1\}$ as follows.
For $k = 1$ the function ${\textsf{rec-maj}}_{1}$ is the majority function on the $3$ input bits.
For $k \gt 1$ the function ${\textsf{rec-maj}}_{k} \,:\, \mathcal{H}_{3^k} \to \{0,1\}$ is defined recursively as follows. For each $x \in \mathcal{H}_{3^k}$ write $x = x^{(1)} \circ x^{(2)} \circ x^{(3)}$ , where each $x^{({r})} \in \mathcal{H}_{3^{k-1}}$ for each ${r} \in [3]$ . Then, ${\textsf{rec-maj}}_{k}(x) ={\sf maj}\left({\textsf{rec-maj}}_{k-1}\big(x^{(1)}\big),{\textsf{rec-maj}}_{k-1}\big(x^{(2)}\big),{\textsf{rec-maj}}_{k-1}\big(x^{(3)}\big)\right)$ .
Note that ${\textsf{rec-maj}}_{k}(x) = 1-{\textsf{rec-maj}}_{k}({\textbf{1}}-x)$ for all $x \in \mathcal{H}_n$ , and hence the density of the set ${A_{\textsf{rec-maj}_{k}}} = \left\{x \in \mathcal{H}_{n} \,:\,{\textsf{rec-maj}}_{k}(x) = 1\right\}$ is $\mu _n\left({A_{\textsf{rec-maj}_{k}}}\right) = 1/2$ . We prove the following result regarding the set $A_{\textsf{rec-maj}_{k}}$ .
Theorem 1.6. Let $k$ be a positive integer, let $n = 3^k$ , and let ${A_{\textsf{rec-maj}_{k}}} = \left\{x \in \mathcal{H}_{n} \,:\,{\textsf{rec-maj}}_{k}(x) = 1\right\}$ . There exists a mapping $\phi _{{\textsf{rec-maj}}_{k}} \,:\, \mathcal{H}_{n-1} \to{A_{\textsf{rec-maj}_{k}}}$ such that ${\sf avgStretch}\left(\phi _{{\textsf{rec-maj}}_{k}}\right) \leq 20$ .
1.2.2. The tribes function
The tribes function is defined as follows.
Definition 1.7. Let $s, w \in{\mathbb N}$ be two positive integers, and let $n = s \cdot w$ . The function ${\sf tribes} \,:\, \mathcal{H}_n \to \{0,1\}$ is defined as a DNF consisting of $s$ disjoint clauses of width $w$ .
That is, the function $\sf tribes$ partitions $n = sw$ inputs into $s$ disjoint ‘tribes’ each of size $w$ , and returns 1 if and only if at least one of the tribes ‘votes’ 1 unanimously.
It is clear that $\mathbb{P}_{x \in \mathcal{H}_n}[{\sf tribes}(x) = 1] = 1 - (1-2^{-w})^s$ . The interesting settings of parameters $w$ and $s$ are such that the function is close to balanced, that is, this probability is close to $1/2$ . Given $w \in{\mathbb N}$ , let $s = s_w = \ln (2) 2^w \pm \Theta (1)$ be the largest integer such that $1 - (1-2^{-w})^s \leq 1/2$ . For such choice of the parameters we have $\mathbb{P}_{x \in \mathcal{H}_n}[{\sf tribes}(x) = 1] = \frac{1}{2} - O\left (\frac{\log (n)}{n}\right )$ (see, e.g., [[Reference O’Donnell13], Section 4.2]).
Consider the set ${A_{{\sf tribes}}} = \left\{x \in \mathcal{H}_n \,:\,{\sf tribes}(x) = 1\right\}$ . Since the density of $A_{{\sf tribes}}$ is not necessarily equal to $1/2$ , we cannot talk about a bijection from $\mathcal{H}_{n-1}$ to $A_{{\sf tribes}}$ . In order to overcome this technical issue, let $A^*_{{\sf tribes}}$ be an arbitrary superset of $A_{{\sf tribes}}$ of density $1/2$ . We prove that there is a mapping $\phi _{\sf tribes}$ from $\mathcal{H}_{n-1}$ to $A^*_{{\sf tribes}}$ with average stretch ${\sf avgStretch}(\phi _{\sf tribes}) = O(\!\log (n))$ . In fact, we prove a stronger result, namely that the average transportation distance of $\phi _{\sf tribes}$ is $O(\!\log (n))$ .
Theorem 1.8. Let $w$ be a positive integer, and let $s$ be the largest integer such that $1 - (1-2^{-w})^s \leq 1/2$ . For $n=s \cdot w$ let ${\sf tribes} \,:\, \mathcal{H}_n \to \{0,1\}$ be defined as a DNF consisting of $s$ disjoint clauses of width $w$ . Let ${A_{{\sf tribes}}} = \left\{x \in \mathcal{H}_n \,:\,{\sf tribes}(x) = 1\right\}$ , and let ${A^*_{{\sf tribes}}} \subseteq \mathcal{H}_n$ be an arbitrary superset of $A_{{\sf tribes}}$ of density $\mu _n\left({A^*_{{\sf tribes}}}\right) = 1/2$ . Then, there exists a bijection $\phi _{{\sf tribes}} \,:\, \mathcal{H}_{n-1} \to{A^*_{{\sf tribes}}}$ such that ${\mathbb E}[{{\sf dist}}(x,\phi _{{\sf tribes}}(x))] = O(\!\log (n))$ . In particular, ${\sf avgStretch}(\phi _{\sf tribes}) = O(\!\log (n))$ .
2. Proof of Theorem 1.3
We provide two different proofs of Theorem 1.3. The first proof, in subsection 2.1 shows a slightly weaker bound of $O\left (\sqrt{n \ln (n)}\right )$ on the average stretch using the Gale-Shapley result on the stable marriage problem. The idea of using the stable marriage problem was suggested in [Reference Benjamini, Cohen and Shinkar2], and we implement this approach. Then, in subsection 2.2, we show the bound of $O\left(\sqrt{n}\right)$ by relating the average stretch of a mapping between two sets to known estimates on the Wasserstein distance on the hypercube.
2.1. Upper bound on the average transportation distance using stable marriage
Recall the Gale-Shapley theorem on the stable marriage problem. In the stable marriage problem we are given two sets of elements $A$ and $B$ each of size $N$ . For each element $a \in A$ (resp. $b \in B$ ) we have a ranking of the elements of $B$ (resp. $A$ ) given as an bijection $rk_a \,:\, A \to [N]$ $\left(rk_b \,:\, B \to [N]\right)$ representing the preferences of each $a$ (resp. $b$ ). A matching (or a bijection) $\phi \,:\, A \to B$ is said to be unstable if there are $a,a^{\prime} \in A$ , and $b,b^{\prime} \in B$ such that $\phi (a) = b^{\prime}$ , $\phi (a^{\prime}) = b$ , but $rk_a(b) \lt rk_a(b^{\prime})$ , and $rk_b(a) \lt rk_b(a^{\prime})$ ; that is, both $a$ and $b$ would prefer to be mapped to each other over their mappings given by $\phi$ . We say that a matching $\phi \,:\, A \to B$ is stable otherwise.
Theorem 2.1. (Gale-Shapley theorem) For any two sets $A$ and $B$ of equal size and any choice of rankings for each $a \in A$ and $b \in B$ there exists a stable matching $m \,:\, A \to B$ .
For the proof of Theorem 1.3 the sets $A$ and $B$ are subsets of $\mathcal{H}_n$ of density 1/2. We define the preferences between points based on the distances between them in $\mathcal{H}_n$ . That is, for each $a \in A$ we have $rk_a(b) \lt rk_a(b^{\prime})$ if and only if ${{\sf dist}}(a,b) \lt{{\sf dist}}(a,b^{\prime})$ with ties broken arbitrarily. Similarly, for each $b \in B$ we have $rk_b(a) \lt rk_b(a^{\prime})$ if and only if ${{\sf dist}}(a,b) \lt{{\sf dist}}(a^{\prime},b)$ with ties broken arbitrarily.
Let $\phi \,:\, A \to B$ be a bijection. We show that if ${\mathbb E}_{x \in A}[{{\sf dist}}(x, \phi (x))]\gt 3k$ for $k=\left \lceil \sqrt{n \ln (n)}\right \rceil$ , then $\phi$ is not a stable matching. Consider the set
Note that since the diameter of $\mathcal{H}_n$ is $n$ , and ${\mathbb E}_{x \in A}[{{\sf dist}}(x, \phi (x))] \gt 3k$ , it follows that $\mu _n(F) \gt \frac{k}{n}$ . Indeed, we have $3k \lt{\mathbb E}_{x \in A}[{{\sf dist}}(x, \phi (x))] \leq n \cdot \frac{\mu _n(F)}{\mu _n(A)} + k \cdot \left(1-\frac{\mu _n(F)}{\mu _n(A)}\right) \leq n \cdot \frac{\mu _n(F)}{\mu _n(A)} + k$ , and thus $\mu _n(F) \gt \frac{2k}{n} \cdot \mu _n(A)$ . Next, we use Talagrand’s concentration inequality.
Theorem 2.2. ([[Reference Talagrand16], Proposition 2.1.1]) Let $k \leq n$ be two positive integers, and let $F \subseteq \mathcal{H}_n$ . Let $F_{\geq k} = \left\{x \in \mathcal{H}_n \,:\,{{\sf dist}}(x,y) \geq k \ \forall y \in F \right\}$ be the set of all $x \in \mathcal{H}_n$ whose distance from $F$ is at least $k$ . Then $\mu _n\left(F_{\geq k}\right) \leq e^{-k^2/n}/ \mu _n(F)$ .
By Theorem 2.2 we have $\mu _n\left(F_{\geq k}\right) \leq e^{-k^2/n}/ \mu _n(F)$ , and hence, for $k = \left \lceil \sqrt{n \ln (n)}\right \rceil$ it holds that
In particular, since $\mu _n(\phi (F)) =\mu _n(F) \gt k/n \gt 1/k \geq \mu _n\left(F_{\geq k}\right)$ , there is some $b \in \phi (F)$ that does not belong to $F_{\geq k}$ . That is, there is some $a \in F$ and $b \in \phi (F)$ such that ${{\sf dist}}(a,b) \lt k$ . On the other hand, for $a^{\prime} = \phi ^{-1}(b)$ , by the definition of $F$ we have ${{\sf dist}}(a,\phi (a)) \geq k$ and ${{\sf dist}}(a^{\prime},b=$ $\phi (a^{\prime})) \geq k$ , and hence $\phi$ is not stable, as $a$ and $b$ prefer to be mapped to each other over their current matching. Therefore, in a stable matching ${\mathbb E}_{x \in A}[{{\sf dist}}(x, \phi (x))] \leq 3\left \lceil \sqrt{n \ln (n)}\right \rceil$ , and by the Gale-Shapley theorem such a matching does, indeed, exist.
2.2. Proof of Theorem 1.3 using transportation theory
Next we prove Theorem 1.3, by relating our problem to a known estimate on the Wasserstein distance between two measures on the hypercube. Recall that the $\ell _1$ -Wasserstein distance between two measures $\mu$ and $\nu$ on $\mathcal{H}_n$ is defined as
where the infimum is taken over all couplings $q$ of $\mu$ and $\nu$ , that is, $\sum _{y^{\prime}} q(x,y^{\prime}) = \mu (x)$ and $\sum _{x^{\prime}} q(x^{\prime},y) = \nu (y)$ for all $x,y \in \mathcal{H}_n$ . That is, we consider an optimal coupling $q$ of $\mu$ and $\nu$ minimizing ${\mathbb E}_{(x,y) \sim q}[{{\sf dist}}(x,y)]$ , the expected distance between $x$ and $y$ , where $x$ is distributed according to $\mu$ and $y$ is distributed according to $\nu$ .
We prove the theorem using the following two claims.
Claim 2.3. Let $\mu _A$ and $\mu _B$ be uniform measures over the sets $A$ and $B$ respectively. Then, there exists a bijection $\phi$ from $A$ to $B$ such that ${\mathbb E}[{{\sf dist}}(x,\phi (x))] = W_1(\mu _A,\mu _B)$ .
Claim 2.4. Let $\mu _A$ and $\mu _B$ be uniform measures over the sets $A$ and $B$ respectively. Then $W_1(\mu _A, \mu _B) \leq \sqrt{2n}$ .
Proof of Claim 2.3. Observe that any bijection $\phi$ from $A$ to $B$ naturally defines a coupling $q$ of $\mu _A$ and $\mu _B$ , defined as
Therefore, $W_1(\mu _A,\mu _B) \leq{\mathbb E}_{x \in A}[{{\sf dist}}(x,\phi (x))]$ .
For the other direction note that in the definition of $W_1$ we are looking for the infimum of the linear function $L(q) = \sum _{(x,y) \in A \times B}{{\sf dist}}(x,y) q(x,y)$ , where the infimum is taken over the Birkhoff polytope of all $n \times n$ doubly stochastic matrices. By the Birkhoff-von Neumann theorem [Reference Birkhoff3, Reference von Neumann18] this polytope is the convex hull whose extremal points are precisely the permutation matrices. The optimum is obtained on such an extremal point, and hence there exists a bijection $\phi$ from $A$ to $B$ such that $W_1(\mu _A,\mu _B) ={\mathbb E}[{{\sf dist}}(x,\phi (x))]$ .
Proof of Claim 2.4. The proof of the claim follows rather directly from the techniques in transportation theory (see [[Reference Lovett and Viola12], Section 3.4]). Specifically, using Definition 3.4.2 and combining Proposition 3.4.1, equation 3.4.42, and Proposition 3.4.3, where $\mathcal{X} = \{0,1\}$ , and $\mu$ is the uniform distribution on $\mathcal{X}$ we have the following theorem.
Theorem 2.5. Let $\nu$ be an arbitrary distribution on the discrete hypercube $\mathcal{H}_n$ , and let $\mu _n$ be the uniform distribution on $\mathcal{H}_n$ . Then
where $D(\nu \mid \mid \mu )$ is the Kullback-Leibler divergence defined as $D(\nu \mid \mid \mu ) = \sum _{x} \nu (x) \log\! \left(\frac{\nu (x)}{\mu (x)}\right)$ .
In particular, by letting $\nu = \mu _A$ be the uniform distribution over the set $A$ of cardinality $2^{n-1}$ , we have $D(\mu _A \mid \mid \mu _n) = \sum _{x \in A} \mu _A(x)\log \left (\frac{\mu _A(x)}{\mu _n(x)}\right ) = \sum _{x \in A} \frac{1}{|A|}\log (2) = 1$ , and hence $W_1(\mu _n,\mu _A) \leq \sqrt{\frac{1}{2} n \cdot D(\mu _n \mid \mid \nu )} = \sqrt{n/2}$ . Analogously, we have $W_1(\mu _n,\mu _B) \leq \sqrt{n/2}$ . Therefore, by the triangle inequality, we conclude that $W_1(\mu _A,\mu _B) \leq W_1(\mu _A,\mu _n) + W_1(\mu _n,\mu _B) \leq \sqrt{2n}$ , as required.
This completes the proof of Theorem 1.3.
3. Average stretch for recursive majority of 3’s
In this section we prove Theorem 1.6, showing a mapping from $\mathcal{H}_n$ to $A_{\textsf{rec-maj}_{k}}$ with constant average stretch. The key step in the proof is the following lemma.
Lemma 3.1. Let $k$ be a positive integer, and let $n = 3^k$ . There exists $f_k \,:\, \mathcal{H}_n \to{A_{\textsf{rec-maj}_{k}}}$ satisfying the following properties.
-
1. $f_k(x) =x$ for all $x \in{A_{\textsf{rec-maj}_{k}}}$ .
-
2. For each $x \in{A_{\textsf{rec-maj}_{k}}}$ there is a unique $z \in{Z_{\textsf{rec-maj}_{k}}}{\,:\!=\,} \mathcal{H}_n \setminus{A_{\textsf{rec-maj}_{k}}}$ such that $f_k(z) = x$ .
-
3. For every $i \in [n]$ we have ${\mathbb E}_{x \in \mathcal{H}_n}\left[{{\sf dist}}\left(f_k(x),f_k\left(x+e_i\right)\right) \right] \leq 10$ .
We postpone the proof of Lemma 3.1 for now, and show how it implies Theorem 1.6.
Proof of Theorem 1.6. Let $f_k$ be the mapping from Lemma 3.1. Define $\psi _0,\psi _1 \,:\, \mathcal{H}_{n-1} \to{A_{\textsf{rec-maj}_{k}}}$ as $\psi _b(x) = f_k(x \circ b)$ , where $x \circ b \in \mathcal{H}_n$ is the string obtained from $x$ by appending to it $b$ as the $n$ ’th coordinate.
The mappings $\psi _0,\psi _1$ naturally induce a bipartite graph $G = (V,E)$ , where $V = \mathcal{H}_{n-1} \cup{A_{\textsf{rec-maj}_{k}}}$ and $E = \left\{(x,\psi _b(x)) \,:\, x \in \mathcal{H}_{n-1}, b \in \{0,1\} \right\}$ , possibly, containing parallel edges. Note that by the first two items of Lemma 3.1 the graph $G$ is 2-regular. Indeed, for each $x \in \mathcal{H}_n$ the neighbours of $x$ are $N(x) = \{ \psi _0(x) = f_k(x \circ 0), \psi _1(x) = f_k(x \circ 1) \}$ , and for each $y \in{A_{\textsf{rec-maj}_{k}}}$ there is a unique $x \in{A_{\textsf{rec-maj}_{k}}}$ and a unique $z \in{Z_{\textsf{rec-maj}_{k}}}$ such that $f_k(x) = f_k(z) = 1$ , and hence $N(y) = \left\{ x_{[1,\dots,n-1]}, z_{[1,\dots,n-1]} \right\}$ .
Since the bipartite graph $G$ is 2-regular, it has a perfect matching. Let $\phi$ be the bijection from $\mathcal{H}_{n-1}$ to $A_{\textsf{rec-maj}_{k}}$ induced by a perfect matching in $G$ , and for each $x \in \mathcal{H}_n$ let $b_x \in \mathcal{H}_n$ be such that $\phi (x) = \psi _{b_x}(x)$ . We claim that ${\sf avgStretch}(\phi ) = O(1)$ . Let $x \sim x^{\prime}$ be uniformly random vertices in $\mathcal{H}_{n-1}$ that differ in exactly one coordinate, and let $r \in \{0,1\}$ be uniformly random. Then
For the first term, since $r$ is equal to $b_x$ with probability $1/2$ by Lemma 3.1 Item 3 we get that ${\mathbb E}\left[{{\sf dist}}\left(f_k\left(x \circ b_x\right), f_k(x \circ r)\right)\right]\leq 5$ . Analogously the third term is bounded by $5$ . In the second term we consider the expected distance between $f({\cdot})$ applied on inputs that differ in a random coordinate $i \in [n-1]$ , which is at most $10$ , again, by Lemma 3.1 Item 3. Therefore, ${\mathbb E}\left[{{\sf dist}}\left(\phi (x), \phi \left(x^{\prime}\right)\right)\right] \leq 20$ .
We return to the proof of Lemma 3.1.
Proof of Lemma 3.1. Define $f_k \,:\, \mathcal{H}_n \to{A_{\textsf{rec-maj}_{k}}}$ by induction on $k$ . For $k = 1$ define $f_1$ as
That is, $f_1$ acts as the identity map for all $x \in{A_{\textsf{rec-maj}_{1}}}$ , and maps all inputs in $Z_{\textsf{rec-maj}_{1}}$ to $A_{\textsf{rec-maj}_{1}}$ in a one-to-one way. Note that $f_1$ is a non-decreasing mapping, that is, $(f_1(x))_i \geq x_i$ for all $x \in \mathcal{H}_3$ and $i \in [3]$ .
For $k\gt 1$ define $f_k$ recursively using $f_{k-1}$ as follows. For each ${r} \in [3]$ , let $T_{r} = \left[({r}-1) \cdot 3^{k-1}+1, \ldots,{r} \cdot 3^{k-1}\right]$ be the $r$ ’th third of the interval $\big[3^k\big]$ . For $x \in \mathcal{H}_{3^k}$ , write $x = x^{(1)} \circ x^{(2)} \circ x^{(3)}$ , where $x^{({r})} = x_{T_{r}} \in \mathcal{H}_{3^{k-1}}$ is the $r$ ’th third of $x$ . Let $y = (y_1,y_2,y_3) \in \{0,1\}^3$ be defined as $y_{r} ={\textsf{rec-maj}}_{k-1}\left(x^{({r})}\right)$ , and let $w =(w_1,w_2,w_3) = f_1(y) \in \{0,1\}^3$ . Define
Finally, the mapping $f_k$ is defined as
That is, if ${\textsf{rec-maj}}_{k}(x) = 1$ then $w = y$ , and hence $f_k(x) = x$ , and otherwise, $f_{k-1}^{(r)}\big(x^{(r)}\big) \neq x^{(r)}$ for all $r \in [3]$ where $y_r = 0$ and $w_r = 1$ .
Next we prove that $f_k$ satisfies the properties stated in Lemma 3.1.
-
1. It is clear from the definition that if ${\textsf{rec-maj}}_{k}(x) = 1$ , then $w = y$ , and hence $f_k(x) = x$ .
-
2. Next, we prove by induction on $k$ that the restriction of $f_k$ to $Z_{\textsf{rec-maj}_{k}}$ induces a bijection. For $k = 1$ the statement clearly holds. For $k \gt 2$ suppose that the restriction of $f_{k-1}$ to $Z_{\textsf{rec-maj}_{k-1}}$ induces a bijection. We show that for every $x \in{A_{\textsf{rec-maj}_{k}}}$ the mapping $f_k$ has a preimage of $x$ in $Z_{\textsf{rec-maj}_{k}}$ . Write $x = x^{(1)} \circ x^{(2)} \circ x^{(3)}$ , where $x^{({r})} = x_{T_{r}} \in \mathcal{H}_{3^{k-1}}$ is the $r$ ’th third of $x$ . Let $w = (w_1,w_2,w_3)$ be defined as $w_{r} ={\textsf{rec-maj}}_{k}[k-1]\left(x^{({r})}\right)$ . Since $x \in{A_{\textsf{rec-maj}_{k}}}$ it follows that $w \in \{110,101,011,111\}$ . Let $y =(y_1,y_2,y_3) \in{Z_{\textsf{rec-maj}_{1}}}$ such that $f_1(y) = w$ . For each ${r} \in [3]$ such that $w_{r} = 1$ and $y_{r} = 0$ it must be the case that $x^{({r})} \in{A_{\textsf{rec-maj}_{k-1}}}$ , and hence, by the induction hypothesis, there is some $z^{({r})} \in{Z_{\textsf{rec-maj}_{k-1}}}$ such that $f_{k-1}\left(z^{({r})}\right) = x^{({r})}$ . For each ${r} \in [3]$ such that $y_{r} = w_{r}$ define $z^{({r})} = x^{({r})}$ . Since $y =(y_1,y_2,y_3) \in{Z_{\textsf{rec-maj}_{1}}}$ , it follows that $z = z^{(1)} \circ z^{(2)} \circ z^{(3)} \in{Z_{\textsf{rec-maj}_{k}}}$ . It is immediate by the construction that, indeed, $f_k(z) = x$ .
-
3. Fix $i \in \big[3^k\big]$ . In order to prove ${\mathbb E}[{{\sf dist}}(f_k(x),f_k(x+e_i)) ] = O(1)$ consider the following events.
\begin{align*} E_1 & = \left\{{\textsf{rec-maj}}_{k}(x) = 1 ={\textsf{rec-maj}}_{k}(x+e_i)\right\}, \\[3pt] E_2 & = \left\{{\textsf{rec-maj}}_{k}(x) = 0,{\textsf{rec-maj}}_{k}(x+e_i) = 1\right\}, \\[3pt] E_3 & = \left\{{\textsf{rec-maj}}_{k}(x) = 1,{\textsf{rec-maj}}_{k}(x+e_i) = 0\right\}, \\[3pt] E_4 & = \left\{{\textsf{rec-maj}}_{k}(x) = 0 ={\textsf{rec-maj}}_{k}(x+e_i)\right\}. \end{align*}Then ${\mathbb E}\!\left[{{\sf dist}}\!\left(f_k(x), f_k(x+e_i)\right)\right] = \sum _{j=1,2,3,4}{\mathbb E}\left[{{\sf dist}}\left(f_k(x), f_k(x+e_i)\right) | E_j\right] \cdot \mathbb{P}\big[E_j\big]$ . The following three claims prove an upper bound on ${\mathbb E}\left[{{\sf dist}}\left(f_k(x), f_k(x+e_i)\right)\right]$ .Claim 3.2. ${\mathbb E}\left[{{\sf dist}}\left(f_k(x), f_k(x+e_i)\right) | E_1\right] = 1$ .
Claim 3.3. ${\mathbb E}\left[{{\sf dist}}\left(f_k(x), f_k(x+e_i)\right) | E_2\right] \leq 2 \cdot 1.5^k$ .
Claim 3.4. ${\mathbb E}\left[{{\sf dist}}\left(f_k(x), f_k(x+e_i)\right) | E_4\right] \cdot \mathbb{P}[E_4] \leq 8$ .
By symmetry, it is clear that ${\mathbb E}\left[{{\sf dist}}\left(f_k(x), f_k(x+e_i)\right) | E_2\right] ={\mathbb E}\left[{{\sf dist}}\left(f_k(x), f_k(x+e_i)\right) | E_3\right]$ . Note also that $\mathbb{P}\big[E_1\big] \lt 0.5$ and $\mathbb{P}\left[E_2 \cup E_3\right] = 2^{-k}$ .Footnote 3 Therefore, the claims above imply that
which completes the proof of Lemma 3.1.
Next we prove the above claims.
Proof of Claim 3.2. If $E_1$ holds then ${{\sf dist}}\left(f_k(x), f_k(x+e_i)\right) ={{\sf dist}}(x,x+e_i) = 1$ .
Proof of Claim 3.3. We prove first that
The proof is by induction on $k$ . For $k = 1$ we have ${\mathbb E}\left[{{\sf dist}}\left(x,f_1(x)\right) |{\textsf{rec-maj}}_{k}(x) = 0\right] = 1.5$ as there are two inputs $x \in{Z_{\textsf{rec-maj}_{k}}}$ with ${{\sf dist}}\left(x,f_1(x)\right) = 1$ and two $x$ ’s in $Z_{\textsf{rec-maj}_{k}}$ with ${{\sf dist}}\left(x,f_1(x)\right) = 2$ . For $k \gt 1$ suppose that ${\mathbb E}\left[{{\sf dist}}(x,f_{k-1}(x)) |{\textsf{rec-maj}}_{k-1}(x)\right] = 1.5^{k-1}$ . Write each $x \in \mathcal{H}_{3^k}$ as $x = x^{(1)} \circ x^{(2)} \circ x^{(3)}$ , where $x^{({r})} = x_{T_{r}} \in \mathcal{H}_{3^{k-1}}$ is the $r$ ’th third of $x$ , and let $y = (y_1,y_2,y_3)$ be defined as $y_{r} ={\textsf{rec-maj}}_{k-1}\left(x^{({r})}\right)$ . Since ${\mathbb E}_{x \in H_{3^{k-1}}}\left[{\textsf{rec-maj}}_{k-1}(x)\right] = 0.5$ , it follows that for a random $z \in{Z_{\textsf{rec-maj}_{k}}}$ each $y \in \{000, 100, 010, 001\}$ happens with the same probability $1/4$ , and hence, using the induction hypothesis we get
which proves equation (1).
Next we prove thatFootnote 4
Note that equation (2) proves Claim 3.3. Indeed, if $E_2$ holds then using the triangle inequality we have ${{\sf dist}}\left(f_k(x), f_k(x+e_i)\right) \leq{{\sf dist}}(f_k(x), x) +{{\sf dist}}(x,x+e_i) +{{\sf dist}}(x+e_i, f_k(x+e_i)) ={{\sf dist}}(f_k(x), x) + 1$ , and hence
as required.
We prove equation (2) by induction on $k$ . For $k = 1$ equation (2) clearly holds. For the induction step let $k \gt 1$ . As in the definition of $f_k$ write each $x \in \mathcal{H}_{3^k}$ as $x = x^{(1)} \circ x^{(2)} \circ x^{(3)}$ , where $x^{({r})} = x_{T_{r}}$ is the $r$ ’th third of $x$ , and let $y = (y_1,y_2,y_3)$ be defined as $y_{r} ={\textsf{rec-maj}}_{k-1}\left(x^{({r})}\right)$ .
Let us suppose for concreteness that $i \in T_1$ . (The cases of $i \in T_2$ and $i \in T_3$ are handled similarly.) Note that if ${\textsf{rec-maj}}_{k}(x) = 0$ , ${\textsf{rec-maj}}_{k}(x+e_i) = 1$ , and $i \in T_1$ , then $y \in \{010,001\}$ . We consider each case separately.
-
1. Suppose that $y = 010$ . Then $w = f(y) =011$ , and hence $f(x)$ differs from $x$ only in $T_3$ . Taking the expectation over $x$ such that ${\textsf{rec-maj}}_{k}(x) = 0$ and ${\textsf{rec-maj}}_{k}(x+e_i) = 1$ by equation (1) we get ${\mathbb E}\left[{{\sf dist}}(x,f(x)) | E_2, y = 010\right] ={\mathbb E}\left[{{\sf dist}}\left(f_{k-1}\big(x^{(3)}\big), x^{(3)}\right) |{\textsf{rec-maj}}_{k-1}\right. \left.\big(x^{(3)}\big)=0\right] = 1.5^{k-1}$ .
-
2. If $y = 001$ , then $w = f_1(y) = 111$ , and $f(x)$ differs from $x$ only in $T_1 \cup T_2$ . Then
\begin{eqnarray*}{\mathbb E}\left[{{\sf dist}}(x,f(x)) | E_2, y = 001\right] &=&{\mathbb E}\left[{{\sf dist}}\big(f_{k-1}\big(x^{(1)}\big), x^{(1)}\big) | E_2, y = 001\right] \\ && +\,{\mathbb E}\left[{{\sf dist}}\big(f_{k-1}\big(x^{(2)}\big), x^{(2)}\big) | E_2, y = 001\right]. \end{eqnarray*}Denoting by $E^{\prime}_2$ the event that ${\textsf{rec-maj}}_{k-1}\big(x^{(1)}\big) = 0,{\textsf{rec-maj}}_{k-1}(x^{(1)}+e_i) = 1$ (i.e., the analogue of the event $E_2$ applied on ${\textsf{rec-maj}}_{k-1}$ ), we note that\begin{equation*} {\mathbb E}\left[{{\sf dist}}\big(f_{k-1}\big(x^{(1)}\big), x^{(1)}\big) | E_2, y = 001\right] = {\mathbb E}\left[ {{\sf dist}}\big(f_{k-1}\big(x^{(1)}\big), x^{(1)}\big) | E^{\prime}_2\right], \end{equation*}which is upper bounded by $\sum _{j=0}^{k-2} 1.5^j$ using the induction hypothesis. For the second term we have\begin{equation*} {\mathbb E}\left[{{\sf dist}}\big(f_{k-1}\big(x^{(2)}\big), x^{(2)}\big) | E_2, y = 001\right] = {\mathbb E}\left[ {{\sf dist}}\big(f_{k-1}\big(x^{(2)}\big), x^{(2)}\big) | {\textsf {rec-maj}}_{k-1}\big(x^{(2)}\big) = 0\right], \end{equation*}which is at most $1.5^{k-1}$ using equation (1). Therefore, for $y = 001$ we have\begin{equation*} {\mathbb E}[{{\sf dist}}(x,f(x)) | E_2, y = 001] \leq \sum _{j=0}^{k-2} 1.5^j + 1.5^{k-1}. \end{equation*}
Using the two cases for $y$ we get
This proves equation (2) for the case where $i \in T_1$ . The other two cases are handled similarly. This completes the proof of Claim 3.3.
Proof of Claim 3.4. For a coordinate $i \in [n]$ and for $0 \leq j \leq k$ let ${r} ={r}_i(j) \in{\mathbb N}$ be such that $i \in [({r}-1) \cdot 3^j + 1, \dots,{r} \cdot 3^j]$ , and denote the corresponding interval by $T_i(j) = [({r}-1) \cdot 3^j + 1, \dots,{r} \cdot 3^j]$ .Footnote 5 These are the coordinates used in the recursive definition of ${\textsf{rec-maj}}_{k}$ by the instance of ${\textsf{rec-maj}}_{j}$ that depends on the $i$ ’th coordinate.
For $x \in \mathcal{H}_n$ and $x^{\prime} = x + e_i$ , define ${\sf \nu }(x)$ as
That is, in the ternary tree defined by the computation of ${\textsf{rec-maj}}_{k}$ , ${\sf \nu }(x)$ is the lowest $j$ on the path from the $i$ ’th coordinate to the root where the computation of $x$ is equal to the computation of $x+e_i$ . Note that if $x$ is chosen uniformly from $\mathcal{H}_n$ , then
Below we show that by conditioning on $E_4$ and on the value of $\sf \nu$ we get
Indeed, suppose that $E_4$ holds. Assume without loss of generality that $x_i = 0$ , and let $x^{\prime} = x + e_i$ . Note that $f_k(x)$ and $f_k(x^{\prime})$ differ only on the coordinates in the interval $T_i({\sf \nu })$ . Let $w = x_{T_i({\sf \nu })}$ , and define $y = (y_1,y_2,y_3) \in \{0,1\}^3$ as $y_{r} ={\textsf{rec-maj}}_{\nu -1}(w^{({r})})$ for each ${r} \in [3]$ , where $w^{({r})}$ is the $r$ ’th third of $w$ . Similarly, let $w^{\prime} = x^{\prime}_{T_i({\sf \nu })}$ , and let $y^{\prime} = \left(y^{\prime}_1,y^{\prime}_2,y^{\prime}_3\right) \in \{0,1\}^3$ be defined as $y^{\prime}_{r} ={\textsf{rec-maj}}_{\nu -1}(w^{\prime({r})})$ for each ${r} \in [3]$ . This implies that
Furthermore, if ${\textsf{rec-maj}}_{\nu }\big(x_{T_i({\sf \nu })}\big) = 1$ $\big($ and ${\textsf{rec-maj}}_{\nu }\big(x^{\prime}_{T_i({\sf \nu })}\big) = 1\big)$ , then $f_k(x)_{T_i({\sf \nu })} = x_{T_i({\sf \nu })}$ , and thus ${{\sf dist}}\left(f_k(x), f_k(x^{\prime})\right) = 1$ .
Next we consider the case of ${\textsf{rec-maj}}_{k}\big(x_{T_i({\sf \nu })}\big) = 0$ $\big($ and ${\textsf{rec-maj}}_{k}\big(x^{\prime}_{T_i({\sf \nu })}\big) = 0\big)$ . Since $x_i = 0$ and $x^{\prime} = x + e_i$ , it must be that $y = 000$ and $y^{\prime}$ is a unit vector. Suppose first that $y^{\prime} = 100$ , that is, the coordinate $i$ belongs to the first third of $T_i({\sf \nu })$ . Write $w = w^{(1)} \circ w^{(2)} \circ w^{(3)}$ , where each $w^{(r)}$ is one third of $w$ . Analogously, write $w^{\prime} = w^{\prime(1)} \circ w^{\prime(2)} \circ w^{\prime(3)}$ , where each $w^{\prime(r)}$ one third of $w^{\prime}$ . Then, since $w^{\prime} = w + e_i$ we have
where the last inequality is by equation (1) and equation (2). Similarly,
where the last inequality is by equation (1). Therefore,
The cases of $y = 010$ and $001$ are handled similarly, and it is straightforward to verify that in these cases we also get the bound of $4 \cdot 1.5^{j-1}$ .
By combining equation (3) with equation (4) it follows that
This completes the proof of Claim 3.4.
4. Average stretch for tribes
In this section we prove Theorem 1.8, showing a mapping from $\mathcal{H}_n$ to $A^*_{{\sf tribes}}$ with $O(\!\log (n))$ average stretch. Let $\mu _{{\sf tribes}}^1$ be the uniform distribution on $A_{{\sf tribes}}$ , and let $\mu _{{\sf tribes}}^0$ be the uniform distribution on ${Z_{{\sf tribes}}} = \mathcal{H}_n \setminus{A_{{\sf tribes}}}$ . The proof consists of the following two claims.
Claim 4.1. For $\mu _{{\sf tribes}}^1$ and $\mu _{{\sf tribes}}^0$ as above it holds that
Next, let ${A^*_{{\sf tribes}}} \subseteq \mathcal{H}_n$ be an arbitrary superset of $A_{{\sf tribes}}$ of density $1/2$ , and let $\mu _{{\sf tribes}}^*$ be the uniform distribution on $A^*_{{\sf tribes}}$ .
Claim 4.2. Consider $\mathcal{H}_{n-1}$ as $\{ x \in \mathcal{H}_n \,:\, x_n = 0\}$ , and let $\mu _{n-1}$ be the uniform measure on $\mathcal{H}_{n-1}$ . Then,
By combining Claim 4.1 and Claim 4.2 we get that the average transportation distance between $\mathcal{H}_{n-1}$ and $A^*_{{\sf tribes}}$ is $W_1\left(\mu _{n-1}, \mu _{{\sf tribes}}^*\right) = O(\!\log (n))$ . By Claim 2.3 it follows that there exists $\phi _{{\sf tribes}} \,:\, \mathcal{H}_{n-1} \to{A^*_{{\sf tribes}}}$ such that ${\mathbb E}[{{\sf dist}}(x,\phi _{{\sf tribes}}(x))] = O(\!\log (n))$ , and using Proposition 1.4 we conclude that ${\sf avgStretch}(\phi _{{\sf tribes}}) = O(\!\log (n))$ . This completes the proof of Theorem 1.8.
Below we prove Claim 4.1 and Claim 4.2.
Proof of Claim 4.1. Let $\mathcal{D} = \mathcal{D}_w$ the uniform distribution over $\{0,1\}^{w} \setminus \{1^w\}$ , let $p=2^{-w}$ , and denote by $\mathcal{L} = \mathcal{L}_{w,s}$ the binomial distribution $\textrm{Bin}(p, s)$ conditioned on the outcome being positive. That is,
Note that $\mu _{{\sf tribes}}^0$ is equal to the product distribution $\mathcal{D}^s$ . Note also that in order to sample from the distribution $\mu _{{\sf tribes}}^1$ , we can first sample $\mathcal{L} \in \{1,\dots,s\}$ , then choose $\mathcal{L}$ random tribes that vote unanimously 1, and for the remaining $s - \mathcal{L}$ tribes sample their values in this tribe according to $\mathcal{D}$ .
We define a coupling $q_{\sf tribes}$ between $\mu _{{\sf tribes}}^0$ and $\mu _{{\sf tribes}}^1$ as follows. First sample $x$ according to $\mu _{{\sf tribes}}^0$ . Then, sample $\mathcal{L} \in \{1,\dots,s\}$ , choose $\mathcal{L}$ tribes $T \subseteq [s]$ uniformly, and let $S = \{ (t-1)w + j \,:\, t \in T, j \in [w]\}$ be all the coordinates participating in all tribes in $T$ . Define $y \in \mathcal{H}_n$ as $y_i = 1$ for all $i \in S$ , and $y_i = x_i$ for all $i \in [n] \setminus S$ . It is clear that $y$ is distributed according to $\mu _{{\sf tribes}}^1$ , and hence $q_{\sf tribes}$ is indeed a coupling between $\mu _{{\sf tribes}}^0$ and $\mu _{{\sf tribes}}^1$ .
We next show that ${\mathbb E}_{(x,y) \sim q_{\sf tribes}}[{{\sf dist}}(x,y)] = O(\!\log (n))$ . We have ${\mathbb E}_{(x,y) \sim q_{\sf tribes}}[{{\sf dist}}(x,y)] \leq{\mathbb E}[\mathcal{L} \cdot w]$ , and by the choice of parameters, we have $w \leq \log (n)$ and ${\mathbb E}[\mathcal{L}] = \frac{{\mathbb E}[\textrm{Bin}(2^{-w},s)]}{1 - \mathbb{P}[\textrm{Bin}(2^{-w},s) = 0]} = \frac{s \cdot 2^{-w}}{1 - 2^{-ws}}$ . By the choice of $s \leq \ln (2) 2^w + O(1)$ it follows that ${\mathbb E}[\mathcal{L}] = O(1)$ , and hence
This completes the proof of Claim 4.1.
Proof of Claim 4.2. We start by showing that
where $\mu _n$ is the uniform measure on $\mathcal{H}_n$ . Indeed, let $q_{\sf tribes}$ be a coupling between $\mu _{{\sf tribes}}^0$ and $\mu _{{\sf tribes}}^1$ . Define a coupling $q_n$ between $\mu _n$ and $\mu _{{\sf tribes}}^1$ as
It is straightforward to verify that $q_n$ is indeed a coupling between $\mu _n$ and $\mu _{{\sf tribes}}^1$ . Letting $q_{\sf tribes}$ be a coupling for which ${\mathbb E}_{(x,y) \sim q_{\sf tribes}}[{{\sf dist}}(x,y)] = W_1\left(\mu _{{\sf tribes}}^0, \mu _{{\sf tribes}}^1\right)$ we get
which proves equation (5).
Next, we show that
Indeed, let $q_n$ be a coupling between $\mu _n$ and $\mu _{{\sf tribes}}^1$ minimizing $\sum _{(x,y) \in \mathcal{H}_{n} \times{A_{{\sf tribes}}}}{{\sf dist}}(x,y) q_n(x,y)$ . Define a coupling $q_{n-1}$ between $\mu _{n-1}$ and $\mu _{{\sf tribes}}^1$ as
It is clear that $q_{n-1}$ is a coupling between $\mu _{n-1}$ and $\mu _{{\sf tribes}}^1$ . Next we prove equation (6).
which proves equation (6).
Next, we show that
In order to prove equation (7), let $\delta = \frac{1}{2} - \frac{\left |{A_{{\sf tribes}}}\right |}{2^n}$ . By the discussion in subsection 1.2.2 we have $\delta = O\left (\frac{\log (n)}{n}\right )$ . Then $\left |{A^*_{{\sf tribes}}} \setminus{A_{{\sf tribes}}}\right | = \delta \cdot 2^n$ . Let $q_{n-1}$ be a coupling between $\mu _{n-1}$ and $\mu _{{\sf tribes}}^1$ such that ${\mathbb E}_{(x,y) \sim q_{n-1}}[{{\sf dist}}(x,y)] = W_1\left(\mu _{n-1}, \mu _{{\sf tribes}}^{1}\right)$ . Define a coupling $q^*$ between $\mu _{n-1}$ and $\mu _{{\sf tribes}}^*$ as
It is straightforward to verify that $q^*$ is a coupling between $\mu _{n-1}$ and $\mu _{{\sf tribes}}^*$ . Next we prove equation (7).
Equation (7) follows from the fact that $\max\! ({{\sf dist}}(x,y)) \leq n$ and $\delta = O\left (\frac{\log (n)}{n}\right )$ .
By combining equations (5) to (7) we get $W_1\left(\mu _{n-1}, \mu _{{\sf tribes}}^*\right) \leq W_1\left(\mu _{{\sf tribes}}^0, \mu _{{\sf tribes}}^1\right) + O(\!\log (n))$ .
5. Concluding remarks and open problems
Uniform upper bound on the average stretch
We’ve shown a uniform upper bound of $O\left(\sqrt{n}\right)$ on the average transportation distance ${\mathbb E}[{{\sf dist}}(x,\phi (x))]$ from $\mathcal{H}_{n-1}$ to any set $A \subseteq \mathcal{H}_n$ of density $1/2$ , where $\mathcal{H}_{n-1}$ is treated as $\{ x \in \mathcal{H}_n \,:\, x_n = 0\}$ . This bound is tight up to a multiplicative constant. Indeed, it is not difficult to see that for any bijection $\phi$ from $\mathcal{H}_{n-1}$ to $A_{{\sf maj}} = \{x \in \mathcal{H}_n \,:\, \sum _i x_i \gt n/2\}$ (for odd $n$ ) the average transportation of $\phi$ is ${\mathbb E}[{{\sf dist}}(x,\phi (x))] \geq \Omega \left(\sqrt{n}\right)$ .
In contrast, we believe that the upper bound of $O\left(\sqrt{n}\right)$ on the average stretch is not tight, and it should be possible to improve it further.
Problem 5.1. Prove/disprove that for any set $A \subseteq \mathcal{H}_n$ of density $1/2$ there exists a mapping $\phi _A \,:\, \mathcal{H}_{n-1} \to A$ with ${\sf avgStretch}(\phi ) = o\left(\sqrt{n}\right)$ .
The tribes function
Considering our results about the tribes function, we make the following conjecture.
Conjecture 5.2. Let $w$ be a positive integer, and let $s$ be the largest integer such that $1 - (1-2^{-w})^s \leq 1/2$ . For $n=s \cdot w$ let ${\sf tribes} \,:\, \mathcal{H}_n \to \{0,1\}$ be defined as a DNF consisting of $s$ disjoint clauses of width $w$ , and let ${A_{{\sf tribes}}} = \left\{x \in \mathcal{H}_n \,:\,{\sf tribes}(x) = 1\right\}$ . There exists ${A^*_{{\sf tribes}}} \subseteq \mathcal{H}_n$ a superset of $A_{{\sf tribes}}$ of density $\mu _n\left({A^*_{{\sf tribes}}}\right) = 1/2$ such that $W_1(\mu _{n-1}, \mu _{{\sf tribes}}^*) = O(1)$ , where $\mu _{{\sf tribes}}^*$ is the uniform distribution on $A^*_{{\sf tribes}}$ .
As a first step toward the conjecture we proposed the following strengthening of Claim 4.1.
Problem 5.3. Let $\mu _{{\sf tribes}}^1$ be the uniform distribution on $A_{{\sf tribes}}$ , and let $\mu _{{\sf tribes}}^0$ be the uniform distribution on ${Z_{{\sf tribes}}} = \mathcal{H}_n \setminus{A_{{\sf tribes}}}$ . It is true that $W_1\left(\mu _{{\sf tribes}}^0, \mu _{{\sf tribes}}^1\right) = O(1)$ ?
A candidate set that requires large average stretch
We propose a candidate set $A^*$ for which we hope that any mapping from $\mathcal{H}_{n-1}$ to $A^*$ requires large average stretch. The set is defined as follows. Let $k^* \in [n]$ be the maximal $k$ such that $\binom{n}{\leq k} = \sum _{j=0}^k{\binom{n}{j}} \leq 2^{n-2}$ . Let $B^0_{1/4} =\left\{ x \in \mathcal{H}_n \,:\, \sum _{i \in [n]} x_i \leq k\right\}$ and $B^1_{1/4} =\left\{ x \in \mathcal{H}_n \,:\, \sum _{i \in [n]} x_i \geq n-k\right\}$ be two (disjoint) antipodal balls of radius $k^*$ , and let $C \subseteq \mathcal{H}_n \setminus \left(B^0_{1/4} \cup B^1_{1/4}\right)$ be an arbitrary set of size $\left |C\right | = 2^{n-1} - \left |B^0_{1/4} \cup B^1_{1/4}\right |$ . Define $A^* = B^0_{1/4} \cup B^1_{1/4} \cup C$ .
Conjecture 5.4. There is no bijection $\phi ^* \,:\, \mathcal{H}_{n-1} \to A^*$ with ${\sf avgStretch}(\phi ^*) = O(1)$ .
Acknowledgements
We are grateful to the anonymous referees for carefully reading the paper, and for their comments that greatly improved the presentation.