A SIMPLIFIED LOWER BOUND FOR IMPLICATIONAL LOGIC

EMIL JEŘÁBEK

doi:10.1017/bsl.2025.6

A SIMPLIFIED LOWER BOUND FOR IMPLICATIONAL LOGIC

Part of: General logic Proof theory and constructive mathematics

Published online by Cambridge University Press: 22 January 2025

EMIL JEŘÁBEK

Show author details

EMIL JEŘÁBEK*: Affiliation:
INSTITUTE OF MATHEMATICS CZECH ACADEMY OF SCIENCES ŽITNÁ 25, 115 67 PRAHA 1 CZECH REPUBLIC E-mail: [email protected] URL: https://users.math.cas.cz/~jerabek

Article contents

Abstract
Introduction
Preliminaries
An exponential lower bound
Extensions
Conclusion
Funding
Footnotes
References

Rights & Permissions

Abstract

We present a streamlined and simplified exponential lower bound on the length of proofs in intuitionistic implicational logic, adapted to Gordeev and Haeusler’s dag-like natural deduction.

Keywords

proof complexity intuitionistic logic natural deduction Kleene slash

MSC classification

Primary: 03F20: Complexity of proofs

Secondary: 03B20: Subsystems of classical logic (including intuitionistic logic)

Type: Article
Information: Bulletin of Symbolic Logic , Volume 31 , Issue 1 , March 2025 , pp. 53 - 87

DOI: https://doi.org/10.1017/bsl.2025.6 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright: © The Author(s), 2025. Published by Cambridge University Press on behalf of The Association for Symbolic Logic

1 Introduction

Frege proof systems (often called Hilbert-style systems outside proof complexity) are among the simplest and most natural proof systems for classical and nonclassical propositional logics. By results of Reckhow and Cook [Reference Cook and Reckhow7, Reference Reckhow31], all classical Frege systems are not only polynomially equivalent to each other, but also to natural deduction systems and to sequent calculi (with cut), which is further testimony to their robustness and fundamental status. Although it is commonly assumed for all classical propositional proof systems that some tautologies require exponentially large proofs, this has been proven so far only for relatively weak proof systems, such as constant-depth Frege, polynomial calculus, and cutting planes (see, e.g., [Reference Buss, Nordström, Biere, Heule, Maaren and Walsh5, Reference Krajíček25]). Unrestricted Frege systems are far beyond the reach of current techniques: nothing better is known than a linear lower bound on the number of proof lines and a quadratic bound on the overall proof size [Reference Samuel3, Reference Krajíček23].

Interestingly, the state of affairs is much better in nonclassical logics: Hrubeš [Reference Hrubeš13–Reference Hrubeš15] proved exponential lower bounds on the number of lines in Frege proofs for some modal logics and intuitionistic logic, which was generalized by Jeřábek [Reference Jeřábek19] to all transitive modal and superintuitionistic logics with unbounded branching, and by Jalali [Reference Jalali17] to substructural logics. Even though the techniques are based on variants of the feasible disjunction property (i.e., given a proof of $\varphi \lor \psi $ , we can decide in polynomial time which of $\varphi $ or $\psi $ is provable), and as such ostensibly require disjunction, Jeřábek [Reference Jeřábek20] showed that the superintuitionistic exponential lower bounds hold for a sequence of purely implicational intuitionistic tautologies.

In a series of papers, Gordeev and Haeusler [Reference Gordeev and Haeusler9–Reference Gordeev and Haeusler12] claim to prove that all intuitionistic implicational tautologies have polynomial-size proofs in a dag-like version of (Gentzen/Prawitz-style) natural deduction, which—if true—would imply $\mathrm {NP}=\mathrm {PSPACE}$ . These claims are wrong, as they contradict the above-mentioned exponential lower bounds on the length of proofs of implicational tautologies in intuitionistic proof systems. Unfortunately, this fact may not be so obvious to readers unfamiliar with nonclassical proof complexity literature, and in any event, the full proof of the lower bound requires tracking down multiple papers: the Frege lower bound for implicational tautologies in [Reference Jeřábek20] builds on a lower bound for unrestricted intuitionistic tautologies, as proved in either of [Reference Hrubeš14, Reference Hrubeš15, Reference Jeřábek19]; these in turn rely on an exponential lower bound on the size of monotone circuits separating the Clique–Colouring disjoint NP pair which—in view of an observation of Tardos [Reference Tardos32]—follows from Alon and Boppana [Reference Alon and Boppana1] (improving a superpolynomial lower bound by Razborov [Reference Razborov30]). Finally, one needs a polynomial simulation of natural deduction by Frege systems: this is originally due to Reckhow and Cook [Reference Cook and Reckhow7, Reference Reckhow31], but they state it for a sequent-style formulation of natural deduction rather than Prawitz-style, let alone the further variant introduced only recently by Gordeev and Haeusler; while it is clear to a proof complexity practitioner that the argument can be easily adapted to all such variants, this is, strictly speaking, not explicitly proved in any extant literature.

The primary goal of this paper is to give a simple direct proof of an exponential lower bound on the length of proofs of intuitionistic implicational tautologies in Gordeev and Haeusler’s dag-like natural deduction. The streamlined argument replaces all proof-theoretic components of the lower bound mentioned above (intuitionistic lower bound, reduction to implicational logic, simulation of natural deduction by Frege), thus it is self-contained except for the combinatorial component (i.e., a monotone circuit lower bound; to simplify our tautologies, we will use a lower bound by Hrubeš and Pudlák [Reference Hrubeš and Pudlák16] instead of Alon–Boppana). It is based on the efficient Kleene slash approach employed in [Reference Ferrari, Fiorentini and Fiorino8, Reference Jeřábek18, Reference Jeřábek19, Reference Mints and Kojevnikov27]. While we strive to keep the proof of the main result as simple as possible, we also briefly indicate how to generalize it to recover almost the full strength of the lower bound from [Reference Jeřábek20].

The intended audience of the paper is twofold:

• Readers with some general background in logic or computer science, but unfamiliar with proof complexity. For them, the paper gives a simple, yet detailed, exposition of an exponential lower bound on intuitionistic implicational logic so that they cannot be fooled by the fact that Gordeev and Haeusler’s claims have been published.
• Researchers in proof complexity—not necessarily interested in Gordeev and Haeusler’s claims—for whom the paper brings a new, much shorter proof of the known implicational lower bound, bypassing implicational translation of full intuitionistic logic. We stress that even though the proof system for which it is formulated is not traditional, it is quite natural, and anyway the lower bound also applies to the standard Frege system for implicational intuitionistic logic as the latter obviously embeds in dag-like natural deduction (up to subproofs of Frege axioms, it can be thought of as natural deduction without the $\to $ -introduction rule).

Our proof of the main lower bound does not involve any proof system other than dag-like natural deduction itself. However, for the sake of completeness, we include an appendix showing the equivalence of dag-like natural deduction with the standard intuitionistic implicational Frege system up to polynomial increase in proof size, as well as the polynomial equivalence of both systems to their tree-like versions (adapting the original result of Krajíček along the lines of [Reference Jeřábek20]). Thus, dag-like natural deduction does not offer any significant shortening of proofs compared to the conventional tree-like natural deduction. The appendix may be of independent interest as we took some effort to optimize the bounds.

An anonymous source pointed out that since Gordeev and Haeusler’s “horizontal compression” only changes the shape of the proof, but does not introduce any new formulas, their claims also contradict other well-known results in proof complexity, namely constant-depth Frege lower bounds such as Beame et al. [Reference Beame, Impagliazzo, Krajíček, Pitassi, Pudlák and Woods2]. For a sketch of the argument, take a sequence of tautologies exponentially hard for constant-depth proofs, such as the pigeonhole principle, and convert it to a sequence of (intuitionistically valid) implicational tautologies $\varphi _n$ of polynomial size and constant depth (measured, say, using the definition ). Each $\varphi _n$ has a cut-free sequent proof of polynomial height (and exponential size), which only involves formulas of constant depth by the subformula property, and thus translates to a natural deduction proof of polynomial height using only formulas of constant depth (with polynomially many distinct formulas). Gordeev and Haeusler’s claims imply that this can be compressed to a polynomial-size dag-like natural deduction proof using formulas of constant depth. The latter, however, can be converted to a polynomial-size (classical) constant-depth sequent or Frege proof, contradicting the hardness of the tautologies. We will not pursue this connection further in this paper, and leave the details to an interested reader.

The paper is organized as follows. In Section 2 we review the needed prerequisites such as dag-like natural deduction and monotone Boolean circuits. Section 3 is devoted to the proof of the main exponential lower bound; we discuss extensions of the lower bound in Section 4, and we conclude with a few remarks in Section 5. We present the equivalence of dag-like natural deduction to a Frege system in Appendix A, and the equivalence of both systems to their tree-like versions in Appendix B.

2 Preliminaries

The set $\mathrm {Form}$ of implicational formulas (or just formulas if no confusion arises) is the smallest set that includes the set of propositional variables (or atoms) $\mathrm {Var}=\{p_n:n\in \omega \}$ , and such that if $\varphi $ and $\psi $ are formulas, then $(\varphi \to \psi )$ is a formula. The size $|\varphi |$ of a formula $\varphi $ is the number of occurrences of variables and connectives in $\varphi $ , i.e., $|p_n|=1$ and $|(\varphi \to \psi )|=1+|\varphi |+|\psi |$ . We may omit outer brackets in $\varphi \to \psi $ , and we treat $\to $ as a right-associative operator so that, e.g., $\varphi \to \psi \to \chi \to \omega $ stands for $(\varphi \to (\psi \to (\chi \to \omega )))$ . (Despite these conventions, we may leave various redundant brackets in place to highlight the formula structure.) We will denote formulas with lower-case Greek letters, and for convenience, we will often use lower-case Latin letters (with indices and/or other decoration) other than $p_n$ for variables. We write $\vec p$ for a finite tuple of variables $\langle p_i:i<n\rangle $ , especially if n is immaterial; the notation $\varphi (\vec p)$ indicates that all variables occurring in $\varphi $ are among $\vec p$ .

Upper-case Greek letters will usually denote finite sets or sequences of formulas. Our indices generally start from $0$ ; in particular, $\langle \varphi _i:i<n\rangle $ , or more concisely $\langle \varphi _i\rangle _{i<n}$ , denotes the sequence $\langle \varphi _0,\dots ,\varphi _{n-1}\rangle $ (which is the empty sequence $\langle \rangle $ if $n=0$ ). The length of a sequence $\Gamma =\langle \varphi _i\rangle _{i<n}$ , denoted $|\Gamma |$ , is n, and the size of $\Gamma $ , denoted $\lVert \Gamma \rVert $ , is $\sum _{i<n}|\varphi _i|$ . If $\Gamma =\langle \varphi _i\rangle _{i<n}$ is a sequence of formulas and $\psi \in \mathrm {Form}$ , we introduce the abbreviation $\Gamma \to \psi $ for the formulaFootnote ¹

$$\begin{align*}\varphi_{n-1}\to\dots\to\varphi_1\to\varphi_0\to\psi.\end{align*}$$

Formally, $\Gamma \to \psi $ is defined by induction on n: $\langle \varphi _i\rangle _{i<0}\to \psi $ is $\psi $ and $\langle \varphi _i\rangle _{i<n+1}\to \psi $ is $\varphi _n\to \langle \varphi _i\rangle _{i<n}\to \psi $ .

A substitution is a mapping $\sigma \colon \mathrm {Form}\to \mathrm {Form}$ such that $\sigma (\varphi \to \psi )=(\sigma (\varphi )\to \sigma (\psi ))$ for all $\varphi ,\psi \in \mathrm {Form}$ . If $\Gamma \subseteq \mathrm {Form}$ , we write $\sigma (\Gamma )=\{\sigma (\varphi ):\varphi \in \Gamma \}$ .

The intuitionistic implicational logic $\mathbf {IPC}_\to $ is defined by its consequence relation ${\vdash }\subseteq \mathcal P(\mathrm {Form})\times \mathrm {Form}$ : we put $\Gamma \vdash \varphi $ iff $\varphi $ belongs to the smallest subset of $\mathrm {Form}$ that is closed under the rule of modus ponens

$$\begin{align*}\varphi,\varphi\to\psi\mathrel/\psi,\end{align*}$$

and includes $\Gamma $ and the logical axioms

$$ \begin{align*} \varphi&\to\psi\to\varphi\\ (\varphi\to\psi\to\chi)&\to(\varphi\to\psi)\to(\varphi\to\chi) \end{align*} $$

for $\varphi ,\psi ,\chi \in \mathrm {Form}$ . As is conventional, we omit braces around formulas on the left-hand side of $\vdash $ , and write commas in place of $\cup $ , so that, e.g., $\Gamma ,\varphi ,\psi \vdash \chi $ stands for $\Gamma \cup \{\varphi ,\psi \}\vdash \chi $ ; we may also coerce finite sequences $\Gamma $ to sets. We write $\vdash \varphi $ for $\varnothing \vdash \varphi $ , in which case we say that $\varphi $ is an intuitionistic implicational tautology, or $\mathbf {IPC}_\to $ tautology for short.

Lemma 2.1 (deduction theorem).

Let $\Pi \subseteq \mathrm {Form}$ , $\varphi \in \mathrm {Form}$ , and let $\Gamma $ be a finite sequence of formulas. Then

$$\begin{align*}\Pi,\Gamma\vdash\varphi\iff\Pi\vdash\Gamma\to\varphi.\end{align*}$$

A Kripke model is a structure $\langle W,{\le },{\vDash }\rangle $ , where $\le $ is a partial order on W, and ${\vDash }\subseteq W\times \mathrm {Form}$ satisfies

$$ \begin{align*} x\vDash\varphi&\,\implies\forall y\ge x\:y\vDash\varphi,\\ x\vDash\varphi\to\psi&\iff\forall y\ge x\:(y\vDash\varphi\implies y\vDash\psi) \end{align*} $$

for all $x\in W$ and $\varphi ,\psi \in \mathrm {Form}$ . Unwinding the definitions, we see that for any sequence $\Gamma =\langle \varphi _i\rangle _{i<n}$ ,

$$\begin{align*}x\vDash\Gamma\to\psi\iff\forall y\ge x\:\big((\forall i<n\:y\vDash\varphi_i)\implies y\vDash\psi\big).\end{align*}$$

A formula $\varphi $ holds in $\langle W,{\le },{\vDash }\rangle $ if $x\vDash \varphi $ for all $x\in W$ .

Intuitionistic logic is complete w.r.t. Kripke semantics, even if we only consider finite frames (see e.g., [Reference Chagrov and Zakharyaschev6, Reference Troelstra and van Dalen33]):

Theorem 2.2 (finite model property).

A formula is an $\mathbf {IPC}_\to $ tautology if and only if it holds in all finite Kripke models.

Let us now present Gordeev and Haeusler’s dag-like natural deduction calculus $\mathrm {NM}_\to $ based on [Reference Gordeev and Haeusler10]. An $\mathrm {NM}_\to $ -proof skeleton is a finite directed acyclic graph (dag) $\langle V,E\rangle $ with a unique node of out-degree $0$ , called the root, and with all nodes having in-degree at most $2$ ; nodes of in-degree $0$ , $1$ , and $2$ are called leaves (assumptions), $\mathrm {({\to }I)}$ -nodes, and $\mathrm {({\to }E)}$ -nodes, respectively. If $\langle u,v\rangle \in E$ , then u is a premise Footnote ² of v. A thread is a directed path starting from a leaf; a thread is maximal if it ends in the root. An $\mathrm {NM}_\to $ -derivation $\langle V,E,\gamma \rangle $ is an $\mathrm {NM}_\to $ -proof skeleton $\langle V,E\rangle $ endowed with a vertex labelling $\gamma =\langle \gamma _v:v\in V\rangle $ with $\gamma _v\in \mathrm {Form}$ , such that for all $v\in V$ :

• if v is an $\mathrm {({\to }I)}$ -node, it is labelled with an implication $\alpha \to \beta $ such that the premise of v is labelled with $\beta $ ;
• if v is an $\mathrm {({\to }E)}$ -node, there are formulas $\alpha ,\beta $ such that v is labelled with $\beta $ , and the two premises of v are labelled with $\alpha $ and $\alpha \to \beta $ , respectively.

A thread with leaf v is discharged if it contains an $\mathrm {({\to }I)}$ -node labelled with $\alpha \to \beta $ where $\alpha =\gamma _v$ . Let $\varphi \in \mathrm {Form}$ and $\Gamma \subseteq \mathrm {Form}$ . An $\mathrm {NM}_\to $ -derivation is an $\mathrm {NM}_\to $ -derivation of $\varphi $ from $\Gamma $ if the root is labelled $\varphi $ and the leaves of all undischarged maximal threads are labelled with elements of $\Gamma $ . An $\mathrm {NM}_\to $ -proof of $\varphi $ is an $\mathrm {NM}_\to $ -derivation of $\varphi $ from $\varnothing $ . The number of lines of an $\mathrm {NM}_\to $ -derivation $\Pi =\langle V,E,\gamma \rangle $ is $|V|$ , and the size of $\Pi $ is $\lVert \Pi \rVert =\sum _{v\in V}|\gamma _v|$ .

It may be difficult to verify the condition on discharging maximal threads directly from the definition. As observed in [Reference Gordeev and Haeusler10], it can be checked efficiently as follows. Given an $\mathrm {NM}_\to $ -derivation $\Pi =\langle V,E,\gamma \rangle $ , we define for each $v\in V$ a set $A_v\subseteq \{\gamma _u:u \text { is a leaf}\}$ by well-founded recursion:

$$\begin{align*}A_v=\begin{cases} \{\gamma_v\},&v\text{ is a leaf,}\\ A_u\smallsetminus\{\alpha\},&v\text{ is an }\mathrm{({\to}I)}\text{-node with premise }u\text{ and }\gamma_v=\alpha\to\beta,\\ A_{u_0}\cup A_{u_1},&v\text{ is an }\mathrm{({\to}E)}\text{-node with premises }u_0\text{ and }u_1. \end{cases}\end{align*}$$

Note that given $\Pi $ , we can compute $\langle A_v:v\in V\rangle $ in polynomial time.

Lemma 2.3 [Reference Gordeev and Haeusler10].

An $\mathrm {NM}_\to \!$ -derivation $\langle V,E,\gamma \rangle $ with root $\varrho $ is a derivation of $\gamma _\varrho $ from $\Gamma $ if and only if $A_\varrho \subseteq \Gamma $ .

Proof Show that $A_v$ is the set of labels of undischarged threads ending in v by well-founded induction on v.

Likewise, we can show the soundness of $\mathrm {NM}_\to $ -derivations by well-founded induction on v, using the deduction theorem.

Lemma 2.4. For any $\mathrm {NM}_\to $ -derivation $\langle V,E,\gamma \rangle $ and $v\in V$ , $A_v\vdash \gamma _v$ .

On the other hand, tree-like $\mathrm {NM}_\to $ derivations are the same as the implicational fragment of the usual Gentzen–Prawitz natural deduction (see, e.g., [Reference Mancosu, Galvan and Zach26, Reference Prawitz28]). This implies the completeness of the calculus, as observed in [Reference Gordeev and Haeusler10].

Lemma 2.5. A formula $\varphi $ is an $\mathbf {IPC}_\to $ tautology if and only if it has an $\mathrm {NM}_\to $ -proof.

We assume familiarity with classical propositional logic, but briefly, we consider formulas built from propositional variables using the connectives $\{\to ,\land ,\lor ,\neg ,\top ,\bot \}$ . An assignment to a set of variables X is a function $a\colon X\to \mathbf 2$ , where $\mathbf 2=\{0,1\}$ . We denote the set of all such assignments as $\mathbf 2^X$ . For any $a\in \mathbf 2^X$ and a formula $\varphi $ over variables X, we define the relation $a\vDash \varphi $ (in words, a satisfies $\varphi $ ) in the usual way:

$$ \begin{align*} a\vDash p&\iff a(p)=1,\quad p\in X,\\ a\vDash(\varphi\to\psi)&\iff a\nvDash\varphi\text{ or }a\vDash\psi,\\ a\vDash\neg\varphi&\iff a\nvDash\varphi, \end{align*} $$

and so on for the other connectives. A formula $\varphi $ is a classical tautology if $a\vDash \varphi $ for all assignments a to the variables of $\varphi $ .

We also need a bit of circuit complexity. A monotone circuit over a set X of variables is $C=\langle V,E,g\rangle $ where $\langle V,E\rangle $ is a dag with a unique node $\varrho $ of out-degree $0$ (the root), endowed with a labelling $g\colon V\to X\cup \{\land ,\lor \}$ such that nodes v with $g(v)\in X$ have in-degree $0$ . Nodes $v\in V$ are also called gates, and edges $e\in E$ are called wires. We may write $C(\vec p)$ to denote that C is a circuit over a finite tuple of variables $\vec p$ . The size of a circuit $C=\langle V,E,g\rangle $ is $|C|=|E|$ (i.e., the number of wires). By well-founded recursion, any assignment $a\colon X\to \mathbf 2$ extends to a unique function $\hat a\colon V\to \mathbf 2$ , called the evaluation of C, such that

$$\begin{align*}\hat a(v)=\left\{\!\!\begin{array}{ll}a(g(v)),&g(v)\in X,\\ \mathrm{inf} \hphantom{,}\{\hat a(u):\langle u,v\rangle\in E\},&g(v)=\land,\\ \mathrm{sup}\{\hat a(u):\langle u,v\rangle\in E\},&g(v)=\lor,\end{array}\right. \end{align*}$$

where $\inf \varnothing =1$ , $\sup \varnothing =0$ (thus $\land $ - and $\lor $ -gates without inputs act as constants $\top $ and $\bot $ , respectively). A circuit C with root $\varrho $ computes a Boolean function $f\colon \mathbf 2^X\to \mathbf 2$ if $f(a)=\hat a(\varrho )$ for each $a\in \mathbf 2^X$ . More generally, a disjoint pair is $P=\langle P^0,P^1\rangle $ where $P^0,P^1\subseteq \mathbf 2^X$ and $P^0\cap P^1=\varnothing $ ; a circuit C separates P if $\hat a(\varrho )=i$ for each $i\in \mathbf 2$ and $a\in P^i$ . We will write $a\vDash C$ for $\hat a(\varrho )=1$ .

Let $\vec p$ , $\vec q$ , and $\vec r$ be pairwise disjoint tuples of variables, and $\varphi (\vec p,\vec q)$ and $\psi (\vec p,\vec r)$ classical formulas. Then a circuit $C(\vec p)$ interpolates the implication $\varphi \to \psi $ (which must be a classical tautology) if $\varphi (\vec p,\vec q)\to C(\vec p)$ and $C(\vec p)\to \psi (\vec p,\vec r)$ are classical tautologies (i.e., $a\vDash \varphi \implies a\vDash C$ and $a\vDash C\implies a\vDash \psi $ for all assignments $a\in \mathbf 2^{\{\vec p,\vec q,\vec r\}}$ ), or in other words, if C separates the interpolation pair $\mathrm {Itp}_{\varphi ,\psi }=\langle \mathrm {Itp}_\psi ^0,\mathrm {Itp}_\varphi ^1\rangle $ , where

$$ \begin{align*} \mathrm{Itp}_\psi^0&=\{a\in\mathbf2^{\vec p}:\exists c\in\mathbf2^{\vec r}\:\langle a,c\rangle\nvDash\psi\},\\ \mathrm{Itp}_\varphi^1&=\{a\in\mathbf2^{\vec p}:\exists b\in\mathbf2^{\vec q}\:\langle a,b\rangle\vDash\varphi\}. \end{align*} $$

For any $n\ge 2$ , we define the Colouring–Cocolouring disjoint pair $\mathrm {CC}_n=\langle \mathrm {CC}_n^0,\mathrm {CC}_n^1\rangle $ over the set of variables $X_n=\binom {[n]}2$ (i.e., the set of unordered pairs of elements of $[n]=\{0,\dots ,n-1\}$ ) by

$$ \begin{align*} \mathrm{CC}_n^0&=\big\{E\subseteq X_n:\text{the graph }\langle[n],E\rangle\text{ is }k\text{-colourable}\big\},\\ \mathrm{CC}_n^1&=\big\{E\subseteq X_n:\text{the graph }\langle[n],\overline{E}\rangle\text{ is }k\text{-colourable}\big\}, \end{align*} $$

where $\overline {E}=X_n\smallsetminus E$ , $k=\lceil \sqrt n\rceil -1$ , and we identify $E\subseteq X_n$ with its characteristic function $X_n\to \mathbf 2$ . To see that $\mathrm {CC}_n^0\cap \mathrm {CC}_n^1=\varnothing $ , observe that if $c_0,c_1\colon [n]\to [k]$ are k-colourings of $\langle [n],E\rangle $ and $\langle [n],\overline {E}\rangle $ , respectively, then $c_0\times c_1\colon [n]\to [k]\times [k]$ is an injection, thus $n\le k^2$ .

An exponential lower bound on the monotone circuit complexity (and even monotone real circuit complexity) of $\mathrm {CC}_n$ was proved by Hrubeš and Pudlák [Reference Hrubeš and Pudlák16, Theorem 10], using machinery from Jukna [Reference Jukna21]:

Theorem 2.6. For $n\gg 0$ , all monotone circuits separating $\mathrm {CC}_n$ have size $2^{\Omega (k^{1/4})}=2^{\Omega (n^{1/8})}$ .

Strictly speaking, Hrubeš and Pudlák work with bounded fan-in monotone circuits, i.e., such that the in-degree of all gates is at most $2$ , and they measure size by the number of gates. This makes no difference, as a d-ary $\land $ - or $\lor $ -gate can be simulated by $d-1$ binary gates using $2(d-1)$ wires, thus any monotone circuit with s wires can be transformed to a bounded fan-in monotone circuit with $s'\le 2s$ wires; moreover, a circuit with $s'$ wires has at most $s'+1$ gates (we may associate each node other than the root with an outgoing wire). This mild size increase does not affect the shape of the lower bound in Theorem 2.6.

3 An exponential lower bound

In this section, we will prove our main lower bound, viz. there is an explicit sequence of implicational intuitionistic tautologies that require $\mathrm {NM}_\to $ -proofs with exponentially many lines.

Let us start with construction of the $\mathbf {IPC}_\to $ tautologies, which will express the disjointness of $\mathrm {CC}_n$ . Intuitionistic tautologies expressing disjointness of the Clique–Colouring pair were first considered by Hrubeš [Reference Hrubeš14]; they were made negation-free in Jeřábek [Reference Jeřábek19], and implicational in [Reference Jeřábek20]. We will further simplify the tautologies from [Reference Jeřábek20] by using a somewhat more direct translation to implicational logic, and by replacing Clique–Colouring with the Colouring–Coclouring pair, which leads to more symmetric (and shorter) formulas. Fix $n\ge 2$ and $k=\lceil \sqrt n\rceil -1$ . Our tautologies will employ variables $p_{ij}$ and $p^{\prime }_{i,j}$ ( $i<j<n$ ), representing the edge relation of a graph $G=\langle [n],E\rangle $ and its complement, and variables $q_{il}$ and $r_{il}$ ( $i<n$ , $l<k$ ), representing a k-colouring of G and of its complement (respectively).

To motivate the formal definition below, we can state in classical propositional logic that $\vec q$ define a (possibly multivalued) k-colouring of G by the formula

$$\begin{align*}\bigwedge_{i<n}\bigvee_{l<k}q_{il}\land\bigwedge_{\substack{i<j<n\\unicode{x0142}<k}}\neg(q_{il}\land q_{jl}\land p_{ij}),\end{align*}$$

and similarly for the complement, thus the disjointness of $\mathrm {CC}_n$ is expressed by the classical tautology

$$\begin{align*}\Big(\bigwedge_{i<n}\bigvee_{l<k}q_{il}\to\bigvee_{\substack{i<j<n\\unicode{x0142}<k}}(q_{il}\land q_{jl}\land p_{ij})\Big) \lor\Big(\bigwedge_{i<n}\bigvee_{l<k}q_{il}\to\bigvee_{\substack{i<j<n\\unicode{x0142}<k}}(r_{il}\land r_{jl}\land\neg p_{ij})\Big),\end{align*}$$

which can be made negation-free using the $\vec p\,'$ variables:

$$ \begin{align*} \bigwedge_{i<j<n}(p_{ij}\lor p^{\prime}_{ij})\to{}& \Big(\bigwedge_{i<n}\bigvee_{l<k}q_{il}\to\bigvee_{\substack{i<j<n\\unicode{x0142}<k}}(q_{il}\land q_{jl}\land p_{ij})\Big)\\ {}\lor{}&\Big(\bigwedge_{i<n}\bigvee_{l<k}r_{il}\to\bigvee_{\substack{i<j<n\\unicode{x0142}<k}}(r_{il}\land r_{jl}\land p^{\prime}_{ij})\Big). \end{align*} $$

This turns out to be an intuitionistic tautology as well. In order to convert it to an implicational tautology, we introduce further auxiliary variables u, v, and w: the idea is to rewrite an implication $\psi \to \chi $ as $(\chi \to u)\to (\psi \to u)$ , where $\psi \to u$ and $\chi \to u$ can be written using implicational formulas when $\psi $ and $\chi $ are monotone formulas. After some manipulation we end up with the following.

Definition 3.1. Let $n\ge 2$ and $k=\lceil \sqrt n\rceil -1$ . We define the following implicational formulas in variables $p_{ij}$ , $p^{\prime }_{ij}$ , $q_{il}$ , $r_{il}$ , u, v, and w, where $i<j<n$ and $l<k$ :

$$ \begin{align*} \alpha_n(\vec p,\vec q,v) &=\big\langle\langle q_{il}\to v\rangle_{l<k}\to v\big\rangle_{i<n}\\ &\qquad\to\langle q_{il}\to q_{jl}\to p_{ij}\to v\rangle_{\substack{i<j<n\\unicode{x0142}<k}}\to v,\\ \tau_n(\vec p,\vec p\,',\vec q,\vec r,u,v,w) &=\big\langle(p_{ij}\to u)\to(p^{\prime}_{ij}\to u)\to u\big\rangle_{i<j<n}\\ &\qquad\to\big(\alpha_n(\vec p,\vec q,v)\to u\big)\to\big(\alpha_n(\vec p\,',\vec r,w)\to u\big)\to u. \end{align*} $$

(The order in which we enumerate the multiply-indexed sequences such as $\langle \dots \rangle _{i<j<n}$ does not matter.)

Observation 3.2. $|\tau _n|=O(n^2k)=O(n^{5/2})$ .

Lemma 3.3. The formulas $\tau _n$ are intuitionistic implicational tautologies.

Proof Assume for contradiction that $\tau _n$ does not hold in a finite Kripke model $\langle W,{\le },{\vDash }\rangle $ . This means that there exists $x\in W$ such that $x\vDash (p_{ij}\to u)\to (p^{\prime }_{ij}\to u)\to u$ for all $i<j<n$ , $x\vDash \alpha _n(\vec p,\vec q,v)\to u$ , $x\vDash \alpha _n(\vec p\,',\vec r,w)\to u$ , but $x\nvDash u$ . Replacing x with some $\tilde x\ge x$ if necessary, we may assume that x is maximal such that $x\nvDash u$ , i.e., $x'\vDash u$ for all $x'>x$ .

For each $i<j<n$ , $x\vDash (p_{ij}\to u)\to (p^{\prime }_{ij}\to u)\to u$ implies that $x\nvDash p_{ij}\to u$ or $x\nvDash p^{\prime }_{ij}\to u$ . Since u is true in all $x'>x$ , we obtain

(3.1)

$$ \begin{align} \forall i<j<n\:(x\vDash p_{ij}\text{ or }x\vDash p^{\prime}_{ij}). \end{align} $$

Since $x\vDash \alpha _n(\vec p,\vec q,v)\to u$ , we have $x\nvDash \alpha _n(\vec p,\vec q,v)$ , thus there exists $y\ge x$ such that $y\vDash \langle q_{il}\to v\rangle _{l<k}\to v$ for all $i<n$ , and $y\vDash q_{il}\to q_{jl}\to p_{ij}\to v$ for all $i<j<n$ and $l<k$ , but $y\nvDash v$ . As above, we may assume that $y'\vDash v$ for all $y'>y$ . Then for every $i<n$ , $y\vDash \langle q_{il}\to v\rangle _{l<k}\to v$ implies $y\nvDash q_{il}\to v$ for some $l<k$ , whence $y\vDash q_{il}$ by maximality. That is, we can find a colouring function $c\colon [n]\to [k]$ such that $y\vDash q_{i,c(i)}$ for all $i<n$ .

If $i<j<n$ are such that $c(i)=c(j)=l$ , then $y\vDash q_{il}\to q_{jl}\to p_{ij}\to v$ and $y\nvDash v$ implies $y\nvDash p_{ij}$ , and a fortiori $x\nvDash p_{ij}$ . This shows that c is a proper k-colouring of the graph $\langle [n],E\rangle $ , where $E=\big \{\{i,j\}:x\vDash p_{ij}\big \}$ .

Since $x\vDash \alpha _n(\vec p\,',\vec r,w)\to u$ , the same argument gives a k-colouring $c'\colon [n]\to [k]$ of $\langle [n],E'\rangle $ , where $E'=\big \{\{i,j\}:x\vDash p^{\prime }_{ij}\big \}$ . But then (3.1) implies that the function $c\times c'\colon [n]\to [k]\times [k]$ is injective, thus $n\le k^2<n$ , a contradiction.

The remaining task is to prove a form of monotone feasible interpolation (based on feasible disjunction property) for $\mathrm {NM}_\to $ , which will imply an exponential lower bound for the $\tau _n$ tautologies using Theorem 2.6. There are many ways how to prove the disjunction property of intuitionistic logic and various intuitionistic theories, one of them being Kleene’s slash [Reference Kleene22]. Efficient versions of Kleene’s slash were used by Ferrari, Fiorentini, and Fiorino [Reference Ferrari, Fiorentini and Fiorino8] (under the umbrella machinery of “extraction calculi”) to prove the feasible disjunction property for the intuitionistic natural deduction system (which was originally proved by Buss and Mints [Reference Buss and Mints4] using a form of cut elimination); by Mints and Kojevnikov [Reference Mints and Kojevnikov27] to prove the polynomial equivalence of intuitionistic Frege systems using admissible rules (with a considerably simplified argument given by Jeřábek [Reference Jeřábek18]); and by Jeřábek [Reference Jeřábek19] to prove an exponential lower bound on intuitionistic Extended Frege proofs. We will adapt the argument from [Reference Jeřábek19] to a purely implicational setting, using a disjunction-free analogue of the disjunction property.

Definition 3.4. If $P\subseteq \mathrm {Form}$ , a P-slash is a unary predicate $|^{}$ on $\mathrm {Form}$ such that

$$ \begin{align*} |^{}(\varphi\to\psi)&\iff\big(\|^{}\varphi\implies|^{}\psi\big) \end{align*} $$

for all $\varphi ,\psi \in \mathrm {Form}$ , where we define the short-hand

$$ \begin{align*} \|^{}\varphi&\iff|^{}\varphi\text{ and }\varphi\in P. \end{align*} $$

If $\Gamma $ is a set of formulas, we write $\|^{}\Gamma $ if $\|^{}\varphi $ for all $\varphi \in \Gamma $ . When we need to consider several slash operators at the same time, we may distinguish them by subscripts, which are carried over to $\|^{}$ . We warn the reader that a P-slash is not uniquely determined by P, as we have liberty in defining $|^{} p$ for $p\in \mathrm {Var}$ ; however, an arbitrary choice of $|$ on $\mathrm {Var}$ has a unique extension to a P-slash.

If $\Pi =\langle V,E,\gamma \rangle $ is an $\mathrm {NM}_\to $ -derivation, a set $P\subseteq \mathrm {Form}$ is $\Pi $ -closed if $A_v\subseteq P\implies \gamma _v\in P$ for all $v\in V$ .

Unwinding the definition, we obtain:

Observation 3.5. If $\Gamma $ is a finite sequence of formulas, and $\varphi \in \mathrm {Form}$ , then

$$\begin{align*}|^{}(\Gamma\to\varphi)\iff\big(\|^{}\Gamma\implies|^{}\varphi\big).\end{align*}$$

We first verify that being $\Pi $ -closed is enough to ensure the soundness of the slash:

Lemma 3.6. Let $\Pi $ be an $\mathrm {NM}_\to $ -proof of $\varphi $ , P be a $\Pi $ -closed set of formulas, and $|^{}$ be a P-slash. Then $\|^{}\varphi $ .

Proof We prove

(3.2)

$$ \begin{align} \|^{} A_v\implies\|^{}\gamma_v \end{align} $$

by well-founded induction on $v\in V$ . This is trivial if v is a leaf. Let v be an $\mathrm {({\to }E)}$ -node with premises $u_0$ , $u_1$ , such that $\gamma _{u_0}=\alpha $ , $\gamma _{u_1}=(\alpha \to \beta )$ , and $\gamma _v=\beta $ , and assume $\|^{} A_v$ . Since $A_{u_i}\subseteq A_v$ , the induction hypothesis gives $\|^{}\alpha $ and $\|^{}(\alpha \to \beta )$ . Then the definition of $|^{}(\alpha \to \beta )$ ensures $|^{}\beta $ , and $A_v\subseteq P$ implies $\beta \in P$ as P is $\Pi $ -closed, thus $\|^{}\beta $ .

Finally, let v be an $\mathrm {({\to }I)}$ -node with premise u such that $\gamma _u=\beta $ and $\gamma _v=(\alpha \to \beta )$ , and assume $\|^{} A_v$ . Then $A_v\subseteq P$ implies $\gamma _v\in P$ as P is $\Pi $ -closed, hence it suffices to show $|^{}(\alpha \to \beta )$ . Thus, assume $\|^{}\alpha $ ; since $A_u\subseteq A_v\cup \{\alpha \}$ , we have $\|^{} A_u$ , thus $\|^{}\beta $ by the induction hypothesis.

Next, we need to furnish ourselves with $\Pi $ -closed sets.

Definition 3.7. Let $\Pi =\langle V,E,\gamma \rangle $ be an $\mathrm {NM}_\to $ -derivation and $P\subseteq \mathrm {Form}$ . The $\Pi $ -closure of P, denoted $\operatorname {\mathrm {cl}}_\Pi (P)$ , is $P_{|V|}$ , where we define $P_i$ for each $i\in \omega $ by

$$ \begin{align*} P_0&=P,\\ P_{i+1}&=P_i\cup\{\gamma_v:v\in V,A_v\subseteq P_i\}. \end{align*} $$

Lemma 3.8. Let $\Pi $ be an $\mathrm {NM}_\to $ -derivation and $P\subseteq \mathrm {Form}$ .

(i) The set $\operatorname {\mathrm {cl}}_\Pi (P)\supseteq P$ is $\Pi $ -closed.
(ii) $P\vdash \varphi $ for all $\varphi \in \operatorname {\mathrm {cl}}_\Pi (P)$ .

Proof (i): Let $\Pi =\langle V,E,\gamma \rangle $ and $t=|V|$ . It is clear from the definition that if $P_i=P_{i+1}$ , then $P_i$ is $\Pi $ -closed, and $P_i=P_j$ for all $j\ge i$ . Thus, it suffices to shows that $P_i=P_{i+1}$ for some $i\le t$ . If not, then $P=P_0\subsetneq P_1\subsetneq \dots \subsetneq P_{t+1}$ , thus $|P_i\smallsetminus P|\ge i$ for each $i\le t+1$ by induction on i; but $P_i\subseteq P\cup \{\gamma _v:v\in V\}$ , thus $t\ge |P_{t+1}\smallsetminus P|\ge t+1$ , a contradiction.

(ii): We can prove $P\vdash \varphi $ for all $\varphi \in P_i$ by induction on i using Corollary 2.4.

It will be crucial in what follows that $\Pi $ -closure is efficiently computable: e.g., it is easy to see that it is computable in polynomial time; but what we will actually need is that it is computable by polynomial-size monotone circuits in the following sense.

Lemma 3.9. Let $\Pi =\langle V,E,\gamma \rangle $ be an $\mathrm {NM}_\to $ -derivation with $t=|V|$ lines, $F=\{\varphi _i:i<n\}\subseteq \mathrm {Form}$ be such that $\{\gamma _v:v\in V\}\subseteq F$ , and $\varphi \in F$ .

Then there exists a monotone circuit C of size $O(t^3)$ over variables $X=\{x_i:i<n\}$ such that for every assignment $a\in \mathbf 2^X$ ,

$$\begin{align*}a\vDash C\iff\varphi\in\operatorname{\mathrm{cl}}_\Pi\big(\{\varphi_i:a(x_i)=1\}\big).\end{align*}$$

Proof We may assume $\varphi =\varphi _0$ . If $\varphi \notin F_\Pi =\{\gamma _v:v\in V\}$ , then $\varphi \in \operatorname {\mathrm {cl}}_\Pi (P)\iff \varphi \in P$ , which is computable by the trivial circuit $C=x_0$ , thus we may assume $\varphi \in F_\Pi $ . More generally, we observe that $\operatorname {\mathrm {cl}}_\Pi (P)=P\cup \operatorname {\mathrm {cl}}_\Pi (P\cap F_\Pi )$ , thus we may assume $F=F_\Pi $ ; in particular, $n\le t$ .

We consider a circuit C with nodes $y_{i,j}$ for $i<n$ and $j\le t$ , and $z_{v,j}$ for $v\in V$ and $j<t$ , wired such that

$$ \begin{align*} y_{i,0}&\equiv x_i,\\ y_{i,j+1}&\equiv y_{i,j}\lor\bigvee_{\substack{v\in V\\\gamma_v=\varphi_i}}z_{v,j},\\ z_{v,j}&\equiv\bigwedge_{\substack{i<n\\\varphi_i\in A_v}}y_{i,j}. \end{align*} $$

We define the root of C to be $y_{0,t}$ (and we remove nodes from which $y_{0,t}$ is not reachable to satisfy the formal definition of a circuit). It follows from the definition by induction on j that if $a\in \mathbf 2^X$ and $P=\{\varphi _i:a(x_i)=1\}$ , then

$$ \begin{align*} \hat a(y_{i,j})=1&\iff\varphi_i\in P_j,\\ \hat a(z_{v,j})=1&\iff A_v\subseteq P_j, \end{align*} $$

where $\hat a$ is the evaluation of C extending a. Consequently, $\varphi \in \operatorname {\mathrm {cl}}_\Pi (P)$ iff $\overline {a}(y_{0,t})=1$ .

In order to determine $|C|$ , for each $j<t$ there are n wires going from $y_{i,j}$ to $y_{i,j+1}$ , t wires (one for each $v\in V$ ) going from $z_{v,j}$ to $y_{i,j+1}$ where $\gamma _v=\varphi _i$ , and $\sum _v|A_v|\le nt$ wires going from $y_{i,j}$ to $z_{v,j}$ such that $\varphi _i\in A_v$ . Thus, $|C|\le (n+t+nt)t=O(nt^2)=O(t^3)$ , using $n\le t$ .

Pudlák [Reference Pudlák, Barry Cooper and Truss29] showed that the feasible disjunction property of intuitionistic calculi can serve a similar role as feasible interpolation for classical proof systems, and as such implies conditional lower bounds on the length of intuitionistic proofs. Hrubeš [Reference Hrubeš14] discovered how to modify the set-up to obtain an analogue of feasible monotone interpolation (first considered by Krajíček [Reference Krajíček24]), which yields unconditional exponential lower bounds utilizing monotone circuit lower bounds such as Alon and Boppana [Reference Alon and Boppana1]. These results naturally rely on the presence of disjunction. Jeřábek [Reference Jeřábek20] obtained a lower bound on implicational intuitionistic logic based using implicational translations of intuitionistic formulas, but here we follow a more direct approach: we introduce a version of feasible monotone interpolation based on a “disjunction-free disjunction property”. This is the main new idea of this paper. To help the reader with intuition, we first prove a most simple version of disjunction-free feasible disjunction propertyFootnote ³ , although we will not really use this statement later.

Lemma 3.10. Given an $\mathrm {NM}_\to $ -proof $\Pi $ of a formula $\varphi $ of the form

$$\begin{align*}(\alpha_0\to u)\to(\alpha_1\to u)\to u,\end{align*}$$

where the variable u does not occur in $\alpha _0$ and $\alpha _1$ , we can compute in polynomial time an $i\in \{0,1\}$ such that $\vdash \alpha _i$ .

Proof Put $P=\operatorname {\mathrm {cl}}_\Pi (\alpha _0\to u,\alpha _1\to u)$ , and let $|^{}$ be a P-slash such that $\mathord \nmid ^{} u$ . Since $|^{}\varphi $ by Lemmas 3.6 and 3.8, we have $\mathord \nparallel ^{}(\alpha _i\to u)$ for some $i<2$ by Observation 3.5. In view of $(\alpha _i\to u)\in P$ , this means $\mathord \nmid ^{}(\alpha _i\to u)$ , thus $\|^{}\alpha _i$ . That is, we have verified

$$\begin{align*}\alpha_0\in P\text{ or }\alpha_1\in P.\end{align*}$$

Given $\Pi $ , we can compute P in polynomial time, hence we can compute $i<2$ such that $\alpha _i\in P$ . It remains to verify that this implies $\vdash \alpha _i$ . Lemma 3.8 gives

$$\begin{align*}\alpha_0\to u,\alpha_1\to u\vdash\alpha_i.\end{align*}$$

But u does not occur in $\alpha _i$ , hence we may substitute it with $\top $ , obtaining $\vdash \alpha _i$ .

We now generalize this argument to a Hrubeš-style feasible monotone interpolation.

Theorem 3.11. Let $\vec p=\langle p_i:i<n\rangle $ , $\vec p\,'=\langle p^{\prime }_i:i<n\rangle $ , $\vec q$ , $\vec r$ , and u be pairwise disjoint tuples of variables, and assume that a formula $\varphi $ of the form

$$\begin{align*}\big\langle(p_i\to u)\to(p^{\prime}_i\to u)\to u\big\rangle_{i<n} \to\big(\alpha_0(\vec p,\vec q)\to u\big)\to\big(\alpha_1(\vec p\,',\vec r)\to u\big)\to u\end{align*}$$

has an $\mathrm {NM}_\to $ -proof with t lines. Then there exists a monotone circuit $C(\vec p)$ of size $O(t^3)$ that interpolates the classical tautology

$$\begin{align*}\neg\alpha_1(\neg\vec p,\vec r)\to\alpha_0(\vec p,\vec q),\end{align*}$$

where $\neg \vec p$ denotes $\langle \neg p_i:i<n\rangle $ .

Proof Let $\Pi =\langle V,E,\gamma \rangle $ be a proof of $\varphi $ with s lines. If $I\subseteq [n]$ , we write $p^{}_I=\{p_i:i\in I\}$ , and similarly for $p^{\prime }_I$ . We define

$$ \begin{align*} P&=\{(p_i\to u)\to(p^{\prime}_i\to u)\to u:i<n\}\cup\{\alpha_j\to u:j<2\},\\ P_{I,J}&=\operatorname{\mathrm{cl}}_\Pi(P\cup p^{}_I\cup p^{\prime}_J) \end{align*} $$

for each $I,J\subseteq [n]$ . Let $|^{}_{I,J}$ be a $P_{I,J}$ -slash such that $\mathord \nmid ^{}_{I,J}u$ and $|^{}_{I,J}x$ for all variables $x\ne u$ .

If $i\in I$ , then $\|^{}_{I,J}p_i$ , thus $\mathord \nmid ^{}_{I,J}(p_i\to u)$ , and $|^{}_{I,J}(p_i\to u)\to (p^{\prime }_i\to u)\to u$ by Observation 3.5. Likewise if $i\in J$ , using $\mathord \nmid ^{}_{I,J}(p^{\prime }_i\to u)$ . In view of $(p_i\to u)\to (p^{\prime }_i\to u)\to u \in P_{I,J}$ , we obtain

$$\begin{align*}I\cup J=[n]\implies\|^{}_{I,J}\big\{(p_i\to u)\to(p^{\prime}_i\to u)\to u:i<n\big\}.\end{align*}$$

On the other hand, $|^{}_{I,J}\varphi $ by Lemmas 3.6 and 3.8, thus assuming $I\cup J=[n]$ , Observation 3.5 implies $\mathord \nparallel ^{}_{I,J}(\alpha _j\to u)$ for some $j<2$ . Since $\alpha _j\to u$ is in $P_{I,J}$ , this means $\mathord \nmid ^{}_{I,J}(\alpha _j\to u)$ , which implies $\|^{}_{I,J}\alpha _j$ . That is,

$$\begin{align*}I\cup J=[n]\implies\alpha_0\in P_{I,J}\text{ or }\alpha_1\in P_{I,J}.\end{align*}$$

Applying this to $J=\overline {I}:=[n]\smallsetminus I$ , and using the monotonicity of $\operatorname {\mathrm {cl}}_\Pi $ , we obtain

(3.3)

$$ \begin{align} \forall I\subseteq[n]\:\big(\alpha_0\in P_{I,[n]}\text{ or }\alpha_1\in P_{[n],\overline{I}}\big). \end{align} $$

Put $F=P\cup p^{}_{[n]}\cup p^{\prime }_{[n]}\cup \{\gamma _v:v\in V\}$ . By Lemma 3.9, there is a monotone circuit of size $O(t^3)$ that determines whether $\alpha \in \operatorname {\mathrm {cl}}_\Pi (S)$ for a given $S\subseteq F$ , using variables corresponding to each $p_i\in F$ , which we may identify with $p_i$ itself, variables corresponding to formulas in $P\cup p^{\prime }_{[n]}$ , which we may substitute with $\top $ , and variables corresponding to other formulas from F, which we may substitute with $\bot $ . We obtain a monotone circuit $C(\vec p)$ of size $O(t^3)$ such that

(3.4)

$$ \begin{align} a\vDash C\iff\alpha\in P_{I(a),[n]} \end{align} $$

for all assignments a, where $I(a)=\{i<n:a(p_i)=1\}$ .

We claim that C interpolates $\neg \alpha _1(\neg \vec p,\vec r)\to \alpha _0(\vec p,\vec q)$ . Let $a\in \mathbf 2^{\{\vec p,\vec q,\vec r\}}$ . On the one hand, assume $a\vDash C$ ; we need to show $a\vDash \alpha _0$ . We have $\alpha _0\in P_{I(a),[n]}$ by (3.4). Since all formulas in P are implied by u, we have

$$\begin{align*}p^{}_{I(a)},p^{\prime}_{[n]},u\vdash\alpha_0(\vec p,\vec q)\end{align*}$$

by Lemma 3.8. But $\alpha _0$ does not contain the variables $p^{\prime }_i$ or u, hence we may substitute these with $\top $ , obtaining

$$\begin{align*}p^{}_{I(a)}\vdash\alpha_0(\vec p,\vec q).\end{align*}$$

Since $a\vDash p^{}_{I(a)}$ , also $a\vDash \alpha _0$ .

On the other hand, assume $a\nvDash C$ ; we will verify $a\vDash \alpha _1(\neg \vec p,\vec r)$ . We have $\alpha _0\notin P_{I(a),[n]}$ by (3.4), hence $\alpha _1\in P_{[n],\overline {I(a)}}$ by (3.3), thus

$$\begin{align*}p^{}_{[n]},p^{\prime}_{\overline{I(a)}},u\vdash\alpha_1(\vec p\,',\vec r)\end{align*}$$

by Lemma 3.8. Substituting $\top $ for $\vec p$ and u, we obtain

$$\begin{align*}p^{\prime}_{\overline{I(a)}}\vdash\alpha_1(\vec p\,',\vec r).\end{align*}$$

Finally, we can substitute $p^{\prime }_i$ with $\neg p_i$ for each i, getting

$$\begin{align*}\neg p^{}_{\overline{I(a)}}\vdash\alpha_1(\neg\vec p,\vec r)\end{align*}$$

(in intuitionistic or classical logic with $\neg $ ). Since a satisfies the left-hand side, this implies $a\vDash \alpha _1(\neg \vec p,\vec r)$ .

We are ready to prove the main lower bound by applying Theorem 3.11 to the $\tau _n$ tautologies from Definition 3.1; we only need to observe that interpolation of the implication $\neg \alpha _n(\neg \vec p,\vec r,w)\to \alpha _n(\vec p,\vec q,v)$ is essentially identical to separation of the $\mathrm {CC}_n$ disjoint pair.

Theorem 3.12. If n is sufficiently large, then every $\mathrm {NM}_\to $ -proof of $\tau _n$ has at least $2^{\Omega (n^{1/8})}$ lines.

Consequently, there are infinitely many intuitionistic implicational tautologies $\varphi $ such that every $\mathrm {NM}_\to $ -proof of $\varphi $ needs to have at least $2^{\Omega (|\varphi |^{1/20})}$ lines.

Proof It suffices to prove the first part; the second part then follows using Observation 3.2.

If $\tau _n$ has an $\mathrm {NM}_\to $ -proof with t lines, there is a monotone circuit $C(\vec p)$ of size $O(t^3)$ that interpolates

(3.5)

$$ \begin{align} \neg\alpha_n(\neg\vec p,\vec r,w)\to\alpha_n(\vec p,\vec q,v) \end{align} $$

by Theorem 3.11. We claim that C separates $\mathrm {CC}_n$ , which implies $t=2^{\Omega (n^{1/8})}$ by Theorem 2.6.

Let $E\subseteq \binom {[n]}2$ , and let e be the corresponding assignment to $\vec p$ , i.e., for each $i<j<n$ ,

$$\begin{align*}e(p_{ij})=1\iff\{i,j\}\in E.\end{align*}$$

Assume that $\langle [n],E\rangle $ is k-colourable; we need to show $e\nvDash C$ . Fix a vertex colouring $c\colon [n]\to [k]$ , and extend e to an assignment on $\vec q$ and v by $e(v)=0$ and

$$\begin{align*}e(q_{il})=1\iff c(i)=l\end{align*}$$

for each $i<n$ and $l<k$ . Then for every $i<n$ , $e\nvDash q_{i,c(i)}\to v$ , thus $e\vDash \langle q_{il}\to v\rangle _{l<k}\to v$ . Likewise, for every $i<j<n$ and $l<k$ , $e\vDash q_{il}\to q_{jl}\to p_{ij}\to v$ , i.e., $e\nvDash q_{il}\land q_{jl}\land p_{ij}$ : if $c(i)=l=c(j)$ , then $\{i,j\}\notin E$ as c is a proper colouring. Thus, $e\nvDash \alpha _n(\vec p,\vec q,v)$ , which implies $e\nvDash C(\vec p)$ as C interpolates (3.5).

A symmetrical argument shows that if $\langle [n],\overline {E}\rangle $ is k-colourable, then e extends to an assignment such that $e\nvDash \alpha _n(\neg \vec p,\vec r,w)$ , whence $e\vDash C(\vec p)$ .

4 Extensions

The goal of the previous section has been to get to the basic lower bound (Theorem 3.12) as directly and as simply as possible. However, if we expend more effort, we can improve the result in various ways—more or less up to the strength of Theorem 4.22 of [Reference Jeřábek20]. We briefly indicate these modifications and their difficulty below, but we omit most details, and keep this section informal, as it is essentially an extended remark. We refer the reader to [Reference Jeřábek19, Reference Jeřábek20] for missing definitions.

4.1 Logics of unbounded branching

We proved the lower bound for a proof system for $\mathbf {IPC}_\to $ , but it can be generalized to analogous proof systems for some stronger logics, namely implicational fragments of superintuitionistic (si) logics of unbounded branching. A si logic L has branching at most k if it is complete w.r.t. a class of finite Kripke models such that every node has at most k immediate successors (or if it is included in such a logic); if L does not have branching at most k for any $k\in \omega $ , it has unbounded branching. We consider $\mathrm {NM}_\to $ extended with finitely many axiom schemata as proof systems for such logics. Any implicational logic of unbounded branching is included in $\mathbf {BD}_2$ (the logic of Kripke models of depth $2$ ), which can be axiomatized over $\mathbf {IPC}$ by the schema

(4.1)

$$ \begin{align} ((\varphi\to((\psi\to\chi)\to\psi)\to\psi)\to\varphi)\to\varphi \end{align} $$

(this is an implicational version of the more familiar axiom $\varphi \lor (\varphi \to (\psi \lor \neg \psi ))$ ). It is not a priori clear that the implicational fragment of $\mathbf {BD}_2$ is also axiomatized by (4.1) over $\mathbf {IPC}_\to $ , but this can be shown using the criterion in [Reference Jeřábek20, Lemma 4.11]. Thus, it suffices to prove our lower bound for $\mathrm {NM}_\to $ extended with axioms (4.1). This can be done by a minor modification of the proof of Theorem 3.11: for each instance $\omega $ of (4.1) used in $\Pi $ , we include $\omega $ itself as well as $\varphi \to ((\psi \to \chi )\to \psi )\to \psi $ in P. These formulas are classically valid, hence they will not affect the final argument showing that C interpolates $\neg \alpha _1(\neg \vec p,\vec r)\to \alpha _0(\vec p,\vec q)$ , and their presence in P easily implies $\|^{}_{I,J}\omega $ .

4.2 Full propositional language

It is straightforward to generalize dag-like natural deduction to the full language $\{\to ,\land ,\lor ,\bot \}$ of intuitionistic logic, including a suitable version of Lemma 2.3. The lower bound still holds for this proof system: we can extend Definition 3.4 using the standard Kleene slash conditions

$$ \begin{align*} |^{}(\varphi\land\psi)&\iff|^{}\varphi\text{ and }|^{}\psi,\\ |^{}(\varphi\lor\psi)&\iff\|^{}\varphi\text{ or }\|^{}\psi, \end{align*} $$

and $\mathord \nmid ^{}\bot $ ; then we can prove the analogue of Lemma 3.6, and the rest of the argument goes through unchanged.

The only problem is that this generalization interferes with the extension to logics of unbounded branching from Paragraph 4.1. While positive fragments (i.e., $\{\to ,\land ,\lor \}$ ) of logics of unbounded branching are still included in $\mathbf {BD}_2$ , this is not true for fragments including $\bot $ : then we only get that logics of unbounded branching are included in either $\mathbf {BD}_2$ or $\mathbf {KC}+\mathbf {BD}_3$ (see [Reference Jeřábek19, Theorem 6.9]; $\mathbf {KC}$ denotes the logic of weak excluded middle). The proof of the lower bound in the full language works fine for logics included in $\mathbf {BD}_2$ as indicated above, but unfortunately we do not know a direct way of proving it for $\mathbf {KC}+\mathbf {BD}_3$ . It seems that in this case we need the reduction to the $\bot $ -free fragment as given in [Reference Jeřábek19, Lemma 6.30] or [Reference Jeřábek20, Section 4.1].

4.3 Frege and Extended Frege

As we already mentioned in the introduction, the result applies to the Frege system for $\mathbf {IPC}_\to $ , as this is essentially a fragment of $\mathrm {NM}_\to $ without the $\mathrm {({\to }I)}$ rule (see Theorem A.5 in the appendix for more details). However, the argument can be adapted to Frege systems directly, using closure under modus ponens (MP) in place of $\Pi $ -closure. This also works for Frege systems of si logics included in $\mathbf {BD}_2$ in the full propositional language as explained above. Since the lower bound is on the number of lines rather than overall proof size, it also applies to Extended Frege systems.

4.4 Separation from Substitution Frege

We have only shown that the $\tau _n$ formulas are $\mathbf {IPC}_\to $ tautologies, but more constructively, they have proofs of polynomial size (and polynomial-time constructible) in the Substitution Frege proof system for $\mathbf {IPC}_\to $ . This can be demonstrated along the lines of the proof of [Reference Jeřábek20, Theorem 4.22] or [Reference Jeřábek19, Lemma 6.29]. Thus, for all proof systems subject to the lower bound, we actually obtain an exponential separation from the $\mathbf {IPC}_\to $ Substitution Frege system.

4.5 Larger bounds

The Colouring–Cocolouring tautologies can be made shorter using bit encoding of the colouring functions: instead of the variables $q_{il}$ for $i<n$ , $l<k$ as in Definition 3.1, we use variables $q_{ile}$ for $i<n$ , $l<\lceil \log k\rceil $ , and $e\in \{0,1\}$ , with intended meaning “the lth bit of the colour assigned to node i is e”, and likewise for $\vec r$ . This reduces the size of $\tau _n$ to $O(n^2\log n)$ while keeping the same proof size lower bound in terms of n, thus the lower bound in terms of $|\varphi |$ improves to $2^{|\varphi |^{1/16-o(1)}}$ (i.e., $\Omega (2^{|\varphi |^{1/16-\varepsilon }})$ for arbitrarily small $\varepsilon>0$ ).

Instead of Colouring–Cocolouring tautologies, we can use tautologies based on the original Clique–Colouring disjoint pair as in [Reference Hrubeš13–Reference Hrubeš15, Reference Jeřábek19, Reference Jeřábek20] (and in a preliminary version of this paper). They have larger size, viz. $O(n^2k^2)$ , but the monotone circuit size lower bound increases even more to $2^{\Omega (k^{1/2})}$ by Alon–Boppana [Reference Alon and Boppana1]. For $k\approx \sqrt n$ , this improves the bound in Theorem 3.12 to $2^{\Omega (|\varphi |^{1/12})}$ ; if we raise k to $\approx n^{2/3-o(1)}$ (the largest value to which the Alon–Boppana result applies), it improves further to $2^{|\varphi |^{1/10-o(1)}}$ .

Any improvements of the underlying monotone circuit size lower bounds directly translate to improvements of the proof size lower bounds. Recently, S. de Rezende and M. Vinyals (pers. comm.) proved a strengthening of the Alon–Boppana lower bound to $n^{\Omega (k)}$ for $k\le n^{1/2-o(1)}$ , and of the Hrubeš–Pudlák bound (our Theorem 2.6) to $2^{k^{1/2-o(1)}}$ . This implies improvements of Theorem 3.12 to $2^{|\varphi |^{1/10-o(1)}}$ for the Colouring–Cocolouring tautologies from Definition 3.1, $2^{|\varphi |^{1/8-o(1)}}$ for the bit-encoded Colouring–Cocolouring tautologies, and $2^{|\varphi |^{1/6-o(1)}}$ for Clique–Colouring tautologies with $k=n^{1/2-o(1)}$ . In fact, their results apply to a restricted version of the Clique–Colouring problem where the graph of size $n=(k+1)m$ is $(k+1)$ -partite with each element of the clique chosen from a specific part of size $m\approx k^{1/2+o(1)}$ , and the colours of nodes from a given part are chosen from a palette of constant size; moreover, each colour occurs in the palettes of only $O(1)$ parts. The corresponding $\mathbf {IPC}_\to $ tautologies have size $O(km^2)=O(k^{3+o(1)})$ , yielding a $2^{|\varphi |^{1/3-o(1)}}$ proof size lower bound.

The best circuit size lower bounds one could hope to achieve with this line of reasoning would be a $2^{\Omega (k)}$ bound on Clique–Colouring with k a constant fraction of n (i.e., with $m=O(1)$ ), implying a $2^{\Omega (k)}$ bound on Colouring–Cocolouring. These would translate to a $2^{\Omega (|\varphi |^{1/5})}$ proof size lower bound for the Colouring–Cocolouring tautologies, $2^{\Omega (|\varphi |^{1/4-o(1)})}$ for bit-encoded Colouring–Cocolouring, $2^{\Omega (|\varphi |^{1/4})}$ for Clique–Colouring with $k=\Theta (n)$ , and an optimal $2^{\Omega (|\varphi |)}$ lower bound for the restricted Clique–Colouring tautologies, matching the basic $2^{O(|\varphi |)}$ upper bound on the size of intuitionistic proofs. (All these bounds are essentially tight for the respective tautologies.)

5 Conclusion

We have shown how to prove a disjunction-free formulation of feasible disjunction property for implicational intuitionistic logic directly using an efficient version of Kleene’s slash, without reintroducing disjunctions into the proof. More generally, we demonstrated an implicational version of Hrubeš-style feasible monotone interpolation, and exploited it to prove exponential lower bounds on the number of lines in dag-like natural deduction $\mathrm {NM}_\to $ for intuitionistic implicational logic (or equivalent familiar systems such as Frege). This provides a simple refutation of Gordeev and Haeusler’s claims that all $\mathbf {IPC}_\to $ tautologies have polynomial-size proofs in $\mathrm {NM}_\to $ that should be accessible to a broad logic-aware audience.

Our approach consolidated the proof-theoretic components of the exponential lower bound to a single argument, obviating the need for translation of intuitionistic logic to its implicational fragment, or of dag-like natural deduction to Frege systems. The lower bound is not fully self-contained as we still rely on monotone circuit lower bounds; this combinatorial component of our lower bound has a quite different flavour from the proof-theoretic part and uses quite different techniques, thus it does not look very promising to try to combine them. Fortunately, we believe there is no pressing need for that, as monotone circuit bounds are now a fairly well-understood part of standard literature. The proof of the original Alon–Boppana bound in [Reference Alon and Boppana1] is neither long nor difficult to follow; likewise, the relevant arguments in [Reference Hrubeš and Pudlák16, Reference Jukna21] are easily accessible.

A Equivalence with Frege

Our objective in Section 3 was to prove an exponential lower bound on the size of $\mathrm {NM}_\to $ -proofs as directly as we could, and in particular, we avoided translation of $\mathrm {NM}_\to $ to other proof systems such as Frege. However, no treatment of the proof complexity of $\mathrm {NM}_\to $ can be complete without showing that it is, after all, polynomially equivalent to the (intuitionistic implicational) Frege proof system $\mathrm {F}_\to $ . This is implicit in Reckhow [Reference Reckhow31] and Cook and Reckhow [Reference Cook and Reckhow7], but they work with a different formulation of natural deduction, and with classical logic, hence it is worthwhile to spell out the reduction adapted to our situation, which is the main goal of this section (Theorems A.5, A.10, and A.16).

Let us mention that even though we formulate the results in this and the next section as only bounds on proof size (and other parameters), they are all constructive in that the relevant proofs can be computed by polynomial-time algorithms.

We start by defining the intuitionistic implicational Frege system $\mathrm {F}_\to $ .

Definition A.1. A (sequence-like) $\mathrm {F}_\to $ -derivation of $\varphi \in \mathrm {Form}$ from $\Gamma \subseteq \mathrm {Form}$ is a finite sequence of formulas $\Pi =\langle \gamma _i:i<t\rangle $ such that $t>0$ , $\gamma _{t-1}=\varphi $ , and for each $i<t$ : $\gamma _i\in \Gamma $ , or $\gamma _i$ is an instance of one of the logical axioms

(A1)

$$ \begin{align} \alpha&\to\beta\to\alpha, \end{align} $$

(A2)

$$ \begin{align} (\alpha\to\beta\to\gamma)&\to(\alpha\to\beta)\to(\alpha\to\gamma) \end{align} $$

for some $\alpha ,\beta ,\gamma \in \mathrm {Form}$ , or $\gamma _i$ is derived from $\gamma _j$ and $\gamma _k$ for some $j,k<i$ by the rule of modus ponens

(MP)

$$ \begin{align} \alpha,\alpha\to\beta\mathrel/\beta, \end{align} $$

i.e., $\gamma _k=(\gamma _j\to \gamma _i)$ . The number of lines of $\Pi $ is t, and the size of $\Pi $ is $\lVert \Pi \rVert =\sum _{i<t}|\gamma _i|$ .

A dag-like $\mathrm {F}_\to $ -derivation of $\varphi $ from $\Gamma $ is $\Pi =\langle V,E,\gamma \rangle $ , where $\langle V,E\rangle $ is a finite dag with a unique node $\varrho $ of out-degree $0$ (the root), all nodes have in-degree $0$ (the axioms or leaves) or $2$ (the (MP)), $\gamma =\langle \gamma _v:v\in V\rangle $ is a labelling of nodes by formulas such that $\gamma _\varrho =\varphi $ , all leaves are labelled with elements of $\Gamma $ or instances of (A1) or (A2), and if v is an (MP) with premises $v_0$ and $v_1$ , then $\gamma _v$ is derived from $\gamma _{v_0}$ and $\gamma _{v_1}$ by (MP). The number of lines of $\Pi $ is $|V|$ , and the size of $\Pi $ is $\lVert \Pi \rVert =\sum _{v\in V}|\gamma _v|$ .

A sequence-like or dag-like $\mathrm {F}_\to $ -proof of $\varphi $ is a sequence-like or dag-like (resp.) $\mathrm {F}_\to $ -derivation of $\varphi $ from $\varnothing $ .

The height of a dag-like $\mathrm {F}_\to $ -derivation or $\mathrm {NM}_\to $ -derivation $\langle V,E,\gamma \rangle $ is the maximal length of a directed path from a leaf to the root. Such a derivation is tree-like if the underlying dag $\langle V,E\rangle $ is a tree, i.e., all nodes have out-degree at most $1$ . Tree-like $\mathrm {F}_\to $ -derivations and $\mathrm {NM}_\to $ -derivations are also called $\mathrm {F}_\to ^*$ -derivations and $\mathrm {NM}_\to ^*$ -derivations (respectively), and likewise for $\mathrm {F}_\to ^*$ -proofs and $\mathrm {NM}_\to ^*$ -proofs.

The formula size of a dag-like $\mathrm {F}_\to $ - or $\mathrm {NM}_\to $ -derivation $\langle V,E,\gamma \rangle $ is $\max _{v\in V}|\varphi _v|$ , and likewise for sequence-like $\mathrm {F}_\to $ -derivations.

Observe that $\mathrm {NM}_\to ^*$ is the implicational fragment of the standard natural deduction system. It is well known that sequence-like and dag-like Frege are just different presentations of the same proof system.

Lemma A.2. A sequence-like (dag-like) $\mathrm {F}_\to $ -derivation of $\varphi $ from $\Gamma $ can be converted to a dag-like (sequence-like, resp.) $\mathrm {F}_\to $ -derivation of $\varphi $ from $\Gamma $ with at most the same size, number of lines, and formula size.

Proof Given a sequence-like derivation $\langle \gamma _i:i<t\rangle $ of $\varphi $ from $\Gamma $ , put $V=[t]$ . Let I be the set of $i<t$ such that $\gamma _i$ is not an axiom (from $\Gamma $ , or an instance of (A1) or (A2)); for each $i\in I$ , fix $i_0,i_1<i$ such that $\gamma _i$ is derived from $\gamma _{i_0}$ and $\gamma _{i_1}$ by (MP), and let $E=\{\langle i_j,i\rangle :i\in I,j\in \{0,1\}\}$ . Observe that $\langle V,E\rangle $ is acyclic as $E\subseteq {<}\mathbin \restriction [t]$ . Then $\langle V,E,\langle \gamma _i:i<t\rangle \rangle $ is a dag-like $\mathrm {F}_\to $ -derivation of $\varphi $ from $\Gamma $ , possibly after eliminating nodes from which the root $t-1$ is not reachable.

Conversely, let $\langle V,E,\gamma \rangle $ be a dag-like $\mathrm {F}_\to $ -derivation of $\varphi $ from $\Gamma $ , and $t=|V|$ . Since $\langle V,E\rangle $ is acyclic, we can find an enumeration $V=\{v_i:{i<t}\}$ such that $E\subseteq \{\langle v_i,v_j\rangle :i<j\}$ (a “topological ordering” of $\langle V,E\rangle $ ). The root $\varrho \in V$ is the only node without a successor, hence we must have $\varrho =v_{t-1}$ . Then $\langle \gamma _{v_i}:i<t\rangle $ is a sequence-like $\mathrm {F}_\to $ -derivation of $\varphi $ from $\Gamma $ .

The sequence-like definition is simpler, and is usually taken as the official definition of Frege systems (we follow this usage). Nevertheless, the dag-like definition has other benefits, in particular it allows the introduction of tree-like proofs and the height measure: this cannot be done directly with sequence-like proofs as it depends on the choice of the dag structure, which may not be uniquely determined by the proof sequence alone.

Let us also note basic dependencies between the various proof parameters.

Observation A.3. An $\mathrm {NM}_\to $ - or (dag-like) $\mathrm {F}_\to $ -derivation with formula size r and t lines has size at most $rt$ and at least $\max \{r,t\}$ . A derivation with height h has less than $2^{h+1}$ lines.

Proof The first part is obvious. In a dag with in-degree $2$ and root $\varrho $ , there are at most $2^l$ paths of length l ending in $\varrho $ . Thus, if $\varrho $ is reachable from any node in at most h steps, there are at most $\sum _{l\le h}2^l=2^{h+1}-1$ nodes.

We mostly consider formula size to be an auxiliary measure that can be used to conveniently bound size as per Observation A.3; it is not that interesting on its own.

A simple, yet very useful, property of Frege and natural deduction systems is that instances of any derivable schema have linear-size proofs. This is convenient for construction of asymptotically short proofs without worrying too much about the choice of basic axioms: we can use any valid schematic axioms and rules in a given argument as long as the number of different schemata is kept fixed.

Lemma A.4. Fix $\Gamma \subseteq \mathrm {Form}$ and $\varphi \in \mathrm {Form}$ in variables $\{p_i:i<k\}$ such that $\Gamma \vdash \varphi $ . Then for all substitutions $\sigma $ , there are $\mathrm {F}_\to ^*$ -derivations and $\mathrm {NM}_\to ^*$ -derivations of $\sigma (\varphi )$ from $\sigma (\Gamma )$ with $O(1)$ lines and size $O(s)$ , where $s=\sum _{i<k}|\sigma (p_i)|$ . (The constants implied in the $O(\dots )$ notation depend on $\Gamma $ and $\varphi $ .) Moreover, we may assume the derivations use each axiom from $\sigma (\Gamma )$ only once.

Proof Let $\Pi =\langle \gamma _i:i<t\rangle $ be a fixed $\mathrm {F}_\to ^*$ -derivation of $\varphi $ from $\Gamma $ such that all variables occurring in $\Pi $ are among $\{p_i:i<k\}$ . Then for any substitution $\sigma $ , $\langle \sigma (\gamma _i):i<t\rangle $ is an $\mathrm {F}_\to $ -derivation of $\sigma (\varphi )$ from $\sigma (\Gamma )$ with t lines and size at most $\left \lVert \Pi \right \rVert s$ . The argument for $\mathrm {NM}_\to ^*$ is completely analogous.

Instead of applying the argument directly to $\Gamma \vdash \varphi $ , we may apply it to the $\mathbf {IPC}_\to $ tautology $\vdash \Gamma \to \varphi $ . This yields tree-like proofs of $\sigma (\Gamma \to \varphi )$ with $O(1)$ lines and size $O(s)$ , which we can turn into derivations of $\sigma (\varphi )$ from $\sigma (\Gamma )$ by $|\Gamma |$ applications of (MP)/ $\mathrm {({\to }E)}$ ; this ensures that each axiom from $\sigma (\Gamma )$ is used only once.

The simulation of $\mathrm {F}_\to $ by $\mathrm {NM}_\to $ is completely straightforward:

Theorem A.5. If $\varphi $ has a dag-like $\mathrm {F}_\to $ -derivation from $\Gamma $ with t lines, height h, formula size r, and size s, then $\varphi $ has an $\mathrm {NM}_\to $ -derivation from $\Gamma $ with $O(t)$ lines, height $h+O(1)$ , formula size $O(r)$ , and size $O(s)$ . If the original $\mathrm {F}_\to $ -derivation is tree-like, the $\mathrm {NM}_\to $ -derivation can also be taken tree-like.

Proof Let $\Pi =\langle V,E,\gamma \rangle $ be a dag-like $\mathrm {F}_\to $ -derivation of $\varphi $ from $\Gamma $ . Reinterpreting the (MP)-nodes as $\mathrm {({\to }E)}$ -nodes, $\Pi $ becomes an $\mathrm {NM}_\to $ -derivation from $\Gamma $ plus the instances of (A1) and (A2) that appear in $\Pi $ . By Lemma A.4, each of the latter can be replaced by a tree-like $\mathrm {NM}_\to $ -subproof with $O(1)$ lines (thus height $O(1)$ ) and size linear in the size of the axiom instance, yielding an $\mathrm {NM}_\to $ -derivation of $\varphi $ from $\Gamma $ with the stated parameters.

For the converse simulation of $\mathrm {NM}_\to $ by $\mathrm {F}_\to $ , we will need proofs of some auxiliary formulas. As proved in [Reference Jeřábek20, L. 2.3], there are short proofs of “structural rules” for $\Gamma \to \varphi $ , showing in particular that we can arbitrarily reorder $\Gamma $ so that we can treat it as a set. We include here optimized proofs of some special cases.

Definition A.6. We extend the $\Gamma \to \varphi $ notation to sequences indexed by finite subsets of integers. If $I\subseteq [m]$ and $\Gamma =\langle \alpha _i:i\in I\rangle =\langle \alpha _i\rangle _{i\in I}$ , we define $\Gamma \to \varphi $ by induction on $|I|$ : $\langle \alpha _i\rangle _{i\in \varnothing }\to \varphi $ is $\varphi $ , and if $I\ne \varnothing $ , then $\langle \alpha _i\rangle _{i\in I}\to \varphi $ is $\alpha _h\to \langle \alpha _i\rangle _{i\in I\smallsetminus \{h\}}\to \varphi $ , where $h=\max I$ . (I.e., $\langle \alpha _i\rangle _{i\in I}\to \varphi $ is $\langle \alpha _{i_j}\rangle _{j<n}\to \varphi $ , where $\langle i_j:j<n\rangle $ is an increasing enumeration of I.)

If $\Gamma =\langle \alpha _i\rangle _{i\in I}$ , we put $\operatorname {\mathrm {dom}}(\Gamma )=I$ , $|\Gamma |=|I|$ , and $\lVert \Gamma \rVert =\sum _{i\in I}|\alpha _i|$ . We write $\Gamma \mathbin \restriction J=\langle \alpha _i\rangle _{i\in I\cap J}$ . If $\Delta =\langle \beta _i\rangle _{i\in J }$ , we write $\Gamma \subseteq \Delta $ when $I\subseteq J$ and $\alpha _i=\beta _i$ for all $i\in I$ .

First, a general observation that we will keep using to construct proofs of small height.

Lemma A.7. Given a sequence of formulas $\langle \varphi _i:i\le n\rangle $ , $n\ge 1$ , there is an $\mathrm {F}_\to ^*$ -derivation of $\varphi _0\to \varphi _n$ from $\{\varphi _i\to \varphi _{i+1}:i<n\}$ with $O(n)$ lines, height $O(\log n)$ , formula size $O(r)$ , and size $O(rn)$ that uses each assumption $\varphi _i\to \varphi _{i+1}$ only once, where $r=\max _i|\varphi _i|$ .

Proof We arrange the implications in a balanced binary tree with n leaves. Formally, we construct for eachFootnote ⁴ $k\le \lceil \log n\rceil $ and $i<n$ such that $2^k\mid i$ a derivation $\Pi ^k_i$ of $\varphi _i\to \varphi _{\min \{i+2^k,n\}}$ by induction on k as follows: $\Pi ^0_i$ is the trivial derivation of $\varphi _i\to \varphi _{i+1}$ from itself. Let $k<\lceil \log n\rceil $ and $i<n$ be such that $2^{k+1}\mid i$ . If $i+2^k\ge n$ , we put $\Pi ^{k+1}_i=\Pi ^k_i$ ; otherwise, we combine $\Pi ^k_i$ and $\Pi ^k_{i+2^k}$ to $\Pi ^{k+1}_i$ using an instance of the schematic rule

$$\begin{align*}\alpha\to\beta,\beta\to\gamma\vdash\alpha\to\beta,\end{align*}$$

i.e., an $\mathrm {F}_\to ^*$ -derivation of $\varphi _i\to \varphi _{\min \{i+2^{k+1},n\}}$ from $\varphi _i\to \varphi _{i+2^k}$ and $\varphi _{i+2^k}\to \varphi _{\min \{i+2^{k+1},n\}}$ with $O(1)$ lines and size $O(r)$ that uses each assumption only once, which exists by Lemma A.4.

Then $\Pi ^{\lceil \log n\rceil }_0$ is the desired derivation of $\varphi _0\to \varphi _n$ .

Lemma A.8. Given sequences of formulas $\Gamma $ and $\Delta $ such that $\Delta \subseteq \Gamma $ , and $\varphi \in \mathrm {Form}$ , there exists an $\mathrm {F}_\to ^*$ -proof of

(A.1)

$$ \begin{align} (\Delta\to\varphi)\to(\Gamma\to\varphi) \end{align} $$

with $O(n)$ lines, height $O(\log n)$ , formula size $O(s)$ , and size $O(sn)$ , where $n=\max \big \{|\Gamma |,2\big \}$ and $s=\lVert \Gamma \rVert +|\varphi |$ .

Proof We may assume $\Gamma =\langle \alpha _i\rangle _{i<n}$ and $\Delta =\langle \alpha _i\rangle _{i\in I}$ , $I\subseteq [n]$ . For each $i\le n$ , let $\varphi _i$ denote the formula $(\Delta \mathbin \restriction [i]\to \varphi )\to (\Gamma \mathbin \restriction [i]\to \varphi )$ . Then $\varphi _i\to \varphi _{i+1}$ is an instance of one of the schemata

$$ \begin{align*} (\delta\to\gamma)&\to(\delta\to\alpha\to\gamma),\\ (\delta\to\gamma)&\to((\alpha\to\delta)\to\alpha\to\gamma) \end{align*} $$

with $\delta =(\Delta \mathbin \restriction [i]\to \varphi )$ , $\gamma =(\Gamma \mathbin \restriction [i]\to \varphi )$ , and $\alpha =\alpha _i$ , depending on whether $i\in I$ . Thus, it has an $\mathrm {F}_\to ^*$ -proof with $O(1)$ lines and size $O(s)$ by Lemma A.4. Using Lemma A.7, we can combine these proofs to a proof of $\varphi _0\to \varphi _n$ with $O(n)$ lines, height $O(\log n)$ , formula size $O(s)$ , and size $O(sn)$ . Since $\varphi _n$ is (A.1), it remains to detach the $\mathbf {IPC}_\to $ tautology $\varphi _0=(\varphi \to \varphi )$ , which has a proof with $O(1)$ lines and size $O(|\varphi |)$ .

Lemma A.9. Given sequences of formulas $\Gamma $ , $\Delta $ , and $\Theta $ , and $\varphi ,\psi \in \mathrm {Form}$ , there are $\mathrm {F}_\to ^*$ -proofs of

(A.2)

$$ \begin{align} (\Gamma\to\varphi\to\psi)\to(\Gamma\to\varphi)&\to(\Gamma\to\psi), \end{align} $$

(A.3)

$$ \begin{align} \Gamma&\to(\Gamma\to\varphi)\to\varphi, \end{align} $$

(A.4)

$$ \begin{align} (\Gamma\to\Gamma\to\varphi)&\to(\Gamma\to\varphi), \end{align} $$

(A.5)

$$ \begin{align} (\Theta\to\Gamma\to\Delta\to\varphi)&\to(\Theta\to\Delta\to\Gamma\to\varphi) \end{align} $$

with $O(n)$ lines, height $O(\log n)$ , formula size $O(s)$ , and size $O(sn)$ , where $n=\max \big \{|\Gamma |+|\Delta |+|\Theta |,2\}$ and $s=\lVert \Gamma \rVert +\lVert \Delta \rVert +\lVert \Theta \rVert +|\varphi |+|\psi |$ .

Proof We prove (A.2) using the same strategy as in Lemma A.8: putting

$$\begin{align*}\varphi_i=(\Gamma\mathbin\restriction[i]\to\varphi\to\psi)\to(\Gamma\mathbin\restriction[i]\to\varphi)\to(\Gamma\mathbin\restriction[i]\to\psi)\end{align*}$$

for each $i\le |\Gamma |$ , $\varphi _i\to \varphi _{i+1}$ has a proof with $O(1)$ lines and size $O(s)$ as it is an instance of the schema

$$\begin{align*}(\beta\to\gamma\to\delta)\to((\alpha\to\beta)\to(\alpha\to\gamma)\to(\alpha\to\delta)).\end{align*}$$

These proofs combine to a proof of $\varphi _0\to \varphi _{|\Gamma |}$ with the stated parameters using Lemma A.7. Then $\varphi _{|\Gamma |}$ is (A.2), and $\varphi _0=(\varphi \to \psi )\to (\varphi \to \psi )$ has a short proof.

For (A.3), we put $\varphi _i=((\Gamma \to \varphi )\to \Gamma \mathbin \restriction [i]\to \varphi )\to \Gamma \mathbin \restriction [i]\to (\Gamma \to \varphi )\to \varphi $ . Then $\varphi _0$ is an instance of $\alpha \to \alpha $ , and $\varphi _i\to \varphi _{i+1}$ is an instance of

$$\begin{align*}((\alpha\to\beta)\to\delta)\to(\alpha\to\gamma\to\beta)\to\gamma\to\delta\end{align*}$$

(with $\alpha =(\Gamma \to \varphi )$ , $\beta =(\Gamma \mathbin \restriction [i]\to \varphi )$ , $\gamma =\gamma _i$ , and

$$\begin{align*}\delta=(\Gamma\mathbin\restriction[i]\to(\Gamma\to\varphi)\to\varphi),\end{align*}$$

where $\Gamma =\langle \gamma _i\rangle _{i<|\Gamma |}$ ). Thus, using Lemmas A.4 and A.7, we obtain an $\mathrm {F}_\to ^*$ -proof of $\varphi _{|\Gamma |}$ with $O(n)$ lines, height $O(\log n)$ , formula size $O(s)$ , and size $O(sn)$ . Detaching the premise $(\Gamma \to \varphi )\to \Gamma \to \varphi $ of $\varphi _{|\Gamma |}$ yields (A.3).

(A.4) follows by (MP) from (A.3) and (A.2).

(A.5): We have $(\Gamma \to \Delta \to \varphi )\to (\Delta \to \Gamma \to \Delta \to \Gamma \to \varphi )$ from (A.1), and $(\Delta \to \Gamma \to \Delta \to \Gamma \to \varphi )\to (\Delta \to \Gamma \to \varphi )$ from (A.4), thus we obtain (A.5) when $\Theta =\varnothing $ . The general case follows by applying (A.2).

Theorem A.10. If $\varphi $ has an $\mathrm {NM}_\to $ -derivation from $\Gamma $ with t lines, height h, and size s, then $\varphi $ has a dag-like $\mathrm {F}_\to $ -derivation from $\Gamma $ with $O(t^2)$ lines, height $O(h)$ , formula size $O(s)$ , and size $O(st^2)$ . If the original $\mathrm {NM}_\to $ -derivation is tree-like, the $\mathrm {F}_\to $ -derivation can be taken tree-like as well.

Proof Let $\Pi =\langle V,E,\gamma \rangle $ be an $\mathrm {NM}_\to $ -derivation of $\varphi $ from $\Gamma $ . Let $\langle \gamma ^{\prime }_i\rangle _{i<t'}$ , $t'\le t$ , be an injective enumeration of the set $\{\gamma _v:v\in V\}$ , and for each $v\in V$ , let $A^{\prime }_v$ denote the sequence $\langle \gamma ^{\prime }_i:i<t',\gamma ^{\prime }_i\in A_v\smallsetminus \Gamma \rangle $ ; notice that $\lVert A^{\prime }_v\rVert \le s$ . We consider the collection of $\mathbf {IPC}_\to $ tautologies $\langle A^{\prime }_v\to \gamma _v:v\in V\rangle $ , and complete it to a valid $\mathrm {F}_\to $ -derivation from $\Gamma $ using Lemmas A.8 and A.9.

In more detail, for every $v\in V$ , we construct an $\mathrm {F}_\to ^*$ -derivation $\Pi _v$ of $A^{\prime }_v\to \gamma _v$ from $\{A^{\prime }_u\to \gamma _u:\langle u,v\rangle \in E\}\cup \Gamma $ with $O(t)$ lines, height $O(\log t)$ , and formula size $O(s)$ . Moreover, each assumption $A^{\prime }_u\to \gamma _u$ is used only once, and the path from it to the conclusion has length $O(1)$ ; both of these properties are obtained by constructing a derivation of $\langle A^{\prime }_u\to \gamma _u\rangle _{\langle u,v\rangle \in E}\to A^{\prime }_v\to \gamma _v$ from $\Gamma $ and applying (MP):

• If v is a leaf, then either $\gamma _v\in \Gamma $ and $A^{\prime }_v=\varnothing $ , in which case we take the trivial derivation of $\gamma _v$ from itself, or $A^{\prime }_v=\langle \gamma _v\rangle $ , in which case we find an $\mathrm {F}_\to ^*$ -proof of $\gamma _v\to \gamma _v$ with $O(1)$ lines and size $O(s)$ by Lemma A.4.
• If v is an $\mathrm {({\to }I)}$ -node with premise u, we have $\gamma _v=(\alpha \to \beta )$ and $\gamma _u=\beta $ for some $\alpha $ and $\beta $ , and $A^{\prime }_v=A^{\prime }_u\smallsetminus \{\alpha \}$ as a set. If $\alpha \in A^{\prime }_u$ , then $(A^{\prime }_u\to \beta )\to (A^{\prime }_v\to \alpha \to \beta )$ is an instance of (A.5), otherwise it is an instance of (A.1).
• If v is an $\mathrm {({\to }E)}$ -node with premises $u_0$ and $u_1$ , then $\gamma _{u_0}=\alpha $ , $\gamma _{u_1}=(\alpha \to \beta )$ , and $\gamma _v=\beta $ for some $\alpha $ and $\beta $ . We have $A^{\prime }_{u_i}\subseteq A^{\prime }_v$ , hence (A.1) gives $\mathrm {F}_\to ^*$ -proofs of $(A^{\prime }_{u_0}\to \alpha )\to A^{\prime }_v\to \alpha $ and $(A^{\prime }_{u_1}\to \alpha \to \beta )\to A^{\prime }_v\to \alpha \to \beta $ . We infer $(A^{\prime }_{u_1}\to \alpha \to \beta )\to (A^{\prime }_{u_0}\to \alpha )\to A^{\prime }_v\to \beta $ using the instance $(A^{\prime }_v\to \alpha \to \beta )\to (A^{\prime }_v\to \alpha )\to A^{\prime }_v\to \beta $ of (A.2) and $O(1)$ additional proof lines by Lemma A.4.

Combining these derivations $\Pi _v$ along the shape of the original derivation $\Pi $ yields an $\mathrm {F}_\to $ -derivation (tree-like if $\Pi $ is tree-like) of $\varphi $ from $\Gamma $ with $O(t^2)$ lines, height $O(h+\log t)=O(h)$ (cf. Observation A.3), formula size $O(s)$ , and size $O(st^2)$ as promised.

The bottleneck in the proof of Theorem A.10 is that formulas of the form $\Gamma \to \varphi $ with long $\Gamma $ are cumbersome to operate as $\varphi $ is nested deep inside, and when untangling it we need to keep copying large parts of the formula. This could be avoided if we had a conjunction connective: using $\bigwedge \Gamma \to \varphi $ instead, $\varphi $ sits right at nesting depth $1$ ; if we arrange the big conjunction $\bigwedge \Gamma $ in a balanced binary tree, the individual entries of $\Gamma $ are also easy to access at nesting depth $O(\log n)$ , and wholesale manipulations such as Lemma A.8 can be done using a divide-and-conquer approach that saves size.

We do not have $\land $ in implicational logic, as it is not definable in terms of $\to $ . However, we may observe that if we fix a formula $\varphi $ , then formulas $\alpha ,\beta $ of the form $\Phi \to \varphi $ do have a definable conjunction operation: $\alpha $ is equivalent to $(\alpha \to \varphi )\to \varphi $ , and likewise for $\beta $ , thus also $\alpha \land \beta $ is equivalent to $(\alpha \land \beta \to \varphi )\to \varphi $ , which can be written as $(\alpha \to \beta \to \varphi )\to \varphi $ . This idea was introduced in [Reference Jeřábek20, Prop. 2.6] to prove polynomial simulation of Frege by tree-like Frege for purely implicational logic (cf. Theorem B.3), but here we will use it to improve the bounds in Theorem A.10.

Definition A.11. For any formulas $\varphi $ , $\alpha $ , and $\beta $ , we put

$$ \begin{align*} \alpha^\varphi&=(\alpha\to\varphi)\to\varphi,\\ \alpha\land^{\varphi}\beta&=(\alpha\to\beta\to\varphi)\to\varphi. \end{align*} $$

For all sequences of formulas $\Gamma =\langle \alpha _i:i\in I\rangle $ , $I\subseteq [m]$ , we define ${\bigwedge \nolimits ^{\varphi }_{i\in I}}\alpha _i$ , also denoted $\mathop {\bigwedge ^{\varphi }\Gamma }$ , by induction on m:

$$\begin{align*}\mathop{\bigwedge\nolimits^{\varphi}_{i\in I}}\alpha_i=\begin{cases} \top,&I=\varnothing,\\ \alpha_{i_0}^\varphi,&I=\{i_0\},\\ \displaystyle\mathop{\bigwedge\nolimits^{\varphi}}_{i\in I-2^k}\alpha_{2^k+i},&I\subseteq[2^k,2^{k+1}),\\ \displaystyle\Big(\mathop{\bigwedge\nolimits^{\varphi}}_{\!\!i\in I\cap[2^k]\!\!}\alpha_i\Big)\land^{\varphi}_{\vphantom l}\Big(\mathop{\bigwedge\nolimits^{\varphi}}_{\!\!i\in I-2^k\!\!}\alpha_{2^k+i}\Big), &{\begin{aligned}[t]&I\subseteq[2^{k+1}],\\&I\cap[2^k]\ne\varnothing\ne I\cap[2^k,2^{k+1}),\end{aligned}} \end{cases}\end{align*}$$

where $\top $ is a fixed $\mathbf {IPC}_\to $ tautology, $k\ge 0$ , and $I-2^k=\{i:2^k+i\in I\}$ . We write $\mathop {\bigwedge \nolimits ^{\varphi }_{i<n}}\alpha _i$ for $\mathop {\bigwedge \nolimits ^{\varphi }_{i\in [n]}}\alpha _i$ .

The idea is that $\mathop {\bigwedge \nolimits ^{\varphi }_{i<2^k}}\alpha _i$ consists of $\land ^{\varphi }_{\vphantom l}$ arranged in a perfect binary tree of height k, while if $I\subseteq [2^k]$ , then $\mathop {\bigwedge \nolimits ^{\varphi }_{i\in I}}\alpha _i$ conforms to the same arrangement except that unused leaves and non-splitting inner nodes are omitted; this ensures that the layouts of $\mathop {\bigwedge \nolimits ^{\varphi }_{i\in I}}\alpha _i$ and $\mathop {\bigwedge \nolimits ^{\varphi }_{i\in J}}\alpha _i$ for any $I,J\subseteq [2^k]$ are compatible, facilitating efficient manipulation of $\mathop {\bigwedge \nolimits ^{\varphi }_{}\Gamma }$ in a divide-and-conquer manner.

Lemma A.12. $\left |{\mathop {\bigwedge \nolimits ^{\varphi }_{}\Gamma }}\right |=\lVert \Gamma \rVert +O\big (\left |\varphi \right | n\big )$ , where $n=\max \big \{|\Gamma |,1\big \}$ .

Proof Observe that the inductive definition introduces $\land ^{\varphi }_{\vphantom l}$ only when the sequences on both sides are nonempty. Thus, $\mathop {\bigwedge \nolimits ^{\varphi }_{}\Gamma }$ is a binary tree of $\land ^{\varphi }_{\vphantom l}$ with n leaves where every inner node splits, thus there are $n-1$ inner nodes. Since $\alpha $ and $\beta $ occur only once in $\alpha \land ^{\varphi }_{\vphantom l}\beta $ , each node of the tree gives rise to only one subformula of $\mathop {\bigwedge \nolimits ^{\varphi }_{}\Gamma }$ ; thus, $\mathop {\bigwedge \nolimits ^{\varphi }_{}\Gamma }$ consists of one occurrence of each $\alpha _i$ of total size $\lVert \Gamma \rVert $ , and $O(1)$ occurrences of $\varphi $ and $\to $ per each node of the tree of total size $O\big (\left |\varphi \right | n\big )$ .

The following is a $\mathop {\bigwedge \nolimits ^{\varphi }}$ -version of Lemma A.8 that also handles unions of two sequences.

Lemma A.13. Let $\varphi \in \mathrm {Form}$ and $\Gamma =\langle \alpha _i:i\in I\rangle $ be a sequence of for mulas with $|\Gamma |=n\ge 1$ and $I\subseteq [m]$ , $m\ge 2$ . Let $\Gamma _u=\Gamma \mathbin \restriction I_u$ for $u=0,1,2$ , where $I_u\subseteq I$ are such that $I_2\subseteq I_0\cup I_1$ . Then there is an $\mathrm {F}_\to ^*$ -proof of

(A.6)

$$ \begin{align} \mathop{\bigwedge\nolimits^{\varphi}{\Gamma_0}}\to\mathop{\bigwedge\nolimits^{\varphi}\Gamma_1}\to\mathop{\bigwedge\nolimits^{\varphi}\Gamma_2} \end{align} $$

with $O(n)$ lines, height $O(\log m)$ , formula size $O(s+\left |\varphi \right | n)$ , and size $O\big ((s+\left |\varphi \right | n)\log m\big )$ , where $s=\lVert \Gamma \rVert $ .

Proof We construct the proofs by induction on $\lceil \log m\rceil $ . If $m=2$ or $n=1$ , then (A.6) has a proof with $O(1)$ lines and size $O(s+|\varphi |)$ by Lemma A.4. If $I\subseteq [2^k,2^{k+1})$ for some k, we can just apply the induction hypothesis (without changing the proof) to $\Gamma '=\langle \alpha _{2^k+i}:i\in I-2^k\rangle $ and $\Gamma ^{\prime }_u=\Gamma '\mathbin \restriction (I_u-2^k)$ , as $\mathop {\bigwedge \nolimits ^{\varphi }_{}\Gamma _u}=\mathop {\bigwedge \nolimits ^{\varphi }_{}\Gamma ^{\prime }_u}$ .

Assume that $I\subseteq [2^{k+1}]$ and $I^0,I^1\ne \varnothing $ , where $I^0=I\cap [2^k]$ and $I^1=I-2^k$ . For each $u<3$ , put $I^0_u=I_u\cap [2^k]$ and $I^1_u=I_u-2^k$ . Let $\Gamma ^0=\Gamma \mathbin \restriction I^0$ , $\Gamma ^1=\langle \alpha _{2^k+i}:i\in I^1\rangle $ , and $\Gamma ^v_u=\Gamma ^v\mathbin \restriction I^v_u$ for each $v<2$ , $u<3$ . There are proofs of

(A.7)

$$ \begin{align} \mathop{\bigwedge\nolimits^{\varphi}\Gamma_u}\to\mathop{\bigwedge\nolimits^{\varphi}\Gamma^v_u},\qquad\mathop{\bigwedge\nolimits^{\varphi}\Gamma^0_u}\to\mathop{\bigwedge\nolimits^{\varphi}\Gamma^1_u}\to\mathop{\bigwedge\nolimits^{\varphi}\Gamma_u} \end{align} $$

with $O(1)$ lines and size $O(s+\left |\varphi \right | n)$ using Lemma A.4: if $I^v_u=\varnothing $ for some $v<2$ , then $\mathop {\bigwedge \nolimits ^{\varphi }_{}\Gamma _u}=\mathop {\bigwedge \nolimits ^{\varphi }_{}\Gamma ^{1-v}_u}$ and $\mathop {\bigwedge \nolimits ^{\varphi }_{}\Gamma ^v_u}=\top $ ; otherwise, $\mathop {\bigwedge \nolimits ^{\varphi }_{}\Gamma _u}$ is $\big (\mathop {\bigwedge \nolimits ^{\varphi }_{}\Gamma ^0_u}\big )\land ^{\varphi }_{\vphantom l}\big (\mathop {\bigwedge \nolimits ^{\varphi }_{}\Gamma ^1_u}\big )$ , thus (A.7) are instances of the valid schemata $(\alpha ^0\to \varphi )\land ^{\varphi }_{\vphantom l}(\alpha ^1\to \varphi )\to (\alpha ^v\to \varphi )$ and $\alpha \to \beta \to \alpha \land ^{\varphi }_{\vphantom l}\beta $ (observe that each $\mathop {\bigwedge \nolimits ^{\varphi }_{}\Gamma ^v_u}$ is of the form $\alpha \to \varphi $ for some formula $\alpha $ ).

Using (A.7), we can construct proofs of

(A.8)

$$ \begin{align} \Big(\mathop{\bigwedge\nolimits^{\varphi}\Gamma^0_0}\to\mathop{\bigwedge\nolimits^{\varphi}\Gamma^0_1}\to\mathop{\bigwedge\nolimits^{\varphi}\Gamma^0_2}\Big)&\to \Big(\mathop{\bigwedge\nolimits^{\varphi}\Gamma^1_0}\to\mathop{\bigwedge\nolimits^{\varphi}\Gamma^1_1}\to\mathop{\bigwedge\nolimits^{\varphi}\Gamma^1_2}\Big)\nonumber\\&\to \Big(\mathop{\bigwedge\nolimits^{\varphi}\Gamma_0}\to\mathop{\bigwedge\nolimits^{\varphi}\Gamma_1}\to\mathop{\bigwedge\nolimits^{\varphi}\Gamma_2}\Big) \end{align} $$

with $O(1)$ lines and size $O(s+\left |\varphi \right | n)$ . The induction hypothesis for $\Gamma ^0$ and $\Gamma ^1$ gives us proofs of $\mathop {\bigwedge \nolimits ^{\varphi }_{}\Gamma ^v_0}\to \mathop {\bigwedge \nolimits ^{\varphi }_{}\Gamma ^v_1}\to \mathop {\bigwedge \nolimits ^{\varphi }_{}\Gamma ^v_2}$ for $v<2$ , and these together yield (A.6).

We can imagine the resulting proof as a binary tree of (A.8) inferences. Since each application of (A.8) corresponds to splitting I to two nonempty disjoint subsets, each inner node has two children, and the tree has at most n leaves. Thus, the proof has $O(n)$ lines. Each application of (A.8) also strictly decreases $\lceil \log m\rceil $ , hence the height of the proof is $O(\log m)$ . The formula size is $O(s+\left |\varphi \right | n)$ using Lemma A.12.

As for the size of the proof, the root of the tree contributes $O(s+\left |\varphi \right | n)$ . Its two children contribute $O(s_0+\left |\varphi \right | n_0)$ and $O(s_1+\left |\varphi \right | n_1)$ , where $s_0+s_1=s$ and $n_0+n_1=n$ , thus $O(s+\left |\varphi \right | n)$ together. Continuing the same way, each level of the tree consists of inferences of size $O(s+\left |\varphi \right | n)$ , and there are at most $O(\log m)$ levels, hence the total size is $O\big ((s+\left |\varphi \right | n)\log m\big )$ . (More formally, we can prove such a bound by induction on $\lceil \log m\rceil $ .)

We cannot use $\mathop {\bigwedge \nolimits ^{\varphi }A^{\prime }_v}\to \gamma _v^\varphi $ with a fixed formula $\varphi $ instead of $A^{\prime }_v\to \gamma _v$ for the simulation of $\mathrm {NM}_\to $ by $\mathrm {F}_\to $ as in the proof of Theorem A.10, because the $\mathrm {({\to }I)}$ -rule would translate to an unsound inference

$$\begin{align*}\mathop{\bigwedge\nolimits^{\varphi}A^{\prime}_v}\to\alpha^\varphi\to\beta^\varphi\vdash\mathop{\bigwedge\nolimits^{\varphi}A^{\prime}_v}\to(\alpha\to\beta)^\varphi.\end{align*}$$

We will in fact work with $\mathop {\bigwedge \nolimits ^{\gamma _v}A^{\prime }_v}\to \gamma _v$ , but this necessitates that we are able to transform $\mathop {\bigwedge \nolimits ^{\varphi }_{}\Gamma }$ to $\mathop {\bigwedge \nolimits ^{\psi }_{}\Gamma }$ for given $\varphi $ , $\psi $ :

Lemma A.14. Let $\varphi ,\psi \in \mathrm {Form}$ and $\Gamma =\langle \alpha _i:i\in I\rangle $ be a sequence of formulas with $I\subseteq [m]$ , $m\ge 2$ . Then there is an $\mathrm {F}_\to ^*$ -proof of

(A.9)

$$ \begin{align} \mathop{\bigwedge\nolimits^{\varphi}\Gamma}\to\Big(\mathop{\bigwedge\nolimits^{{\psi}}\Gamma}\Big)^\varphi \end{align} $$

with $O(n)$ lines, height $O(\log m)$ , formula size $O(s+\left |\varphi \right | n+\left |\psi \right | n)$ , and size $O\big ((s+\left |\varphi \right | n+\left |\psi \right | n)\log m\big )$ , where $n=\max \big \{|\Gamma |,1\big \}$ and $s=\lVert \Gamma \rVert $ .

Proof We construct the proofs by induction on $\lceil \log m\rceil $ , similarly to Lemma A.13. If $m=2$ or $n=1$ , then (A.9) has a proof with $O(1)$ lines and size $O(s+|\varphi |+|\psi |)$ by Lemma A.4. If $I\subseteq [2^k,2^{k+1})$ for some k, we can apply the induction hypothesis to $\Gamma '=\langle \alpha _{2^k+i}:i\in I-2^k\rangle $ without changing the proof. If $I\subseteq [2^{k+1}]$ and $I^0,I^1\ne \varnothing $ , where $I^0=I\cap [2^k]$ and $I^1=I-2^k$ , the induction hypothesis applied to $\Gamma ^0=\Gamma \mathbin \restriction I^0$ and $\Gamma ^1=\langle \alpha _{2^k+i}:i\in I^1\rangle $ gives proofs of

$$\begin{align*}\mathop{\bigwedge\nolimits^{\varphi}\Gamma^0}\to\Big(\mathop{\bigwedge\nolimits^{{\psi}}\Gamma^0}\Big)^\varphi,\qquad \mathop{\bigwedge\nolimits^{\varphi}\Gamma^1}\to\Big(\mathop{\bigwedge\nolimits^{{\psi}}\Gamma^1}\Big)^\varphi.\end{align*}$$

These yield (A.9) using an instance of the schema

$$\begin{align*}(\alpha\to\beta^\varphi)\to(\gamma\to\delta^\varphi)\to\alpha\land^{\varphi}_{\vphantom l}\gamma\to(\beta\land^{{\psi}}_{\vphantom l}\delta)^\varphi\end{align*}$$

(we invite the reader to check this is indeed a valid schema).

The resulting proof has the stated size parameters by the same argument as in Lemma A.13.

Before we get to the improved simulation of $\mathrm {NM}_\to $ by $\mathrm {F}_\to $ , we need to introduce one more size parameter so that we can state the bounds accurately:

Definition A.15. The inferential size of a $\mathrm {NM}_\to $ -derivation or dag-like $\mathrm {F}_\to $ -derivation $\langle V,E,\gamma \rangle $ is $\sum _{v\in V}s_v$ , where $s_v=|\gamma _v|+\sum _{\langle u,v\rangle \in E}|\gamma _u|$ .

Clearly, a derivation with t lines and formula size r has inferential size $O(rt)$ . A tree-like derivation (or more generally, a derivation where each node has bounded out-degree) of size s has inferential size $O(s)$ . We will see later (Lemma B.2) that any dag-like $\mathrm {F}_\to $ -derivation of size s can be shortened to a derivation with inferential size $O(s)$ , but we do not know whether the analogue for $\mathrm {NM}_\to $ -derivations holds.

Theorem A.16. If $\varphi $ has an $\mathrm {NM}_\to $ -derivation from $\Gamma $ with t lines, height h, formula size r, and inferential size $\tilde s$ , then $\varphi $ has a dag-like $\mathrm {F}_\to $ -derivation from $\Gamma $ with $O(t^2)$ lines, height $O(h)$ , formula size $O(rt)$ , and (inferential) size $O(\tilde st\log t)$ . If the original $\mathrm {NM}_\to $ -derivation is tree-like, the $\mathrm {F}_\to $ -derivation can be taken tree-like as well.

Proof We use the same notation and argument structure as in the proof of Theorem A.10, but we work with the formulas $\delta _v=\mathop {\bigwedge \nolimits ^{{\gamma _v}}_{}A^{\prime }_v}\to \gamma _v$ in place of $A^{\prime }_v\to \gamma _v$ . Observe $|\delta _v|=O\big (s+\left |{\gamma _v}\right |t\big )=O(rt)$ , where $s=\lVert \Pi \rVert $ .

For each $v\in V$ , we construct an $\mathrm {F}_\to ^*$ -derivation $\Pi _v$ of $\langle \delta _u\rangle _{\langle u,v\rangle \in E}\to \delta _v$ from $\Gamma $ with $O(t)$ lines, height $O(\log t)$ , formula size $O(s+s_vt)=O(rt)$ , and size $O\big ((s+s_vt)\log t\big )$ , where $s_v=|\gamma _v|+\sum _{\langle u,v\rangle \in E}|\gamma _u|$ :

• The case of v being a leaf is straightforward.
• If v is an $\mathrm {({\to }I)}$ -node with premise u, we have $\gamma _v=(\alpha \to \beta )$ and $\gamma _u=\beta $ for some $\alpha $ and $\beta $ , and $A^{\prime }_u\subseteq A^{\prime }_v\cup \{\alpha \}$ as a set. Lemma A.13 gives a proof of $\mathop {\bigwedge \nolimits ^{{\beta }}_{}A^{\prime }_v}\to \alpha ^\beta \to \mathop {\bigwedge \nolimits ^{{\beta }}_{}A^{\prime }_u}$ , which (using $\alpha \to \alpha ^\beta $ ) yields $\delta _u\to \mathop {\bigwedge \nolimits ^{{\beta }}_{}A^{\prime }_v}\to \gamma _v$ . Combining this with $\mathop {\bigwedge \nolimits ^{{\gamma _v}}_{}A^{\prime }_v}\to \big (\mathop {\bigwedge \nolimits ^{{\beta }}_{}A^{\prime }_v}\big )^{\gamma _v}$ from Lemma A.14 gives $\delta _u\to \mathop {\bigwedge \nolimits ^{{\gamma _v}}_{}A^{\prime }_v}\to \gamma _v$ , i.e., $\delta _u\to \delta _v$ .
• If v is an $\mathrm {({\to }E)}$ -node with premises $u_0$ and $u_1$ , then $\gamma _{u_0}=\alpha $ , $\gamma _{u_1}=(\alpha \to \beta )$ , and $\gamma _v=\beta $ for some $\alpha $ and $\beta $ , and $A^{\prime }_{u_i}\subseteq A^{\prime }_v$ . Using Lemmas A.13 and A.14, we obtain proofs of $\delta _{u_0}\to \mathop {\bigwedge \nolimits ^{{\beta }}_{}A^{\prime }_v}\to \alpha ^\beta $ and $\delta _{u_1}\to \mathop {\bigwedge \nolimits ^{{\beta }}_{}A^{\prime }_v}\to (\alpha \to \beta )^\beta $ , which yield $\delta _{u_0}\to \delta _{u_1}\to \mathop {\bigwedge \nolimits ^{{\beta }}_{}A^{\prime }_v}\to \beta $ (i.e., $\delta _{u_0}\to \delta _{u_1}\to \delta _v$ ) using the schema $\alpha ^\beta \to (\alpha \to \beta )^\beta \to \beta $ .

Combining the $\Pi _v$ derivations yields an $\mathrm {F}_\to $ -derivation (tree-like if $\Pi $ is tree-like) of $\varphi $ from $\Gamma $ with $O(t^2)$ lines, height $O(h)$ , formula size $O(rt)$ , and size $O\big (st\log t+\sum _vs_vt\log t)=O(\tilde st\log t)$ .

Remark A.17. We can improve the resulting $\mathrm {F}_\to $ -derivation to a tree-like derivation of height $O(\log t)$ at the expense of a mild size increase: see Theorem B.5.

If we have a real $\land $ , the $|\gamma _v|$ terms from the size parameters disappear, and we obtain a derivation with formula size $O(s)$ and size $O(st\log t)$ rather than $O(\tilde st\log t)$ . It is unclear how to achieve that in the purely implicational setting. One possible improvement is to modify the inductive definition of $\mathop {\bigwedge \nolimits ^{\varphi }_{}}$ so that $\mathop {\bigwedge \nolimits ^{\varphi }_{}\Gamma }=(\Gamma \to \varphi )\to \varphi $ whenever $|\Gamma |\le \ell $ , where $\ell \ge 1$ is an extra parameter. Then $\mathop {\bigwedge \nolimits ^{\varphi }_{}\Gamma }$ has size $O\big (\lVert \Gamma \rVert +\left |\varphi \right |\frac n\ell \big )$ , where $n=\max \big \{|\Gamma |,\ell \big \}$ . The proofs in Lemma A.13 will have formula size $O\big (s+\left |\varphi \right |\frac n\ell \big )$ and size $O\big (s(\log m+\ell )+\left |\varphi \right |\frac n\ell \log m\big )$ , and similarly for Lemma A.14. In the context of the proof of Theorem A.16, the optimal choice is $\ell \approx \sqrt {(\tilde s/s)\log t}$ , which yields an $\mathrm {F}_\to $ -derivation of $\varphi $ from $\Gamma $ with $O(t^2)$ lines, height $O(\log t)$ , formula size $O\big (s+rt/\ell \big )$ , and size $O\big (st\log t+\sqrt {\tilde sst^2\log t}\big )$ .

Remark A.18. Using similar arguments, we can also prove an efficient version of Lemma 2.1: if $\varphi $ has an $\mathrm {F}_\to $ -derivation from $\Gamma =\{\alpha _i:i<n\}$ and $\Delta $ with $t\ge n$ lines, height h, formula size r, and size s, then $\mathop {\bigwedge \nolimits ^{\varphi }_{}\Gamma }\to \varphi $ and $\Gamma \to \varphi $ have $\mathrm {F}_\to $ -derivations from $\Delta $ with $O(t)$ lines, height $O(h)$ , formula size $O\big (r+\lVert \Gamma \rVert +\left |\varphi \right | n\big )$ , and size $O\big (s+(\lVert \Gamma \rVert +\left |\varphi \right | n)t\big )$ .

B Equivalence of dag-like and tree-like proofs

Our final task is to show that $\mathrm {NM}_\to $ and $\mathrm {F}_\to $ are polynomially equivalent to their tree-like versions $\mathrm {NM}_\to ^*$ and $\mathrm {F}_\to ^*$ ; more precisely, we will show that an $\mathrm {F}_\to $ -proof with t lines can be converted to a polynomially larger tree-like proof of height $O(\log t)$ (Theorem B.3), which implies a similar simulation of $\mathrm {NM}_\to $ by $\mathrm {NM}_\to ^*$ (Theorem B.5).

The original argument by Krajíček [Reference Krajíček23, L. 4.4.8] (stated in the context of classical logic, but intuitionistically valid) relies on conjunctions: given a proof $\langle \gamma _i:i<t\rangle $ , we consider the conjunctions $\tau _j=\bigwedge _{i<j}\gamma _i$ , construct short tree-like proofs of $\tau _i\to \tau _{i+1}$ , and combine them to a proof of $\tau _t$ . A purely implicational version of the argument was sketched in [Reference Jeřábek20, Prop. 2.6], using the $\alpha \land ^{\varphi }_{\vphantom l}\beta $ formulas to emulate conjunctions. We now present the argument in detail, incorporating an extra idea to save proof size: instead of (an implicational emulation of) the long conjunctions $\tau _i\to \tau _{i+1}$ , we start with $\tau ^{\prime }_i\to \gamma _i$ where $\tau ^{\prime }_i$ only consists of the premises needed to infer $\gamma _i$ , and we gradually merge these lists of premises in later stages of the proof.

Let us first observe that if we do not care about the exact values of the polynomial bounds, an $O(\log t)$ height bound along with a polynomial formula-size bound is all we need to show, as we will then get tree-like polynomial-size proofs for free.

Lemma B.1. Let $\Pi $ be a dag-like $\mathrm {F}_\to $ -derivation of $\varphi $ from $\Gamma $ of height h and formula size r. Then there is a tree-like $\mathrm {F}_\to $ -derivation $\Pi '$ of $\varphi $ from $\Gamma $ of height h and formula size r, hence with less than $2^{h+1}$ lines and size $2^{h+1}r$ .

Proof We can unwind a dag-like derivation $\langle V,E,\gamma \rangle $ with root $\varrho $ to a tree-like derivation $\langle V',E',\gamma '\rangle $ of the same height by taking for $V'$ the set of all paths ending in $\varrho $ , with $\langle p,q\rangle \in E'$ if p initially extends q by one edge, and $\gamma ^{\prime }_p=\gamma _v$ where v is the starting vertex of p. The bounds on the number of lines and size follow from Observation A.3.

Thus, a reader who is happy with any polynomial may ignore the exact bounds on the number of lines below and concentrate on height bounds, which are easier to verify.

We need one more structural property of $\mathrm {F}_\to $ -proofs so that we can accurately estimate the resulting proof size. Let us say that an $\mathrm {F}_\to $ -derivation is non-redundant if no formula occurs in it more than once.

Lemma B.2.

(i) Any $\mathrm {F}_\to $ -derivation of $\varphi $ from $\Gamma $ can be made non-redundant by omitting some formulas.
(ii) A non-redundant dag-like $\mathrm {F}_\to $ -derivation of size s has inferential size $O(s)$ .

Proof (i): If we omit all but the first occurrence of each formula from a (sequence-like) $\mathrm {F}_\to $ -derivation, it remains an $\mathrm {F}_\to $ -derivation.

(ii): Clearly, the total size of axioms (logical or from $\Gamma $ ) is at most s. As for (MP) inferences, the size of an inference $\alpha ,\alpha \to \beta \mathrel /\beta $ is linear in the size of its second premise $\alpha \to \beta $ . In a non-redundant proof, each formula of the form $\alpha \to \beta $ can be used at most once as a second premise of an (MP) inference, because the conclusion of such an inference can only be $\beta $ , which can only occur once in the derivation. Thus, the total size of (MP) inferences is also $O(s)$ .

We remark that property (ii) is specific to Frege systems based on (MP) as the only rule of inference; we see no reason it should hold in general. (Another such (MP)-specific property is the last part of Lemma A.4.)

Theorem B.3. If $\varphi $ has an $\mathrm {F}_\to $ -derivation from $\Gamma $ with t lines, formula size r, and size s, then it has a tree-like $\mathrm {F}_\to $ -derivation from $\Gamma $ with $O(t\log t)$ lines, height $O(\log t)$ , formula size $O(s+\left |\varphi \right | t)$ , and size $O\big ((s+\left |\varphi \right | t)(\log t)^2\big )$ .

Proof Let $\Pi =\langle \gamma _i:i<t\rangle $ be a derivation of $\varphi $ from $\Gamma $ , which we may assume to be non-redundant by Lemma B.2. We fix $E\subseteq {<}\mathbin \restriction [t]$ that makes $\langle [t],E,\gamma \rangle $ a dag-like derivation by Lemma A.2. For each $j<t$ and $k\le \lceil \log t\rceil $ such that $2^k\mid j$ , we put

$$ \begin{align*} P_j^k&=\big\{i<j:\exists i'\in[j,j')\:\langle i,i'\rangle\in E\big\},\\ \Gamma_j^k&=\langle\gamma_i:i\in[j,j')\rangle,\\ \Delta_j^k&=\langle\gamma_i:i\in P_j^k\rangle,\\ \tau_j^k&=\mathop{\bigwedge\nolimits^{\varphi}}\Delta_j^k\to\mathop{\bigwedge\nolimits^{\varphi}}\Gamma_j^k, \end{align*} $$

where $j'=\min \{j+2^k,t\}$ . Observe that $|\Gamma _j^k|\le 2^k$ , $|\Delta _j^k|=|P_j^k|=O(2^k)$ , and $|\tau _j^k|=O\big (\lVert \Gamma _j^k\rVert +\lVert \Delta _j^k\rVert +\left |\varphi \right |2^k\big )=O(s_{k,j}+\left |\varphi \right |2^k)$ , where we put $s_i=|\gamma _i|+\sum _{\langle i',i\rangle \in E}|\gamma _{i'}|$ as in Definition A.15, and $s_{k,j}=\sum _{i\in [j,j')}s_i$ . Notice that $\sum _is_i=O(s)$ by Lemma B.2, thus also $\sum _{2^k\mid j}s_{k,j}=O(s)$ for each k.

We construct $\mathrm {F}_\to ^*$ -derivations $\Pi _j^k$ of $\tau _j^k$ from $\Gamma $ by induction on k. For $k=0$ , the formula $\tau _j^0$ is $\mathop {\bigwedge \nolimits ^{\varphi }_{}}\langle \gamma _i:\langle i,j\rangle \in E\rangle \to \gamma _j^\varphi $ , which has a derivation from $\Gamma $ with $O(1)$ lines and size $O(s_j+|\varphi |)$ using Lemma A.4. Assume that $\Pi _j^k$ have been defined for all $j<t$ such that $2^k\mid j$ , and let $j<t$ be such that $2^{k+1}\mid j$ . If $j+2^k\ge t$ , we have $\tau _j^{k+1}=\tau _j^k$ , thus we can take $\Pi _j^{k+1}=\Pi _j^k$ . Otherwise, we combine $\Pi _j^k$ and $\Pi _{j+2^k}^k$ to $\Pi _j^{k+1}$ using an $\mathrm {F}_\to ^*$ -proof of $\tau _j^k\to \tau _{j+2^k}^k\to \tau _j^{k+1}$ with $O(2^k)$ lines, height $O(\log t)$ , formula size $O(s_{k+1,j}+\left |\varphi \right |2^k)$ , and size $O\big ((s_{k+1,j}+\left |\varphi \right |2^k)\log t\big )$ that we construct as follows. Observe that $\Gamma _j^{k+1}$ is the concatenation of $\Gamma _j^k$ and $\Gamma _{j+2^k}^k$ , and $\Delta _j^k\subseteq \Delta _j^{k+1}$ , while $\Delta _{j+2^k}^k$ is a concatenation of a subsequence of $\Delta _j^{k+1}$ and a subsequence of $\Gamma _j^k$ . Thus, Lemma A.13 gives us $\mathrm {F}_\to ^*$ -proofs of

$$ \begin{align*} \tau_j^k&\to\mathop{\bigwedge\nolimits^{\varphi}}\Delta_j^{k+1}\to\mathop{\bigwedge\nolimits^{\varphi}\Gamma_j^k},\\ \tau_{j+2^k}^k&\to\mathop{\bigwedge\nolimits^{\varphi}\Delta_j^{k+1}}\to\mathop{\bigwedge\nolimits^{\varphi}\Gamma_j^k}\to\mathop{\bigwedge\nolimits^{\varphi}\Gamma_{j+2^k}^k},\\ \mathop{\bigwedge\nolimits^{\varphi}\Gamma_j^k}\to\mathop{\bigwedge\nolimits^{\varphi}\Gamma_{j+2^k}^k}&\to\mathop{\bigwedge\nolimits^{\varphi}\Gamma_j^{k+1}}. \end{align*} $$

with the stated size parameters. These together imply $\tau _j^k\to \tau _{j+2^k}^k\to \tau _j^{k+1}$ .

In the end, $\Pi _0^{\lceil \log t\rceil }$ is a derivation of $\top \to \mathop {\bigwedge \nolimits ^{\varphi }_{i<t}\gamma _i}$ from $\Gamma $ . This yields $\mathop {\bigwedge \nolimits ^{\varphi }_{}}\langle \gamma _{t-1}\rangle $ , i.e., $\varphi ^\varphi $ , using Lemma A.13, and we can infer $\varphi $ .

It is clear that the whole derivation has height $O(\log t)$ and formula size $O(s+\left |\varphi \right | t)$ . The derivations $\Pi _j^0$ have together $O(t)$ lines and size $O\big (\sum _j(s_j+|\varphi |)\big )=O(s+\left |\varphi \right | t)$ . Likewise, for each $k<\lceil \log t\rceil $ , there are $t/2^{k+1}$ subproofs of $\tau _j^k\to \tau _{j+2^k}^k\to \tau _j^{k+1}$ with $O(2^k)$ lines each, which together makes $O(t)$ lines of size $O\big (\sum _{2^{k+1}\mid j}(s_{k+1,j}+\left |\varphi \right |2^k)\log t\big )= O\big ((s+\left |\varphi \right | t)\log t\big )$ . Summing over all $k<\lceil \log t\rceil $ , the whole derivation has $O(t\log t)$ lines and size $O\big ((s+\left |\varphi \right | t)(\log t)^2\big )$ .

Remark B.4. We could avoid the machinery of $\mathop {\bigwedge \nolimits ^{\varphi }_{}\Gamma }$ formulas by defining $\tau _j^k=(\Gamma _j^k\to \varphi )\to (\Delta _j^k\to \varphi )$ , and using Lemmas A.8 and A.9 in place of Lemma A.13, yielding an $\mathrm {F}_\to ^*$ -derivation with $O(t\log t)$ lines, height $O(\log t)$ , formula size $O(s)$ , and size $O(st+\left |\varphi \right | t\log t)$ .

If we have a real $\land $ , the $|\varphi |$ terms from the size parameters disappear: we obtain a derivation with $O(t\log t)$ lines, height $O(\log t)$ , formula size $O(s)$ , and sizeFootnote ⁵ $O\big (s(\log t)^2\big )$ .

Back in the implicational setting, we can alternatively use $\mathop {\bigwedge \nolimits ^{{p}}_{}}$ in place of $\mathop {\bigwedge \nolimits ^{\varphi }_{}}$ , where p is the right-most variable occurrence in $\varphi $ , i.e., $\varphi $ is of the form $\Phi \to p$ for some sequence $\Phi $ . This reduces all the $|\varphi |$ terms in the size parameters to $O(1)$ : we obtain a derivation of $\varphi ^p$ from $\Gamma $ with $O(t\log t)$ lines, height $O(\log t)$ , formula size $O(s)$ , and size $O\big (s(\log t)^2\big )$ . We can construct a proof of $\varphi ^p\to \varphi $ using Lemma A.9: two instances of (A.3) give $\Phi \to (((\Phi \to p)\to p)\to p)\to p$ , and (A.5) yields $(((\Phi \to p)\to p)\to p)\to \Phi \to p$ . We obtain an $\mathrm {F}_\to ^*$ -derivation of $\varphi $ from $\Gamma $ with $O(t\log t+n)$ lines, height $O\big (\log (t+n)\big )$ , formula size $O(s)$ , and size $O\big (s(\log t)^2+\left |\varphi \right | n\big )$ , where $n=|\Phi |\le |\varphi |$ . Furthermore, if the $\mathbf {IPC}_\to $ tautology $\Gamma \to \varphi $ is not a substitution instance of any strictly smaller $\mathbf {IPC}_\to $ tautology, then $n=O(t)$ because of [Reference Krajíček23, L. 4.4.4], which simplifies the bounds to $O(t\log t)$ lines, height $O(\log t)$ , formula size $O(s)$ , and size $O\big (s(\log t)^2+\left |\varphi \right | n\big )$ .

We can also modify the definition of $\mathop {\bigwedge \nolimits ^{\varphi }_{}}$ using an extra parameter $\ell $ as in Remark A.17. In the context of the proof of Theorem B.3, the optimal choice is $\ell \approx \sqrt {\left |\varphi \right | t(\log t)/s}$ , which yields an $\mathrm {F}_\to ^*$ -derivation of $\varphi $ from $\Gamma $ with $O(t\log t)$ lines, height $O(\log t)$ , formula size $O\big (s+\sqrt {\left |\varphi \right | st/\log t}\big )$ , and size $O\big (s(\log t)^2+\sqrt {\left |\varphi \right | st(\log t)^3}\big )$ .

Theorems A.10 or A.16, B.3, and A.5 imply a polynomial simulation of $\mathrm {NM}_\to $ by $\mathrm {NM}_\to ^*$ , but we can obtain better bounds by taking into account that the building blocks of the proofs constructed in Theorems A.10 and A.16 are already tree-like.

Theorem B.5. If $\varphi $ has an $\mathrm {NM}_\to $ -derivation from $\Gamma $ with t lines, size s, and inferential size $\tilde s$ , then it has an $\mathrm {F}_\to ^*$ -derivation and $\mathrm {NM}_\to ^*$ -derivation from $\Gamma $ with $O(t^2)$ lines, height $O(\log t)$ , formula size $O(st)$ , and size $O\big (\min \{st^2,\tilde st(\log t)^2\}\big )$ .

Proof In view of Theorem A.5, it is enough to construct an $\mathrm {F}_\to ^*$ -derivation.

We combine the arguments in Theorems A.16 and B.3. Let $\Pi =\langle V,E,\gamma \rangle $ be an $\mathrm {NM}_\to $ -derivation of $\varphi $ from $\Gamma $ . By considering a topological ordering of $\langle V,E\rangle $ , we may assume $V=[t]$ and $E\subseteq {<}\mathbin \restriction [t]$ . As in the proof of Theorem A.16, let $\langle \gamma ^{\prime }_i\rangle _{i<t'}$ , $t'\le t$ , be an injective enumeration of the set $\{\gamma _i:i<t\}$ , and for each $i<t$ , let $A^{\prime }_i$ denote the sequence $\langle \gamma ^{\prime }_j:j<t',\gamma ^{\prime }_j\in A_i\smallsetminus \Gamma \rangle $ . Put $\delta _i=\mathop {\bigwedge \nolimits ^{{\gamma _i}}_{}A^{\prime }_i}\to \gamma _i$ ; we have $|\delta _i|=O(s+\left |{\gamma _i}\right |t)$ .

Similarly to the proof of Theorem B.3, for all $j<t$ and $k\le \lceil \log t\rceil $ such that $2^k\mid j$ , we put

$$ \begin{align*} P_j^k&=\big\{i<j:\exists i'\in[j,j')\:\langle i,i'\rangle\in E\big\},\\ \Gamma_j^k&=\langle\delta_i:i\in[j,j')\rangle,\\ \Delta_j^k&=\langle\delta_i:i\in P_j^k\rangle,\\ \tau_j^k&=\mathop{\bigwedge\nolimits^{\varphi}\Delta_j^k}\to\mathop{\bigwedge\nolimits^{\varphi}\Gamma_j^k}, \end{align*} $$

where $j'=\min \{j+2^k,t\}$ . We have $|\Gamma _j^k|\le 2^k$ and $|\Delta _j^k|=|P_j^k|=O(2^k)$ , thus $|\tau _j^k|=O\big (\lVert \Gamma _j^k\rVert +\lVert \Delta _j^k\rVert +\left |\varphi \right |2^k\big )=O(s2^k+s_{k,j}t)$ , where $s_{k,j}=\sum _{j\le i<j'}|\gamma _i|+\sum _{i\in P_j^k}|\gamma _i|\le s$ . Observe $s_{k,j}\le \sum _{j\le i<j'}s_{0,i}$ , thus for a fixed k, $\sum _js_{k,j}\le \sum _{i<t}s_{0,i}=\tilde s$ . We will now construct $\mathrm {F}_\to ^*$ -derivations $\Pi _j^k$ of $\tau _j^k$ from $\Gamma $ by induction on k.

As shown in the proof of Theorem A.16, for each $j<t$ , there is an $\mathrm {F}_\to ^*$ -derivation of $\Delta _j^0\to \delta _j$ from $\Gamma $ with $O(t)$ lines, height $O(\log t)$ , formula size $O(s+s_{0,j}t)$ , and size $O\big ((s+s_{0,j}t)\log t\big )$ . We can infer $\mathop {\bigwedge \nolimits ^{\varphi }}_{}\Delta _j^0\to \gamma _j^\varphi $ , which is $\tau _j^0$ , using $O(1)$ extra lines of size $O(s+s_{0,j}t)$ ; we denote the resulting derivation $\Pi _j^0$ . In total, these derivations have $O(t^2)$ lines, height $O(\log t)$ , formula size $O(rt)$ (where r is the formula size of $\Pi $ ) and size $O(\tilde st\log t)$ .

Let $k<\lceil \log t\rceil $ and $j<t$ be such that $2^{k+1}\mid j$ . If $j+2^k\ge t$ , then $\tau _j^{k+1}=\tau _j^k$ , and we put $\Pi _j^{k+1}=\Pi _j^k$ . Otherwise, we combine $\Pi _j^k$ and $\Pi _{j+2^k}^k$ to $\Pi _j^{k+1}$ using an $\mathrm {F}_\to ^*$ -proof of $\tau _j^k\to \tau _{j+2^k}^k\to \tau _j^{k+1}$ as constructed in the proof of Theorem B.3: it has $O(2^k)$ lines, height $O(\log t)$ , formula size $O(s2^k+s_{k+1,j}t)=O(st)$ , and size $O\big ((s2^k+s_{k+1,j}t)\log t\big )$ ; summing this over all j for a fixed k gives $O(t)$ lines of total size $O(\tilde st\log t)$ .

Altogether, $\Pi _0^{\lceil \log t\rceil }$ has $O(t^2)$ lines, height $O(\log t)$ , formula size $O(st)$ , and size $O\big (\tilde st(\log t)^2\big )$ . It is a derivation of $\top \to \mathop {\bigwedge \nolimits ^{\varphi }_{i<t}\delta _i}$ from $\Gamma $ . Since $\delta _{t-1}=\top \to \varphi $ , we can infer $\varphi $ using Lemma A.13 without asymptotically increasing any of the size parameters.

We can obtain the $O(st^2)$ size bound similarly, using $\delta _i=A^{\prime }_i\to \gamma _i$ as in the proof of Theorem A.10 in place of Theorem A.16; in this case, we can avoid usage of the $\mathop {\bigwedge \nolimits ^{\varphi }_{}}$ formulas entirely as in Remark B.4.

We mention that if we have a real $\land $ , the size bound improves to $O\big (st(\log t)^2\big )$ .

Acknowledgments

I want to thank Pavel Hrubeš for the suggestion to use tautologies based on the Colouring–Cocolouring principle, Susanna de Rezende for kindly explaining her work, and the anonymous referee for improvements in the presentation.

Funding

The research was supported by the Czech Academy of Sciences (RVO 67985840) and GA ČR project 23-04825S.

Footnotes

1 It might appear more visually pleasing to define it as $\varphi _0\to \varphi _1\to \dots \to \varphi _{n-1}\to \psi $ , but the reverse order will be technically more convenient, e.g., in some inductive arguments in Appendix A.

2 In [Reference Gordeev and Haeusler10], proofs go upside down so that edges are directed from conclusions to premises; we reversed them to a more natural order. Also, they include an auxiliary repetition rule that we omit for simplicity (it can be eliminated from any $\mathrm {NM}_\to $ derivation without increasing its size).

3 It is not surprising that $\alpha _0\lor \alpha _1$ can be expressed by an implicational formula as in Lemma 3.10; what is supposed to be novel here is the way to prove the feasible disjunction property for this formulation without reintroducing disjunctions.

4 In this paper, $\log $ denotes base- $2$ logarithm.

5 [Reference Krajíček23, L. 4.4.8] seemingly claims an even better bound $O(s\log t)$ , but this is a typo, as the argument only warrants size $O(st\log t)$ ; cf. https://www.karlin.mff.cuni.cz/~krajicek/upravy.html.

References

Alon, N. and Boppana, R. B., The monotone circuit complexity of Boolean functions . Combinatorica , vol. 7 (1987), no. 1, pp. 1–22.Google Scholar

Beame, P., Impagliazzo, R., Krajíček, J., Pitassi, T., Pudlák, P., and Woods, A., Exponential lower bounds for the pigeonhole principle , Proceedings of the 24th Annual ACM Symposium on Theory of Computing , 1992, pp. 200–220.Google Scholar

Samuel, R., Some remarks on lengths of propositional proofs . Archive for Mathematical Logic , vol. 34 (1995), no. 6, pp. 377–394.Google Scholar

Buss, S. R. and Mints, G., The complexity of the disjunction and existential properties in intuitionistic logic . Annals of Pure and Applied Logic , vol. 99 (1999), pp. 93–104.Google Scholar

Buss, S. R. and Nordström, J., Proof complexity and SAT solving, Handbook of Satisfiability , second ed. (Biere, A., Heule, M., Maaren, H. van, and Walsh, T., editors), Frontiers in Artificial Intelligence and Applications, 336, IOS Press, 2021, pp. 233–350.Google Scholar

Chagrov, A. V. and Zakharyaschev, M., Modal Logic , Oxford Logic Guides, 35, Oxford University Press, Oxford, 1997.Google Scholar

Cook, S. A. and Reckhow, R. A., The relative efficiency of propositional proof systems . The Journal of Symbolic Logic , vol. 44 (1979), no. 1, pp. 36–50.Google Scholar

Ferrari, M., Fiorentini, C., and Fiorino, G., On the complexity of the disjunction property in intuitionistic and modal logics . ACM Transactions on Computational Logic , vol. 6 (2005), no. 3, pp. 519–538.Google Scholar

Gordeev, L. and Haeusler, E. H., Proof compression and NP versus PSPACE . Studia Logica , vol. 107 (2019), no. 1, pp. 53–83.Google Scholar

Gordeev, L. and Haeusler, E. H., Proof compression and NP versus PSPACE II . Bulletin of the Section of Logic , vol. 49 (2020), no. 3, pp. 213–230.Google Scholar

Gordeev, L. and Haeusler, E. H., Proof compression and NP versus PSPACE II: Addendum . Bulletin of the Section of Logic , vol. 51 (2022), no. 2, pp. 197–205.Google Scholar

Gordeev, L. and Haeusler, E. H., On proof theory in computer science, preprint, 2020, arXiv:2012.04437.Google Scholar

Hrubeš, P., Lower bounds for modal logics . The Journal of Symbolic Logic , vol. 72 (2007), no. 3, pp. 941–958.Google Scholar

Hrubeš, P., A lower bound for intuitionistic logic . Annals of Pure and Applied Logic , vol. 146 (2007), no. 1, pp. 72–90.Google Scholar

Hrubeš, P., On lengths of proofs in non-classical logics . Annals of Pure and Applied Logic , vol. 157 (2009), nos. 2–3, pp. 194–205.Google Scholar

Hrubeš, P. and Pudlák, P., Random formulas, monotone circuits, and interpolation , Proceedings of the 58th Annual IEEE Symposium on Foundations of Computer Science , 2017, pp. 121–131.Google Scholar

Jalali, R., Proof complexity of substructural logics . Annals of Pure and Applied Logic , vol. 172 (2021), no. 7, Article no. 102972, 31 pp.Google Scholar

Jeřábek, E., Frege systems for extensible modal logics . Annals of Pure and Applied Logic , vol. 142 (2006), pp. 366–379.Google Scholar

Jeřábek, E., Substitution Frege and extended Frege proof systems in non-classical logics . Annals of Pure and Applied Logic , vol. 159 (2009), no. 1–2, pp. 1–48.Google Scholar

Jeřábek, E., Proof complexity of intuitionistic implicational formulas . Annals of Pure and Applied Logic , vol. 168 (2017), no. 1, pp. 150–190.Google Scholar

Jukna, S., Boolean Function Complexity: Advances and Frontiers , Algorithms and Combinatorics, 27, Springer, Heidelberg, Dordrecht, New York, 2012.Google Scholar

Kleene, S. C., Disjunction and existence under implication in elementary intuitionistic formalisms . The Journal of Symbolic Logic , vol. 27 (1962), no. 1, pp. 11–18.Google Scholar

Krajíček, J., Bounded Arithmetic, Propositional Logic, and Complexity Theory , Encyclopedia of Mathematics and Its Applications, 60, Cambridge University Press, Cambridge, 1995.Google Scholar

Krajíček, J., Interpolation theorems, lower bounds for proof systems, and independence results for bounded arithmetic . The Journal of Symbolic Logic , vol. 62 (1997), no. 2, pp. 457–486.Google Scholar

Krajíček, J., Proof complexity , Encyclopedia of Mathematics and its Applications, vol. 170, Cambridge University Press, Cambridge, 2019.Google Scholar

Mancosu, P., Galvan, S., and Zach, R., An Introduction to Proof Theory: Normalization, Cut-Elimination, and Consistency Proofs , Oxford University Press, Oxford, 2021.Google Scholar

Mints, G. and Kojevnikov, A., Intuitionistic Frege systems are polynomially equivalent . Zapiski Nauchnyh Seminarov POMI , vol. 316 (2004), pp. 129–146.Google Scholar

Prawitz, D., Natural Deduction: A Proof-Theoretical Study , Dover Publications, Mineola, New York, 2006.Google Scholar

Pudlák, P., On the complexity of propositional calculus , Sets and Proofs: Invited Papers from Logic Colloquium ’97 (Barry Cooper, S. and Truss, J. K., editors), Cambridge University Press, Cambridge, 1999, pp. 197–218.Google Scholar

Razborov, A. A., Lower bounds on the monotone complexity of some Boolean functions . Mathematics of the USSR, Doklady , vol. 31 (1985), pp. 354–357.Google Scholar

Reckhow, R. A., On the lengths of proofs in the propositional calculus , Ph.D. thesis, Department of Computer Science, University of Toronto, 1976.Google Scholar

Tardos, É., The gap between monotone and non-monotone circuit complexity is exponential . Combinatorica , vol. 7 (1987), no. 4, pp. 141–142.Google Scholar

Troelstra, A. S. and van Dalen, D., Constructivism in mathematics: An introduction. I , Studies in Logic and the Foundations of Mathematics, vol. 121, North-Holland, Amsterdam, New York, Oxford, Tokyo, 1988.Google Scholar

Article contents

A SIMPLIFIED LOWER BOUND FOR IMPLICATIONAL LOGIC

Abstract

Keywords

MSC classification

1 Introduction

2 Preliminaries

Lemma 2.1 (deduction theorem).

Theorem 2.2 (finite model property).

Lemma 2.3 [Reference Gordeev and Haeusler10].

3 An exponential lower bound

4 Extensions

4.1 Logics of unbounded branching

4.2 Full propositional language

4.3 Frege and Extended Frege

4.4 Separation from Substitution Frege

4.5 Larger bounds

5 Conclusion

A Equivalence with Frege

B Equivalence of dag-like and tree-like proofs

Acknowledgments

Funding

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests