Von Neumann’s Minimax Theorem in the Gentry Wichs Separation

Christopher Smith

Last Updated: 2025-11-05

1 Overview

The Gentry Wichs separation [1] is a fundamental barrier in theoretical cryptography stating that the adaptive soundness of any SNARG cannot be proved with a black-box reduction to any falsifiable assumption. It should be mentioned that recent results [6–8] circumvent this barrier through clever uses of complexity leveraging (and pay for this in CRS size). Barrier busting aside, some of the techniques used in [1] are quite interesting. In particular, the proof of their key “indistinguishability with auxiliary information” lemma relies on an application of von Neumann’s minimax theorem. The aim of this note is to introduce this minimax theorem and unpack its usage within [1].

2 Minimax Theorems

See the Wikipedia page on minimax theorems [10] for a solid introduction to the topic. For more in-depth resources, see Sion [5] and Kjeldsen [2]. In short, a minimax theorem is a theorem claiming something of the form \[\max_{x \in X} \min_{y \in Y} f(x,y) = \min_{y \in Y} \max_{x \in X} f(x,y)\] under specific conditions on \(X\), \(Y\), and \(f\). Any point \((x,y)\) at which the equality holds is often called a saddle point. John von Neumann [4] is credited with the first of these theorems, which is stated as follows.

Theorem 1 (von Neumann minimax [10]). Let \(X = \left \{ (x_1,...,x_n) \in [0,1]^n ~ \middle | ~\sum x_i = 1 \right \}\) and
\(Y = \left \{ (y_1,...,y_m) \in [0,1]^m ~ \middle | ~\sum y_i = 1 \right \}\) be standard simplexes, and let \(f(x,y)\) be a linear function in both of its arguments (that is, \(f\) is bilinear), and can therefore be written as \(f(x,y) = x^{\top}Ay\) for a finite matrix \(A \in \mathbb{R}^{n \times m}\), or equivalently as \(f(x,y) = \sum_{i=1}^n \sum_{j=1}^m A_{ij}x_iy_j\). Then \[\max_{x \in X} \min_{y \in Y} f(x,y) = \min_{y \in Y} \max_{x \in X} f(x,y)\]

2.0.0.1 Type Inference for \(f\).

Note that \(f\) technically cannot be bilinear over \(X \times Y\) as \(X,Y\) are not proper vector spaces. It follows that \(f\) must at least be bilinear over some pair of vector spaces \(\hat{X} \supset X, \hat{Y} \supset Y\). From the characterization in 1 of \(f\) as \(\sum_{i=1}^n \sum_{j=1}^m A_{ij}x_iy_j\) where \(A_{ij}\) are entries in a finite matrix \(A \in \mathbb{R}^{n \times m}\), we can deduce that the scalar field of \(f\) is \(\mathbb{R}\), and that \(\hat{X},\hat{Y}\) must be finite dimensional. Specifically, \(\dim \hat{X} \leq n\) and \(\dim \hat{Y} \leq m\). But notice that any vector space containing \(X\) must necessarily contain \(\mathsf{span}(X)\), and \(\mathsf{span}(X) = \mathbb{R}^n\) since the \(n\) standard basis vectors for for \(\mathbb{R}^n\) already belong to \(X\). Thus, \(\hat{X}\) must be isomorphic to \(\mathbb{R}^n\), and \(\hat{Y}\) must be isomorphic to \(\mathbb{R}^m\). All this to conclude that we can safely think of \(f\) as a bilinear function \(\mathbb{R}^n \times \mathbb{R}^m \to \mathbb{R}\) (i.e., a bilinear form, depending on whether you insist that \(n=m\)).

2.0.0.2 Bilinear vs. continuous quasi-concave-convex

I feel compelled to mention that, as explained in [2], von Neumann actually proved his theorem for the more general case of a continuous function \(f: X \times Y \to R\) that is quasiconcave in in \(X\) and quasiconvex in \(Y\) (5 implies \(X\) and \(Y\) are convex subsets of a real vector space). Perhaps to nobody’s surprise, the bilinear function \(f(x,y) = \sum_{i}\sum_{j}A_{ij}x_iy_j\) of 1 satisfies these conditions. Let us prove this anyway.

Claim 2. Let \(f: \mathbb{R}^n \times \mathbb{R}^m \to \mathbb{R}\) be bilinear. Then \(f\) is continuous, quasiconcave in its first argument, and quasiconvex in its second argument.

Proof. We first prove continuity. We do this by showing that \(f\) is separately continuous (i.e.,\(f(x,\cdot)\) and \(f(\cdot, y)\) are continuous \(\forall x,y\)), and then appealing to the fact [9] that a separately continuous bilinear map \(f: X \times Y \to Z\) is continuous if \(X\) is a Baire space and \(Y\) is metrizable. Indeed, since \(R^n\) is a complete metric space it is both Baire and metrizable, so it remains to be seen that \(f\) is separately continuous. By symmetry it is enough to show that \(f(\cdot, y) : \mathbb{R}^n \to \mathbb{R}\) is continuous. Because \(\mathbb{R}^n\) is a normed and finite dimensional space, by Theorem 2.7-8 of [3], it is bounded. By Theorem 2.7-9 of [3], since \(f(\cdot, y)\) is a bounded linear operator and \(\mathbb{R}^n, \mathbb{R}\) are normed spaces, \(f(\cdot, y)\) is continuous.

Next we show \(f(\cdot, y)\) is quasiconcave for all \(y\). Let \(x_1, x_2 \in \mathbb{R}^n\), \(t \in [0,1]\) be arbitrary. We have: \[\begin{equation*} f(tx_1 + (1-t)x_2, y) = tf(x_1,y) + (1-t)f(x_2,y) \geq \min \left \{ f(x_1,y), f(x_2,y) \right \} \end{equation*}\] Similarly we show quasiconvexity for \(f(x, \cdot)\) for all \(x\). Let \(y_1, y_2 \in \mathbb{R}^m\), \(t \in [0,1]\). We have: \[\begin{equation*} f(x, ty_1 + (1-t)y_2) = tf(x,y_1) + (1-t)f(x,y_2) \leq \max \left \{ f(x,y_1), f(x,y_2) \right \} \end{equation*}\] ◻

3 Minimax in Gentry Wichs

In order to see exactly how minimax is applied in [1], we introduce relevant notation from the paper. Let \(\mathsf{size}(m)\) denote the set of all circuits of size \(m\), and \(\mathsf{dist}(m)\) denote the set of all distributions over \(\mathsf{size}(m)\). So for example (specifically the example from [1]), if \(s^*(\cdot)\) is some polynomial, then \(\mathsf{dist}(s^*(n)+1)\) is the set of all distributions over circuits of size \(s^*(n)+1\). Further, fix some distribution ensembles \(\mathcal{L}_n\) and \(\overline{\mathcal{L}}_n\) over a language \(L\) and its complement \(\bar{L}\) (there are further conditions on \(L\) in the paper but they are not important if we just want to see how minimax is applied), and fix some joint distribution \(\mathcal{L}^*_n\) over tuples \((x, \pi)\) such that \(x \gets \mathcal{L}_n\), and let \(\mathsf{dist}(\overline{\mathcal{L}}_n)\) be the set of all joint-distributions on tuples \((\bar{x}, \bar{\pi})\) with \(\bar{x} \gets \overline{\mathcal{L}}_n\).

Additionally, for purposes of this note let \(\bar{\ell}^*(n) \coloneqq |\bar{x}| + |\bar{\pi}|\) for \((\bar{x}, \bar{\pi}) \gets \overline{\mathcal{L}}^*_n \in \mathsf{dist}(\overline{\mathcal{L}}_n)\). Then \(\overline{\mathcal{L}}^*_n\) takes values in \(\bar{L} \cap \left \{ 0,1 \right \}^{\bar{\ell}^*(n)}\), and we can let \(M \coloneqq |\bar{L} \cap \left \{ 0,1 \right \}^{\bar{\ell}^*(n)}|\) denote the size of the support for any \(\overline{\mathcal{L}}^*_n \in \mathsf{dist}(\overline{\mathcal{L}}_n)\). Similarly, let \(N \coloneqq |\mathsf{size}(s^*(n)+1)|\) denote the size of the support for any \(\mathbb{D}_n \in \mathsf{dist}(s^*(n)+1)\).

By von Neumann’s minimax theorem, the proof of Lemma 3.1 in the paper claims the following equality: \[\begin{align*} &\min_{\overline{\mathcal{L}}^*_n \in \mathsf{dist}(\overline{\mathcal{L}}_n)} ~ \max_{\mathbb{D}_n \in \mathsf{dist}(s^*(n)+1)} ~ \underset{\substack{(\bar{x}, \bar{\pi}) \gets \overline{\mathcal{L}}^*_n \\ D_n \gets \mathbb{D}_n}}{\mathbb{E}} \left [D_n(\bar{x}, \bar{\pi}) - \underset{(x, \pi) \gets \mathcal{L}^*_n}{\Pr} \left [ D_n(x, \pi) = 1 \right ] \right ] \\ =~ &\max_{\mathbb{D}_n \in \mathsf{dist}(s^*(n)+1)} ~ \min_{\overline{\mathcal{L}}^*_n \in \mathsf{dist}(\overline{\mathcal{L}}_n)} ~ \underset{\substack{(\bar{x}, \bar{\pi}) \gets \overline{\mathcal{L}}^*_n \\ D_n \gets \mathbb{D}_n}}{\mathbb{E}} \left [D_n(\bar{x}, \bar{\pi}) - \underset{(x, \pi) \gets \mathcal{L}^*_n}{\Pr} \left [ D_n(x, \pi) = 1 \right ] \right ] \end{align*}\]

In light of 2, in order to invoke 1 in this way, it must be the case that \(\mathsf{dist}(\overline{\mathcal{L}}_n)\) and \(\mathsf{dist}(s^*(n)+1)\) are standard simplexes, and that the above expectation is bilinear.

Indeed, \(\mathsf{dist}(\overline{\mathcal{L}}_n)\) is the set of all finite probability distributions supported over \(L \cap \left \{ 0,1 \right \}^{\bar{\ell}^*(n)}\), and each \(\overline{\mathcal{L}}^*_n \in \mathsf{dist}(\overline{\mathcal{L}}_n)\) can be represented by a vector \((p_1,...,p_M)\), where \(p_i\) is the probability that \(\overline{\mathcal{L}}^*_n\) takes the value of the \(i\)-th tuple \((\bar{x}, \bar{\pi})_i \in \bar{L} \cap \left \{ 0,1 \right \}^{\bar{\ell}^*(n)}\). Thus, \(\mathsf{dist}(\overline{\mathcal{L}}_n)\) is the standard \((M-1)\)-simplex. Similarly, \(\mathsf{dist}(s^*(n)+1)\) is the standard \((N-1)\)-simplex, and any \(\mathbb{D}_n \in \mathsf{dist}(s^*(n)+1)\) can be represented by a vector \((q_1,...,q_N)\) where \(q_j\) is the probability that \(\mathbb{D}_n\) takes the value of the \(j\)-th circuit \(D^{(j)}_n \in \mathsf{size}(s^*(n)+1)\).

Now we show that the expectation expression above is bilinear. Begin by defining the function \(g: (\bar{L} \cap \left \{ 0,1 \right \}^{\bar{\ell}^*(n)}) \times \mathsf{size}(s^*(n)+1) \to \mathbb{R}\) as follows: \[\begin{equation*} g((\bar{x}, \bar{\pi}), D_n) \coloneqq D_n(\bar{x}, \bar{\pi}) - \underset{(x, \pi) \gets \mathcal{L}^*_n}{\Pr} \left [ D_n(x, \pi) = 1 \right ] \end{equation*}\] Because we can index the values in the domain of \(g\) with elements of \([M] \times [N]\), we can equivalently define \(g: [M] \times [N] \to \mathbb{R}\) as: \[\begin{equation*} g(i,j) \coloneqq D^{(j)}_n((\bar{x}, \bar{\pi})_i) - \underset{(x, \pi) \gets \mathcal{L}^*_n}{\Pr} \left [ D^{(j)}_n(x, \pi) = 1 \right ] \end{equation*}\] Now let \(f|_{\mathsf{simplex}}: \mathsf{dist}(\overline{\mathcal{L}}_n) \times \mathsf{dist}(s^*(n)+1) \to \mathbb{R}\) be given by \[\begin{align*} f|_{\mathsf{simplex}}(\overline{\mathcal{L}}^*_n, \mathbb{D}_n) &\coloneqq \underset{\substack{(\bar{x}, \bar{\pi}) \gets \overline{\mathcal{L}}^*_n \\ D_n \gets \mathbb{D}_n}}{\mathbb{E}} \left [g(\overline{\mathcal{L}}^*_n, \mathbb{D}_n) \right ] \\ &= \sum_{(\bar{x}, \bar{\pi}) \in \overline{\mathcal{L}}^*_n} \sum_{D_n \in \mathbb{D}_n} g((\bar{x}, \bar{\pi}), D_n) \underset{}{\Pr} \left [ \overline{\mathcal{L}}^*_n = (\bar{x}, \bar{\pi}) \right ] \underset{}{\Pr} \left [ \mathbb{D}_n = D_n \right ] \\ &= \sum_{i \in [M]} \sum_{j \in [N]} g(i,j)p_iq_j \end{align*}\] Written in this way, it is clear that the extension \(f\) of \(f|_{\mathsf{simplex}}\) to \(\mathbb{R}^N \times \mathbb{R}^M \to \mathbb{R}\) is bilinear, and therefore the use of the minimax theorem is justified.

4 Supplementary Definitions

Definition 3 (Standard Simplex [12]). The standard \(n\)-simplex (or unit \(n\)-simplex) is the subset of \(\mathbb{R}^{n+1}\) given by \[\begin{equation*} \Delta^n = \left \{ (t_0,...,t_n) \in \mathbb{R}^{n+1} ~ \middle | ~\sum_{i=0}^n t_i = 1 ~and~ t_i \geq 0 ~for~ i=0,...,n \right \} \end{equation*}\]

Definition 4 (Bilinear Map). Let \(X,Y,Z\) be three vector spaces over the same base field \(\mathbb{F}\). A bilinear map is a function \(f: X \times Y \to Z\) such that for all \(y \in Y\), the map \(f(\cdot, y): X \to Z\) is linear, and for all \(x \in X\), the map \(f(x,\cdot):Y \to Z\) is linear. If \(Z = \mathbb{F}\), then \(f\) is a bilinear form.

Definition 5 (Quasiconvex function [11]). A function \(f: S \to \mathbb{R}\) defined on a convex subset \(S\) of a real vector space is quasiconvex if for all \(x,y \in S\) and \(t \in [0,1]\) we have \[\begin{equation*} f(tx + (1-t)y) \leq \max \left \{ f(x), f(y) \right \} \end{equation*}\] Alternatively, \(f\) is quasiconcave if \((-f)\) is quasiconvex, and in this case we have: \[\begin{equation*} f(tx + (1-t)y) \geq \min \left \{ f(x), f(y) \right \} \end{equation*}\]

[1]
Gentry, C. and Wichs, D. 2011. Separating succinct non-interactive arguments from all falsifiable assumptions. Proceedings of the forty-third annual ACM symposium on theory of computing (2011), 99–108.
[2]
Kjeldsen, T.H. 2001. John von neumann’s conception of the minimax theorem: A journey through different mathematical contexts. Archive for history of exact sciences. 56, 1 (2001), 39–68.
[3]
Kreyszig, E. 1991. Introductory functional analysis with applications. John Wiley & Sons.
[4]
Neumann, J. v. 1928. Zur theorie der gesellschaftsspiele. Mathematische annalen. 100, 1 (1928), 295–320.
[5]
Sion, M. 1958. On general minimax theorems. (1958).
[6]
Waters, B. and Wu, D.J. 2025. A pure indistinguishability obfuscation approach to adaptively-sound SNARGs for NP. Annual international cryptology conference (2025), 292–326.
[7]
Waters, B. and Wu, D.J. 2024. Adaptively-sound succinct arguments for NP from indistinguishability obfuscation. Proceedings of the 56th annual ACM symposium on theory of computing (2024), 387–398.
[8]
Waters, B. and Zhandry, M. 2024. Adaptive security in SNARGs via iO and lossy functions. Annual international cryptology conference (2024), 72–104.
[9]
Wikipedia contributors 2025. Bilinear map — Wikipedia, the free encyclopedia. https://en.wikipedia.org/w/index.php?title=Bilinear_map&oldid=1311554362.
[10]
Wikipedia contributors 2025. Minimax theorem — Wikipedia, the free encyclopedia. https://en.wikipedia.org/w/index.php?title=Minimax_theorem&oldid=1296417469.
[11]
Wikipedia contributors 2025. Quasiconvex function — Wikipedia, the free encyclopedia. https://en.wikipedia.org/w/index.php?title=Quasiconvex_function&oldid=1303920671.
[12]
Wikipedia contributors 2025. Simplex — Wikipedia, the free encyclopedia. https://en.wikipedia.org/w/index.php?title=Simplex&oldid=1314265483.