Minimax optimal multiple hypothesis test

Question

Let us consider the following two-player game between Chooser and Guesser. There is a finite set $\Omega$ and $k$ probability distributions on $\Omega$, denoted by $ \mathcal{P} =\{P_1,\ldots,P_k\} $.

Chooser picks a distribution $Q\in \mathcal{P} $ and presents an $X\in\Omega$ drawn according to $Q$ to Guesser. Guesser has pre-committed to a function $\psi:\Omega\to\mathcal{P}$ according to which he will infer the identity of $Q$. The loss suffered by the Guesser is the probability that his inference is incorrect, under the Chooser's adaptive (worst-case) choice of $Q$: $$ L(\psi) = \max_{Q\in P} \sum_{x\in\Omega}Q(x)\boldsymbol{1}[ Q \neq \psi(x) ]. $$

We say that $\psi^*$ is minimax-optimal (for $\mathcal{P}$) if $$ L(\psi^*) = \min_{ \psi\in \mathcal{P}^\Omega } L(\psi). $$

For binary hypothesis testing, it is an immediate consequence of the Neyman-Pearson Lemma that the maximum likelihood yields the minimax-optimal inference. Namely, for $k=2$, the choice $\psi(x)=\mathrm{argmax}\{P_1(x),P_2(x)\}$ is easily seen to be minimax-optimal. For $k=3$, this is no longer the case. Indeed, one can easily construct 3 distributions $P_1,P_2,P_3$ such that for all $x\in\Omega$, $P_3(x)\le \max\{P_1(x),P_2(x)\}$. In this case, the choice $\psi(x)=\mathrm{argmax}\{P_1(x),P_2(x),P_3(x)\}$ will never select $P_3$, and hence the adaptive adversary will always choose $Q=P_3$ for a maximal loss of $1$.

Question. What is known about this problem? What is known about minimax decision rule for $k>2$?

Update: My claims about the implications of the Neyman-Pearson Lemma are incorrect and in particular the maximum-likelihood $\psi$ is not always minimax-optimal. See the answer by Mr X.

Yuval Peres · Accepted Answer · 2024-01-28 06:20:42Z

2

Deterministic $\Psi$ is bound to fail. You must randomize. Consider three probability measures supported on $\{1,2\}$. A deterministic $\Psi$ will have only two of the measures in its image. The adversary will choose the third, again yielding a loss of 1. Once you allow $\Psi$ to be randomized, you have a zero sum game, which for small $\Omega$ can be solved quickly via linear programming.

answered Jan 28 at 6:20

Yuval Peres

14.1k1 gold badge28 silver badges48 bronze badges

Add a comment |

Doron Shafrir · Accepted Answer · 2024-01-29 07:24:18Z

The argmax choice function minimizes the sum of errors for any $k$, but not the maximum of errors, even for $k=2$. In fact, minimizing the maximum of errors is $NP$-hard even for the case $P_1=P_2=P$: is this case you minimize over $\max\{x,1-x\}$ where $x$ is any subset sum of $P=(x_1,\dots,x_n)$, so you're looking for the subset sum closest to $1/2$. This is a variant of the subset sum problem in which the target is half the total sum.

This variant is also $NP$-complete: given a general instance $x_1,....,x_n$ of the subset sum problem, we can reduce it to this case. First scale it so the total sum is $1/3$ and the target sum is $t<\frac{1}{3}$. Define $P=(x_1,..,x_n,\frac{1}{2}-t,\frac{1}{6}+t)$. It's total sum is 1. If a subset of $x_i$ sums to $t$, together with $\frac{1}{2}-t$ it will sum to 1. Conversely, take any subset of $P$. If it includes both $\frac{1}{2}-t,\frac{1}{6}+t$ it sums to at least $2/3$. If it includes neither of $\frac{1}{2}-t,\frac{1}{6}+t$ it sums to at most $1/3$. So to hit $1/2$, it must include exactly one of them, and we can assume it includes $\frac{1}{2}-t$ by taking complement in the other case. So the remaining elements from $x_1,...,x_n$ sum up to $t$.

Note that the instances we use are all degenrate: The goal of $\psi$ is to assess the distribution from the data. If $P_1=P_2$, the data clearly tells you nothing. The fact that the problem is $NP$-hard even in this degenerate case means that you probably want to answer a different question.

One good idea is to look at randomized strategies (as Yuval has suggested), and then finding the optinal strategy is easy with Linear Programming with $k|\Omega|+1$ variables: $k|\Omega|$ describing the non-deterministic $\psi$, and a final one smaller than the linear sums appearing in the maximum expression.

Stack Exchange Network

Minimax optimal multiple hypothesis test

2 Answers 2

Not the answer you're looking for? Browse other questions tagged
pr.probability
st.statistics
learning-theory
or ask your own question.

Minimax optimal multiple hypothesis test

2 Answers 2

Not the answer you're looking for? Browse other questions tagged pr.probabilityst.statisticslearning-theory or ask your own question.

Related

Not the answer you're looking for? Browse other questions tagged
pr.probability
st.statistics
learning-theory
or ask your own question.