1
$\begingroup$

In the context of learning theory, we usually have: data $(x,y)\sim P(x,y)$, with $x\in\mathcal{X}\subseteq\mathbb{R}^d$ and $y\in\mathcal{X}\subseteq\mathbb{R}^k$, a hypothesis class $\mathcal{F}\subseteq\Omega$, where $\Omega$ is the set of measurable functions $\mathcal{X}\rightarrow\mathcal{Y}$ and a loss function $\ell:\mathbb{R}^k\times\mathbb{R}^k\to\mathbb{R}_+$. Also, we define the functional $R$, the risk, such that $R(f)=\mathbb{E}_{xy}[\ell(y,f(x))]$

A Bayes model $f^*$ is defined as any function in $\Omega$ such that $\forall f\in\Omega$: $R(f^*)\leq R(f)$

A "best in class" $f^*_{F}$ is defined as any function in $\mathcal{F}$ such that $\forall f\in\mathcal{F}$: $R(f^*_F)\leq R(f)$

Given the random variable $D\sim P(x,y)^n$, define $R_{emp}(f; D)=\frac{1}{N}\sum_{i=1}^N\ell(y_i,f(x_i))$. Now, an empirical risk minimiser $f_{erm}$ is defined as any function in $\mathcal{F}$ such that $\forall f\in\mathcal{F}$: $R_{emp}(f_{erm})\leq R_{emp}(f)$.

My question is: what are the conditions needed to guarantee the existence of these functions? I'm not interested in the uniqueness. If one of these conditions is the continuity of $\ell$, do these quantities exist also for the $0/1$ loss?

$\endgroup$
1
  • $\begingroup$ It doesn’t exist in general. For regression, I believe any strongly convex loss function would do the trick. For classification, it is common to use KL divergence (eg MLE for logistic regression) and in this case, there is no way to guarantee existence. By the way, your question is a bit unprincipled. The way you asked it, randomness does not come into play at all. Are you sure you asked what you wanted to ask? $\endgroup$
    – Andrew
    Commented Jan 24 at 3:22

0

You must log in to answer this question.