Questions tagged [learning-theory]
This tag is used for questions that are related with following branches: Statistical learning theory, Machine learning, Vapnik–Chervonenkis theory (VC theory) and all other branches that are studied and applied in the area of learning theory that involves various kinds of mathematics.
100
questions
0
votes
2
answers
287
views
Statistical divergence
Does anyone know about a statistical divergence of this type?
\begin{equation}
\text{D}(P||Q) = \frac{1}{2} \left[\text{KL}(M||P) + \text{KL}(M||Q)\right]
\end{equation}
where $M = \frac{1}{2} [P+Q]$....
2
votes
1
answer
658
views
Why we use Rademacher complexity for generalization error when we can have a trained function?
Let $G$ be a family of functions mapping from $Z$ to $[a, b]$ and $S=\left(z_{1}, \ldots, z_{m}\right)$ a fixed sample of size $m$ with elements in $Z$ . Then, the empirical Rademacher complexity of $...
23
votes
1
answer
4k
views
Relation between information geometry and geometric deep learning
Disclaimer: This is a cross-post from a very similar question on math.SE. I allowed myself to post it here after reading this meta
post about cross-posting between mathoverflow and math.SE, I did
...
6
votes
0
answers
116
views
Functional Equation of Zeta Function on Statistical Model
I've been studying [1] because I was interested in his ideas on the zeta function. I'll define it here (c.f. p. 31):
The Kullback-Leibler distance is defined as
$$
K(w)=\int q(x)f(x, w)dx\quad
f(x,w)...
0
votes
0
answers
149
views
Function classes with high Rademacher complexity
My question is two fold,
Is there any general understanding of what makes a function class have high Rademacher complexity? (Sudakov minoration would say that one sufficient condition for a class of ...
2
votes
0
answers
194
views
Shattering with sinusoids
Let $d \geq 2$ and $K$ some positive integer. Consider distinct points $\theta_1, \ldots, \theta_K\in \mathbb{T}^d$ and (not necessarily distinct) $z_1, \ldots, z_K \in \{-1,1\}$ such that $\sum\...
1
vote
2
answers
314
views
Use covering number to get uniform concentration from pointwise concentration
Let $\Theta$ be a subset of a metric space. Suppose $(X_\theta)_{\theta \in \Theta}$ is a random process on $\Theta$ which is $L$-Lipschitz and with the property that there exists constants $A, B>0$...
1
vote
1
answer
405
views
Growth rate of bounded Lipschitz functions on compact finite-dimensional space
Let $\mathcal X$ be a metric space of diameter $D$ and "dimension" (e.g doubling dimension) $d$. Let $L \in [0, \infty]$ and $M \in [0, \infty)$ and consider the class $\mathcal H_{M,L}$ of $L$-...
1
vote
2
answers
347
views
Is it possible to “solve” iterative (convex/non-convex) optimization problems via learning (one-shot)?
I posted a following question in MSE, but I think it should be posted here in MO. Since I don't know how to transfer the post from MSE to MO, I have pasted the question below. Thank you in advance and ...
2
votes
2
answers
521
views
Lower bound on misclassification rate of Lipschitz functions in terms of Lipschitz constant
Important note
@MateuszKwaśnicki in the comment section has raised a fundamental issue with the current statement of the problem. I'm trying to bugfix it.
Setup
I wish to show that a Lipschitz ...
8
votes
2
answers
1k
views
VC dimension, fat-shattering dimension, and other complexity measures, of a class BV functions
I wish to show that a function which is "essentially constant" (defined shortly) can't be a good classifier (machine learning). For this i need to estimate the "complexity" of such a class of ...
3
votes
0
answers
332
views
From Sudakov minoration principle to lowerbounds on Rademacher complexity
For a compact subset $S \subset \mathbb{R}^n$ (and an implicit metric $d$ on it) and $\epsilon >0$ lets define the following $2$ standard quantities,
Let ${\cal P}(\epsilon,S,d)$ be the $\epsilon-...
2
votes
1
answer
1k
views
Packing number of Lipschitz functions
For some $L>0$ say ${\cal L}$ is the space of all $L-$Lipschitz functions mapping $(X,\rho) \rightarrow [0,1]$ where $(X,\rho)$ is a metric space.
For any $\alpha >0$ do we know of a ...
3
votes
0
answers
102
views
A largest lattice of a given Vapnik-Chervonekis dimension
Prove (or disprove) that a largest lattice of Vapnik-Chervonekis dimension at most $k$ which has at most $n\cdot k$ join-irreducible and $n\cdot k$ meet-irreducible elements is the distributive ...
7
votes
1
answer
3k
views
Covering number of Lipschitz functions
What do we know about the covering number of $L$-Lipschitz functions mapping say, $\mathbb{R}^n \rightarrow \mathbb{R}$ for some $L >0$?
Only 2 results I have found so far are,
That the $\infty$-...