Skip to main content

Questions tagged [learning-theory]

This tag is used for questions that are related with following branches: Statistical learning theory, Machine learning, Vapnik–Chervonenkis theory (VC theory) and all other branches that are studied and applied in the area of learning theory that involves various kinds of mathematics.

0 votes
2 answers
287 views

Statistical divergence

Does anyone know about a statistical divergence of this type? \begin{equation} \text{D}(P||Q) = \frac{1}{2} \left[\text{KL}(M||P) + \text{KL}(M||Q)\right] \end{equation} where $M = \frac{1}{2} [P+Q]$....
Apprentice's user avatar
2 votes
1 answer
658 views

Why we use Rademacher complexity for generalization error when we can have a trained function?

Let $G$ be a family of functions mapping from $Z$ to $[a, b]$ and $S=\left(z_{1}, \ldots, z_{m}\right)$ a fixed sample of size $m$ with elements in $Z$ . Then, the empirical Rademacher complexity of $...
lee's user avatar
  • 53
23 votes
1 answer
4k views

Relation between information geometry and geometric deep learning

Disclaimer: This is a cross-post from a very similar question on math.SE. I allowed myself to post it here after reading this meta post about cross-posting between mathoverflow and math.SE, I did ...
Blupon's user avatar
  • 333
6 votes
0 answers
116 views

Functional Equation of Zeta Function on Statistical Model

I've been studying [1] because I was interested in his ideas on the zeta function. I'll define it here (c.f. p. 31): The Kullback-Leibler distance is defined as $$ K(w)=\int q(x)f(x, w)dx\quad f(x,w)...
Matt Cuffaro's user avatar
0 votes
0 answers
149 views

Function classes with high Rademacher complexity

My question is two fold, Is there any general understanding of what makes a function class have high Rademacher complexity? (Sudakov minoration would say that one sufficient condition for a class of ...
gradstudent's user avatar
  • 2,186
2 votes
0 answers
194 views

Shattering with sinusoids

Let $d \geq 2$ and $K$ some positive integer. Consider distinct points $\theta_1, \ldots, \theta_K\in \mathbb{T}^d$ and (not necessarily distinct) $z_1, \ldots, z_K \in \{-1,1\}$ such that $\sum\...
Rajesh D's user avatar
  • 724
1 vote
2 answers
314 views

Use covering number to get uniform concentration from pointwise concentration

Let $\Theta$ be a subset of a metric space. Suppose $(X_\theta)_{\theta \in \Theta}$ is a random process on $\Theta$ which is $L$-Lipschitz and with the property that there exists constants $A, B>0$...
dohmatob's user avatar
  • 6,824
1 vote
1 answer
405 views

Growth rate of bounded Lipschitz functions on compact finite-dimensional space

Let $\mathcal X$ be a metric space of diameter $D$ and "dimension" (e.g doubling dimension) $d$. Let $L \in [0, \infty]$ and $M \in [0, \infty)$ and consider the class $\mathcal H_{M,L}$ of $L$-...
dohmatob's user avatar
  • 6,824
1 vote
2 answers
347 views

Is it possible to “solve” iterative (convex/non-convex) optimization problems via learning (one-shot)?

I posted a following question in MSE, but I think it should be posted here in MO. Since I don't know how to transfer the post from MSE to MO, I have pasted the question below. Thank you in advance and ...
user550103's user avatar
2 votes
2 answers
521 views

Lower bound on misclassification rate of Lipschitz functions in terms of Lipschitz constant

Important note @MateuszKwaśnicki in the comment section has raised a fundamental issue with the current statement of the problem. I'm trying to bugfix it. Setup I wish to show that a Lipschitz ...
dohmatob's user avatar
  • 6,824
8 votes
2 answers
1k views

VC dimension, fat-shattering dimension, and other complexity measures, of a class BV functions

I wish to show that a function which is "essentially constant" (defined shortly) can't be a good classifier (machine learning). For this i need to estimate the "complexity" of such a class of ...
dohmatob's user avatar
  • 6,824
3 votes
0 answers
332 views

From Sudakov minoration principle to lowerbounds on Rademacher complexity

For a compact subset $S \subset \mathbb{R}^n$ (and an implicit metric $d$ on it) and $\epsilon >0$ lets define the following $2$ standard quantities, Let ${\cal P}(\epsilon,S,d)$ be the $\epsilon-...
gradstudent's user avatar
  • 2,186
2 votes
1 answer
1k views

Packing number of Lipschitz functions

For some $L>0$ say ${\cal L}$ is the space of all $L-$Lipschitz functions mapping $(X,\rho) \rightarrow [0,1]$ where $(X,\rho)$ is a metric space. For any $\alpha >0$ do we know of a ...
gradstudent's user avatar
  • 2,186
3 votes
0 answers
102 views

A largest lattice of a given Vapnik-Chervonekis dimension

Prove (or disprove) that a largest lattice of Vapnik-Chervonekis dimension at most $k$ which has at most $n\cdot k$ join-irreducible and $n\cdot k$ meet-irreducible elements is the distributive ...
Lviv Scottish Book's user avatar
7 votes
1 answer
3k views

Covering number of Lipschitz functions

What do we know about the covering number of $L$-Lipschitz functions mapping say, $\mathbb{R}^n \rightarrow \mathbb{R}$ for some $L >0$? Only 2 results I have found so far are, That the $\infty$-...
gradstudent's user avatar
  • 2,186

15 30 50 per page
1 2 3
4
5
7