Questions tagged [learning-theory]
This tag is used for questions that are related with following branches: Statistical learning theory, Machine learning, Vapnik–Chervonenkis theory (VC theory) and all other branches that are studied and applied in the area of learning theory that involves various kinds of mathematics.
39
questions with no upvoted or accepted answers
8
votes
0
answers
685
views
The function space defined by deep neural nets
Given a deep net graph and the activation functions on the hidden vertices do we have a description of the function space spanned by it? (even if for some specific architectures and activation ...
7
votes
0
answers
423
views
Does the Mandelbrot set have infinite VC dimension?
Define a binary classifier for points in the complex plane, whose parameter $\theta$ is an isometry of $\mathbb{C}$, and which classifies $z \in \mathbb{C}$ based on whether or not $\theta(z)$ is in ...
6
votes
1
answer
458
views
Why is this nonlinear transformation of an RKHS also an RKHS?
I came across this paper (beginning of page 6) where they stated that if $f,h\in \mathcal{H}$, where $\mathcal{H}$ is an RKHS, then $l_{h,f}=\left|f(x)-h(x)\right|^q$ where $q\geq 1$ also belongs to ...
6
votes
0
answers
116
views
Functional Equation of Zeta Function on Statistical Model
I've been studying [1] because I was interested in his ideas on the zeta function. I'll define it here (c.f. p. 31):
The Kullback-Leibler distance is defined as
$$
K(w)=\int q(x)f(x, w)dx\quad
f(x,w)...
4
votes
0
answers
140
views
Known relations between mutual information and covering number?
This is a question about statistical learning theory. Consider a hypothesis class $\mathcal{F}$, parameterized by real vectors $w \in \mathbb{R}^p$. Suppose I have a data distribution $D \sim \mu$ and ...
4
votes
0
answers
120
views
Progress on "Un-Alching" ML?
So, a couple of years ago I watched both Ali Rahimi's NIPS speech "Machine Learning is Alchemy",
(where he talks about how the field lacks a solid, overarching, theoretical foundation) and ...
4
votes
0
answers
161
views
Convergence rates for kernel empirical risk minimization, i.e empirical risk minimization (ERM) with kernel density estimation (KDE)
Let $\Theta$ be an open subset of some $\mathbb R^m$ and let $P$ be a probability distribution on $\mathbb R^d$ with density $f$ in a Sobolev space $W_p^s(\mathbb R^d)$, i.e all derivatives of $f$ ...
3
votes
0
answers
427
views
Analytic formula for the eigenvalues of kernel integral operator induced by Laplace kernel $K(x,x') = e^{-c\|x-x'\|}$ on unit-sphere in $\mathbb R^d$
Let $d \ge 2$ be an integer and let $X=\mathcal S_{d-1}$ the unit-sphere in $\mathbb R^d$. Let $\tau_d$ be the uniform distribution on $X$. Define a function $K:X \times X \to \mathbb R$ by $K(x,y) := ...
3
votes
0
answers
332
views
From Sudakov minoration principle to lowerbounds on Rademacher complexity
For a compact subset $S \subset \mathbb{R}^n$ (and an implicit metric $d$ on it) and $\epsilon >0$ lets define the following $2$ standard quantities,
Let ${\cal P}(\epsilon,S,d)$ be the $\epsilon-...
3
votes
0
answers
102
views
A largest lattice of a given Vapnik-Chervonekis dimension
Prove (or disprove) that a largest lattice of Vapnik-Chervonekis dimension at most $k$ which has at most $n\cdot k$ join-irreducible and $n\cdot k$ meet-irreducible elements is the distributive ...
2
votes
0
answers
64
views
How to naturally define an output space with certain properties
Consider the following regression problem $v=A(u) + \varepsilon$
for some operator $A:\mathcal{U} \rightarrow \mathcal{V}$ and some function spaces $\mathcal{U},\mathcal{V}$, mapping from $\mathcal{X}$...
2
votes
0
answers
264
views
Covering/Bracketing number of monotone functions on $\mathbb{R}$ with uniformly bounded derivatives
I am interested in the $\| \cdot \|_{\infty}$-norm bracketing number or covering number of some collection of distribution functions on $\mathbb{R}$.
Let $\mathcal{F}$ consist of all distribution ...
2
votes
0
answers
194
views
Shattering with sinusoids
Let $d \geq 2$ and $K$ some positive integer. Consider distinct points $\theta_1, \ldots, \theta_K\in \mathbb{T}^d$ and (not necessarily distinct) $z_1, \ldots, z_K \in \{-1,1\}$ such that $\sum\...
2
votes
0
answers
436
views
Relation between pseudo-dimension and Rademacher complexity
With techniques of Dudley's entropy bound and Haussler's upper bound one can show that there exists a constant $C$ such that any class of $\{0,1\}$ indicator functions with Vapnik-Chervonenkis ...
2
votes
0
answers
187
views
Maximum-likelihood estimation for univariate responses from multivariate data
I am new in the field of machine learning, so I hope I will be able to formulate my question in a clear way...
I have some data represented by vectors $\mathbf{x}_1,\mathbf{x}_2,\ldots,\mathbf{x}_n \...