Questions tagged [learning-theory]
This tag is used for questions that are related with following branches: Statistical learning theory, Machine learning, Vapnik–Chervonenkis theory (VC theory) and all other branches that are studied and applied in the area of learning theory that involves various kinds of mathematics.
11
questions
13
votes
3
answers
2k
views
Is there research on Machine Learning techniques to discover conjectures (theorems) in a wide range of mathematics beyond mathematical logic?
Although there already exists active research area, so-called, automated theorem proving, mostly work on logic and elementary geometry.
Rather than only logic and elementary geometry, are there ...
8
votes
2
answers
1k
views
VC dimension, fat-shattering dimension, and other complexity measures, of a class BV functions
I wish to show that a function which is "essentially constant" (defined shortly) can't be a good classifier (machine learning). For this i need to estimate the "complexity" of such a class of ...
7
votes
1
answer
3k
views
Covering number of Lipschitz functions
What do we know about the covering number of $L$-Lipschitz functions mapping say, $\mathbb{R}^n \rightarrow \mathbb{R}$ for some $L >0$?
Only 2 results I have found so far are,
That the $\infty$-...
6
votes
1
answer
458
views
Why is this nonlinear transformation of an RKHS also an RKHS?
I came across this paper (beginning of page 6) where they stated that if $f,h\in \mathcal{H}$, where $\mathcal{H}$ is an RKHS, then $l_{h,f}=\left|f(x)-h(x)\right|^q$ where $q\geq 1$ also belongs to ...
2
votes
3
answers
14k
views
The Polynomial Kernel
I Have seen two versions of the Polynomial Kernel during my time learning Kernel Methods for things such as regression analysis.
1) $\kappa_d(x,y) = (x \cdot y)^d$
2) $\kappa_d(x,y) = (x \cdot y + 1)...
2
votes
1
answer
1k
views
Packing number of Lipschitz functions
For some $L>0$ say ${\cal L}$ is the space of all $L-$Lipschitz functions mapping $(X,\rho) \rightarrow [0,1]$ where $(X,\rho)$ is a metric space.
For any $\alpha >0$ do we know of a ...
1
vote
1
answer
255
views
VC dimension of cone-restricted linear classifiers
Let $\mathcal{C}$ be a pointed, salient cone in $\mathbb{R}^d$. We may also assume that $\mathcal{C}$ is full-dimensional. Consider the set of binary classifiers $$\mathcal{H} = \{\boldsymbol{x}\...
1
vote
2
answers
314
views
Use covering number to get uniform concentration from pointwise concentration
Let $\Theta$ be a subset of a metric space. Suppose $(X_\theta)_{\theta \in \Theta}$ is a random process on $\Theta$ which is $L$-Lipschitz and with the property that there exists constants $A, B>0$...
1
vote
1
answer
201
views
Rademacher complexity of function class $(x,y) \mapsto 1[|yf(x)-\alpha| \ge \beta]$ in terms of $\alpha$, $\beta$, and Rademacher complexity of $F$
Let $X$ be a measurable space and let $P$ be a probability distribution on $X \times \{\pm 1\}$. Let $F$ be a function class on $X$, i.e., a collection of (measurable) functions from $X$ to $\mathbb R$...
1
vote
0
answers
56
views
Properties of a kernel convolution $K'(x,y) = \int_X\int_X K_0(x,a)K(a,b)K_0(b,y)d\mu(a)d\mu(b)$ where $K$ and $K_0$ are kernels on $(X,\mu)$
Let $(X,\mu)$ be a probability measure space and $K:X \times X \to \mathbb R$ be a (psd) kernel on $X$. Let $K_0$ be another kernel on $X$ and defined a new kernel $\widetilde K$ on $X$ by
$$
\...
0
votes
1
answer
203
views
VC dimension of a certain derived class of binary functions
Let $X$ be a measurable space and let $P$ be a probability distribution on $X \times \{\pm 1\}$. Let $F$ be a function class on $X$, i.e., a collection of (measurable) functions from $X$ to $\mathbb R$...