Skip to main content

Questions tagged [learning-theory]

This tag is used for questions that are related with following branches: Statistical learning theory, Machine learning, Vapnik–Chervonenkis theory (VC theory) and all other branches that are studied and applied in the area of learning theory that involves various kinds of mathematics.

13 votes
3 answers
2k views

Is there research on Machine Learning techniques to discover conjectures (theorems) in a wide range of mathematics beyond mathematical logic?

Although there already exists active research area, so-called, automated theorem proving, mostly work on logic and elementary geometry. Rather than only logic and elementary geometry, are there ...
Xingdong Zuo's user avatar
8 votes
2 answers
1k views

VC dimension, fat-shattering dimension, and other complexity measures, of a class BV functions

I wish to show that a function which is "essentially constant" (defined shortly) can't be a good classifier (machine learning). For this i need to estimate the "complexity" of such a class of ...
dohmatob's user avatar
  • 6,824
7 votes
1 answer
3k views

Covering number of Lipschitz functions

What do we know about the covering number of $L$-Lipschitz functions mapping say, $\mathbb{R}^n \rightarrow \mathbb{R}$ for some $L >0$? Only 2 results I have found so far are, That the $\infty$-...
gradstudent's user avatar
  • 2,186
6 votes
1 answer
458 views

Why is this nonlinear transformation of an RKHS also an RKHS?

I came across this paper (beginning of page 6) where they stated that if $f,h\in \mathcal{H}$, where $\mathcal{H}$ is an RKHS, then $l_{h,f}=\left|f(x)-h(x)\right|^q$ where $q\geq 1$ also belongs to ...
Kashif's user avatar
  • 373
2 votes
3 answers
14k views

The Polynomial Kernel

I Have seen two versions of the Polynomial Kernel during my time learning Kernel Methods for things such as regression analysis. 1) $\kappa_d(x,y) = (x \cdot y)^d$ 2) $\kappa_d(x,y) = (x \cdot y + 1)...
mrehayden's user avatar
2 votes
1 answer
1k views

Packing number of Lipschitz functions

For some $L>0$ say ${\cal L}$ is the space of all $L-$Lipschitz functions mapping $(X,\rho) \rightarrow [0,1]$ where $(X,\rho)$ is a metric space. For any $\alpha >0$ do we know of a ...
gradstudent's user avatar
  • 2,186
1 vote
1 answer
255 views

VC dimension of cone-restricted linear classifiers

Let $\mathcal{C}$ be a pointed, salient cone in $\mathbb{R}^d$. We may also assume that $\mathcal{C}$ is full-dimensional. Consider the set of binary classifiers $$\mathcal{H} = \{\boldsymbol{x}\...
S.B.'s user avatar
  • 215
1 vote
2 answers
314 views

Use covering number to get uniform concentration from pointwise concentration

Let $\Theta$ be a subset of a metric space. Suppose $(X_\theta)_{\theta \in \Theta}$ is a random process on $\Theta$ which is $L$-Lipschitz and with the property that there exists constants $A, B>0$...
dohmatob's user avatar
  • 6,824
1 vote
1 answer
201 views

Rademacher complexity of function class $(x,y) \mapsto 1[|yf(x)-\alpha| \ge \beta]$ in terms of $\alpha$, $\beta$, and Rademacher complexity of $F$

Let $X$ be a measurable space and let $P$ be a probability distribution on $X \times \{\pm 1\}$. Let $F$ be a function class on $X$, i.e., a collection of (measurable) functions from $X$ to $\mathbb R$...
dohmatob's user avatar
  • 6,824
1 vote
0 answers
56 views

Properties of a kernel convolution $K'(x,y) = \int_X\int_X K_0(x,a)K(a,b)K_0(b,y)d\mu(a)d\mu(b)$ where $K$ and $K_0$ are kernels on $(X,\mu)$

Let $(X,\mu)$ be a probability measure space and $K:X \times X \to \mathbb R$ be a (psd) kernel on $X$. Let $K_0$ be another kernel on $X$ and defined a new kernel $\widetilde K$ on $X$ by $$ \...
dohmatob's user avatar
  • 6,824
0 votes
1 answer
203 views

VC dimension of a certain derived class of binary functions

Let $X$ be a measurable space and let $P$ be a probability distribution on $X \times \{\pm 1\}$. Let $F$ be a function class on $X$, i.e., a collection of (measurable) functions from $X$ to $\mathbb R$...
dohmatob's user avatar
  • 6,824