Skip to main content

All Questions

3 votes
2 answers
314 views

Minimax optimal multiple hypothesis test

Let us consider the following two-player game between Chooser and Guesser. There is a finite set $\Omega$ and $k$ probability distributions on $\Omega$, denoted by $ \mathcal{P} =\{P_1,\ldots,P_k\} $. ...
Aryeh Kontorovich's user avatar
3 votes
1 answer
100 views

When does the optimal model exist in learning theory?

In the context of learning theory, we usually have: data $(x,y)\sim P(x,y)$, with $x\in\mathcal{X}\subseteq\mathbb{R}^d$ and $y\in\mathcal{Y}\subseteq\mathbb{R}^k$, a hypothesis class $\mathcal{F}\...
rick's user avatar
  • 179
4 votes
0 answers
140 views

Known relations between mutual information and covering number?

This is a question about statistical learning theory. Consider a hypothesis class $\mathcal{F}$, parameterized by real vectors $w \in \mathbb{R}^p$. Suppose I have a data distribution $D \sim \mu$ and ...
Tanishq Kumar's user avatar
1 vote
2 answers
216 views

Beating the $1/\sqrt n$ rate of uniform-convergence over a linear function class

Let $P$ be a probability distribution on $\mathbb R^d \times \mathbb R$, and let $(x_1,y_1), \ldots, (x_n,y_n)$ be an iid sample of size $n$ from $P$. Fix $\epsilon,t\gt 0$. For any unit-vector $w \in ...
dohmatob's user avatar
  • 6,824
0 votes
1 answer
91 views

Is it reasonable to consider the subgaussian property of the logarithm of the Gaussian pdf?

Let $Y$ denote a Gaussian random variable characterized by a mean $\mu$ and a variance $\sigma^2$. Consider $N$ independent and identically distributed (i.i.d.) copies of $Y$, denoted as $Y_1, Y_2, \...
Math_Y's user avatar
  • 311
1 vote
1 answer
189 views

Rademacher complexity for a family of bounded, nondecreasing functions?

Let $\{\phi_k\}_{k=1}^K$ be a family of functions mapping from an interval $[a, b]$ to $[-1, 1]$. That is, $\phi_k \colon[ a,b] \to [-1, 1]$ are nondecreasing maps on some finite interval $[a, b] \...
Drew Brady's user avatar
1 vote
1 answer
163 views

Tight upper-bounds for the Gaussian width of intersection of intersection of hyper-ellipsoid and unit-ball

Let $\Lambda$ be a positive-definite matrix of size $n$ and let $R \ge 0$, which may depend on $n$. Consider the set $S := \{x \in \mathbb R^n \mid \|x\|_2 \le R,\,\|x\|_{\Lambda^{-1}} \le 1\}$ where $...
dohmatob's user avatar
  • 6,824
1 vote
0 answers
24 views

Minimax statistical estimation of proximal transform $\mbox{prox}_g(\theta_0)$, from linear model data $y_i := x_i^\top \theta_0 + \epsilon_i$

tl;dr: My question pertains the subject of minimax estimation theory (mathematical statistics), in the context of linear regression. Given a vector $\theta_0 \in \mathbb R^d$, consider the linear ...
dohmatob's user avatar
  • 6,824
1 vote
0 answers
368 views

Conditions for equivalence of RKHS norm and $L^2(P)$ norm

Let $K$ be a psd kernel on an abstract space $X$ and let $H_K$ be the induced Reproducing Kernel Hilbert Space (RKHS). Let $P$ be a probability measure on $X$ such that $H_K \subseteq L^2(P_X)$ and ...
dohmatob's user avatar
  • 6,824
0 votes
0 answers
95 views

Verification of a certain computation of VC dimension

Disclaimer: I'm not very familiar with the concept of VC dimensions and how to manipulate such objects. I'd be grateful if expects on the subject (learning theory, probability), could kindly proof ...
dohmatob's user avatar
  • 6,824
0 votes
1 answer
203 views

VC dimension of a certain derived class of binary functions

Let $X$ be a measurable space and let $P$ be a probability distribution on $X \times \{\pm 1\}$. Let $F$ be a function class on $X$, i.e., a collection of (measurable) functions from $X$ to $\mathbb R$...
dohmatob's user avatar
  • 6,824
1 vote
1 answer
201 views

Rademacher complexity of function class $(x,y) \mapsto 1[|yf(x)-\alpha| \ge \beta]$ in terms of $\alpha$, $\beta$, and Rademacher complexity of $F$

Let $X$ be a measurable space and let $P$ be a probability distribution on $X \times \{\pm 1\}$. Let $F$ be a function class on $X$, i.e., a collection of (measurable) functions from $X$ to $\mathbb R$...
dohmatob's user avatar
  • 6,824
0 votes
0 answers
171 views

Upper-bound for bracketing number in terms of VC-dimension

Let $P$ be a probability distribution on a measurable space $\mathcal X$ (e.g;, some euclidean $\mathbb R^m$) and let $F$ be a class of funciton $f:\mathcal X \to \mathbb R$. Given, $f_1,f_2 \in F$, ...
dohmatob's user avatar
  • 6,824
1 vote
0 answers
95 views

$L_1$ convergence rates for multivariate kernel density estimation

Let $X$ be a random variable on $\mathbb R^d$ with probability density function $f$, and let $X_1,\ldots,X_n$ of $X$ be $n$ iid copies of $X$. Given a bandwidth parameter $h=h_n > 0$ and a kernel $...
dohmatob's user avatar
  • 6,824
4 votes
0 answers
161 views

Convergence rates for kernel empirical risk minimization, i.e empirical risk minimization (ERM) with kernel density estimation (KDE)

Let $\Theta$ be an open subset of some $\mathbb R^m$ and let $P$ be a probability distribution on $\mathbb R^d$ with density $f$ in a Sobolev space $W_p^s(\mathbb R^d)$, i.e all derivatives of $f$ ...
dohmatob's user avatar
  • 6,824

15 30 50 per page