Newest 'machine-learning+pr.probability' Questions

2 votes

0 answers

158 views

How to choose N policemen positions to catch a drunk driver in the most effective way (on a Cayley graph of a finite group)?

Consider a Cayley graph of some big finite group. Consider random walk on such a graph - think of it as drunk driver. Fix some number $N$ which is much smaller than group size. Question 1: How to ...

Alexander Chervov

24.4k

asked Jun 1 at 21:04

1 vote

0 answers

36 views

Interpolation in convex hull

I'm reading a paper, Learning in High Dimension Always Amounts to Extrapolation, that provides a result I don't understand. It provides this theorem which I do understand: Theorem 1: (Bárány and ...

Christopher D'Arcy

11

asked May 14 at 18:44

1 vote

1 answer

76 views

Bayes classifiers with cost of misclassification

A minimum ECM classifier disciminate the features $\underline{x}$ to belong to class $t$ ($\delta(\underline{x}) = t$) if $\forall j \ne t$: $$\sum_{k\ne t} c(t|k) f_k(\underline{x})p_k \le \sum_{k\ne ...

BiasedBayes

11

asked May 1 at 14:10

2 votes

1 answer

119 views

How to lower bound the absolute value of the difference of two Kullback-Leibler divergences given the constrains below?

Given that $\min (D[p_1||p_3],D[p_2||p_4])=a$, how to find a lower bound of the difference $\left \vert D[p_1\parallel p_2]-D[p_3\parallel p_4] \right\vert$ with respect to $a$? If the condition is ...

Richard Ben

31

asked Apr 18 at 15:06

6 votes

0 answers

181 views

What are compact manifolds such that GROWTH (of spheres volumes) is well approximated by the Gaussian normal distribution?

Consider some compact Riemannian manifold $M$. Fix some point $p$. Consider a "sub-sphere of radius $r$" - i.e. set of points on distance $r$ from $p$. Consider growth function $g(r)$ to be ...

Alexander Chervov

24.4k

asked Mar 29 at 18:04

0 votes

0 answers

85 views

Some new questions on Rademacher complexity

For $A\subset R^n$,$A=(a_1,a_2,\dots, a_n)$, $\sigma_i$ are Rademacher random variable. Is $|\mathbb{E}_\sigma \inf_{a\in A}\sum_{i=1}^n\sigma_ia_i| \le |\mathbb{E}_\sigma \sup_{a\in A}\sum_{i=1}^n\...

Hao Yu

185

asked Feb 28 at 8:36

3 votes

1 answer

100 views

When does the optimal model exist in learning theory?

In the context of learning theory, we usually have: data $(x,y)\sim P(x,y)$, with $x\in\mathcal{X}\subseteq\mathbb{R}^d$ and $y\in\mathcal{Y}\subseteq\mathbb{R}^k$, a hypothesis class $\mathcal{F}\...

rick

179

asked Jan 23 at 22:07

2 votes

1 answer

110 views

expectation of the product of Gaussian kernels and their input

I was wondering if anybody knows how to solve: $$\mathbb{E}{\mathbf{z} \sim \mathcal{N}(\mathbf{0}, \mathbf{I})}\left[ (\mathbf{x}{i} - \mathbf{z})(\mathbf{x}{j} - \mathbf{z})^\top \exp\left( - (\...

wsz_fantasy

245

asked Jan 16 at 3:09

4 votes

0 answers

140 views

Known relations between mutual information and covering number?

This is a question about statistical learning theory. Consider a hypothesis class $\mathcal{F}$, parameterized by real vectors $w \in \mathbb{R}^p$. Suppose I have a data distribution $D \sim \mu$ and ...

Tanishq Kumar

141

asked Dec 31, 2023 at 7:45

1 vote

2 answers

216 views

Beating the $1/\sqrt n$ rate of uniform-convergence over a linear function class

Let $P$ be a probability distribution on $\mathbb R^d \times \mathbb R$, and let $(x_1,y_1), \ldots, (x_n,y_n)$ be an iid sample of size $n$ from $P$. Fix $\epsilon,t\gt 0$. For any unit-vector $w \in ...

dohmatob

6,824

asked Oct 13, 2023 at 10:09

3 votes

1 answer

156 views

Why is the logistic regression model good? (and its relation with maximizing entropy)

Suppose we're trying to train a classifier $\pi$ for $k$ classes that takes as input a feature vector $x\in\mathbb{R}^n$ and outputs a probability vector $\pi(x)\in\mathbb{R}^k$ such that $\sum_{v=1}^...

stupid_question_bot

7,313

asked Sep 15, 2023 at 0:17

2 votes

0 answers

109 views

Equivalence of score function expressions in SDE-based generative modeling

I am studying the paper "Score-Based Generative Modeling through Stochastic Differential Equations" (arXiv:2011.13456) by Yang et al. The authors use the following loss function (Equation 7 ...

Po-Hung Yeh

43

asked Aug 1, 2023 at 13:30

5 votes

1 answer

1k views

Mathematics research relating to machine learning

What branch/branches of math are most relevant in enhancing machine learning (mostly in terms of practical use as opposed to theoretical/possible use)? Specifically, I want to know about math research ...

Artus

173

asked Apr 28, 2023 at 18:32

3 votes

0 answers

164 views

What is the meaning of big-O of a random variable?

I encountered this problem in a book "Pattern Recognition and Machine Learning" by Christopher M. Bishop. I excerpt it below: screenshot of the book In the excerpt, the big-O notation $O(\xi^...

zzzhhh

31

asked Dec 21, 2022 at 7:31

1 vote

0 answers

56 views

How to calculate the unifrom entropy or VC dimension of the following class of functions?

When dealing with U process I meet with such a uniform entropy to calculate. For any $\eta>0$, function class $\mathcal{F}$ containing functions $f=\left(f_{i, j}\right)_{1 \leq i \neq j \leq n}: \...

leslie zhang

11

asked Sep 12, 2022 at 3:20

Stack Exchange Network

All Questions

How to choose N policemen positions to catch a drunk driver in the most effective way (on a Cayley graph of a finite group)?

Interpolation in convex hull

Bayes classifiers with cost of misclassification

How to lower bound the absolute value of the difference of two Kullback-Leibler divergences given the constrains below?

What are compact manifolds such that GROWTH (of spheres volumes) is well approximated by the Gaussian normal distribution?

Some new questions on Rademacher complexity

When does the optimal model exist in learning theory?

expectation of the product of Gaussian kernels and their input

Known relations between mutual information and covering number?

Beating the $1/\sqrt n$ rate of uniform-convergence over a linear function class

Why is the logistic regression model good? (and its relation with maximizing entropy)

Equivalence of score function expressions in SDE-based generative modeling

Mathematics research relating to machine learning

What is the meaning of big-O of a random variable?

How to calculate the unifrom entropy or VC dimension of the following class of functions?

All Questions

Related Tags