Skip to main content

All Questions

3 votes
0 answers
109 views

Short path problem on Cayley graphs as language translation task (from "Permutlandski" to "Cayleylandski"(s) :). Reference/suggestion request

Context: Algorithms to find short paths on Cayley graphs of (finite) groups are of some interest - see below. There can be several approaches to that task. One of ideas coming to my mind - in some ...
Alexander Chervov's user avatar
5 votes
1 answer
1k views

Mathematics research relating to machine learning

What branch/branches of math are most relevant in enhancing machine learning (mostly in terms of practical use as opposed to theoretical/possible use)? Specifically, I want to know about math research ...
Artus's user avatar
  • 173
1 vote
0 answers
32 views

Convergent gradient-type scheme for solving smooth nonconvex constrained optimization problem

Let $x_1,\ldots,x_n \in \mathbb R^d$ and $y_1,\ldots,y_n \in \{\pm 1\}$, and $\epsilon, h \gt 0$. Define $\theta(t) := Q((t-\epsilon)/h)$, where $Q(z) := \int_{z}^\infty \phi (z)\mathrm{d}z$ is the ...
dohmatob's user avatar
  • 6,824
2 votes
0 answers
37 views

Stochastic gradient descent in 'stronger' settings

I am minimzing a function $F(x) = \mathbb E(f(x,\Xi))$ where $\Xi$ is some random value, by a stochastic gradient descent that generates a random number $\xi$ from the distribution of $\Xi$ at each ...
lrnv's user avatar
  • 686
17 votes
3 answers
2k views

Theoretical results on neural networks

With this question I'd like to have a recollection of theoretical rigorous results on neural networks. I'd like to have results that have been settled, as opposed to hypothesis. As an example, this ...
1 vote
0 answers
136 views

Continuous decomposition of permutation-invariant set functions

The seminal machine learning paper Deep Sets (Zaheer et al., 2017) discusses representations of permutation-invariant functions on real tuples, or (multi)set functions. Given a countable set $X$ and a ...
Daniel Paleka's user avatar
7 votes
2 answers
2k views

Mathematics of GANs (generative adversarial networks)

Generative Adversarial Networks were introduced in http://papers.nips.cc/paper/5423-generative-adversarial-nets and has more than 20000 citations. The paper introduced key paradigm changes which ...
Turbo's user avatar
  • 13.8k
2 votes
0 answers
49 views

What are some beginner's references on algebraically structured (statistical) models, and their connection with group actions and Fourier transform?

I asked this question on Cross Validated a few days ago, but didn't really get a favorable response, so asking here to see if I get any. I'm looking at the description of a short-term position in ...
Stat_math's user avatar
  • 223
2 votes
1 answer
212 views

Uniform Lipschitz function approximation by shallow neural networks

Fix $d\in \mathbb{N}$. Let $F_1$ be the set of all 1-Lipschitz functions mapping $[0, 1]^d$ to $\mathbb{R}$. For $\varphi: \mathbb{R} \rightarrow \mathbb{R}$ and $m \in \mathbb{N}$, let $N_\varphi^m$ ...
Steve's user avatar
  • 1,095
11 votes
1 answer
798 views

Abstract mathematical concepts/tools appeared in machine learning research

I am interested in knowing about abstract mathematical concepts, tools or methods that have come up in theoretical machine learning. By "abstract" I mean something that is not immediately related to ...
1 vote
0 answers
99 views

Plethora of variant neural networks?

Since a decade ago when new life was breathed in to neural networks in the form of deep learning a plethora of different architectures have come about. Is there a reference that gives compendium of ...
Turbo's user avatar
  • 13.8k
13 votes
2 answers
679 views

Reference Request: Theoretical Mixing Times Research in Machine Learning / Artificial Intelligence (AI)

I'm doing a PhD in probability theory, focusing mostly on mixing times. It's a pure maths PhD, considering precise models and showing rigorous mixing results. I'm also interested in stuff like machine ...
Sam OT's user avatar
  • 560
0 votes
1 answer
108 views

General results regarding linear separability?

I'm reading up on the theory behind support vector machines and would like a good reference with some general results about linear separability. Specifically, questions like below: Given two ...
Fred Byrd's user avatar
  • 101
57 votes
4 answers
14k views

Group theory in machine learning

I'm a Machine Learning researcher who would like to research applications of group theory in ML. There is a term "Partially Observed Groups" in machine learning theory which has been ...
drosophyllum's user avatar
94 votes
14 answers
14k views

Deep learning / Deep neural nets for mathematician

I am interested in finding out the math ideas behind the technologies that are under the umbrella of "Deep Learning" or "Deep neural nets". Most of the papers/books that are often quoted in papers/...