Questions tagged [machine-learning]
The machine-learning tag has no usage guidance.
189
questions
9
votes
1
answer
313
views
Who introduced the term hyperparameter?
I am trying to find the earliest use of the term hyperparameter. Currently, it is used in machine learning but it must have had earlier uses in statistics or optimization theory. Even the multivolume ...
2
votes
0
answers
109
views
Equivalence of score function expressions in SDE-based generative modeling
I am studying the paper "Score-Based Generative Modeling through Stochastic Differential Equations" (arXiv:2011.13456) by Yang et al. The authors use the following loss function (Equation 7 ...
8
votes
1
answer
546
views
Geometric formulation of the subject of machine learning
Question:
what is the geometric interpretation of the subject of machine learning and/or deep learning?
Being "forced" to have a closer look at the subject, I have the impression that it ...
1
vote
0
answers
98
views
Problems Correction of "Algebra, Topology, Differential Calculus, and Optimization Theory For Computer Science and Machine Learning "' [closed]
Where I can find the problems correction of this book " Algebra, Topology, Differential Calculus, and Optimization Theory For Computer Science and Machine Learning "
3
votes
0
answers
61
views
Prove the convergence of the LASSO model in the presence of limited eigenvalues
I am researching the properties of the Lasso model $\hat \beta:= \operatorname{argmin} \{\|Y-X\beta\|_2^2/n+\lambda\|\beta\|_1\}$, specifically its convergence when the data satisfies restricted ...
11
votes
0
answers
165
views
Worst margin when halving a hypercube with a hyperplane
Consider the $n$-cube $C_n=\lbrace-1,1\rbrace^n$ and the problem of partitioning it into halves with hyperplanes through the origin that avoid all its points. We can parameterize the hyperplanes by ...
1
vote
0
answers
69
views
Curve fitting with "rough" loss functions
Many real-valued classification and regression problems can be framed as minimization in the following way.
Setup:
Let $\Theta \in \mathbb{R}^p$ be the parameter space that we are searching over.
For ...
5
votes
1
answer
1k
views
Mathematics research relating to machine learning
What branch/branches of math are most relevant in enhancing machine learning (mostly in terms of practical use as opposed to theoretical/possible use)? Specifically, I want to know about math research ...
1
vote
1
answer
121
views
Adjoint sensitivity analysis for a cost functional under an ODE constraint
I am trying to recover the result given by equation 10 in the article here. I am unable to get rid of the integral, any help would be much appreciated. To keep the description as self contained as ...
2
votes
1
answer
60
views
Convergence of minimiser of empirical risk to minimiser of population risk
Let $X_1, \dots, X_n \sim \mu$ be some random elements of a space $\mathcal{X}$. Let $H$ be a Hilbert space of functions $f: S \to \Re$ with norm $\|\cdot\|_H$.
Let $\|f^*\|_{L_2(\mu)} < \infty$ ...
2
votes
0
answers
42
views
can we get a family of classifiers $\left\{f_n\right\}_{n \in N}$such that $\lim_{n->∞} (E_{(X_1, Y_1), ...,(X_n, Y_n) \sim \rho}[R(f_n)]-R(f_B))=0 $
For a given classifier $f: \mathbb{R}^d \mapsto\{0,1,2\}$, let
$$
R(f):=\mathbb{E}_{(X, Y) \sim \rho}\left[\mathbb{1}_{f(X) \neq Y}\right]
$$
$f_B$ the Bayes classifier.
can we get a family of ...
3
votes
0
answers
57
views
How to prove emprical risk converges to expectation risk as $n\to \infty$?
For example, for a classical binary classification:
$x \in \mathbb{R}^d$ and $y \in\{0,1\}$
let empirical risk be
$R_{\ell}^n(f):=\frac{1}{n} \sum_{i=1}^n \ell\left(f\left(X_i\right), Y_i\right)$
and ...
2
votes
1
answer
84
views
VC-based risk bounds for classifiers on finite set
Let $X$ be a finite set and let $\emptyset\neq \mathcal{H}\subseteq \{ 0,1 \}^{\mathcal{X}}$. Let $\{(X_n,L_n)\}_{n=1}^N$ be i.i.d. random variables on $X\times \{0,1\}$ with law $\mathbb{P}$. ...
4
votes
1
answer
333
views
Perceptron / logistic regression accuracy on the n-bit parity problem
$\DeclareMathOperator{\sgn}{sign}$The perceptron (similarly, logistic regression) of the form $y=\sgn(w^T \cdot x+b)$ is famously known for its inability to solve the XOR problem, meaning it can get ...
1
vote
0
answers
32
views
Convergent gradient-type scheme for solving smooth nonconvex constrained optimization problem
Let $x_1,\ldots,x_n \in \mathbb R^d$ and $y_1,\ldots,y_n \in \{\pm 1\}$, and $\epsilon, h \gt 0$. Define $\theta(t) := Q((t-\epsilon)/h)$, where $Q(z) := \int_{z}^\infty \phi (z)\mathrm{d}z$ is the ...