Newest 'machine-learning' Questions - Page 3

9 votes

1 answer

313 views

Who introduced the term hyperparameter?

I am trying to find the earliest use of the term hyperparameter. Currently, it is used in machine learning but it must have had earlier uses in statistics or optimization theory. Even the multivolume ...

ACR

790

asked Aug 13, 2023 at 18:14

2 votes

0 answers

109 views

Equivalence of score function expressions in SDE-based generative modeling

I am studying the paper "Score-Based Generative Modeling through Stochastic Differential Equations" (arXiv:2011.13456) by Yang et al. The authors use the following loss function (Equation 7 ...

Po-Hung Yeh

43

asked Aug 1, 2023 at 13:30

8 votes

1 answer

546 views

Geometric formulation of the subject of machine learning

Question: what is the geometric interpretation of the subject of machine learning and/or deep learning? Being "forced" to have a closer look at the subject, I have the impression that it ...

Manfred Weis

12.8k

asked Jul 16, 2023 at 9:40

1 vote

0 answers

98 views

Problems Correction of "Algebra, Topology, Differential Calculus, and Optimization Theory For Computer Science and Machine Learning "' [closed]

Where I can find the problems correction of this book " Algebra, Topology, Differential Calculus, and Optimization Theory For Computer Science and Machine Learning "

zdo0x0

11

asked Jun 13, 2023 at 16:52

3 votes

0 answers

61 views

Prove the convergence of the LASSO model in the presence of limited eigenvalues

I am researching the properties of the Lasso model $\hat \beta:= \operatorname{argmin} \{\|Y-X\beta\|_2^2/n+\lambda\|\beta\|_1\}$, specifically its convergence when the data satisfies restricted ...

GGbond

39

asked Jun 10, 2023 at 14:33

11 votes

0 answers

165 views

Worst margin when halving a hypercube with a hyperplane

Consider the $n$-cube $C_n=\lbrace-1,1\rbrace^n$ and the problem of partitioning it into halves with hyperplanes through the origin that avoid all its points. We can parameterize the hyperplanes by ...

Veit Elser

1,065

asked Jun 8, 2023 at 23:43

1 vote

0 answers

69 views

Curve fitting with "rough" loss functions

Many real-valued classification and regression problems can be framed as minimization in the following way. Setup: Let $\Theta \in \mathbb{R}^p$ be the parameter space that we are searching over. For ...

Simon Kuang

190

asked May 13, 2023 at 20:18

5 votes

1 answer

1k views

Mathematics research relating to machine learning

What branch/branches of math are most relevant in enhancing machine learning (mostly in terms of practical use as opposed to theoretical/possible use)? Specifically, I want to know about math research ...

Artus

173

asked Apr 28, 2023 at 18:32

1 vote

1 answer

121 views

Adjoint sensitivity analysis for a cost functional under an ODE constraint

I am trying to recover the result given by equation 10 in the article here. I am unable to get rid of the integral, any help would be much appreciated. To keep the description as self contained as ...

Abhi. A

55

asked Mar 23, 2023 at 17:04

2 votes

1 answer

60 views

Convergence of minimiser of empirical risk to minimiser of population risk

Let $X_1, \dots, X_n \sim \mu$ be some random elements of a space $\mathcal{X}$. Let $H$ be a Hilbert space of functions $f: S \to \Re$ with norm $\|\cdot\|_H$. Let $\|f^*\|_{L_2(\mu)} < \infty$ ...

user27182

327

asked Mar 18, 2023 at 12:20

2 votes

0 answers

42 views

can we get a family of classifiers $\left\{f_n\right\}_{n \in N}$such that $\lim_{n->∞} (E_{(X_1, Y_1), ...,(X_n, Y_n) \sim \rho}[R(f_n)]-R(f_B))=0 $

For a given classifier $f: \mathbb{R}^d \mapsto\{0,1,2\}$, let $$ R(f):=\mathbb{E}_{(X, Y) \sim \rho}\left[\mathbb{1}_{f(X) \neq Y}\right] $$ $f_B$ the Bayes classifier. can we get a family of ...

fantacy_crs

51

asked Mar 2, 2023 at 3:19

3 votes

0 answers

57 views

How to prove emprical risk converges to expectation risk as $n\to \infty$?

For example, for a classical binary classification: $x \in \mathbb{R}^d$ and $y \in\{0,1\}$ let empirical risk be $R_{\ell}^n(f):=\frac{1}{n} \sum_{i=1}^n \ell\left(f\left(X_i\right), Y_i\right)$ and ...

fantacy_crs

51

asked Mar 2, 2023 at 2:29

2 votes

1 answer

84 views

VC-based risk bounds for classifiers on finite set

Let $X$ be a finite set and let $\emptyset\neq \mathcal{H}\subseteq \{ 0,1 \}^{\mathcal{X}}$. Let $\{(X_n,L_n)\}_{n=1}^N$ be i.i.d. random variables on $X\times \{0,1\}$ with law $\mathbb{P}$. ...

Math_Newbie

362

asked Feb 15, 2023 at 21:46

4 votes

1 answer

333 views

Perceptron / logistic regression accuracy on the n-bit parity problem

$\DeclareMathOperator{\sgn}{sign}$The perceptron (similarly, logistic regression) of the form $y=\sgn(w^T \cdot x+b)$ is famously known for its inability to solve the XOR problem, meaning it can get ...

ido4848

141

asked Jan 29, 2023 at 21:55

1 vote

0 answers

32 views

Convergent gradient-type scheme for solving smooth nonconvex constrained optimization problem

Let $x_1,\ldots,x_n \in \mathbb R^d$ and $y_1,\ldots,y_n \in \{\pm 1\}$, and $\epsilon, h \gt 0$. Define $\theta(t) := Q((t-\epsilon)/h)$, where $Q(z) := \int_{z}^\infty \phi (z)\mathrm{d}z$ is the ...

dohmatob

6,824

asked Jan 4, 2023 at 11:07

Stack Exchange Network

Questions tagged [machine-learning]

Who introduced the term hyperparameter?

Equivalence of score function expressions in SDE-based generative modeling

Geometric formulation of the subject of machine learning

Problems Correction of "Algebra, Topology, Differential Calculus, and Optimization Theory For Computer Science and Machine Learning "' [closed]

Prove the convergence of the LASSO model in the presence of limited eigenvalues

Worst margin when halving a hypercube with a hyperplane

Curve fitting with "rough" loss functions

Mathematics research relating to machine learning

Adjoint sensitivity analysis for a cost functional under an ODE constraint

Convergence of minimiser of empirical risk to minimiser of population risk

can we get a family of classifiers $\left\{f_n\right\}_{n \in N}$such that $\lim_{n->∞} (E_{(X_1, Y_1), ...,(X_n, Y_n) \sim \rho}[R(f_n)]-R(f_B))=0 $

How to prove emprical risk converges to expectation risk as $n\to \infty$?

VC-based risk bounds for classifiers on finite set

Perceptron / logistic regression accuracy on the n-bit parity problem

Convergent gradient-type scheme for solving smooth nonconvex constrained optimization problem

Questions tagged [machine-learning]

Related Tags