Notice removed Authoritative reference needed by CommunityBot

occurred Aug 24, 2023 at 15:06

Bounty Ended with no winning answer by CommunityBot

occurred Aug 24, 2023 at 15:06

Improved tags

Link

edited Aug 17, 2023 at 11:51

gmvh

2.8k
5
26
43

Notice added Authoritative reference needed by ACR

occurred Aug 16, 2023 at 14:02

Bounty Started worth 50 reputation by ACR

occurred Aug 16, 2023 at 14:02

Source Link

asked Aug 13, 2023 at 18:14

ACR

790
5
19

Who introduced the term hyperparameter?

I am trying to find the earliest use of the term hyperparameter. Currently, it is used in machine learning but it must have had earlier uses in statistics or optimization theory. Even the multivolume Lexicon der Mathematik (Springer) does not have this term.

So far, one can trace up to 1972 in Bayes Estimates for the Linear Model D. V. Lindley, A. F. M. Smith Journal of the Royal Statistical Society. Series B (Methodological), Volume 34, Issue 1 (1972), 1-41. Link

The authors introduce the term hyperparameter with a footnote:

In the present paper we study situations where we have exchangeable prior knowledge and assume this exchangeability described by a mixture. In the example this implies $E\left(\theta_i\right)=\mu$, say, a common value for each $i$. In other words there is a linear structure to the parameters analogous to the linear structure supposed for the observations $\mathbf{y}$. If we add the premise that the distribution from which the $\theta_i$ appear as a random sample is normal, the parallelism between the two stages, for $\mathbf{y}$ and $\boldsymbol{\theta}$, becomes closer. In this paper we study the situation in which the parameters of the general linear model themselves have a general linear structure in terms of other quantities which we call hyperparameters. $\dagger$ In this simple example there is just one hyperparameter, $\mu$.

Footnote

$\dagger$ We believe we have borrowed this terminology from I. J. Good but are unable to trace the reference.

I.J. Good was a statistician turned philosopher but Google Scholar shows no hope that he introduced this term after 1960s.

st.statistics nonlinear-optimization machine-learning bayesian-probability

Stack Exchange Network

Return to Question

Who introduced the term hyperparameter?