Highest scored 'bayesian-probability' questions

9 votes

3 answers

468 views

What does the KL being symmetric tell us about the distributions?

Suppose two probability density functions, $p$ and $q$, such that $\text{KL}(q||p) = \text{KL}(p||q) \neq 0$. Intuitively, does that tell us anything interesting about the nature of these densities?

HesterJ

123

asked Nov 2, 2019 at 17:52

9 votes

1 answer

313 views

Who introduced the term hyperparameter?

I am trying to find the earliest use of the term hyperparameter. Currently, it is used in machine learning but it must have had earlier uses in statistics or optimization theory. Even the multivolume ...

ACR

790

asked Aug 13, 2023 at 18:14

9 votes

1 answer

477 views

In what sense is the Bayesian posterior mean a “convex combination”?

I asked this on math.stackexchange with no response, I'm hoping someone here might have something. Suppose I want to estimate $x \in \mathbb{R}^n$ from two signals with zero mean, normally ...

Ronaldo Carpio

309

asked Nov 17, 2014 at 14:56

8 votes

1 answer

1k views

Rate of convergence of Bayesian posterior

Suppose a data generating process (DGP) is parameterized by some unknown parameter $\theta_0$, say $P_{\theta_0}$, and we want to estimate the value of $\theta_0$ using Bayesian method. Let $\pi(\...

Herr K.

183

asked Mar 30, 2015 at 18:12

8 votes

1 answer

319 views

Base schemes and Bayesian priors

One of Grothendieck's dicta about algebraic geometry is to consider "the relative situation", where one doesn't consider the category of schemes but of schemes over a fixed base scheme. In Bayesian ...

Allen Knutson

27.7k

asked Nov 17, 2015 at 1:04

6 votes

1 answer

2k views

What can be said about an infinite linear chain of conjugate prior distributions?

We can sample a discrete value from the multinomial distribution. We can also sample the parameters of the multinomial distribution from its conjugate prior the dirichlet distribution. Since the ...

DoubleJay

2,383

asked Apr 30, 2011 at 3:59

6 votes

0 answers

202 views

Existence of stick breaking representations for random measures

The Dirichlet process has a roughly size ordered representation in terms of beta random variables, called a stick-breaking representation (Sethuraman, 1994). Similar results hold for the beta process, ...

Shannon S.

129

asked Mar 14, 2018 at 22:52

5 votes

6 answers

2k views

Are all probabilities conditional probabilities? [closed]

We know that $P(A\mid B) = \frac{P(A \cap B)}{P(B)}$. So $P(B) = P(A\mid B)P(A \cap B)$. Thus are all probabilities conditional probabilities? Can one make a probability more accurate by introducing a ...

Tony

59

asked Jun 13, 2010 at 2:26

5 votes

3 answers

1k views

Probability estimates for pairwise majority votes

This is related to the rank aggregation question I asked previously. I have items $I_1, \ldots, I_N$ and the observations of a number of pairwise trials which pit pairs $I_i$ and $I_j$ against ...

David R. MacIver

1,321

asked May 27, 2010 at 16:40

5 votes

1 answer

330 views

Bounding the sensitivity of a posterior mean to changes in a single data point

There is a real-valued random variable $R$. Define a finite set of random variables ("data points") $$X_i = R + Z_i \; \text{for } i\in\{1,\ldots,n\},$$ where $Z_i$ are identically and independently ...

Ben Golub

1,058

asked Aug 4, 2019 at 4:11

4 votes

2 answers

216 views

Do these distributions have a name already?

In playing with some math finance stuff I ran into the following distribution and I was curious if someone had a name for it or has studied it or worked with it already. To start, let $\Delta^n$ be ...

Jess Boling

646

asked Oct 28, 2021 at 17:25

4 votes

1 answer

531 views

Gaussian process kernel parameter tuning

I am reading on gaussian processes and there are multiple resources that say how the parameters of the prior (kernel, mean) can be fitted based on data,specifically by choosing those that maximize the ...

john

141

asked Apr 8, 2021 at 15:02

4 votes

0 answers

228 views

Convergence of the expectation of a random variable when conditioned on its sum with another, independent but not identically distributed

Suppose that for all $n \in \mathbf{N}$, $X_n$ and $Y_n$ are independent random variables with $$X_n \sim \mathtt{Binomial}(n,1-q),$$ and $$Y_n \sim \mathtt{Poisson}(n(q+\epsilon_n)),$$ where $q \in (...

as1

91

asked May 1, 2020 at 12:16

4 votes

0 answers

651 views

Bayesian Networks and Polytree

I am a bit puzzled by the use of polytree to infer a posterior in a Bayesian Network (BN). BN are defined as directed acyclic graphs. A polytree is DAG whose underlying undirected graph is a tree. ...

Bremen

41

asked Jul 18, 2019 at 5:42

3 votes

2 answers

633 views

Parametrising a sparse orthogonal matrix

I need to find a way to parametrise a matrix that is both sparse (to some degree) and orthogonal, i.e., I am looking for a parametrisation that describes $A \in \mathbb{R}^{n\times m}$ such that $AA^𝑇...

HesterJ

123

asked Jun 5, 2019 at 16:52

Stack Exchange Network

Questions tagged [bayesian-probability]

What does the KL being symmetric tell us about the distributions?

Who introduced the term hyperparameter?

In what sense is the Bayesian posterior mean a “convex combination”?

Rate of convergence of Bayesian posterior

Base schemes and Bayesian priors

What can be said about an infinite linear chain of conjugate prior distributions?

Existence of stick breaking representations for random measures

Are all probabilities conditional probabilities? [closed]

Probability estimates for pairwise majority votes

Bounding the sensitivity of a posterior mean to changes in a single data point

Do these distributions have a name already?

Gaussian process kernel parameter tuning

Convergence of the expectation of a random variable when conditioned on its sum with another, independent but not identically distributed

Bayesian Networks and Polytree

Parametrising a sparse orthogonal matrix

Questions tagged [bayesian-probability]

Related Tags