Skip to main content

All Questions

Tagged with
3 votes
1 answer
120 views

Parameter distribution of $\theta$ from a rectangular matrix multiplication $C\theta$

I am struggeling to see where this problem fits - i.e. what topics this problem relates to, so I am not able to find the right literature. I want to use some particular information as a prior to a ...
smallStackBigFlow's user avatar
0 votes
0 answers
21 views

Ask for rationale of finding the corresponding prior from regularizer by taking exponential of negative regularizer

In equation (5.112) of textbook "Pattern Recognition and Machine Learning" by Christopher M. Bishop, the simple regularizer takes the form $\frac{\lambda}{2}{\bf w}^T{\bf w}$. The author ...
zzzhhh's user avatar
  • 333
0 votes
0 answers
13 views

Beta distribution equivalence with two redondant parameters [duplicate]

context In Factor graphs on discrete variables, the parameters are contained in factors associated each with a subset of the random variables in the system. Each factor provides a different positive ...
Arnaud's user avatar
  • 566
2 votes
1 answer
40 views

Bayes prior in MAP estimation corresponding to $\ell^0$ penalization

I gather that in the context of penalized least squares, we can interpret a penalty term as corresponding to a prior $\pi(\beta)\propto \exp\{-\text{pen}\}.$ Is this also true for $\ell^0$ ...
Golden_Ratio's user avatar
4 votes
1 answer
653 views

Bayesian priors associated with regularization penalties

I gather that adding a penalty term to (linear) least squares minimization typically corresponds with choosing some prior for Bayes estimation in the normal linear regression model. A couple questions ...
Golden_Ratio's user avatar
6 votes
1 answer
362 views

If LASSO is equivalent to Bayesian Regression with a Laplace (double exponential) prior, what would be the prior for non-negative LASSO? Exponential?

We know that the LASSO penalty is equivalent to Laplace prior. So what would be the corresponding prior for a non-negative LASSO? Is it exponential distribution? More generally, is it true that every ...
Jeffrey's user avatar
  • 107
6 votes
1 answer
214 views

What prior would lead to $\ell_\infty$ regularization of model weights?

Gaussian prior on weights of a GLM lead to Ridge / $\ell_2$ squared regularization. Laplace prior leads to $\ell_1$ regularization Question What prior would lead to $\ell_\infty$ regularization ?
dohmatob's user avatar
  • 558
2 votes
1 answer
563 views

Random-walk prior with ridge-like regularizarion?

I am working with a model that contains a large number of coefficients, arranged in an ordered vector $\beta_1, \dots, \, \beta_N $. I have some prior knowledge that could be used to improve the ...
matteo's user avatar
  • 3,283
0 votes
1 answer
83 views

How to select variables when using shrinkage priors?

I am fitting a linear regression model using shrinkage priors (Horseshoe and Laplace/LASSO). This shrinks many of the variables close to zero, but I would like to select the variables. Can I use the ...
Shrimp's user avatar
  • 1
1 vote
0 answers
154 views

Deriving posterior mean with horseshoe prior

I want to decompose a matrix $S \in \mathbb{R}^{D \times D}$ as below $$S=vv^T $$ where $v_i\mid\lambda_i,\tau_i \sim N(0,\lambda^2_i\rho^2_i)$, $\lambda_i \sim Cauchy^+(0,1)$ i.e $v$ has horseshoe ...
newbie's user avatar
  • 225
4 votes
2 answers
268 views

In "A Topology Layer for Machine Learning," are the topological priors learned by the network or imposed by humans?

In this paper by Gabrielsson, Nelson, et al. the authors "present a differentiable topology layer that can, among other things, construct a loss on the output of a deep generative network to ...
kdbanman's user avatar
  • 857
2 votes
1 answer
336 views

How does L2 penalize large weights

The L2 regularization term is useful because it penalizes large weights over smaller weights which is good to prevent overfitting. I'm having a hard time understanding how exactly it does this. This ...
buydadip's user avatar
  • 123
9 votes
1 answer
6k views

MAP estimation as regularisation of MLE

Going through the Wikipedia article on Maximum a posteriori estimation, it got confusing after reading this: It is closely related to the method of maximum likelihood (ML) estimation, but employs ...
naive's user avatar
  • 1,049
4 votes
1 answer
803 views

Difference between random effect and fixed effect with regularization/prior

Let's say I have a random effect intercept. For example: lme4::lmer(yield ~ 1 + (1|Batch)) How is that different than just ordinary regression using ...
Jeff's user avatar
  • 150
2 votes
1 answer
240 views

Marginal prior derivation in hierarchical Bayesian model

I am working on a model that is closely related to the normal gamma shrinkage prior setup discussed in Griffin & Brown (2010). Suppose we want to draw $P$ parameters $\beta_p$ with $p=1,...,P$. ...
yrx1702's user avatar
  • 710

15 30 50 per page