All Questions
Tagged with prior machine-learning
22
questions
0
votes
0
answers
17
views
Strange Variance Term for Normal Prior $w^2\sigma^2$
I've attached two screenshots, one with the question and one with the answer. It seems to me that the prior is wrong and it should include $w^2$ not $w^2\sigma^2$
I apologise for, including such a ...
1
vote
0
answers
22
views
Can we solve by hand the early exit multi-class classification problem? [closed]
Problem: Find a solution $\hat{\varepsilon}$ of the following minimization problem
\begin{align*}
&\min_{\varepsilon \in \mathbb{R}^M} \sum_{h=1}^M \varepsilon^h \hat{R}^h+\beta \sum_{h=1}^M \...
2
votes
1
answer
45
views
Is bayesian updating framework a valid concept?
When I google search for the term, only 6 pages showed up. There is no authoritative paper on this, except https://arxiv.org/abs/1306.6430 which argues for using informatics concepts to generalize a ...
7
votes
2
answers
305
views
What is the statistical model for a multi-label problem?
In a setting with a binary $y$ like dog/cat, a reasonable statistical model is to posit that the probability parameter $p$ of a $\text{Binomial}(1, 0)$ distribution is some function $f$ of features $X$...
1
vote
0
answers
26
views
Specifying conditional distribution in a Bayesian network
I am trying to learn about Bayesian networks and am really having a hard time to figure out how to setup some simple models.
Say, I have a model as:
...
0
votes
0
answers
102
views
What is the posterior distribution $p(\textbf{f} | \textbf{y})$ for a Gaussian Process regression?
What is the posterior distribution $p(\textbf{f} | \textbf{y})$ for a Gaussian Process regression?
Suppose that $p(y_n |x_n, f) = N(f(x_n), \sigma^2)$ with prior on $\textbf{f} = [f(x_1), \ldots f(x_n)...
8
votes
3
answers
507
views
Why does a function being smoother make it more likely?
I am currently studying the textbook Gaussian Processes for Machine Learning by Carl Edward Rasmussen and Christopher K. I. Williams. Chapter 1 Introduction says the following:
Given this training ...
3
votes
2
answers
154
views
The Bayes' Theorem Components of the Probability Output of a Classifier
Let's give a simple setup.
I have $500$ photos of dogs and $500$ photos of cats, all labeled. From these, I want to build a classifier of photos.
For each photo, the classifier outputs a probability ...
1
vote
0
answers
170
views
Bishop: Understanding the prior and posterior for a curve fitting example (1.2)
In Bishop's Pattern Recognition and Machine Learning Book, he uses an example of fitting a polynomial to data collected from a sinusoidal curve with Gaussian noise. The goal is to find the most ...
1
vote
0
answers
25
views
Linear regression - Bayesian Predictive distribution
I am trying to answer a question about linear regression but i am stuck:
$y=w \cdot x + \epsilon, \epsilon \sim N(0,\alpha)$
i am also given a prior:
$w\sim N(0,\beta)$
from which i was able to ...
2
votes
0
answers
117
views
What do these equations on Bayesian regression (MAP) from Chapter 3.3 in PRML by Bishop mean?
This was taken from Ch 3.3 on Bayesian Linear Regression from Pattern Recognition in Machine Learning by Bishop.
Apparently the posterior can be described by eq 3.49. Eq 3.48 represents the prior ...
1
vote
0
answers
169
views
Gaussian Process regression: does there exist a conjugate prior over hyperparameters?
When adopting a fully Bayesian hierarchical setting in Gaussian Process regression is there a choice of kernel (covariance) function such that there exist a conjugate prior? If so which?
5
votes
2
answers
879
views
Importance of the prior
The maximum a posteriori objective can be written as
$$\widehat{\theta}_\textrm{MAP} = \operatorname*{argmax}_\theta \log P(y\mid\theta) + \log P(\theta)$$
where $\log P(\theta)$––the prior––is a ...
1
vote
1
answer
1k
views
Choice of Gaussian process in non-parametric regression
I have been trying to understand non-parametric regression using Gaussian processes (GP), which are used to represent prior distributions over the space of functions. The linear model considered is
$$ ...
3
votes
1
answer
27
views
Predictive classification including varying information about classes
I'd appreciate help conceptualizing a problem. I constructed a supervised training set where user inputs carry the correct classification. In building a classification model, I'll strip out the ...