Skip to main content

All Questions

Tagged with
0 votes
0 answers
17 views

Strange Variance Term for Normal Prior $w^2\sigma^2$

I've attached two screenshots, one with the question and one with the answer. It seems to me that the prior is wrong and it should include $w^2$ not $w^2\sigma^2$ I apologise for, including such a ...
CormJack's user avatar
  • 161
1 vote
0 answers
22 views

Can we solve by hand the early exit multi-class classification problem? [closed]

Problem: Find a solution $\hat{\varepsilon}$ of the following minimization problem \begin{align*} &\min_{\varepsilon \in \mathbb{R}^M} \sum_{h=1}^M \varepsilon^h \hat{R}^h+\beta \sum_{h=1}^M \...
ohana's user avatar
  • 111
2 votes
1 answer
45 views

Is bayesian updating framework a valid concept?

When I google search for the term, only 6 pages showed up. There is no authoritative paper on this, except https://arxiv.org/abs/1306.6430 which argues for using informatics concepts to generalize a ...
Chloe's user avatar
  • 21
7 votes
2 answers
305 views

What is the statistical model for a multi-label problem?

In a setting with a binary $y$ like dog/cat, a reasonable statistical model is to posit that the probability parameter $p$ of a $\text{Binomial}(1, 0)$ distribution is some function $f$ of features $X$...
Dave's user avatar
  • 65k
1 vote
0 answers
26 views

Specifying conditional distribution in a Bayesian network

I am trying to learn about Bayesian networks and am really having a hard time to figure out how to setup some simple models. Say, I have a model as: ...
Luca's user avatar
  • 4,700
0 votes
0 answers
102 views

What is the posterior distribution $p(\textbf{f} | \textbf{y})$ for a Gaussian Process regression?

What is the posterior distribution $p(\textbf{f} | \textbf{y})$ for a Gaussian Process regression? Suppose that $p(y_n |x_n, f) = N(f(x_n), \sigma^2)$ with prior on $\textbf{f} = [f(x_1), \ldots f(x_n)...
chesslad's user avatar
  • 211
8 votes
3 answers
507 views

Why does a function being smoother make it more likely?

I am currently studying the textbook Gaussian Processes for Machine Learning by Carl Edward Rasmussen and Christopher K. I. Williams. Chapter 1 Introduction says the following: Given this training ...
The Pointer's user avatar
  • 2,096
3 votes
2 answers
154 views

The Bayes' Theorem Components of the Probability Output of a Classifier

Let's give a simple setup. I have $500$ photos of dogs and $500$ photos of cats, all labeled. From these, I want to build a classifier of photos. For each photo, the classifier outputs a probability ...
Dave's user avatar
  • 65k
1 vote
0 answers
170 views

Bishop: Understanding the prior and posterior for a curve fitting example (1.2)

In Bishop's Pattern Recognition and Machine Learning Book, he uses an example of fitting a polynomial to data collected from a sinusoidal curve with Gaussian noise. The goal is to find the most ...
user137210's user avatar
1 vote
0 answers
25 views

Linear regression - Bayesian Predictive distribution

I am trying to answer a question about linear regression but i am stuck: $y=w \cdot x + \epsilon, \epsilon \sim N(0,\alpha)$ i am also given a prior: $w\sim N(0,\beta)$ from which i was able to ...
Questions's user avatar
2 votes
0 answers
117 views

What do these equations on Bayesian regression (MAP) from Chapter 3.3 in PRML by Bishop mean?

This was taken from Ch 3.3 on Bayesian Linear Regression from Pattern Recognition in Machine Learning by Bishop. Apparently the posterior can be described by eq 3.49. Eq 3.48 represents the prior ...
doctopus's user avatar
  • 121
1 vote
0 answers
169 views

Gaussian Process regression: does there exist a conjugate prior over hyperparameters?

When adopting a fully Bayesian hierarchical setting in Gaussian Process regression is there a choice of kernel (covariance) function such that there exist a conjugate prior? If so which?
Marco Rossi's user avatar
5 votes
2 answers
879 views

Importance of the prior

The maximum a posteriori objective can be written as $$\widehat{\theta}_\textrm{MAP} = \operatorname*{argmax}_\theta \log P(y\mid\theta) + \log P(\theta)$$ where $\log P(\theta)$––the prior––is a ...
Shrey's user avatar
  • 205
1 vote
1 answer
1k views

Choice of Gaussian process in non-parametric regression

I have been trying to understand non-parametric regression using Gaussian processes (GP), which are used to represent prior distributions over the space of functions. The linear model considered is $$ ...
Jack2018's user avatar
3 votes
1 answer
27 views

Predictive classification including varying information about classes

I'd appreciate help conceptualizing a problem. I constructed a supervised training set where user inputs carry the correct classification. In building a classification model, I'll strip out the ...
Eric Green's user avatar

15 30 50 per page