All Questions
9
questions
2
votes
3
answers
396
views
Implementing multiclass logistic regression from scratch
This is a sequel to a previous question about implementing binary logistic regression from scratch.
Background knowledge:
To train a logistic regression model for a classification problem with $K$ ...
3
votes
1
answer
195
views
Implementing binary logistic regression from scratch
Background knowledge:
To train a logistic regression model for a classification problem with two classes (called class $0$ and class $1$), we are given a training dataset consisting of feature vectors ...
0
votes
1
answer
183
views
Maximum Entropy Continuous Distribution
In Pattern Recognition and Machine Learning Ch 1.6, the author derives the distribution which maximises the differential entropy;
$$H(\textbf{x})-\int p(\textbf{x}) \ln (p(\textbf{x})) d\textbf{x}$$
...
6
votes
3
answers
754
views
Application of the chain rule to $3$-layers neural network
Consider the differentiable functions $L^1(x,\theta^1),L^2(x^2,\theta^2),L^3(x^3,\theta^3)$, where every $x_k,\theta^k$ are real vectors, for $k=1,2,3$. Also define $\theta=(\theta^1,\theta^2,\theta^3)...
0
votes
1
answer
836
views
Simplifying partial derivative of cross-entropy function
How do I simplify:
$$\begin{eqnarray}
\frac{\partial C}{\partial w_j} & = & -\frac{1}{n} \sum_x \left(
\frac{y }{\sigma(z)} -\frac{(1-y)}{1-\sigma(z)} \right)
\frac{\partial \sigma}{\...
0
votes
1
answer
465
views
Why are terms flipped in partial derivative of logistic regression cost function?
When calculating the partial derivative:
$$\frac{\partial}{\partial\theta_{j}}J(\theta) $$
from:
$$ J(\theta)=-\frac{1}{m}\sum_{i=1}^{m}(y^{i}\log(h_\theta(x^{i}))+(1-y^{i})\log(1-h_\theta(x^{i})))$$
...
1
vote
1
answer
138
views
Deriving the maximum likelihood estimate of Gaussian co variance matrix
$\newcommand{\trace}{\operatorname{trace}}$I recently came across a deduction I couldn't follow. It concerns the maximum likelihood estimate of the co-variance matrix for a multivariate Gaussian ...
1
vote
0
answers
70
views
General correlation between function with itself and other input data
I've collected data for a function F = f(g(x)), for different function shapes g(x).
The goal is to predict values of ...
6
votes
1
answer
1k
views
Multivariate Gaussian equivalent for a Gaussian integration identity.
For a one-dimensional x,
$$\int_{-\infty}^{\infty}x^{2}e^{-x^{2}}dx=\frac{1}{2}\int_{-\infty}^{\infty}e^{-x^{2}}dx$$
This can be shown through integration by parts. There is a good derivation of ...