Skip to main content

All Questions

0 votes
0 answers
10 views

Encouraging sparsity at block level or element-wise level?

I have an objective function $f(W)$, where $W$ is a $Kp \times Kp$ matrix. We can view $W$ is a $p \times p$ block matrix, where each block has the dimension $K \times K$. Now to optimize $f(W)$, I ...
PiVoyager's user avatar
0 votes
1 answer
20 views

How to sidestep of undifferentiability of Frobenius norm at 0 in the numerical analysis?

I am currently doing the l-bfgs-b optimization algorithm. I have my objective function. I also need to get the gradient of the objective function. Some part of my objective function is Frobenius norm ...
PiVoyager's user avatar
2 votes
0 answers
56 views

Covariances not below $\Sigma$

$\Sigma_0,\Sigma_1,\dots,\Sigma_K$ are real covariance matrices. I’m interested in the set of matrices $$\bigcap_{k=1}^K \left\{x: 0 \preceq x \preceq \Sigma_0, \ x\not\prec\Sigma_k\right\}.$$ I’m ...
Christian Chapman's user avatar
1 vote
0 answers
64 views

Relation between values of $ξ_i$ and $\alpha_i$ in SVM?

I have a question in about a property of support vectors of SVM which is stated in subsection "12.2.1 Computing the Support Vector Classifier" of "The Elements of Statistical Learning&...
hasanghaforian's user avatar
0 votes
0 answers
39 views

least squares minimum test error solution

assume we want to learn a model $y=x^T \beta + \varepsilon $ where $\beta \in \mathbb{R}^d$ is constant $ x \in \mathbb{R}^d$ is the input vector with Gaussian distribution $\mathcal{N}(0,\Sigma_x)$ ...
Elad Elmakias's user avatar
1 vote
0 answers
22 views

What does the spectral norm of a Wigner matrix converge to when the variances are not renormalised?

It seems that it is well known that for a $NxN$ Wigner matrix - that is a matrix that is symmetric (or Hermitian, but I am only interested in the case where all the entries are real) and has i.i.d. ...
ufghd34's user avatar
  • 81
-1 votes
1 answer
47 views

What is the derivative of the $\ell_1$ norm of the matrix? [duplicate]

The question is short. I have a square matrix $W$. I know $\|W\|_1$ means the usual $\ell_1$ norm, which means the sum of the absolute value of elements of the matrix. Now I want to compute the ...
PiVoyager's user avatar
0 votes
0 answers
23 views

What is the intuition to create an orthogonal design matrix in an iterative way (without Gram-Schmidt)?

Let $X_1, X_2, \dots, X_n$ be $n$ observations between $[-1,1]$ and $X_i \ne X_j$ if $i\ne j$. Let $\phi_0(x)=1$, $\phi_1(x) = > 2(x-a_1)\phi_0(x)$. When $r\ge 1$, $\phi_{r+1}(x)=2(x-a_{r+1})\...
Kaven Lin's user avatar
0 votes
0 answers
31 views

How to Upper Bound the Spectral Norm of $\left(XX^T\right)^{-1}\left(XX^T\right)^{-1}X$?

I have an observation matrix $ X \in \mathbb{R}^{n \times n}$. Considering $XX^T$, this matrix can be seen as a correlation matrix between individuals, so it generally has elements close to the ...
Tool's user avatar
  • 1
1 vote
1 answer
63 views

PCA Reconstruction Properties

Let $X \in \mathbb{R}^{n \times d}$ be our data matrix where $n$ is the number of examples and $d$ is the feature dimension. Applying PCA to $X$, we get a low-dimensional representation $A \in \mathbb{...
FountainTree's user avatar
2 votes
0 answers
49 views

How to show this discrete quadratic equation converges?

So I have a discrete process $V_{k}=AV_{k-1}A^T+C$ where $C$ is a constant and $V$ is symmetric (this is supposed to be the update for the state covariance of a discrete stochastic process). I read ...
Minecraft dirt block's user avatar
1 vote
1 answer
45 views

If $X^TX\beta=X^TY$, then $X\beta$ is independent of $\beta$

This question is motivated by linear statistical inference, and more specifically, the normal equation for a least squares estimate and estimable functions. But it boils down to pure linear algebra. ...
Anon's user avatar
  • 586
1 vote
1 answer
29 views

Compute inverse of a special 2 by 2 block matrix.

Let $$X\in\mathbb{R}^p,\quad \tilde{X} = (1, X^{\top})^{\top}\in\mathbb{R}^{p+1},\quad \tilde{\Sigma}=\mathbb{E} \left[\tilde{X} \tilde{X}^{\top}\right]\in\mathbb{R}^{(p+1)\times (p+1)} $$ $$ (\tilde{...
maskeran's user avatar
  • 573
5 votes
1 answer
204 views

Rigorous Mathematical foundations of Machine Learning / Deep Learning / Neural Networks

I am an Engineering Graduate (with a strong background in Probability/Measure Theory, Linear Algebra and Calculus) wanting to dig deep into Deep Learning and Neural Networks, and I'm looking for ...
Michel H's user avatar
  • 322
0 votes
0 answers
10 views

Given two matrices, where the average of correlations for each column is $0$, how to determine the range of average correlation values for each row?

Given two matrices A and B, where $mean(corr(A_{\cdot,i},B_{\cdot,i}))$ is equal to $0$, how to determine the range of $mean(corr(A_{j,\cdot},B_{j,\cdot}))$ ?
Seymour's user avatar

15 30 50 per page
1
2 3 4 5
58