All Questions
Tagged with statistics machine-learning
661
questions
0
votes
0
answers
18
views
Does probability flow ODE trajectory (in the context of diffusion models) represents a bijective mapping between any distribution to a gaussian? [closed]
I have read several papers about diffusion models in the context of deep learning.
especially this one
As explained in the paper, by learning the score function $(\nabla \log(p_t(x)))$, probability ...
0
votes
0
answers
21
views
Sample complexity bounds of $L_S(h)$
Fix $\mathscr{H} \subset \mathscr{Y}^\mathscr{X}$ and a loss $\ell : \hat{Y} \times Y \to [0,1]$. Fix $S \in (\mathscr{X} \times \mathscr{Y})^{2m}$. Assume for now that $S$ is not random. Suppose we ...
0
votes
0
answers
23
views
Harmonizing Classification and Regression
I have recently been encountering explanations of classification and regression which start with discrete label values as defining the former and continuous label values as defining the latter. I have ...
1
vote
0
answers
64
views
Relation between values of $ξ_i$ and $\alpha_i$ in SVM?
I have a question in about a property of support vectors of SVM which is stated in subsection "12.2.1 Computing the Support Vector Classifier" of "The Elements of Statistical Learning&...
0
votes
0
answers
10
views
Paired bootstrap test p-value formula in binary classification
Background
For a binary classification task, let $M(A, Z)$ denote an evaluation metric, such as accuracy, for classifier $A$ and test examples $Z.$ Then, let
$$
\delta(Z) = M(A, Z) - M(B, Z)
$$
denote ...
0
votes
0
answers
39
views
least squares minimum test error solution
assume we want to learn a model $y=x^T \beta + \varepsilon $
where
$\beta \in \mathbb{R}^d$ is constant
$ x \in \mathbb{R}^d$ is the input vector with Gaussian distribution $\mathcal{N}(0,\Sigma_x)$ ...
2
votes
0
answers
20
views
Would like to validate whether the AUC equation is correct or not
I found a paper "Chapi, Kamran, et al. "A novel hybrid artificial intelligence approach for flood susceptibility assessment." Environmental modelling & software 95 (2017): 229-245&...
0
votes
1
answer
16
views
Understanding the Reasoning Behind the Growth Function $m_{\mathcal{H}}(N)=2^N$ for Convex Sets
I am currently reading Learning from Data by Abu-Mostafa et al. and I am struggling to understand the reasoning behind the growth function $m_{\mathcal{H}}(N)=2^N$ for convex sets. Here is the ...
0
votes
1
answer
36
views
Estimating the conditional entropy of a discrete variable conditioning on continuous variable
I am doing a machine learning project and I am trying to select the best features by computing their mutual information and select the ones with the highest information gain. I was looking at this ...
0
votes
0
answers
31
views
How to Upper Bound the Spectral Norm of $\left(XX^T\right)^{-1}\left(XX^T\right)^{-1}X$?
I have an observation matrix $ X \in \mathbb{R}^{n \times n}$. Considering $XX^T$, this matrix can be seen as a correlation matrix between individuals, so it generally has elements close to the ...
1
vote
1
answer
41
views
How to expand the double integral in variational objective function?
I am reading John Paisley's lecture note on variational inference. In lecture 6 p.3, he wrote the objective function as follows:
Latex:
$$
\mathcal{L}(a', b', \mu', \Sigma') = \int_{0}^{\infty} \int_{...
0
votes
0
answers
21
views
How to understand likelihood function bayesian
$\mathcal{N}(W^T \cdot X, \beta^{-1})$
This is the likelihood distribution for Bayesian linear regression, right? So, the thing is, if I'm doing batch mode Bayesian regression, then:
Weights (W): Size:...
2
votes
1
answer
33
views
How to derive likelihood function
I have been struggling a lot with the concept of likelihood and I'd really appreciate it if someone could verify if my understanding is correct and give input.
If I understand this correcly, we pick ...
0
votes
0
answers
22
views
Bayesian linear regression about finding the likelihood
Pick a single data point $(x,t)$ and calculate and plot the likelihood for this single data point across all $w$ in your parameter space $(w_0 \times w_1)$ (for a single data point it is a univariate ...
1
vote
0
answers
36
views
Bayes classifiers with cost of misclassification
A minimum ECM classifier disciminate the features $\underline{x}$ to belong to class $t$ ($\delta(\underline{x}) = t$) if $\forall j \ne t$:
$$\sum_{k\ne t} c(t|k) f_k(\underline{x})p_k \le \sum_{k\ne ...