0
$\begingroup$

While studying Hard-SVM topic in Shalev-Shwartz book I came across the following proof for the distance between point and hyperplane

$$\min\{\|\pmb x-\pmb v\|: \langle\pmb w,\pmb v\rangle + b = 0\}\\ \text{Taking }\ \pmb v = \pmb x\ - (\langle \pmb w, \pmb x \rangle +\ b)\pmb w\ \text{ we have that}\\ \langle\pmb w,\pmb v\rangle+\ b = \langle\pmb w,\pmb x\rangle-\ (\langle \pmb w, \pmb x \rangle\ +\ b)\|\pmb w\|^2\ +\ b = 0,\\ \text{and}\\ \|\pmb x-\pmb v\|=|\langle \pmb w,\pmb x \rangle+\ b|\|\pmb w\| = |\langle \pmb w,\pmb x\rangle\ +\ b|$$

Above is a proof for the distance between point $\pmb x$ and the hyperplane defined by $(\pmb w, b)$ where $\|\pmb w\|=1$ which is $|\langle \pmb w, \pmb x \rangle+b|$

I can derive the same proof by taking a point on the plane say $\pmb y$ and then taking a orthogonal projection of $\pmb x - \pmb y$ on the normal vector of the plane, but not able to understand the proof provided in the book. I would greatly appreciate if anyone can explain the above proof.

PS: I understand the first line in the proof points towards finding a point $\pmb v$ on the plane such that the distance between $\pmb x \ \text{and }\ \pmb v$ is minimized.

Thanks

$\endgroup$

1 Answer 1

1
$\begingroup$

The proof you provided is not complete. It's only the first part of it.

The distance between a point $\textbf{x}$ and a hyperplane $H$ defined by $(\textbf{w},b)$ is defined by:

$$ d(\textbf{x},H) = \underset{\textbf{v} \in H}{\text{min }} \|\textbf{x}-\textbf{v}\|\ $$

That is, one is trying to find the point $\textbf{v}$ in the hyperplane that minimises the distance to the point $\textbf{x}$. The proof is done by taking any point $\textbf{u} \in H$ and showing that:

$$ \| \textbf{x}-\textbf{u} \| \geq \|\textbf{x}-\textbf{v}\| = |\langle \textbf{w}, \textbf{x} \rangle +b | $$ where $\textbf{v} = \textbf{x} - (\langle \textbf{x}, \textbf{w} \rangle +b) \textbf{w}$.

That is, the point $\textbf{v}$ that we constructed is the one that minimises the distance to the point $x$ and hence $\|\textbf{x} - \textbf{v}\|$ is the distance between the hyperplane and $x$.

Here I used the same notation in the book and skipped the calculus since it's provided in there. The construction of $v$ is based on addition of vectors. You can think of $\textbf{x}$ as a vector from the origin to the point $x$. Similarly, $\langle (\textbf{w}, \textbf{x} \rangle+b) \textbf{w} $ is the vector from the origin to the orthogonal projection on the plane. Hence, the distance we're looking for, i.e., the distance between $x$ and its orthogonal projection, is just the difference between these two vectors.

$\endgroup$
2
  • $\begingroup$ Thanks for your reply, as I mentioned I do understand the first equation in the proof, finding a point $\mathbf v$ s.t. $\|\mathbf x - \mathbf v\|$ is minimum compared to any other point $\mathbf u$ on the hyperplane, this is essentially the orthogonal distance from the plane. However, can you explain the construction of $\mathbf v$, especially the second and third equations in my question above? $\endgroup$
    – ASR
    Commented Jan 4, 2022 at 7:41
  • $\begingroup$ I modified the answer. For a graphical representation see, e.g.: en.wikibooks.org/wiki/Linear_Algebra/…. As for the third equation, just plug in the definition of $\textbf{v}$ and use standard calculus of inner products. $\endgroup$
    – Saleh
    Commented Jan 12, 2022 at 14:45

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .