SVM derivation - the "arbitrary multiplier" seems to matter.

Question

I've been reading the derivation for SVMs in the book by Chris Bishop (pattern recognition and machine learning). Equations (7.7) describes the Lagrangian. Note the $\frac{1}{2}$ in front of the $w$, which was chosen arbitrarily.

Then, the derivatives with respect to $w$ and $b$ are set to zero producing equations (7.8) and (7.9).

\begin{align} L(w,b,a) = \frac{1}{2} ||w||^2 - \sum_{n=1}^N a_n (t_n (w^T \phi(x_n)+b)-1) \tag{7.7}\end{align}

Separating the terms,

\begin{align}L(w,b,a) = \frac{1}{2}||w||^2 -\sum_{n=1}^N a_nt_nw^T\phi(x_n) +b\sum_{n=1}^N a_nt_n-\sum_{n=1}^N a_n\tag{7.7a}\end{align}

\begin{align} w = \sum_{n=1}^N a_n t_n \phi(x_n) \tag{7.8}\end{align}

\begin{align} 0 = \sum_{n=1}^N a_n t_n \tag{7.9}\end{align}

Then, he substitutes equation (7.8) into (7.7)

Note that as a direct consequence of (7.8) we get:

$$||w||^2 = w^Tw = \sum_{n=1}^N \sum_{m=1}^N a_n a_m t_n t_m \phi(x_n)^T \phi(x_m) = \sum_{n=1}^N a_nt_n w^T\phi(x_n)\tag{7.8a}$$

Substituting into (7.7a), the first two terms yield: $-\frac{1}{2}\sum_{n=1}^N \sum_{m=1}^N a_n a_m t_n t_m \phi(x_n)^T \phi(x_m)$ and this reduces the Lagrangian to:

$$L(a) = \sum a_n -\frac{1}{2}\sum_{n=1}^N \sum_{m=1}^N a_n a_m t_n t_m \phi(x_n)^T \phi(x_m)$$

Herein lies my question. The only reason we were left with $-\frac{1}{2}$ was due to the arbitrary $\frac{1}{2}$ chosen to accompany $w$. If we chose 1 instead, the term would completely cancel out, fundamentally changing the Lagrangian.

Rohit Pandey · Accepted Answer · 2019-08-13 06:42:52Z

1

The answer occurred to me as I was writing the question. But since I had already put a lot of work into the question, I decided to leave it there and answer it for my own reference. The error in my thinking was assuming that (7.8) would remain unchanged if the multiplier accompanying $||w||^2$ (the objective function) was changed.

answered Aug 13, 2019 at 6:42

Rohit Pandey

6,9433 gold badges29 silver badges62 bronze badges

Add a comment |

Stack Exchange Network

SVM derivation - the "arbitrary multiplier" seems to matter.

1 Answer 1

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
machine-learning
lagrange-multiplier
.

Hot Network Questions

SVM derivation - the "arbitrary multiplier" seems to matter.

1 Answer 1

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged machine-learninglagrange-multiplier.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
machine-learning
lagrange-multiplier
.