I am taking a advanced machine learning class, and in the class notes I noticed a notation that I did not recognize. It is the notation for binary loss function (Please note that it is not binary cross entropy loss)
In the context of binary classification, the binary loss function is defined as
$$ (l \circ f)(x_i, y_i) = \textbf{1}_{[f(x_i) \neq y_i]} $$
Where,
$x_i$ = Feature vector of i-th sample
$y_i$ = True label of i-th sample
$f$ = Function which represents the machine learning model
$l$ = Loss function
Here, $f(x_i) \in \{ 0,1 \}$ and $y_i \in \{ 0,1 \}$, so it is a binary classification problem. Also, in the notes, the following is mentioned,
If $f(x_i) = y_i$ then $ 1_{[f(x_i) \neq y_i]} = 1$, otherwise, $ 1_{[f(x_i) \neq y_i]} = 0$
Which, I think can be interpreted as, if the output of the model matches with the true label, the loss is 0 otherwise the loss is 1. However, if this interpretation is true, then shouldn't it be the other way around? i.e. should it be
If $ f(x_i) = y_i, 1_{[f(x_i) \neq y_i]} = 0$, otherwise, $ 1_{[f(x_i) \neq y_i]} = 1 $
Because loss should be 0, whenever the model's output matches the true label, otherwise, if it doesn't match, loss should be higher.
I am just trying to match my interpretation with the class notes. If my interpretation is correct, then there seems to be a mistake in the class note. Otherwise, my interpretation might be wrong.