Skip to main content

Questions tagged [log-loss]

The tag has no usage guidance.

0 votes
0 answers
9 views

Extremely high logloss in binary classification problem [duplicate]

I have a binary classification problem that I am currently trying to tackle with xgboost. This is a low signal-to-noise ratio situation dealing with time series. Per this answer "Dumb" log-...
Baron Yugovich's user avatar
1 vote
1 answer
77 views

Logloss worse than random guessing with xgboost

I have a binary classification problem that I am currently trying to tackle with xgboost. This is a low signal-to-noise ratio situation dealing with time series. My out of sample AUC is 0.65, which is ...
Baron Yugovich's user avatar
2 votes
1 answer
41 views

SAS log likelihood fit statistic from SCORE statement in PROC LOGISTIC [closed]

When using a SCORE statement in PROC LOGISTIC in SAS, I can get fit statistics with FITSTAT. My response variable is binary. I want to get log likelihood, but looking at this documentation, I'm ...
cpahanson's user avatar
4 votes
1 answer
274 views

Selecting the model by bootstrapping: AIC vs. log-loss?

I'm building a predictive model with potentially multiple predictors. To that end, I try different, nested models, each with one more predictor than the previous one and compare their AICs. The AIC ...
Igor F.'s user avatar
  • 9,418
1 vote
0 answers
22 views

Log base in Cross Entropy Loss [duplicate]

What is the base for the logarithm used in the cross entropy loss (while doing multiclass classification's backpropagation)? Is it e, 2, or 10?
Sachin's user avatar
  • 111
2 votes
0 answers
54 views

Probabilistic interpretation for log-loss

Suppose I am modelling a binary response variable $Y \sim B(p)$ as a function of $p$ features $X_1, \dots, X_p$ by means of an equation of the form $$ p(Y = 1 \, | \, X = x) = f(x,\theta), $$ where $\...
Othman El Hammouchi's user avatar
18 votes
2 answers
3k views

Why not use evaluation metrics as the loss function?

Most algorithms use their own loss function for optimization. But these loss functions are always different from metrics used for actual evaluation. For example, for building binary classification ...
etang's user avatar
  • 1,007
3 votes
3 answers
194 views

Challenge an ICML Paper: For a given set of probability predictions and a log loss value, is the set of true labels giving such a loss unique?

Aggarwal's 2021 ICML paper "Label Inference Attacks from Log-loss Scores", seems to argue that the answer to the question in the title is "YES". The paper claims that, given ...
Dave's user avatar
  • 65.1k
2 votes
1 answer
2k views

Predicted Probability with XGBClassifier ranging only from 0.48 to 0.51 for either class

Why does my XGBClassifier predicts probability only from 0.48 to 0.51 for either class? I'm very new to XGBoost, so any suggestions are greatly appreciated! Here's ...
yuan-ning's user avatar
  • 123
6 votes
2 answers
1k views

"Dumb" log-loss for a binary classifier

I am trying to understand how I can best compare a classifier that I have trained and tuned against a "dumb" classifier, particularly in the context of binary classification with imbalanced ...
wissam124's user avatar
1 vote
1 answer
291 views

Mean logarithmic square error

I have a fallowing problems: I'm training a neural network against some set of output values (regression problem). Those values are between -inf to inf and I can't normalize them, because they come ...
Daniel Wiczew's user avatar
3 votes
1 answer
362 views

exp(log_softmax) vs softmax as neural network activation

I have read about log_softmax being more numerical stable than softmax, since it circumvents the division. I need to use softmax, probabilities between 0 and 1, for my neural network loss function. So ...
CreeperPower storing's user avatar
1 vote
0 answers
48 views

Is half the log loss, twice as good?

Lets say I have two different models based on the same dataset. Model A has a log loss of 0.30 on this dataset. Model B has a log loss of 0.60 on this dataset. If our scoring metric is log loss, is ...
WhiskeyHammer's user avatar
1 vote
1 answer
1k views

Understanding multiclass log-loss

I'm trying to understand the multiclass log-loss as described in sk-learns documentation. The wording ' Let the true labels (Y) for a set of samples be encoded as a 1-of-K binary indicator matrix...', ...
N Blake's user avatar
  • 579
0 votes
0 answers
139 views

If you're trying to match a vector $p$ to $x$, why doesn't a divisive loss function $\frac{p}{x} + \frac{x}{p}$ work better than negative log loss? [duplicate]

Suppose you had a classification problem where you are trying to predict a class label (e.g., $[0 \: 1 \: 0]^T$) with a model. One way to do this is to use log loss: $\Large \ell_{\log} = -\sum_i[y_i\...
Sam's user avatar
  • 257

15 30 50 per page
1
2 3 4 5