Newest 'f1' Questions - Cross Validated

1 vote

1 answer

74 views

Logloss worse than random guessing with xgboost

I have a binary classification problem that I am currently trying to tackle with xgboost. This is a low signal-to-noise ratio situation dealing with time series. My out of sample AUC is 0.65, which is ...

Baron Yugovich

617

asked Jul 16 at 0:54

0 votes

0 answers

10 views

Wilcoxon's Signed-Rank Test in the context of 2 algorithms and 1 domain

I'm trying to understand whether my analysis for a problem is in the right direction. I have 2 algorithms (3d object detectors) that I've applied to the same dataset to obtain TP, FP and FN's for each ...

neoavalon

1

asked Mar 2 at 5:28

0 votes

0 answers

35 views

Model evaluation approach and How it affects the performance of the model

So the task iam working on is supervised video summarization where the model tries to predict if a video frame is important or no using its features and the labels as annotations of frame scores. ...

moha tech

1

asked Feb 13 at 14:50

2 votes

1 answer

112 views

Comparing probability threshold graphs for F1 score for different models

Below are two plots, side-by side, for an imbalanced dataset. We have a very large imbalanced dataset that we are processing/transforming in different manner. After each transformation, we run an ...

Ashok K Harnal

557

asked Feb 5 at 6:08

0 votes

1 answer

29 views

What is F1 Score for this diagram?

I have this Venn chart that represent a dataset prediction of Identifying if our products are classified as "A41" standard or not The Blue Circle represents a Machine Learning Model ...

asmgx

291

asked Jan 13 at 17:58

0 votes

0 answers

22 views

F1 score mismatch with publication

I'm trying to reproduce the results of the baseline model from SEP28k paper but I struggle to get the details. Most strikingly, the F1 score for random prediction doesn't match the paper. Here are the ...

marekjg

1

asked Dec 27, 2023 at 8:37

1 vote

1 answer

16 views

Is there an equivalent for Yates' correction for a confusion matrix-derived metrics?

Given the following table of predictions vs. actual states: ...

Bryan

1,109

asked Jul 6, 2023 at 13:06

2 votes

1 answer

100 views

Binary classification metrics - Combining sensitivity and specificity?

The harmonic mean between precision and recall (F1 score) is a common metric to evaluate binary classification. It is useful because it strikes a balance between precision (FP) and recall (FN). For ...

usual me

1,247

asked Jun 14, 2023 at 10:09

1 vote

2 answers

145 views

Why don't we use the harmonic mean of sensitivity and specificity?

There is this question on the F-1 score, asking why we compute the harmonic mean of precision and recall rather than its arithmetic mean. There were good arguments in the answers in favor of the ...

user209974

211

asked Mar 22, 2023 at 21:28

0 votes

0 answers

505 views

F1 score for validation and testing datasets is different

I have the following F1 score function that I use for the model when I train it as part of metrics and as well during prediction: ...

Avv

249

asked Feb 1, 2023 at 15:58

8 votes

2 answers

430 views

Calculating the Brier or log score from the confusion matrix, or from accuracy, sensitivity, specificity, F1 score etc

Suppose I have a confusion matrix, or alternatively any one or more of accuracy, sensitivity, specificity, recall, F1 score or friends for a binary classification problem. How can I calculate the ...

Stephan Kolassa

127k

asked Feb 1, 2023 at 8:17

22 votes

2 answers

2k views

Academic reference on the drawbacks of accuracy, F1 score, sensitivity and/or specificity

Accuracy, as a KPI for assessing binary classification models, has major drawbacks: Why is accuracy not the best measure for assessing classification models?. The exact same issues also plague the F1 ...

Stephan Kolassa

127k

asked Jan 30, 2023 at 16:23

1 vote

0 answers

64 views

Statistical significance of performance difference in classification models

Is it possible to assign a p-value to the mean performance difference in three classification models? The models use the same data, same random seed, and use 10-fold cross validation. Model A has a ...

Adam_G

371

asked Dec 11, 2022 at 18:53

2 votes

1 answer

224 views

Confidence Interval of the Average of a F1 Score Samples

I have a number of individual F1 score samples and right now I am measuring the average F1 score across this group. However, I would also like to present a confidence interval on it. Its a continuous ...

SriK

269

asked Dec 5, 2022 at 16:58

3 votes

1 answer

204 views

Singular beta in the F-beta vs. threshold score?

Consider this plot of the $F_\beta$ score for different values of $\beta$. I have a hard time getting an intuition as to why they intersect at a same point. (Cf. this blog post.) In other words, why ...

Tfovid

795

asked Nov 28, 2022 at 8:26

Stack Exchange Network

Questions tagged [f1]

Logloss worse than random guessing with xgboost

Wilcoxon's Signed-Rank Test in the context of 2 algorithms and 1 domain

Model evaluation approach and How it affects the performance of the model

Comparing probability threshold graphs for F1 score for different models

What is F1 Score for this diagram?

F1 score mismatch with publication

Is there an equivalent for Yates' correction for a confusion matrix-derived metrics?

Binary classification metrics - Combining sensitivity and specificity?

Why don't we use the harmonic mean of sensitivity and specificity?

F1 score for validation and testing datasets is different

Calculating the Brier or log score from the confusion matrix, or from accuracy, sensitivity, specificity, F1 score etc

Academic reference on the drawbacks of accuracy, F1 score, sensitivity and/or specificity

Statistical significance of performance difference in classification models

Confidence Interval of the Average of a F1 Score Samples

Singular beta in the F-beta vs. threshold score?

Hot Network Questions

Questions tagged [f1]

Related Tags