Questions tagged [metric]
A metric is a way to evaluate the performance of a machine learning model. Depending on the task, different metrics may be used.
249
questions
1
vote
0
answers
9
views
Use a metric that is not available in the list of metric for xgboost
Working in R.
I am following this post on stack overflow.
I am train an xgboost model and I want to use another metric that is not in the list of metric we can whoose for the eval_metric parameter.
I ...
0
votes
0
answers
16
views
How to choose segment in Grouped AUC metric?
Background
In Binary Classification, AUC is a common metric. However, Group-AUC performs better in some scenario, such as we use AUC grouped by user in recommendation systems.
In the below examples, I ...
0
votes
0
answers
14
views
Keras siamese model history is empty
I am making a siamese neural network with triplet loss using keras, and have encountered an odd problem. I tried saving my history twice: once in a callback (saved as a dictionary), and once after ...
0
votes
0
answers
14
views
A/B test question - How to test significance for metrics that are not the unit of randomization
We're runnning an AB test on an ecommerce website. The feature being launched is not for the "users" that come to buy products on the website but is rather for "suppliers" who add ...
0
votes
0
answers
31
views
What are the most important evaluation metrics for anomaly segmentation?
When people talk about anomaly segmentation models, they often mention evaluation metrics like F1 score, AP, AUROC, and AUPRO. But which one really matters most when comparing models, and why? I'm ...
0
votes
0
answers
10
views
Commonly used metric in NLP literature to compare ranked weighted results with variable importance for top-k results
I have two different search engines that always return the same results but in different orders. The results consist of websites along with confidence scores, which range from 100 to 10,000. The ...
0
votes
0
answers
18
views
How to make my validation plots more stable and improve R2 metric?
I'm working on predicting 4 numeric values basing on signal spectrum (spectrum is represented as an array of 800 numeric values in scale 0 to 1). The input values are scaled by using StandardScaler. ...
1
vote
0
answers
52
views
Bad metrics results by strong class imbalance in Credit card classification
Hi i'm currently in the process of writing my bachelor's thesis and stuck at a some steps.
I've developed a few ML-Model (XGBoost, (Balanced) Random Forest, ElasticNet,...) on an extreme imbalanced ...
0
votes
0
answers
12
views
Which analog of F1 score metrics can I use in this case?
I am training a cnn segmentation model and I need some analog of F1 score
So, we have GT as red rectangles (called "red") and Pred as blue rectangles (called "blue").
It is clear ...
0
votes
0
answers
28
views
How to read the "predicted_true" Metric of an Azure ML experiment?
I followed along to Explore Automated Machine Learning in Azure Machine Learning
which had me create a regression experiment using data from https://aka.ms/bike-rentals (731 samples; 12 features; 1 ...
0
votes
0
answers
17
views
Calculating Readmission Metrics in Python
I need to compute some Hospital Readmission Variables using Python.
I would need to compute the following metrics:
Simple Readmission:
Compute variables for different periods 3, 7, 14 30 and 45 days ...
0
votes
0
answers
7
views
How to label a dataset of text pairs to use it as a universal one for calculating the precision@k metric for different models?
I am facing a semantic search problem. I am fine tuning different NLU models and i want to use precision@k as my main metric. Is it possible to label a dataset of text pairs to use it as a universal ...
0
votes
0
answers
57
views
Is this the appropriate way to calculate a multiclass reliability diagram for model calibration?
I'm trying to generalize reliability diagrams [1] to a multiclass classifier and implement that using pytorch and pytorch-metrics.
So far so good but I'm somewhat confused about the definition of ...
1
vote
1
answer
33
views
Is it bad to average several MAEs calculated from chunks of a big test dataset?
In my regression problem, I am using Mean Absolute Error (MAE) as a metric for my network. My test dataset is too big to fit in memory, so I am reading the test dataset in chunks and then Keras' ...
0
votes
0
answers
17
views
Survival analysis metric on time series data
I created a model that estimates the probability of failure of an asset (based on Weibull CDF, value between 0 and 1). I have a data point every minute.
I want to measure the model's success based on ...