Skip to main content

Questions tagged [metric]

A metric is a way to evaluate the performance of a machine learning model. Depending on the task, different metrics may be used.

1 vote
0 answers
9 views

Use a metric that is not available in the list of metric for xgboost

Working in R. I am following this post on stack overflow. I am train an xgboost model and I want to use another metric that is not in the list of metric we can whoose for the eval_metric parameter. I ...
Camillionnaire's user avatar
0 votes
0 answers
16 views

How to choose segment in Grouped AUC metric?

Background In Binary Classification, AUC is a common metric. However, Group-AUC performs better in some scenario, such as we use AUC grouped by user in recommendation systems. In the below examples, I ...
Travis's user avatar
  • 111
0 votes
0 answers
14 views

Keras siamese model history is empty

I am making a siamese neural network with triplet loss using keras, and have encountered an odd problem. I tried saving my history twice: once in a callback (saved as a dictionary), and once after ...
Rotem Ton's user avatar
0 votes
0 answers
14 views

A/B test question - How to test significance for metrics that are not the unit of randomization

We're runnning an AB test on an ecommerce website. The feature being launched is not for the "users" that come to buy products on the website but is rather for "suppliers" who add ...
helloworld's user avatar
0 votes
0 answers
31 views

What are the most important evaluation metrics for anomaly segmentation?

When people talk about anomaly segmentation models, they often mention evaluation metrics like F1 score, AP, AUROC, and AUPRO. But which one really matters most when comparing models, and why? I'm ...
Mosh Geb's user avatar
0 votes
0 answers
10 views

Commonly used metric in NLP literature to compare ranked weighted results with variable importance for top-k results

I have two different search engines that always return the same results but in different orders. The results consist of websites along with confidence scores, which range from 100 to 10,000. The ...
hanugm's user avatar
  • 157
0 votes
0 answers
18 views

How to make my validation plots more stable and improve R2 metric?

I'm working on predicting 4 numeric values basing on signal spectrum (spectrum is represented as an array of 800 numeric values in scale 0 to 1). The input values are scaled by using StandardScaler. ...
mkow93's user avatar
  • 1
1 vote
0 answers
52 views

Bad metrics results by strong class imbalance in Credit card classification

Hi i'm currently in the process of writing my bachelor's thesis and stuck at a some steps. I've developed a few ML-Model (XGBoost, (Balanced) Random Forest, ElasticNet,...) on an extreme imbalanced ...
user159373's user avatar
0 votes
0 answers
12 views

Which analog of F1 score metrics can I use in this case?

I am training a cnn segmentation model and I need some analog of F1 score So, we have GT as red rectangles (called "red") and Pred as blue rectangles (called "blue"). It is clear ...
sixtytrees's user avatar
0 votes
0 answers
28 views

How to read the "predicted_true" Metric of an Azure ML experiment?

I followed along to Explore Automated Machine Learning in Azure Machine Learning which had me create a regression experiment using data from https://aka.ms/bike-rentals (731 samples; 12 features; 1 ...
joseville's user avatar
  • 143
0 votes
0 answers
17 views

Calculating Readmission Metrics in Python

I need to compute some Hospital Readmission Variables using Python. I would need to compute the following metrics: Simple Readmission: Compute variables for different periods 3, 7, 14 30 and 45 days ...
Carmen Morales's user avatar
0 votes
0 answers
7 views

How to label a dataset of text pairs to use it as a universal one for calculating the precision@k metric for different models?

I am facing a semantic search problem. I am fine tuning different NLU models and i want to use precision@k as my main metric. Is it possible to label a dataset of text pairs to use it as a universal ...
Ir8_mind's user avatar
  • 183
0 votes
0 answers
57 views

Is this the appropriate way to calculate a multiclass reliability diagram for model calibration?

I'm trying to generalize reliability diagrams [1] to a multiclass classifier and implement that using pytorch and pytorch-metrics. So far so good but I'm somewhat confused about the definition of ...
Nirro's user avatar
  • 101
1 vote
1 answer
33 views

Is it bad to average several MAEs calculated from chunks of a big test dataset?

In my regression problem, I am using Mean Absolute Error (MAE) as a metric for my network. My test dataset is too big to fit in memory, so I am reading the test dataset in chunks and then Keras' ...
ihavenoidea's user avatar
0 votes
0 answers
17 views

Survival analysis metric on time series data

I created a model that estimates the probability of failure of an asset (based on Weibull CDF, value between 0 and 1). I have a data point every minute. I want to measure the model's success based on ...
rvdinter's user avatar

15 30 50 per page
1
2 3 4 5
17