Questions tagged [evaluation]
Anything related to evaluation of expressions, i.e. the process used to determine the value of expressions in a running program.
evaluation
1,354
questions
0
votes
0
answers
20
views
why does the f1 score declines from 1.0 to 0.0?
In the evaluation of a multi-class YOLOv8 model, the F1 confidence curve, why the F1 got dropped by 1.0 to 0.0 also it starts with 1.0? Why does it have to happen when there is an increase in the ...
1
vote
1
answer
31
views
Why do the sensitivity (recall) values differ between classification_report and precision_recall_fscore_support in a loop?
I am working with a synthetic dataset generated using make_classification from sklearn.datasets with 5 classes. I have trained a RandomForestClassifier on this data and am evaluating its performance ...
0
votes
0
answers
18
views
problems calculating precision and recall
I need to calculate precision and recall to evaluate my model performance,so I am using this code that perform inference,annotate the images with the resulted class
and calculates the precision and ...
0
votes
0
answers
168
views
Mean Reciprocal Rank (MRR) understanding for predicting top k elements
I have the following code from a paper, who have implemented MRR for recommending top-k elements using some Machine Learning model.
def MRR(test_y, pred_y, k=5):
predict = pd.DataFrame([])
...
0
votes
0
answers
45
views
Unexpected search results: How does Gmail search interpret advanced queries, and how I can use that information to achieve precision?
Consider the following filter
Keep this filter (also provided in query form) in mind. We'll come back to it later.
from: { domain1 domain2 email1 email2 }
subject: { +"exact string 1" +"...
-1
votes
1
answer
46
views
How to explain null-coalescing expression precedence evaluation with some operators?
The following code works fine. What is the logic behind the addition + being evaluated after the null-coalescing ??? How's that possible? Where is the doc explaining that?
int? tNullable = 2;
...
1
vote
1
answer
25
views
Why does the approxes variable of a custom eval_metric with catboost for binary classification contain negative values?
In order to create a personal evaluation function with catboost for binary classification, I used the example mentioned here: How to create custom eval metric for catboost?
However, I have negative ...
0
votes
0
answers
18
views
how to append validation accuracy for each epoch in the code below?
def train():
seed_val = 42
criterion = CosineSimilarityLoss()
criterion = criterion.to(device)
random.seed(seed_val)
torch.manual_seed(seed_val)
We'll store a number of quantities such as training and ...
1
vote
1
answer
62
views
Ensuring Equivalence in Python Functions: Understanding Implementation Impacts
In defining function equivalence, several factors come into play:
Producing equivalent results
Sharing the same (non-)termination behavior
Mutating (non-local) memory similarly
Maintaining identical ...
3
votes
3
answers
145
views
is "Side effects of a function are sequenced before its evaluation" specified by the C++ standard?
I didn't find relevant terms in Order of evaluation.
So is the behavior of function g undefined in the code below?
int x;
int f() { return x++; }
void g() { x = f(); }
I compiled the code on ...
0
votes
0
answers
103
views
model loss value for each epoch in sentence transformers framework
I'm trying to fine-tune a pre-trained language model using sentence transformers.
The model I'm using is based on Bert.
the method I'm using is fine-tunning via a siamese network so in order to do ...
0
votes
0
answers
18
views
Evaluation in recommendation system
I want to calculate the precision, recall, and f1-score values of the recommendation system that I built. I plan to conduct an evaluation by asking users directly whether the recommendation items ...
0
votes
0
answers
53
views
How to SpanQuery the evaluations in arize phoenix
I'm making a RAG application and I use arize phoenix for my logs.
I can make evaluations but it seems like I can't make a query that gets the evaluations result in a dataframe.
Does anyone have a ...
0
votes
0
answers
21
views
Testing and Evaluating Potentially Complex System of Interconnected Software Services
How to test and evaluate the interconnected software services as a whole with scientific citations? I expect to minimize risk by knowing certainty, increasing predictability, and considering more ...
0
votes
0
answers
40
views
How can I make an effective Evaluation function for a Draughts/Checkers game with Minimax + alpha-beta pruning?
Making a Checkers game for an academic project and struggling to produce effective evaluation methods to push certain scenarios. My game's logic appears to work fine and everything functions in terms ...