Questions tagged [classification]
An instance of supervised learning that identifies the category or categories which a new instance of dataset belongs.
3,287
questions
0
votes
0
answers
8
views
Imbalanced Cost-Sensitive Learning Workflow - How to split the data, tune hyperparameters and apply adecision threshold?
I am facing a problem with imbalanced dataset in which I would like to detect the rare event. My questions are more of general strategy about the whole workflow and I would like to hear your thoughts ...
0
votes
0
answers
12
views
Does it make sense to have object detection model followed by a classification model
So i was working with the SKU110k dataset and i was required to identify the different items in the shelf as well but the SKU110k dataset only annotated shelf items but did not identify them. So i ...
0
votes
0
answers
15
views
Classification: Seeking Model to Recognize Number Relationships
I'm trying to find out if there's a type of AI/ML model capable of recognizing relationships between numbers. Could functional programming help here? I haven't figured out how to approach this yet.
...
0
votes
0
answers
11
views
NER with custom tags and no training data, zero shot approach help
I am building a "field tagger" for documents. Basically, a document, in my case something like a proposal or sales quote, would have a bunch of entities scattered throughout it, and we want ...
0
votes
0
answers
8
views
How to adjust classification totals based on known bias of estimator
Let's say I have a dataset, $D$, with known ground truth labels. I nonetheless use a few-shot LLM classifier on this dataset to predict $k$ classes for each label.
From the LLM results, I get ...
0
votes
0
answers
32
views
How does one handle a dataset with groups of features and groups of labels in classification?
I have a large dataset (1.8mil samples). There are 15 features: x1, y1, z1, e1, d1, x2,..., d3. (x,y,z) are coordinates, e is energy, and d is a derived feature- Euclidean distance between the ...
1
vote
1
answer
24
views
Everything is classified as background by segmentation model
I am training a U-NET model for medical image segmentation. Problem is that the binary masks that im using to train the model mostly consist of background pixels and a very small region of the whole ...
0
votes
1
answer
24
views
How can I improve xgboost classifier if overfitting start from the initial epochs?
I am training a XGBoost multi-class classifier, but got very bad result. The train/val leaning multi-class logloss curve showed that overfitting started from the early epochs. What directions can I ...
1
vote
0
answers
20
views
How to apply CalibratedClassifierCV in external validation of a Random Forest model
I have a model trained on my data. I used joblib to get the model and shared with other teams to evaluate the performance of the model on their data. One of the team came back and said that the models ...
0
votes
0
answers
16
views
Training Data for Duplicate Detection: Allow External Information?
We have collected metadata of scientific publications (in a bilingual English-French context) from several international platforms (OpenAlex, Scopus) and French platforms (Hal, Idref, etc.). Many ...
0
votes
0
answers
18
views
Cross entropy loss for multi classification problem
I am handling a multi-class classification problem, with label in the following form [1333201000]
and the logit output of the model is in the form
([[ 0.4523, 0.0198, -0.1911, -0.0036],
[ 0.4917, 0....
0
votes
0
answers
22
views
Improving Recall and Precision of the Minority Class with XGBoost to Maximize Profits in Unbalanced Data
The company is interested in identifying profitable customers who are likely to purchase a ticket when given a promotional offer. My goal is to build a model to predict whether a customer will buy a ...
0
votes
0
answers
23
views
How to choose thresholds to discretize target for binary classification
My group is using logistic regression to investigate the most predictive features in a dataset. Our target variable is actually a continuous variable that we discretized using two cutoff thresholds (...
0
votes
0
answers
8
views
I wrote a classifier in C++ but it doesn't learn and returns the same loss each epoch
I wrote a classifier in C++ and tried training it on MNIST set but it doesn't learn. Because I am using log loss, it returns a loss of -ln(1/10) basically random chance. I tried tinkering with my ...
0
votes
0
answers
6
views
Classification for multi row observation: Long format to Wide format always efficient?
I have a table of observations, or rather 'grouped' observations (that spans more than a row), where each group represents a deal, and each row representing a product. But the prediction is to be done ...