Skip to main content

All Questions

0 votes
0 answers
12 views

Does it make sense to have object detection model followed by a classification model

So i was working with the SKU110k dataset and i was required to identify the different items in the shelf as well but the SKU110k dataset only annotated shelf items but did not identify them. So i ...
Ali Raheel's user avatar
0 votes
0 answers
11 views

NER with custom tags and no training data, zero shot approach help

I am building a "field tagger" for documents. Basically, a document, in my case something like a proposal or sales quote, would have a bunch of entities scattered throughout it, and we want ...
redbull_nowings's user avatar
0 votes
0 answers
32 views

How does one handle a dataset with groups of features and groups of labels in classification?

I have a large dataset (1.8mil samples). There are 15 features: x1, y1, z1, e1, d1, x2,..., d3. (x,y,z) are coordinates, e is energy, and d is a derived feature- Euclidean distance between the ...
mche1962's user avatar
1 vote
1 answer
24 views

Everything is classified as background by segmentation model

I am training a U-NET model for medical image segmentation. Problem is that the binary masks that im using to train the model mostly consist of background pixels and a very small region of the whole ...
Ashwin Singh's user avatar
0 votes
0 answers
23 views

How to choose thresholds to discretize target for binary classification

My group is using logistic regression to investigate the most predictive features in a dataset. Our target variable is actually a continuous variable that we discretized using two cutoff thresholds (...
OstensiblyPutative's user avatar
1 vote
1 answer
77 views

How do I compute and plot Bias and Variance of a classifier in Python?

I'm new to Machine Learning and I understand bias and variance in theory but I can't seem to find a single source that explains how bias or variance can be computed. I'd like to do it in Python and ...
William's user avatar
  • 113
0 votes
1 answer
25 views

Fixing class imbalance vs Over-detecting in test data

In my experiences, binary classifiers tend do better in terms of F1 scores when the class imbalance is at least reduced. However, this leads to over-predicting in the test data. (Thought) Example: If ...
yurnero's user avatar
  • 131
0 votes
0 answers
16 views

How to choose segment in Grouped AUC metric?

Background In Binary Classification, AUC is a common metric. However, Group-AUC performs better in some scenario, such as we use AUC grouped by user in recommendation systems. In the below examples, I ...
Travis's user avatar
  • 111
1 vote
1 answer
26 views

Feature Engineering a Recency feature

I have a customer scoring problem I'm working on specifically on predicting conversion and coming up with a probability score on conversion (using xgboost classifier atm). There's a feature I want to ...
MetalicSt33l's user avatar
0 votes
0 answers
13 views

Modeling spatial data

I have the following dataset. For every time point (at a frequency of 1 hour), we can construct a graph consisting of 20 nodes representing countries. Each country (node) is characterized by 5 ...
Peter's user avatar
  • 1
0 votes
0 answers
37 views

Determining VCdim for union of subspaces $H_i$ - short question

Consider $\mathcal{H} = \mathcal{H}_1 \cup \mathcal{H}_2 \cup \mathcal{H}_3$, where: $\mathcal{H_1} = \{h_{a} : \mathbb{R} \rightarrow \{0,1\} \ | \ h_{a}(x) = 1_{[x \geq a]}(x) = 1_{[a, +\infty)}(x), ...
Andrei Jarca's user avatar
0 votes
1 answer
23 views

Should I standardise time series data for deep learning classification?

Say I have time series data for classifying stars using deep learning based on stellar variability, with each time series data measuring the flux of the star overtime. For each star, I have the data ...
Johnathon Smith's user avatar
1 vote
1 answer
39 views

Data binning for interval data

I am trying to create a ML model for salary classification into 5 categories (0-90k, 90-120k, 120-180k and so on). The problem is that in my dataset almost all salary data is presented in intervals. ...
pinkkdeerr's user avatar
0 votes
0 answers
45 views

When is sampling bias acceptable?

Overview: Dataset is small and a bit messy and the task is to classify 5 classes wherein the targets are ordinal. Feature Engineering and Selection, Model Tuning, etc. did not produce acceptable ...
easymoneysniper's user avatar
1 vote
0 answers
7 views

Is GroupKFold needed if some samples have some of their feature values equal?

I am given a dataset $D$ of 10k enzyme-substrate complexes having a lock-key relationship, with each sample (complex) being characterized by enzyme features $x_e$ and substrate features $x_s$. That is,...
ado sar's user avatar
  • 191

15 30 50 per page
1
2 3 4 5
91