All Questions
Tagged with classification clustering
120
questions
2
votes
1
answer
34
views
Grouping similar classes to improve accuracy, whilst maximising the number of classes
Suppose I have a large number of distinct classes, some of which are related.
My model has high classification accuracy for some classes, whilst other classes are hard to predict.
How could I group ...
0
votes
0
answers
13
views
Need to compare results using Ward's method
So I create clusters like this and StandardScale them
...
0
votes
1
answer
25
views
Group/cluster semantically similar classes in reports?
I'm fine-tuning BERT models to binary classify reports. For example, a report can be about 'birds' or not about 'birds'.
This works really well, but now I want to do multi-label classification, ...
0
votes
0
answers
14
views
How to classify/recognize postage stamp varieties?
As a hobbiest stamp collector, I often run into the need for classifying stamps based on minute differences, such as these:
Now, I literally have thousands of them (in ziploc bags) and I am planning ...
0
votes
0
answers
37
views
Classifying Players as winners or losers
I have a dataset that I curated from a game that I play. There are currently 130 instances (i.e. players) and an innumerable number of features. Experience tells me <10 features would be sufficient....
0
votes
0
answers
9
views
Unsupervised learning with bags of words with a word metric
I would like to perform clustering on a collection of documents with the assumption that I have a metric $\rho$ which tells me how close two words are to being synonyms.
If $\mathcal{W}$ is our ...
0
votes
0
answers
23
views
Choosing a cluster validation measure for graph clustering algorithm
I am currently solving a clustering problem. Objects to be clustered are represented as sparse vectors in R^N, N=10. The number of objects is about 1kk. To cluster, I build a graph keeping the largest ...
0
votes
0
answers
12
views
How to solve classification problem that we should cluster elements, with Multinomial classification from CS229?
I just learned about Multinomial classification (CS229 Lecture note (What I learned is on page 24)) and I attempted to solve a problem that Obesity classification from Kaggle. Kaggle Link
I tried to ...
1
vote
2
answers
96
views
Cluster/Similarity problem with two datasets of different cardinality
I want to cluster financial products according to their similarity. I have two dataset of different cardinality:
One-to-One dataset: One ID has One attribute/feature per column - Describes a ...
4
votes
1
answer
329
views
Solve tough clustering problem with overlapping clusters
I'm having some trouble to solve a hard clustering problem.
I have a 2D dataset characterized by non spherical and partially overlaping clusters with different densities.
I've read a lot about ...
0
votes
2
answers
53
views
Text Classification Taking too long
I have a sample of 135k documents that are preprocessed, and to which I calculated TFIDF. I tried clustering with KMeans, which gave me a memory problem (20GB). Then, i tried with MiniBatch K-Means ...
0
votes
2
answers
88
views
Different Algorithms for 50-50 A/B Testing
We are running A/B tests on web app customers, given a customerId. Each customer will see different web-feature designs. Trying to prevent usage of Feature Flags as its not currently setup yet in our ...
1
vote
1
answer
24
views
Movement in cohorts
I am working on a user sales data which gets updated week over week. Based on the sales done in each week, the user is categorized in segment A, B or C. This means size of each segment could change ...
0
votes
0
answers
23
views
Determine unusual occurrence of words in classes
I am working on a project where I have 20+ classes/groups. Each of these groups perform certain text searches. I am looking for specific keywords example 'code' which is an anomaly. The challenge is ...
1
vote
1
answer
476
views
In DBSCAN, can the distance between a Noise Point and Border Point be less than Epsilon?
In DBSCAN:
A core point is a point which has at least "MinPts" points inside its Epsilon radius.
A border point is a point inside the Epsilon radius of a core point, but it has a number of ...