Newest 'classification+clustering' Questions - Data Science Stack Exchange

2 votes

1 answer

34 views

Grouping similar classes to improve accuracy, whilst maximising the number of classes

Suppose I have a large number of distinct classes, some of which are related. My model has high classification accuracy for some classes, whilst other classes are hard to predict. How could I group ...

MuhammedYunus

681

asked May 25 at 17:48

0 votes

0 answers

13 views

Need to compare results using Ward's method

So I create clusters like this and StandardScale them ...

Poyo

1

asked Apr 7 at 5:53

0 votes

1 answer

25 views

Group/cluster semantically similar classes in reports?

I'm fine-tuning BERT models to binary classify reports. For example, a report can be about 'birds' or not about 'birds'. This works really well, but now I want to do multi-label classification, ...

Rob Audenaerde

143

asked Apr 4 at 12:29

0 votes

0 answers

14 views

How to classify/recognize postage stamp varieties?

As a hobbiest stamp collector, I often run into the need for classifying stamps based on minute differences, such as these: Now, I literally have thousands of them (in ziploc bags) and I am planning ...

René Becker

1

asked Apr 1 at 8:33

0 votes

0 answers

37 views

Classifying Players as winners or losers

I have a dataset that I curated from a game that I play. There are currently 130 instances (i.e. players) and an innumerable number of features. Experience tells me <10 features would be sufficient....

Shawn

35

asked Mar 14 at 21:16

0 votes

0 answers

9 views

Unsupervised learning with bags of words with a word metric

I would like to perform clustering on a collection of documents with the assumption that I have a metric $\rho$ which tells me how close two words are to being synonyms. If $\mathcal{W}$ is our ...

jwhite

101

asked Jan 25 at 21:39

0 votes

0 answers

23 views

Choosing a cluster validation measure for graph clustering algorithm

I am currently solving a clustering problem. Objects to be clustered are represented as sparse vectors in R^N, N=10. The number of objects is about 1kk. To cluster, I build a graph keeping the largest ...

Sergey Tkachenko

1

asked Jan 23 at 11:45

0 votes

0 answers

12 views

How to solve classification problem that we should cluster elements, with Multinomial classification from CS229?

I just learned about Multinomial classification (CS229 Lecture note (What I learned is on page 24)) and I attempted to solve a problem that Obesity classification from Kaggle. Kaggle Link I tried to ...

Gosu Choi

1

asked Nov 26, 2023 at 9:56

1 vote

2 answers

96 views

Cluster/Similarity problem with two datasets of different cardinality

I want to cluster financial products according to their similarity. I have two dataset of different cardinality: One-to-One dataset: One ID has One attribute/feature per column - Describes a ...

Maeaex1

550

asked Sep 28, 2023 at 14:54

4 votes

1 answer

329 views

Solve tough clustering problem with overlapping clusters

I'm having some trouble to solve a hard clustering problem. I have a 2D dataset characterized by non spherical and partially overlaping clusters with different densities. I've read a lot about ...

Lorenço Santos

41

asked Sep 8, 2023 at 18:44

0 votes

2 answers

53 views

Text Classification Taking too long

I have a sample of 135k documents that are preprocessed, and to which I calculated TFIDF. I tried clustering with KMeans, which gave me a memory problem (20GB). Then, i tried with MiniBatch K-Means ...

ayowhatthedogdoin

3

asked Aug 29, 2023 at 17:40

0 votes

2 answers

88 views

Different Algorithms for 50-50 A/B Testing

We are running A/B tests on web app customers, given a customerId. Each customer will see different web-feature designs. Trying to prevent usage of Feature Flags as its not currently setup yet in our ...

mattsmith5

53

asked May 10, 2023 at 21:43

1 vote

1 answer

24 views

Movement in cohorts

I am working on a user sales data which gets updated week over week. Based on the sales done in each week, the user is categorized in segment A, B or C. This means size of each segment could change ...

Sham

31

asked Feb 21, 2023 at 16:07

0 votes

0 answers

23 views

Determine unusual occurrence of words in classes

I am working on a project where I have 20+ classes/groups. Each of these groups perform certain text searches. I am looking for specific keywords example 'code' which is an anomaly. The challenge is ...

kruparulz14

11

asked Jan 12, 2023 at 18:25

1 vote

1 answer

476 views

In DBSCAN, can the distance between a Noise Point and Border Point be less than Epsilon?

In DBSCAN: A core point is a point which has at least "MinPts" points inside its Epsilon radius. A border point is a point inside the Epsilon radius of a core point, but it has a number of ...

SuperFluo

13

asked Jan 11, 2023 at 12:31

Stack Exchange Network

All Questions

Grouping similar classes to improve accuracy, whilst maximising the number of classes

Need to compare results using Ward's method

Group/cluster semantically similar classes in reports?

How to classify/recognize postage stamp varieties?

Classifying Players as winners or losers

Unsupervised learning with bags of words with a word metric

Choosing a cluster validation measure for graph clustering algorithm

How to solve classification problem that we should cluster elements, with Multinomial classification from CS229?

Cluster/Similarity problem with two datasets of different cardinality

Solve tough clustering problem with overlapping clusters

Text Classification Taking too long

Different Algorithms for 50-50 A/B Testing

Movement in cohorts

Determine unusual occurrence of words in classes

In DBSCAN, can the distance between a Noise Point and Border Point be less than Epsilon?

Hot Network Questions

All Questions

Related Tags