Newest 'classification+dataset' Questions - Data Science Stack Exchange

0 votes

1 answer

52 views

Train/test split of data, stratified based on label, but ensuring no athletes are In both train/test sets

I’m working on a project that uses data from wearable tech for activity classification. However, I’m having trouble deciding on how to do the train/test split. I’m currently doing the split based on ...

Shane O Mahony

3

asked Apr 26 at 15:56

0 votes

0 answers

45 views

When is sampling bias acceptable?

Overview: Dataset is small and a bit messy and the task is to classify 5 classes wherein the targets are ordinal. Feature Engineering and Selection, Model Tuning, etc. did not produce acceptable ...

easymoneysniper

21

asked Apr 16 at 4:21

0 votes

0 answers

8 views

What is the reference time point relative to which the capital-gain and capital-loss features of the UC Irvine Adults dataset are measured?

The Adults dataset available in the UC Irvine Machine Learning Repository is based on the 1994 census data (USA census, I presume). The dataset has two features named ...

Evan Aad

175

asked Apr 15 at 17:48

0 votes

0 answers

12 views

Building a dataset for classification

I'm thinking of building a powershell script classifier using different architectures of neural networks. I have approximately 6k powershell scripts (3k malicious, 3k benign). My questions are: How ...

freaks

1

asked Apr 14 at 23:00

1 vote

1 answer

26 views

Public Email Classification Dataset but not Spam vs Ham

Context Working to deliver a POC on automated email classification (in customer service context) to tag emails as related to feedback, complain, lost and found etc. The tags are not entirely exclusive,...

Della

335

asked Apr 12 at 4:46

0 votes

0 answers

9 views

Which dataset could be a good choice to train Environment Sound Classification model for user environment awareness while wearing earbus?

Which dataset could be a good choice to train an Environment Sound Classification model for the following use case: use the model in the earbuds/earphones to detect important sound events in the user'...

Danijel

173

asked Feb 2 at 11:58

0 votes

1 answer

29 views

Where can I get 5000+ classified images of zoo animals? [closed]

please help! We are college students doing this for a project. The project is using neural networks and want to build a model that takes in an input of a colored image of an animal and outputs the ...

user90061

1

asked Jan 9 at 19:35

0 votes

0 answers

35 views

How can I identify coverage types in NFL games using Computer vision

I am currently working on a project that classifies coverage types from sports highlights using advanced computer vision techniques. Next Gen Stats effectively utilizes tracking data to identify ...

Shah Zeb

1

asked Dec 19, 2023 at 6:59

0 votes

0 answers

92 views

Seeking datasets for training a Language Model on U.S. mortgage loan processes

I'm in the process of training a Language Model (LLM) and require datasets that encompass various aspects of the U.S. mortgage loan process. The model's aim is to understand and simulate decision-...

Anand

1

asked Nov 30, 2023 at 7:53

0 votes

0 answers

32 views

Optimal ML classification approach

Background: I have an app data (impressions, user activities) that I can use as features for a multiclass classifier (5 classes). I just want to discuss about some things that our team is having a ...

easymoneysniper

21

asked Nov 12, 2023 at 15:53

0 votes

1 answer

31 views

Binary Classification of Images- CNN

I am learning ML and am working on a CNN problem where I need to classify images of CATS and DOGS. The way I have setup the labels is that cats are 1 and dogs are 0. I have made the final output layer ...

Hussain Bhavnagarwala

1

asked Oct 16, 2023 at 4:56

1 vote

2 answers

96 views

Cluster/Similarity problem with two datasets of different cardinality

I want to cluster financial products according to their similarity. I have two dataset of different cardinality: One-to-One dataset: One ID has One attribute/feature per column - Describes a ...

Maeaex1

550

asked Sep 28, 2023 at 14:54

1 vote

1 answer

107 views

How to know the confidence of a classification on unlabeled data generated after model training?

I have created (in python) the code for a Random Forest classification model for a labeled dataset using sklearn. The model works very well. ...

Daniel Vieira

13

asked Aug 15, 2023 at 9:44

0 votes

0 answers

45 views

Open source dataset (manufacturing, machine operations)

I am looking for an open source dataset from the manufacturing domain (sensor data, time series) with specific traits. It should stem from a process consisting of a sequence of distinct machine ...

sinpalabras

1

asked Aug 8, 2023 at 19:22

0 votes

1 answer

12 views

Classification Problematics : Feature Number Variance & Feature Repetition

I have a harsh case study (in my mind). The problem is I need make binary classification on Quality of Service (good or bad). I have a feedback on quality on groups of devices belonging to company. I ...

secuf

1

asked Jul 7, 2023 at 8:07

Stack Exchange Network

All Questions

Train/test split of data, stratified based on label, but ensuring no athletes are In both train/test sets

When is sampling bias acceptable?

What is the reference time point relative to which the capital-gain and capital-loss features of the UC Irvine Adults dataset are measured?

Building a dataset for classification

Public Email Classification Dataset but not Spam vs Ham

Which dataset could be a good choice to train Environment Sound Classification model for user environment awareness while wearing earbus?

Where can I get 5000+ classified images of zoo animals? [closed]

How can I identify coverage types in NFL games using Computer vision

Seeking datasets for training a Language Model on U.S. mortgage loan processes

Optimal ML classification approach

Binary Classification of Images- CNN

Cluster/Similarity problem with two datasets of different cardinality

How to know the confidence of a classification on unlabeled data generated after model training?

Open source dataset (manufacturing, machine operations)

Classification Problematics : Feature Number Variance & Feature Repetition

Hot Network Questions

All Questions

Related Tags