Newest 'machine-learning' Questions - Data Science Stack Exchange

0 votes

0 answers

12 views

Does it make sense to have object detection model followed by a classification model

So i was working with the SKU110k dataset and i was required to identify the different items in the shelf as well but the SKU110k dataset only annotated shelf items but did not identify them. So i ...

Ali Raheel

1

asked yesterday

0 votes

0 answers

8 views

Is it appropriate to utilize LSTMs for multivariate binary prediction on a timeseries by sliding block-by-block vs row-by-row?

I am trying to implement an ML algorithm for multivariate regression on a list of several timeseries. There are hundreds of timeseries, each one millions of rows long. There are 13 features, and I'm ...

HyperThready

1

asked 2 days ago

0 votes

0 answers

23 views

What is appropriate Individual KPI for AI projects?

I work in the sales department of electronics component manufacturing company and we do data science projects using traditional algorithm like Random forests (success likelihood of design project), ...

The Great

2,585

asked Jul 18 at 1:07

0 votes

0 answers

11 views

NER with custom tags and no training data, zero shot approach help

I am building a "field tagger" for documents. Basically, a document, in my case something like a proposal or sales quote, would have a bunch of entities scattered throughout it, and we want ...

redbull_nowings

1

asked Jul 17 at 21:18

0 votes

0 answers

9 views

Is there a way to create a bootstrapped beta calibration function to use on new data?

I have created ML classification models that are now to be evaluated on a different population for external validation (n=5000, event rates between n=400 and n=1200 for different outcomes under study)....

mmo

1

asked Jul 17 at 9:15

0 votes

2 answers

22 views

LR not decaying for pytorch AdamW even after hundreds of epochs

I have following code using AdamW optimizer from pytorch: optimizer = AdamW(params=self.model.parameters(), lr=0.00005) I tried ...

RajS

105

asked Jul 16 at 6:09

0 votes

0 answers

14 views

Using LSTMs for Predicting Targets with Known Feature Vector

I am trying to use an LSTM to predict the consecutive "offset" calibration values for an instrument. These offset values have previously been shown to be well correlated with a pair of ...

Joey Wee

1

asked Jul 15 at 16:05

0 votes

0 answers

17 views

Andrew Ng ML course using MATLAB?

Nowadays python is mostly used for machine learning and i think it is also used in new ML courses of Andrew Ng https://www.quora.com/Why-was-MATLAB-not-used-in-the-Andrew-Ng-course-of-deep-learning ...

DSP_CS

101

asked Jul 15 at 13:15

0 votes

1 answer

27 views

Machine Learning vs Deep Learning? in context of Generative AI vs Discrimative AI?

I know that deep learning is subset of Machine learning But is it correct that classical machine Learning algorithms mainly focus on implementing Discriminative AI while Deep learning algorithms ...

DSP_CS

101

asked Jul 15 at 11:18

0 votes

1 answer

28 views

hacky backprop outperforms clean backprop - Why?

I implemented a basic NN for MNIST in Numpy and started with a hacky implementation of backprop (just randomly multiplying gradients together), but somehow that one works better than my cleaned up ...

Christoph Hörtnagl

1

asked Jul 14 at 15:34

0 votes

1 answer

12 views

Tuning NonHyperparameters in Scikitlearn

In Scikit Learn RandomSearch or GridSearch , how to include non hyper parameters in the tuning process?! Non hyper parameters are parameters not related to the machine learning algorithms. For example ...

Emad Ezzeldin

151

asked Jul 12 at 8:54

0 votes

0 answers

32 views

How does one handle a dataset with groups of features and groups of labels in classification?

I have a large dataset (1.8mil samples). There are 15 features: x1, y1, z1, e1, d1, x2,..., d3. (x,y,z) are coordinates, e is energy, and d is a derived feature- Euclidean distance between the ...

mche1962

1

asked Jul 11 at 6:42

0 votes

0 answers

45 views

How weight vector behave when we initialize the weight to 0 in case of perceptron

While reading in book i encountered this statement Now, the reason we don't initialize the weights to zero is that the learning rate (eta) only has an effect on the classification outcome if the ...

Vipin Dubey

101

asked Jul 7 at 1:56

1 vote

1 answer

24 views

Everything is classified as background by segmentation model

I am training a U-NET model for medical image segmentation. Problem is that the binary masks that im using to train the model mostly consist of background pixels and a very small region of the whole ...

Ashwin Singh

61

asked Jul 6 at 15:35

0 votes

0 answers

30 views

Advice on deep learning PC build using dual 4090s

I’m an engineering grad student, and I’ve been tasked with finding parts for building a shared workstation for my lab. Our work includes deep learning, computer vision, network analysis, reinforcement ...

yuki

1

asked Jul 5 at 18:51

0 votes

0 answers

10 views

How to create modeling data for predciting Customer Lifetime Value? and Definitions of Customer Lifetime Value

I'm trying to build this CLTV model for customers coming to purchase products over time but I'm new to CLTV, so got some questions to clarify: Since each customer was acquired in different time point,...

Iris

3

asked Jul 5 at 1:55

2 votes

1 answer

25 views

Level of confidence for binary classification

I’m relatively new to PyTorch and deep learning. I was able to create a model and analyze a data set for both a training and test set in a binary classification problem. Everything is working well. ...

Ashishkabaab

21

asked Jul 5 at 1:35

0 votes

0 answers

34 views

Why does the line not cross b value/ y-intercept as per expectation?

...

jass

1

asked Jul 4 at 10:37

0 votes

0 answers

7 views

Live odds data set for horse racing [closed]

I am looking for a resource of live odds for horse racing to implement in my model. I know they exist, I just can't find anything that has worked, yet. Live, updated and accessible are what I'm ...

Adam

1

asked Jul 3 at 8:40

0 votes

0 answers

14 views

How to increase the optimial cutoff point(youden index) after training a model?

So I trained a model based on a medical dataset and and I got an AUROC for detecting cancer in brain images as about 0.96 and i noticed that the youden index is 0.1 but i want to increase it to 0.5 , ...

mutli-arm-bandit

23

asked Jun 30 at 21:28

0 votes

0 answers

24 views

Where can I find a Database of Corrected Essays in Portuguese-BR?

I'm looking for a database of essays that could be from ENEM, competitions, entrance exams, universities, etc., however, they must have been corrected in Portuguese-BR by humans. Does anyone know ...

Maycon Silva

1

asked Jun 29 at 23:51

0 votes

0 answers

6 views

What's the right machine learning approach to mark rubrics based on sequences of data?

I'm a teacher and I'm working on a pet project to help streamline some of my assessment workflows for my students. One of those workflows is gathering data on student progress in the form of a rubric ...

Kevin

1

asked Jun 29 at 18:51

0 votes

1 answer

37 views

Detection of musical instruments using Yamnet

My goal is to detect musical instruments with AI (machine learning). I'm currently using the Yamnet model to make inferences, but it has a very wide range of categories, for example, "Growling&...

Maxime Dupré

1

asked Jun 29 at 1:07

1 vote

0 answers

9 views

Use a metric that is not available in the list of metric for xgboost

Working in R. I am following this post on stack overflow. I am train an xgboost model and I want to use another metric that is not in the list of metric we can whoose for the eval_metric parameter. I ...

Camillionnaire

11

asked Jun 28 at 13:47

0 votes

1 answer

23 views

How to create consistent dummy variables in Inference code?

I am using pd.get_dummies on a categorical column to create dummy variables. The Training pipeline is something like this Normalization Dummy variable Creation ...

Sociopath

1,253

asked Jun 28 at 13:27

0 votes

0 answers

10 views

Fequency encoding in R while using a cross validated model: How to use step_lencode_mixed()

One way of addressing high cardinality in a column is the use of frequency encoding. However, if you use a cross validated analysis plan the you would need to re-encode the column at each step. It's ...

Englishman Bob

113

asked Jun 28 at 1:37

0 votes

0 answers

18 views

Generating transaction data for a dataset to train on

My project is to predict what payment option a customer might use depending on various factors on a checkout screen. For example here are some of the fields I would have Variables : User_Location ...

Naeem Mujeeb

1

asked Jun 27 at 22:30

1 vote

1 answer

38 views

How does seeing training batches only once influence the generalization of a neural network?

I am referring to this question/scenario Train neural network with unlimited training data but unfortunately I can not comment. As I am not seeing any training batch multiple times I would guess that ...

ZenDen

13

asked Jun 26 at 15:07

-1 votes

0 answers

13 views

Multi_Target Classification

I have 24 columns for banking hyper-personalized recommendation engine for providing offers for the customers. So, the offer columns provided have two target variables. What approach would be correct ...

Johnimmanuel

1

asked Jun 26 at 13:11

0 votes

0 answers

11 views

What Package/Algorithm should I use to classify the pixels of a pigs eye in an infrared picture of a pig?

Im a college student working on a project that involves identifying eye features in infrared pictures of pigs so that we can apply a FEM mesh to it and do computations (we haven't created the mesh yet,...

Ian Wilson

1

asked Jun 24 at 19:06

0 votes

0 answers

11 views

Cumulative feature importance in Random Forest taking into account past days data

I have a dataframe with past days data and current day data. Example columns [ cases , mobility, temp , rh , cases_1, mobility_1 , temp_1 , rh_1, cases_2, mobility_2, temp_2, rh_2 and so on. . ]. My ...

SHARVARI WANJARI

1

asked Jun 24 at 7:22

0 votes

1 answer

37 views

What programming language is better for data science?

I'm a student who is currently learning Python and Java. I'm very interested to learn about data science and machine learning as well as deep learning. Which of Java and Python is more useful for me?

Charansaiponnada

1

asked Jun 23 at 1:31

0 votes

1 answer

20 views

Only one node generated after using decision tree model on training data set

I am trying to build a decision tree model predicting an outcome variable (named : Results) based on predictor variable. Indeed, I have applied one-hot encoding on some of the ">2 level" ...

M. Samir

1

asked Jun 22 at 14:53

0 votes

0 answers

23 views

How to choose thresholds to discretize target for binary classification

My group is using logistic regression to investigate the most predictive features in a dataset. Our target variable is actually a continuous variable that we discretized using two cutoff thresholds (...

OstensiblyPutative

1

asked Jun 21 at 21:25

0 votes

0 answers

13 views

Should multiple categorical embeddings be combined for a conditional GAN (cGAN)?

I'm trying to make a conditional GAN (cGAN) that generates YouTube thumbnails based on a title and a video category/genre. It's not working whatsoever, not even close, and so I'm trying to go back to ...

Dylan Todd

21

asked Jun 19 at 11:43

0 votes

0 answers

10 views

Understanding the data leakage article on The ICML 2013 Whale Challenge - Right Whale Redux

I came across this article while trying to understand more about data leakage. https://www.kaggle.com/competitions/the-icml-2013-whale-challenge-right-whale-redux/discussion/4865 Though really good ...

sarah

1

asked Jun 19 at 8:36

0 votes

0 answers

16 views

Courses or lectures or books on machine learning or AI in general that have a lot of theoretical and practical mathematics but also practical coding

I'm looking for deep learning, or machine learning more generally, or artificial intelligence more generally, courses or lectures or books, that have a lot of theoretical and practical mathematics but ...

Burny

1

asked Jun 18 at 23:17

1 vote

1 answer

29 views

Is it appropriate to use KL Divergence as a loss function for a 1x3 regression model?

I have a regression model with a 1x3 output, which means it predicts three continuous values. I'm wondering if it would be appropriate to use the Kullback-Leibler (KL) Divergence as the loss function ...

Kjyong

175

asked Jun 18 at 1:42

0 votes

0 answers

17 views

Optimal combination of feature values to maximize a metric

I'm facing a complex issue with a project I'm working on and would appreciate some advice. Here's the context: We have a search engine where users input queries and get results filtered through ...

k-eternal007

1

asked Jun 17 at 13:15

6 votes

2 answers

588 views

How do regression loss functions like MAE and MSE work although they remove the plus/minus sign?

I have a question about regression loss functions like Mean Absolute Error (MAE) and Mean Squared Error (MSE) used in deep learning. When we calculate these losses, we remove the plus/minus sign from ...

Kjyong

175

asked Jun 17 at 9:20

0 votes

0 answers

21 views

dummy features has almost the same effect as actual features

4 dummy random features (using np.random.randn) and 4 new real features (brough from some ideas) shows almost the same improvement. In cross validation, 4 dummies ...

Crispy13

133

asked Jun 17 at 5:07

0 votes

0 answers

30 views

Why is the generator not trained directly in GAN

When we build a GAN network we usually do the following: Build a discriminator and compile it Build a generator Build a combined model generator+discriminator and compile it Now for training we do ...

Lockhart

123

asked Jun 16 at 7:25

0 votes

0 answers

32 views

Keras multi-label model predictions always sum to ~1

I believe I've configured this model correctly for multi-label classification, but it would seem that it insists on behaving like a multi-class model, since the predictions it outputs always sum to 1 (...

paammar

101

asked Jun 16 at 0:27

0 votes

0 answers

11 views

training with a grouped data and having conditions on the group [closed]

i have a data of students, with various features of the students. Now students are are divided in multiple groups. each group having 3-6 students. I have to predict the marks of individual students ...

Mayank

1

asked Jun 15 at 9:07

0 votes

0 answers

12 views

High Validation Accuracy + High Loss Score and High Training Accuracy + Low Loss Score?

I am having a wierd observation in my experiments, I am using BERT with adapter and lora PEFT methods for domain adaptation. I first trained the adapter on Unlabled target domain dataset using MLM, ...

Mo Rawhani

1

asked Jun 15 at 2:52

0 votes

0 answers

54 views

Seeking Feedback on Data Science Portfolio

I hope you're doing well. I've recently started building my data science portfolio on GitHub with the goal of securing a role as a data scientist. I would greatly appreciate your feedback to help me ...

Colin Lim

1

asked Jun 13 at 9:28

0 votes

0 answers

13 views

can we build sequential model?

Like is it possible to train a model on X1_i Inputs Y1_i Output and then the second one is running on X1_i + Y1_i to give output Y2_i ?? Context : (I am just getting my hand into ML and trying to ...

Charles

1

asked Jun 13 at 9:00

0 votes

0 answers

10 views

Correct way to measure the uplift: S-learner or difference between real values and predictions of a model trained on non-treated samples

I want to measure the uplift associated with a change in a profile. For example, if I take the example of Airbnb, I'd like to measure if, when a host adds additional photos to his listing, an increase ...

gummy

1

asked Jun 12 at 16:33

0 votes

0 answers

34 views

How specific should I be with my region of interest in image data for training a CNN model for better accuracies?

I am trying to train a 3D CNN model for classification of cancer stages on a dataset that comprises of head to neck CT image series which is split into 5 classes corresponding to the stages of cancer....

Ashwin Singh

61

asked Jun 12 at 12:12

0 votes

0 answers

20 views

NLP: how to handle bad tokenization

I get nonsense when trying to translate the following german sentence to swedish using google/madlad400-3b-mt: a. Natürliche Personen: BundID mit ELSTER-Zertifikat oder nPA/eID/eAT-Authentifizierung ...

Mathermind

1

asked Jun 12 at 3:50

Questions tagged [machine-learning]

Related Tags